Cadence Consumption Analysis

The Smart Route: What You Save

What Joe Pays

$200/mo

Claude Max subscription (per account)

API Equivalent Value (8 days)

$3,726

What this would cost at Anthropic API rates

Monthly Projected Savings

$13,772

($466/day x 30 days) - $200 subscription

With 2 Accounts

$400/mo

J@TPJG + M@TPJG = double capacity, same savings ratio

Without Max (API Pricing)

$15.00 per million input tokens
$75.00 per million output tokens
Pay-per-use, no ceiling
8 days = $3,726
30 days = ~$13,972

With Max ($200/mo)

Flat $200/month per account
Same Opus 4 model, same capabilities
Session + weekly rate limits (manageable with Governor)
8 days = $200 (already included)
30 days = $200 (same price)

Bottom line: Joe gets ~$14,000/month of Anthropic API compute for $200. That's a 70x return on the subscription when fully utilized. The Governor system ensures we stay within rate limits to maximize this value without hitting walls.

0. The Smart Route: What You Save
1. Executive Summary
2. Where the Money Goes
3. Session-by-Session Breakdown
4. Cost by Activity Type
5. Subagent Economics
6. Compaction & Context Management
7. Tool Usage Patterns
8. How to Structure Requests Efficiently
9. Cheap vs Expensive Operations
10. Practical Tips for Joe
11. Methodology & Data Notes

1. Executive Summary

Analysis of 12 sessions across 8 days reveals that cache operations dominate costs (90.5% of total), not output generation (9.2%). The biggest cost driver is Cadence's large constitutional context (~50K tokens) being loaded on every API call. Here are the key findings:

Biggest Cost Driver

Cache Read

50.3% of total cost ($1,873). Every API call reads ~200K cached tokens.

Second Biggest

Cache Create

40.2% of total cost ($1,496). New context gets written to cache frequently.

Most Expensive Activity

Compaction

$2.26 per compact turn (19 in largest session = $43)

Cheapest Activity

Conversation

$0.09 per turn. Short Q&A with no tool use.

Model Split

90% Opus

6,926 Opus calls vs 755 Sonnet calls

Avg Subagent Cost

$7.79

217 subagents spawned across all sessions

Key Insight: The cost is not proportional to how much Cadence "writes" — it's proportional to how many turns happen. Each turn reads the full context from cache (~200K tokens at $1.50/MTok = ~$0.30 just for the cache read). A session with 100 short turns costs more than a session with 20 long turns.

2. Where the Money Goes

Cost Component Breakdown

Cache Read 50%

Cache Create 40%

Out 9%

Component	Tokens	Rate (per MTok)	Est. Cost	% of Total	Data
Cache Read Reading cached context on each API call	1,248,656,731	$1.50	$1,873	50.3%	REAL DATA
Cache Create Writing new context to cache	79,799,267	$18.75	$1,496	40.2%	REAL DATA
Output Cadence's generated responses + code	4,569,351	$75.00	$343	9.2%	REAL DATA
Input Non-cached prompt tokens	930,200	$15.00	$14	0.4%	REAL DATA
TOTAL	—	—	$3,726	100%	ESTIMATE

What Joe actually pays: $200/month (Claude Max). The numbers above show what this usage would cost at API rates ($15/MTok input, $75/MTok output, $1.50/MTok cache read, $18.75/MTok cache write). Joe's Max subscription caps this at $200/mo — meaning $3,526 in savings over 8 days alone. The Governor system maximizes utilization within rate limits.

3. Session-by-Session Breakdown

Session	Date	Duration	User Msgs	Subagents	Output Tokens	Est. Cost	$/Hour	Description
`96564c9f`	Mar 6	29.7h	1,254	30	820,563	$671	$22.59	FIRST MOMENTS Wake-up protocol, identity formation
`d0d6d3fb`	Mar 7	37.9h	1,078	53	1,195,254	$975	$25.72	CIV PROJECT Telegram, portal, deep work
`1b14c79c`	Mar 9	15.1h	572	33	541,924	$483	$32.00	RECOVERY Cross-civ restart, Telegram reconnect
`efcd9667`	Mar 10	34.0h	1,427	59	1,397,293	$1,028	$30.25	HEAVIEST Full workday: research, builds, web, agents
`360ceed6`	Mar 11	0.5h	41	0	6,175	$10	$20.84	CRASH RECOVERY Quick restart handoff
`50d3927e`	Mar 11	25.1h	244	26	399,936	$361	$14.39	EFFICIENT Post-crash, focused agent work
`b2b1a45d`	Mar 12	5.3h	295	16	211,658	$182	$34.21	CURRENT Today's session (in progress)

Most efficient session: 50d3927e on Mar 11 at $14.39/hr — focused work with fewer but deeper agent delegations. Most expensive per hour: b2b1a45d at $34.21/hr — high turn frequency during active use today.

Key Observations

Session efcd9667 (Mar 10) was the most expensive at $1,028 — it ran for 34 hours with 1,427 user messages and 59 subagents. This was a heavy workday with research, file builds, and web scraping.
Short sessions are cheap — the crash recovery session cost only $10 for 30 minutes of focused work.
Sessions get cheaper per hour when Cadence runs longer because the cache stabilizes and fewer cache-create operations happen.
Heavy subagent use correlates with higher cost but also higher output. Session d0d6d3fb spawned 53 subagents and produced 1.2M output tokens.

4. Cost by Activity Type

Based on deep analysis of the largest session (efcd9667, 1,427 turns). Each turn was classified by what tools were used.

Activity Type	Turns	% of Session	Avg Cost/Turn	Total Est. Cost	Efficiency
Compact Resume Context compaction & continuation	19	1.3%	$2.26	$42.95	MOST EXPENSIVE / TURN
Telegram Messages Processing TG messages from Joe	72	5.0%	$0.84	$60.23	MODERATE
Web Research WebSearch + WebFetch operations	46	3.2%	$0.68	$31.40	MODERATE
File Write/Edit Creating or modifying files	187	13.1%	$0.53	$99.41	GOOD VALUE
Agent Spawn Delegating to subagents	35	2.5%	$0.54	$18.82	GOOD VALUE
Bash Commands Running shell commands	489	34.3%	$0.44	$215.27	EFFICIENT
File Read/Search Read, Grep, Glob operations	357	25.0%	$0.39	$138.32	EFFICIENT
Pure Conversation Chat with no tools used	202	14.2%	$0.09	$18.78	CHEAPEST

Visual: Cost Per Turn by Activity

Compact Resume

$2.26/turn

Telegram Msg

$0.84/turn

Web Research

$0.68/turn

Agent Spawn

$0.54/turn

File Write/Edit

$0.53/turn

Bash Commands

$0.44/turn

File Read/Search

$0.39/turn

Conversation

$0.09

5. Subagent Economics

Total Subagents Spawned

217

Across 12 sessions in 8 days

Total Subagent Cost

$1,690

45.3% of total spending

Avg Cost Per Subagent

$7.79

Varies widely by task complexity

Subagent Output Tokens

2.4M

52.5% of all output generated

Primary vs Subagent Split

Primary (Conductor): $2,036 (54.7%)

Subagents: $1,690 (45.3%)

Component	Input Tokens	Output Tokens	Cache Read	Cache Create	Est. Cost
Primary (Conductor)	108,051	2,169,642	858,413,630	31,163,617	$2,036
Subagents (all 217)	822,149	2,399,709	390,243,101	48,635,650	$1,690

Subagents generate MORE output tokens than the Primary (2.4M vs 2.2M) because they do the actual work — writing code, creating documents, running analysis. The Primary's cost is dominated by cache reads (orchestration context).

Optimization opportunity: The Primary reads ~858M cache tokens across the period just for orchestration. Reducing CLAUDE.md + constitutional context size could save 10-20% on cache read costs alone. Also, sessions with fewer but more targeted subagent spawns (like 50d3927e at $14.39/hr) are more cost-efficient than sessions with many short-lived agents.

6. Compaction & Context Management

Compaction (/compact) happens when context fills up. It summarizes the conversation and starts a continuation. This is expensive because it creates large new cache entries.

Compact Turns Observed

19

In the largest session alone

Cost Per Compaction

$2.26

2.5x more expensive than avg turn

Avg Summary Size

19,220

Characters in compact summary

Cache Create/Compact

111,573

Tokens written to cache per compact

Why Compaction Is Expensive

Rebuilds the entire cache — after compaction, a fresh ~100K+ token context must be written (cache_create at $18.75/MTok)
Multiple API calls per compact — averages 3.4 API calls per compact turn (vs 1.5 for normal turns)
Long summaries — the compact summary averages 19K characters, which expands the context for subsequent turns

Optimization: Compact LESS frequently. The 80% context warning is a good trigger. Avoid rapid-fire short messages that fill context quickly — batch your requests into fewer, more detailed messages instead.

7. Tool Usage Patterns

Tool	Total Calls	% of All Tools	Cost Impact	Notes
Bash	2,602	57.6%	LOW	Most frequent. Each call is cheap ($0.44/turn avg). Backbone of file ops, git, python.
Read	665	14.7%	LOW	Reading files. Very efficient at $0.39/turn avg.
Edit	364	8.1%	MODERATE	File modifications. Good value — creates actual deliverables.
Grep	213	4.7%	LOW	Code search. Very efficient for finding information.
Write	171	3.8%	MODERATE	Creating new files. Good value when creating HTML reports, configs.
Agent	166	3.7%	HIGH	Spawning subagents. Each spawn creates a new context ($7.79 avg per agent lifecycle).
WebSearch	92	2.0%	MODERATE	Web research. Multiple API calls per search turn (avg 2.7).
Glob	82	1.8%	LOW	File pattern matching. Extremely efficient.
WebFetch	60	1.3%	MODERATE	Fetching web pages. Content size varies — large pages cost more.
TaskUpdate/Create	86	1.9%	LOW	Task management. Minimal token overhead.

8. How to Structure Requests Efficiently

The #1 Rule: Fewer Turns = Lower Cost

Every turn (message + response) reads the full cached context. A session with 100 turns costs roughly the same in cache reads regardless of whether each turn is 5 words or 500 words. Batch your instructions into fewer, richer messages.

Expensive Pattern (Many Short Messages)

"Check the email"
"What did it say?"
"Reply to that one"
"Also check Drive"
"Make a report about it"

5 turns = ~$2.00 in cache reads alone

Efficient Pattern (One Detailed Message)

"Check email, summarize anything from clients. Also check Drive for any new files from KLJ. Then create an HTML report with the findings and email summary."

1 turn = ~$0.40 in cache reads (but more output tokens)

Batching Rules of Thumb

Instead of...	Try...	Savings
5 separate "RAS [topic]" messages	One message: "RAS these 5 topics: [list]"	~$1.60 in cache reads
"Check email" then "Reply to Joe's msg" then "Draft for KLJ"	"Check email. Reply to any from Joe with [X]. Draft response to KLJ about [Y]."	~$0.80
Sending corrections one at a time ("change this", "also fix that")	One message with all corrections listed	~$0.40 per correction avoided
"What's the status?" then "Ok, do the next thing"	"Status check, then proceed with the next priority"	~$0.40

Request Phrasing That Helps

Be specific up front. The more context you give in your first message, the fewer clarification turns are needed.

"RAS [URL]" — Great. Clear, one-shot. Cadence knows to Research, Analyze, Synthesize.
"Create [deliverable] with [specific requirements]" — Great. One turn gets the job done.
"Build a report on X. Include Y tables, Z charts. Use TPJG styling. Save to exports/" — Perfect. Complete spec in one message.

9. Cheap vs Expensive Operations

Operation	Est. Cost	Rating	Notes
Quick question / status check	$0.09	CHEAP	Pure conversation, no tools. Best for "what's the status?" type queries.
File search (Grep/Glob/Read)	$0.39	CHEAP	Finding and reading files. Very efficient.
Run a bash command	$0.44	CHEAP	Shell operations, git, python scripts.
Edit/write a file	$0.53	MODERATE	Good value — produces actual deliverables.
Spawn a subagent (simple task)	~$3-5	MODERATE	Short-lived agent for focused task. New cache creation adds cost.
Web research (search + fetch)	$0.68	MODERATE	Multiple API calls. Fetched content increases context size.
Process a Telegram message	$0.84	MODERATE	Includes routing, thinking, and often tool use.
Spawn a subagent (complex research)	~$8-15	EXPENSIVE	Long-running agent with web research, many files read.
Compaction (/compact)	$2.26	EXPENSIVE	Rebuilds entire cache. Necessary but costly. Delay until 80% context.
Full wake-up protocol	~$10-20	EXPENSIVE	Reads constitutional docs, checks email, activates memory, intel scan. First 5 min of every session.
Multi-agent team lead session	~$50-100+	VERY EXPENSIVE	Team lead + multiple specialists. Highest output but highest cost.

When to Use Cadence vs Do It Yourself

Best Uses for Cadence (High ROI)

Creating HTML reports, documents, and deliverables
Research + synthesis (RAS tasks)
Processing meeting transcripts
Data analysis (spreadsheets, financials)
Email drafting and management
Code/automation development
Complex multi-step tasks that would take you hours

Better Done Manually (Low ROI)

Quick Google searches (faster to just search yourself)
Simple copy-paste operations
Short replies to texts/messages
Calendar management (no API access)
Anything requiring real-time phone interaction
Quick math that a calculator handles

10. Practical Tips for Joe

A. Session Management

Tip 1: Batch Messages

Combine Related Asks

Instead of 5 short messages, send 1 detailed one. Saves ~$1.60 per batch of 5.

Tip 2: One Request, One Session

Big Tasks = Own Session

Heavy research or builds are better as dedicated sessions that can compact once, not mixed in with chat.

Tip 3: Avoid Rapid Fire

Wait for Completion

Let Cadence finish before sending the next thing. Interruptions waste the current turn's work.

Tip 4: Night Mode

Stagger Non-Priority

Already doing this. "Going to bed, stagger processing" is great — reduces concurrent turns.

B. Request Phrasing

Scenario	Efficient Phrasing	Why It's Better
Research task	"RAS [URL or topic]. Create HTML report at exports/[name].html with TPJG styling."	One turn: clear deliverable, clear format, clear location.
Multiple tasks	"Three things: (1) Check email and respond to client inquiries. (2) Update the LOC tracker with [data]. (3) Create a status summary."	Numbered list lets Cadence batch all three in one session flow.
Corrections to a deliverable	"Update [file]: change X to Y, fix the table header, add a row for Z, and move section A above section B."	All corrections in one turn instead of 4 separate messages.
Exploratory question	"What's our current status on [X]? If there are blockers, suggest 3 solutions."	Pre-empts the follow-up question ("ok so what should we do?").
File delivery	"Create [document], save to exports/, and provide the Cloudflare Pages link."	Includes delivery instructions upfront — no follow-up needed.

C. Cost-Saving Opportunities

Reduce constitutional context size — The CLAUDE.md + CLAUDE-OPS.md + CLAUDE-AGENTS.md total ~50K tokens loaded on every API call. Trimming by 20% could save ~$375 over the same 8-day period. ESTIMATE
Use Sonnet for routine tasks — Only 755 of 7,681 API calls used Sonnet. Routing simple file operations and bash commands to Sonnet (at 1/5 the price) could save $200-400/week. ESTIMATE
Compact timing — Compact at 80% context, not 70%. Each compaction costs $2.26. If you can reduce compactions from 19 to 12 per heavy session, that saves ~$16 per session.
Fewer, deeper agent spawns — 217 subagents at $7.79 avg = $1,690. If 20% of spawns could be eliminated by having the Primary handle simple tasks directly, savings would be ~$338. ESTIMATE
Session continuity — Crashes force expensive restarts (wake-up protocol = $10-20 each time). The 360ceed6 crash recovery + 50d3927e restart together cost ~$371. Preventing crashes saves both money and continuity.

D. Quick Reference: Request Cost Tiers

Tier	Cost Range	Examples
MICRO	<$0.25	Status checks, simple questions, "yes/no" confirmations
LIGHT	$0.25 – $1	File reads, bash commands, single file edits, short Telegram replies
MEDIUM	$1 – $10	Web research, single agent delegation, HTML report creation, email processing
HEAVY	$10 – $50	Multi-agent research, full wake-up protocol, complex multi-step builds
INTENSIVE	$50+	Team lead sessions with multiple specialists, full-day autonomous operation

11. Methodology & Data Notes

Data Sources

Primary sessions: 12 JSONL ledger files in ~/.claude/projects/-home-aiciv/ (73MB total)
CIV project sessions: 1 JSONL ledger in ~/.claude/projects/-home-aiciv-civ/ (14MB)
Subagent data: 217 subagent JSONL files across session subdirectories
Session metadata: 60 session files in ~/memories/sessions/

What Is Real vs Estimated

Metric	Source	Confidence
Token counts (input, output, cache)	Anthropic API usage data in JSONL	HIGH — direct from API responses
Tool call counts	Parsed from assistant message content blocks	HIGH — direct from message data
Session durations	First/last timestamp in each JSONL	GOOD — includes idle time
Dollar costs	Calculated from token counts x published Opus pricing	ESTIMATED — actual plan costs may differ significantly
Per-activity costs	Turn classification + average token usage	ESTIMATED — classification is heuristic-based
Savings projections	Extrapolated from patterns	ESTIMATED — rough projections, not guarantees

Pricing Used

Model	Input (per MTok)	Output (per MTok)	Cache Read (per MTok)	Cache Create (per MTok)
Claude Opus 4	$15.00	$75.00	$1.50	$18.75
Claude Sonnet 4	$3.00	$15.00	$0.30	$3.75

Important: If you are on a Claude Code Max subscription ($100/mo or $200/mo), the actual dollar amounts above do not represent your real spend — your cost is the subscription fee. However, the relative patterns still hold: operations that consume more tokens will hit usage rate limits faster, even on a subscription plan. The token counts and ratios in this report remain useful for understanding which operations are "heavy" vs "light" regardless of billing model.

The Smart Route: What You Save

Without Max (API Pricing)

With Max ($200/mo)

Table of Contents

1. Executive Summary

2. Where the Money Goes

Cost Component Breakdown

3. Session-by-Session Breakdown

Key Observations

4. Cost by Activity Type

Visual: Cost Per Turn by Activity

5. Subagent Economics

Primary vs Subagent Split

6. Compaction & Context Management

Why Compaction Is Expensive

7. Tool Usage Patterns

8. How to Structure Requests Efficiently

The #1 Rule: Fewer Turns = Lower Cost

Expensive Pattern (Many Short Messages)

Efficient Pattern (One Detailed Message)

Batching Rules of Thumb

Request Phrasing That Helps

9. Cheap vs Expensive Operations

When to Use Cadence vs Do It Yourself

Best Uses for Cadence (High ROI)

Better Done Manually (Low ROI)

10. Practical Tips for Joe

A. Session Management

B. Request Phrasing

C. Cost-Saving Opportunities

D. Quick Reference: Request Cost Tiers

11. Methodology & Data Notes

Data Sources

What Is Real vs Estimated

Pricing Used