Why OpenClaw Uses 9,600 Tokens for a Simple Question (And How to Fix It) | OpenClaw DC
A simple 'What model are you?' question in OpenClaw costs 9,600+ tokens. That is because OpenClaw sends ~8,000 tokens of system instructions, skills, and context with EVERY single request. After 5 conversation turns, the cost is 13.3x the first message. Here is exactly where every token goes and how to stop the bleeding.
Need help cutting your OpenClaw costs?
Book a Call for a personalized token audit and optimization session.
TL;DR
Every OpenClaw message carries ~8,000 tokens of invisible overhead. A 20-token question actually costs 9,600+ tokens. After 5 turns, context snowballing pushes costs to 13.3x the first message. Background tasks add another 3-5x on top. Five targeted fixes below can cut your token usage by 60-80%.
Where 9,600 Tokens Go
You type “What model are you?” That is roughly 7 tokens. OpenClaw charges you 9,600+. Here is the breakdown of what gets sent to the API alongside your tiny question:
| Component | Approximate Tokens | Purpose |
|---|---|---|
| System prompt | ~3,200 | Core instructions, safety rules, behavior definitions |
| SOUL.md | ~1,800 | Your custom personality, preferences, project context |
| Skill definitions | ~1,500 | Descriptions for every enabled skill (tools, integrations) |
| Memory context | ~800 | Persistent memory entries loaded from previous sessions |
| Conversation scaffolding | ~280 | Message formatting, role tags, structural tokens |
| Your actual question | ~7 | ”What model are you?” |
| Total | ~9,585 | What the API actually processes and bills |
Your 7 tokens of real content represent 0.07% of the total request. The other 99.93% is overhead that OpenClaw sends on your behalf, silently, every time.
This is not a bug. It is how OpenClaw maintains its personality, remembers your preferences, and knows which tools are available. But understanding it is the first step to controlling it.
The Context Snowball
The system prompt overhead stays constant. What grows is the conversation history. OpenClaw sends the entire conversation back to the API with every new message. Here is how costs accumulate across a typical session:
| Turn | New User Tokens | New Assistant Tokens | Cumulative Context | Total Request Size |
|---|---|---|---|---|
| 1 | 20 | 150 | 170 | 9,770 |
| 2 | 30 | 200 | 400 | 10,000 |
| 3 | 25 | 300 | 725 | 10,325 |
| 4 | 40 | 500 | 1,265 | 10,865 |
| 5 | 35 | 120,000 | 121,300 | 130,900 |
Turn 5 is not hypothetical. That is what happens when you ask OpenClaw to write a long document, refactor a file, or generate detailed output. One verbose response can balloon the context from manageable to massive.
By turn 5 with a single long response, you are paying 13.3x what the first message cost. Every subsequent turn carries all of that weight forward.
The math is simple but brutal: if turn 1 costs $0.01, turn 5 costs $0.13. A 20-turn session can cost 50-100x the first message per turn.
The Background Drain
Your visible conversation is only part of the story. OpenClaw runs background tasks that consume tokens without you sending a single message:
Heartbeat checks. OpenClaw periodically pings the API to verify your session is active. Each ping carries the full system prompt overhead. If heartbeats fire every 30 seconds, that is 9,600 tokens every half minute you are idle.
Autocomplete suggestions. As you type, OpenClaw may send partial input to generate inline suggestions. Each suggestion request carries the full context. Type slowly and you could trigger dozens of these per message.
Conversation title generation. After your first message, OpenClaw sends a separate API call to generate a title for the conversation. Another 9,600+ token request you never asked for.
Tool result processing. When OpenClaw runs a skill (file search, web lookup, code execution), the results get added to context and often trigger a follow-up API call to summarize or format the output.
Combined, these background tasks inflate your actual token usage by 3-5x. You think you used 100K tokens in a session. The API dashboard says 400K. The difference is background overhead.
21.5 Million Tokens in One Day
One user reported burning 21.5 million tokens in a single day of normal use. That is roughly $65-215 in API costs depending on the model, gone in 24 hours.
What went wrong:
- Long unbroken sessions. They never reset context, letting the snowball grow for hours.
- Verbose SOUL.md. Their configuration file was 3,000+ tokens of detailed instructions, adding weight to every single request.
- All skills enabled. Every skill definition shipped with every API call, whether they used that skill or not.
- Background tasks running. Autocomplete and heartbeats fired continuously while they read responses and thought about their next message.
- Tool-heavy workflow. Each tool invocation added results to context and triggered follow-up processing calls.
The user was not doing anything unusual. They were coding, asking questions, and running searches. The architecture just compounds costs in ways that are invisible until you check the bill.
Reports of 1-3 million tokens consumed within minutes of normal use are not uncommon. The users who notice are the ones watching their dashboards. Most people do not check until the invoice arrives.
5 Fixes to Cut Token Usage
1. Trim Your SOUL.md
Your SOUL.md file ships with every request. Every line costs tokens on every turn. Audit it ruthlessly.
Before: 1,800 tokens of detailed personality instructions, project history, and style guides.
After: 400 tokens of essential instructions only.
Savings: ~1,400 tokens per request. Over a 20-turn session with background tasks, that is 80,000-140,000 tokens saved.
Keep only what changes OpenClaw’s behavior in ways you actually notice. Remove aspirational instructions that do not produce measurable differences in output quality.
2. Disable Unused Skills
Every skill definition adds 50-200 tokens to the system prompt. If you have 15 skills enabled but only use 3, you are paying for 12 useless skill descriptions on every single API call.
Review your enabled skills and disable anything you have not used in the past week. You can always re-enable them when needed.
3. Reset Sessions After 3-5 Turns
Context snowballing is the single biggest cost driver. The fix is simple: start a new session before the snowball gets heavy.
After 3-5 productive turns, or immediately after receiving a long response, open a new session. You lose conversation continuity but save 5-10x on token costs for subsequent turns.
If you need to reference earlier context, copy the key information into your new session rather than carrying 100K tokens of history.
4. Disable Background Features
Turn off autocomplete, heartbeat checks, and automatic title generation if your setup allows it. Check your OpenClaw configuration for settings related to:
- Inline completions or suggestions
- Session keep-alive or heartbeat intervals
- Automatic conversation naming
Each disabled feature eliminates thousands of tokens per minute of background consumption.
5. Use /usage to Monitor in Real Time
Run /usage during your session to see current token consumption. Watch the numbers after each turn. If consumption is climbing faster than expected, you know context is snowballing and it is time to reset.
Make it a habit: check /usage every 3-4 messages. Catching a runaway session early prevents the worst cost spikes.
Try This Now
Run
/statusin your current OpenClaw session. Look at the context size. If it is over 100K tokens, start a new session immediately. You are paying 10x or more per message compared to a fresh session. A new session costs nothing. Staying in a bloated one costs you on every single turn.
Related Guides
- How to Cut OpenClaw API Costs breaks down provider-level strategies for reducing per-token pricing.
- OpenClaw Spending Limits shows how to set hard caps so runaway sessions cannot exceed your budget.
- From $600 to $20 Monthly walks through a complete cost reduction case study.
Keep Your Token Costs Under Control
Token overhead is not going away. It is baked into how OpenClaw works. But the difference between a user who understands context mechanics and one who does not is easily 5-10x in monthly costs.
Trim your SOUL.md, disable unused skills, reset sessions regularly, kill background tasks, and monitor with /usage. Those five changes alone will cut most users’ consumption by 60-80%.
Want a personalized token audit?
We will review your OpenClaw setup, find the biggest cost drivers, and show you exactly what to cut. Book a Call to get started.
Get guides like this in your inbox every Wednesday.
No spam. Unsubscribe anytime.
You'll probably need this again.
Press Cmd+D (Mac) or Ctrl+D (Windows) to bookmark this page.
Need help with your OpenClaw setup?
We do remote setup, troubleshooting, and training worldwide.
Book a Call