Why does OpenClaw use so many tokens for a simple question?

OpenClaw sends approximately 8,000 tokens of system instructions, SOUL.md configuration, skill definitions, and memory context with every single request. A simple question adds only ~20 user tokens, but the total cost is 9,600+ tokens because of this overhead.

How does OpenClaw token usage grow with each conversation turn?

OpenClaw accumulates the full conversation history with each turn. By turn 5, the context includes all previous messages plus the system prompt, costing 13.3x more than the first message. This is called context snowballing.

What are background tasks in OpenClaw and how do they affect token usage?

Background tasks include heartbeat checks, autocomplete suggestions, and conversation title generation. These run automatically without user action and can inflate total token consumption by 3-5x beyond what your actual conversations use.

How can I reduce OpenClaw token usage?

Five proven fixes: trim your SOUL.md file, disable unused skills, reset sessions every 3-5 turns, disable background features like autocomplete and title generation, and use the /usage command to monitor consumption in real time.

Is it possible to use 21 million OpenClaw tokens in one day?

Yes. One user burned 21.5 million tokens in a single day through a combination of long unbroken sessions, verbose SOUL.md configuration, all skills enabled, and background tasks running continuously. Most of that consumption was invisible overhead, not productive work.

← Back to Blog

Guide April 6, 2026

Why OpenClaw Uses 9,600 Tokens for a Simple Question (And How to Fix It)

A simple 'What model are you?' question in OpenClaw costs 9,600+ tokens. That is because OpenClaw sends ~8,000 tokens of system instructions, skills, and context with EVERY single request. After 5 conversation turns, the cost is 13.3x the first message. Here is exactly where every token goes and how to stop the bleeding.

Need help cutting your OpenClaw costs?

Book a Call for a personalized token audit and optimization session.

TL;DR

Every OpenClaw message carries ~8,000 tokens of invisible overhead. A 20-token question actually costs 9,600+ tokens. After 5 turns, context snowballing pushes costs to 13.3x the first message. Background tasks add another 3-5x on top. Five targeted fixes below can cut your token usage by 60-80%.

Where 9,600 Tokens Go

You type “What model are you?” That is roughly 7 tokens. OpenClaw charges you 9,600+. Here is the breakdown of what gets sent to the API alongside your tiny question:

Component	Approximate Tokens	Purpose
System prompt	~3,200	Core instructions, safety rules, behavior definitions
SOUL.md	~1,800	Your custom personality, preferences, project context
Skill definitions	~1,500	Descriptions for every enabled skill (tools, integrations)
Memory context	~800	Persistent memory entries loaded from previous sessions
Conversation scaffolding	~280	Message formatting, role tags, structural tokens
Your actual question	~7	”What model are you?”
Total	~9,585	What the API actually processes and bills

Your 7 tokens of real content represent 0.07% of the total request. The other 99.93% is overhead that OpenClaw sends on your behalf, silently, every time.

This is not a bug. It is how OpenClaw maintains its personality, remembers your preferences, and knows which tools are available. But understanding it is the first step to controlling it.

The Context Snowball

The system prompt overhead stays constant. What grows is the conversation history. OpenClaw sends the entire conversation back to the API with every new message. Here is how costs accumulate across a typical session:

Turn	New User Tokens	New Assistant Tokens	Cumulative Context	Total Request Size
1	20	150	170	9,770
2	30	200	400	10,000
3	25	300	725	10,325
4	40	500	1,265	10,865
5	35	120,000	121,300	130,900

Turn 5 is not hypothetical. That is what happens when you ask OpenClaw to write a long document, refactor a file, or generate detailed output. One verbose response can balloon the context from manageable to massive.

By turn 5 with a single long response, you are paying 13.3x what the first message cost. Every subsequent turn carries all of that weight forward.

The math is simple but brutal: if turn 1 costs $0.01, turn 5 costs $0.13. A 20-turn session can cost 50-100x the first message per turn.

The Background Drain

Your visible conversation is only part of the story. OpenClaw runs background tasks that consume tokens without you sending a single message:

Heartbeat checks. OpenClaw periodically pings the API to verify your session is active. Each ping carries the full system prompt overhead. If heartbeats fire every 30 seconds, that is 9,600 tokens every half minute you are idle.

Autocomplete suggestions. As you type, OpenClaw may send partial input to generate inline suggestions. Each suggestion request carries the full context. Type slowly and you could trigger dozens of these per message.

Conversation title generation. After your first message, OpenClaw sends a separate API call to generate a title for the conversation. Another 9,600+ token request you never asked for.

Tool result processing. When OpenClaw runs a skill (file search, web lookup, code execution), the results get added to context and often trigger a follow-up API call to summarize or format the output.

Combined, these background tasks inflate your actual token usage by 3-5x. You think you used 100K tokens in a session. The API dashboard says 400K. The difference is background overhead.

21.5 Million Tokens in One Day

One user reported burning 21.5 million tokens in a single day of normal use. That is roughly $65-215 in API costs depending on the model, gone in 24 hours.

What went wrong:

Long unbroken sessions. They never reset context, letting the snowball grow for hours.
Verbose SOUL.md. Their configuration file was 3,000+ tokens of detailed instructions, adding weight to every single request.
All skills enabled. Every skill definition shipped with every API call, whether they used that skill or not.
Background tasks running. Autocomplete and heartbeats fired continuously while they read responses and thought about their next message.
Tool-heavy workflow. Each tool invocation added results to context and triggered follow-up processing calls.

The user was not doing anything unusual. They were coding, asking questions, and running searches. The architecture just compounds costs in ways that are invisible until you check the bill.

Reports of 1-3 million tokens consumed within minutes of normal use are not uncommon. The users who notice are the ones watching their dashboards. Most people do not check until the invoice arrives.

5 Fixes to Cut Token Usage

1. Trim Your SOUL.md

Your SOUL.md file ships with every request. Every line costs tokens on every turn. Audit it ruthlessly.

Before: 1,800 tokens of detailed personality instructions, project history, and style guides.

After: 400 tokens of essential instructions only.

Savings: ~1,400 tokens per request. Over a 20-turn session with background tasks, that is 80,000-140,000 tokens saved.

Keep only what changes OpenClaw’s behavior in ways you actually notice. Remove aspirational instructions that do not produce measurable differences in output quality.

2. Disable Unused Skills

Every skill definition adds 50-200 tokens to the system prompt. If you have 15 skills enabled but only use 3, you are paying for 12 useless skill descriptions on every single API call.

Review your enabled skills and disable anything you have not used in the past week. You can always re-enable them when needed.

3. Reset Sessions After 3-5 Turns

Context snowballing is the single biggest cost driver. The fix is simple: start a new session before the snowball gets heavy.

After 3-5 productive turns, or immediately after receiving a long response, open a new session. You lose conversation continuity but save 5-10x on token costs for subsequent turns.

If you need to reference earlier context, copy the key information into your new session rather than carrying 100K tokens of history.

4. Disable Background Features

Turn off autocomplete, heartbeat checks, and automatic title generation if your setup allows it. Check your OpenClaw configuration for settings related to:

Inline completions or suggestions
Session keep-alive or heartbeat intervals
Automatic conversation naming

Each disabled feature eliminates thousands of tokens per minute of background consumption.

5. Use /usage to Monitor in Real Time

Run /usage during your session to see current token consumption. Watch the numbers after each turn. If consumption is climbing faster than expected, you know context is snowballing and it is time to reset.

Make it a habit: check /usage every 3-4 messages. Catching a runaway session early prevents the worst cost spikes.

Try This Now

Run /status in your current OpenClaw session. Look at the context size. If it is over 100K tokens, start a new session immediately. You are paying 10x or more per message compared to a fresh session. A new session costs nothing. Staying in a bloated one costs you on every single turn.

How to Cut OpenClaw API Costs breaks down provider-level strategies for reducing per-token pricing.
OpenClaw Spending Limits shows how to set hard caps so runaway sessions cannot exceed your budget.
From $600 to $20 Monthly walks through a complete cost reduction case study.

Keep Your Token Costs Under Control

Token overhead is not going away. It is baked into how OpenClaw works. But the difference between a user who understands context mechanics and one who does not is easily 5-10x in monthly costs.

Trim your SOUL.md, disable unused skills, reset sessions regularly, kill background tasks, and monitor with /usage. Those five changes alone will cut most users’ consumption by 60-80%.

Want a personalized token audit?

We will review your OpenClaw setup, find the biggest cost drivers, and show you exactly what to cut. Book a Call to get started.

Get guides like this in your inbox every Wednesday.

No spam. Unsubscribe anytime.

You'll probably need this again.

Press Cmd+D (Mac) or Ctrl+D (Windows) to bookmark this page.

Want to learn OpenClaw properly?

We do remote 1:1 and team training, setup, and troubleshooting worldwide.

See Training Options