Vine

โ† Back
Save Tokens

Save Tokens

The biggest levers for cheaper, faster sessions.

The biggest levers for cheaper, faster sessions.

What it is

Claude Code burns tokens on every message, every tool call, every file read. The more context loaded, the more expensive AND slower each response gets. These are the levers that actually move the number โ€” no micro-optimizations, just the stuff that makes sessions measurably cheaper.


Keep CLAUDE.md small

It ships with every single message. Stay under 100 lines. Only rules that apply to every interaction belong here. Project-specific stuff goes in a project CLAUDE.md inside the project folder.

One CLAUDE.md per project

Claude Code auto-loads the CLAUDE.md from the current directory. That way Claude only gets the context that's actually relevant โ€” not everything at once.

Structure your memory system

Memory files do NOT load automatically โ€” only MEMORY.md as the index. Keep the index short (under 200 lines), detail files separate. No duplicates. Clean it out regularly.

Use /compact and /clear often

  • /compact โ€” compresses your context so far. Rule of thumb: run it after every 30+ messages.
  • /clear โ€” fresh start. Use it when switching topics.

Subagents instead of main context

Every piece of research, exploration, or parallel analysis goes into a subagent. The subagent has its own context and only returns the result. Your main window stays clean.

Plan Mode before code

A short plan up front saves 3โ€“5x the tokens the plan itself costs. Claude without a plan writes, corrects, rewrites โ€” all tokens.

Grep before Read

Never load an entire file when you need one specific spot. Grep first, then read with offset/limit. On a 500-line file you save 90%.

Sonnet vs Opus vs Haiku, on purpose

  • Haiku โ€” mini tasks, quick summaries.
  • Sonnet โ€” 90% of everything. Code, edits, research.
  • Opus โ€” only for deep architecture calls, nasty bugs, or when Sonnet fails multiple times.

Switch with /model in chat.

Pro tip

The biggest lever usually isn't model choice โ€” it's context hygiene: CLAUDE.md under 100 lines, subagents instead of everything-in-main, /compact before it gets sluggish. Skip that and you pay 3โ€“5x more, no matter which model you pick.


๐Ÿš€ Inside the community I share my full setup including prompt strategies that save tokens without hurting output. โ†’ Join the Claude Code Mastery Community

Updated regularly โ€” follow @vine.codes for more.

Want more?

Learn it straight from me.