Skip to main content

Command Palette

Search for a command to run...

A note on Context Windows

Updated
2 min readView as Markdown
A note on Context Windows
S

I break (and occasionally build) apps for Android. I sometimes work on backend as well. When I'm not coding, I can most likely be found watching a TV Show, a movie, playing video games or reading a book.

In long agent conversations, Developers often lose track of how much context has been used. Tools like Claude Code or Codex have auto-compaction built in, but they compact too late. Opus 4.8 has a million token context window, but retrieval quality falls off after about 200k. Opus 4.7 was actually a huge regression in this, 4.8 has a lower drop off. The bigger effect though is the cost. This comes directly from the horse's mouth, longer context == more cost.

A few things that have worked for me:

  • Use /context or /status to see where you currently are.

  • Use /statusline to add context and more information to the bottom of your terminal

  • /compact often, in both Codex and Claude Code

  • Start new conversations frequently (like one per bugfix). Use the /new or /clear command

  • Keep a small AGENTS.md/CLAUDE.md file. Avoid repeating instructions.

  • Use fewer and smaller MCPs, skills and subagents. Every skill and subagent you have are injected into the initial prompt, inflating context. MCPs are loaded on-demand, but descriptions are still sent for the model to decide. By 3-4 turns you're already exceeding the optimal window. I have seen people use 20+ MCPs! Don't do that. Think about what needs to be scoped to the user vs the project.

  • Spawn subagents. They give a small, concise results to your main agent, preventing context inflation. Also better retrieval! Most agents spawn them without your intervention.

23 views