A note on Context Windows

I break (and occasionally build) apps for Android. I sometimes work on backend as well. When I'm not coding, I can most likely be found watching a TV Show, a movie, playing video games or reading a book.
In long agent conversations, Developers often lose track of how much context has been used. Tools like Claude Code or Codex have auto-compaction built in, but they compact too late. Opus 4.8 has a million token context window, but retrieval quality falls off after about 200k. Opus 4.7 was actually a huge regression in this, 4.8 has a lower drop off. The bigger effect though is the cost. This comes directly from the horse's mouth, longer context == more cost.
A few things that have worked for me:
Use
/contextor/statusto see where you currently are.Use
/statuslineto add context and more information to the bottom of your terminal
/compactoften, in both Codex and Claude CodeStart new conversations frequently (like one per bugfix). Use the
/newor/clearcommandKeep a small
AGENTS.md/CLAUDE.mdfile. Avoid repeating instructions.Use fewer and smaller MCPs, skills and subagents. Every skill and subagent you have are injected into the initial prompt, inflating context. MCPs are loaded on-demand, but descriptions are still sent for the model to decide. By 3-4 turns you're already exceeding the optimal window. I have seen people use 20+ MCPs! Don't do that. Think about what needs to be scoped to the user vs the project.
Spawn subagents. They give a small, concise results to your main agent, preventing context inflation. Also better retrieval! Most agents spawn them without your intervention.



