# A note on Context Windows

In long agent conversations, Developers often lose track of how much context has been used. Tools like Claude Code or Codex have auto-compaction built in, but they compact too late. Opus 4.8 has a million token context window, but retrieval quality falls off after about 200k. Opus 4.7 was actually a huge regression in this, 4.8 has a lower drop off. The bigger effect though is the cost. This comes directly from the horse's mouth, longer context == more cost.

![](https://cdn.hashnode.com/uploads/covers/617d6f4ed55bde5cb66815ee/484ab6e6-80bd-4b65-b183-d6a049903514.png align="center")

A few things that have worked for me:

*   Use `/context` or `/status` to see where you currently are.
    
*   Use `/statusline` to add context and more information to the bottom of your terminal
    

![](https://cdn.hashnode.com/uploads/covers/617d6f4ed55bde5cb66815ee/574eac92-888d-4b9c-b63f-b12ca1796ac3.png align="center")

*   `/compact` often, in both Codex and Claude Code
    
*   Start new conversations frequently (like one per bugfix). Use the `/new` or `/clear` command
    
*   Keep a small `AGENTS.md`/`CLAUDE.md` file. Avoid repeating instructions.
    
*   Use fewer and smaller MCPs, skills and subagents. Every skill and subagent you have are injected into the initial prompt, inflating context. MCPs are loaded on-demand, but descriptions are still sent for the model to decide. By 3-4 turns you're already exceeding the optimal window. I have seen people use 20+ MCPs! Don't do that. Think about what needs to be scoped to the user vs the project.
    
*   Spawn subagents. They give a small, concise results to your main agent, preventing context inflation. Also better retrieval! Most agents spawn them without your intervention.
    

![](https://cdn.hashnode.com/uploads/covers/617d6f4ed55bde5cb66815ee/624dea1a-6691-4d43-a816-8d1d4f6838e5.png align="center")