A note on Context Windows

In long agent conversations, Developers often lose track of how much context has been used. Tools like Claude Code or Codex have auto-compaction built in, but they compact too late. Opus 4.8 has a million token context window, but retrieval quality falls off after about 200k. Opus 4.7 was actually a huge regression in this, 4.8 has a lower drop off. The bigger effect though is the cost. This comes directly from the horse's mouth, longer context == more cost.

A few things that have worked for me:

Use /context or /status to see where you currently are.
Use /statusline to add context and more information to the bottom of your terminal

/compact often, in both Codex and Claude Code
Start new conversations frequently (like one per bugfix). Use the /new or /clear command
Keep a small AGENTS.md/CLAUDE.md file. Avoid repeating instructions.
Use fewer and smaller MCPs, skills and subagents. Every skill and subagent you have are injected into the initial prompt, inflating context. MCPs are loaded on-demand, but descriptions are still sent for the model to decide. By 3-4 turns you're already exceeding the optimal window. I have seen people use 20+ MCPs! Don't do that. Think about what needs to be scoped to the user vs the project.
Spawn subagents. They give a small, concise results to your main agent, preventing context inflation. Also better retrieval! Most agents spawn them without your intervention.

A note on Context Windows

Comments

More from this blog

Drawing Custom Alerts on Top of Bottom Sheets in Jetpack Compose

Automating Bluetooth Profile switching thanks to Copilot Chat

Context Receivers in Kotlin: An Example

Integrating Google Maps, Places API, and Reverse Geocoding with Jetpack Compose

Command Palette

Comments

More from this blog