10. Context Management¶

Domain 5 (15%) — Context Management & Reliability

Context window fundamentals — What fills context and how to think about it

Context = system prompt + all messages (user + assistant) + tool results
Position matters: most important info goes last (recency bias)
Verbose tool output fills context fast → always trim and extract structured facts

Warning

Don't confuse context capacity with context attention. Even well under capacity, the model may not reliably track preferences scattered across many turns.

Four context management strategies — Know which to use when, and what each trades off

Strategy	Description
Sliding window	Keep only the most recent N turns. Simple. Works for short conversations. Weakness: older context is completely gone.
Progressive summarization	Summarize older turns, keep last 5–8 verbatim. Best general-purpose approach. Loses numerical precision.
Structured state objects	Maintain a JSON object capturing current state. More reliable for iteratively-refined preferences.
Retrieval-based (RAG)	Store extracted facts in a DB, retrieve on demand. For precision-dependent recall — exact numbers, stats. Summaries lose this.

Compaction (Agent SDK) — What happens when context nears its limit

When context nears limit, SDK runs automatic compaction
Yields a SystemMessage with subtype: "compact_boundary" before and after
Manual strategy: use scratchpad files for long sessions; subagent delegation offloads context

Pattern	When to use
Extract structured facts	Always — never pass raw verbose tool output
Scratchpad files	Long multi-step sessions to preserve intermediate state
Subagent delegation (Task tool)	Offload context-heavy subtasks to isolated agents
Explore subagent	Verbose discovery (file listing, grep) — returns summary only
`/compact` command	Claude Code: manually compact a long session

Information provenance (multi-agent) — Handling conflicting sources

Always maintain claim → source mappings
Conflicting sources: preserve both values with attribution (never pick one arbitrarily)
Add conflict_detected: true boolean to structured output when sources disagree
Coverage gaps: annotate final output with what couldn't be verified