Claude Opus 4.7 — API Guide
This page documents Claude Opus 4.7 (claude-opus-4-7): breaking changes from Opus 4.6, migration guidance, and example requests.
Key breaking changes (Opus 4.6 → 4.7)
Sampling parameters removed (causes 400)
temperature,top_p, andtop_khave been removed in Opus 4.7. Sending these fields to 4.7 will return HTTP 400. Remove them from request bodies when migrating.
budget_tokensfully removed- Using
thinking:{type:"enabled",budget_tokens:N}is invalid in 4.7 and will return 400. Usethinking:{type:"adaptive"}instead. Note: any 4.6 escape-hatch usingbudget_tokensis not applicable in 4.7.
- Using
Assistant prefill removed
- Relying on the last
assistantmessage as prefill will return 400 in 4.7. Useoutput_config.format(structured outputs) or put required context into thesystemprompt.
- Relying on the last
Thinking content hidden by default (silent change)
thinkingblocks still stream events, but the text will be empty unless explicitly requested. To see thinking text, setthinking:{type:"adaptive",display:"summarized"}. Defaultdisplayisomitted, which may create a UI impression of a long pause although no error occurs.
Caching and prefix rules (Opus 4.7)
- Minimum cacheable prefix: 4096 tokens. Content shorter than this will be silently not cached.
- Cache hit rules: strict prefix match — any byte-level change invalidates following content.
- Rendering order matters:
toolsandsystemmessage order affects cache breakpoints. Using the lastsystemblock as a breakpoint can cachetools+systemtogether. - Up to 4 breakpoints are allowed per request.
Common silent invalidation pitfalls:
- Injecting dynamic values into
system(e.g.,datetime.now()) — non-deterministic content invalidates cache. - Using
json.dumps()withoutsort_keys=Truemay change byte order and invalidate cache. - Changing the tools list mid-flow or switching models invalidates caches.
Check usage.cache_read_input_tokens in responses; if it stays 0 over time, there is a silent invalidation factor.
Migration examples
Bad example (Opus 4.6 style — will return 400 on 4.7):
python
# Old (Opus 4.6) — this will 400 on 4.7
message = client.messages.create(
model="claude-opus-4-6",
max_tokens=16000,
temperature=0.7,
top_p=0.9,
messages=[...],
thinking={"type":"enabled","budget_tokens":10000},
)Correct (Opus 4.7):
python
# New (Opus 4.7)
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
output_config={"effort":"xhigh"},
messages=[...],
thinking={"type":"adaptive","display":"summarized"},
)JavaScript (old → new):
javascript
// Old (4.6) — invalid on 4.7
await client.messages.create({
model: "claude-opus-4-6",
max_tokens: 16000,
temperature: 0.7, // removed in 4.7
messages: [...],
thinking: { type: "enabled", budget_tokens: 10000 }, // removed
});
// New (4.7)
await client.messages.create({
model: "claude-opus-4-7",
max_tokens: 16000,
output_config: { effort: "xhigh", format: "json" },
messages: [...],
thinking: { type: "adaptive", display: "summarized" },
});Assistant prefill: recommended alternatives
- If you previously relied on the last
assistantmessage as a prefill, switch to:output_config.format/output_config.json_schemato enforce structured outputs; or- Put constraints/context in the
systemprompt (avoid non-deterministic content to preserve cacheability).
Example (requesting JSON via output_config):
json
{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [{"role":"user","content":"Return result per schema"}],
"output_config": {
"format": "json",
"json_schema": {"type":"object","properties":{"summary":{"type":"string"}}}
}
}Notes
- Before upgrading, scan and remove
temperature,top_p,top_k, andbudget_tokensfrom your client code and test streamingthinkingbehavior. - Tune
system/toolorganization to match Opus 4.7 caching constraints (4096-token prefix, strict byte-level matching).