Skip to content

Claude Opus 4.7 — API Guide

This page documents Claude Opus 4.7 (claude-opus-4-7): breaking changes from Opus 4.6, migration guidance, and example requests.

Key breaking changes (Opus 4.6 → 4.7)

  1. Sampling parameters removed (causes 400)

    • temperature, top_p, and top_k have been removed in Opus 4.7. Sending these fields to 4.7 will return HTTP 400. Remove them from request bodies when migrating.
  2. budget_tokens fully removed

    • Using thinking:{type:"enabled",budget_tokens:N} is invalid in 4.7 and will return 400. Use thinking:{type:"adaptive"} instead. Note: any 4.6 escape-hatch using budget_tokens is not applicable in 4.7.
  3. Assistant prefill removed

    • Relying on the last assistant message as prefill will return 400 in 4.7. Use output_config.format (structured outputs) or put required context into the system prompt.
  4. Thinking content hidden by default (silent change)

    • thinking blocks still stream events, but the text will be empty unless explicitly requested. To see thinking text, set thinking:{type:"adaptive",display:"summarized"}. Default display is omitted, which may create a UI impression of a long pause although no error occurs.

Caching and prefix rules (Opus 4.7)

  • Minimum cacheable prefix: 4096 tokens. Content shorter than this will be silently not cached.
  • Cache hit rules: strict prefix match — any byte-level change invalidates following content.
  • Rendering order matters: tools and system message order affects cache breakpoints. Using the last system block as a breakpoint can cache tools + system together.
  • Up to 4 breakpoints are allowed per request.

Common silent invalidation pitfalls:

  • Injecting dynamic values into system (e.g., datetime.now()) — non-deterministic content invalidates cache.
  • Using json.dumps() without sort_keys=True may change byte order and invalidate cache.
  • Changing the tools list mid-flow or switching models invalidates caches.

Check usage.cache_read_input_tokens in responses; if it stays 0 over time, there is a silent invalidation factor.

Migration examples

Bad example (Opus 4.6 style — will return 400 on 4.7):

python
# Old (Opus 4.6) — this will 400 on 4.7
message = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    temperature=0.7,
    top_p=0.9,
    messages=[...],
    thinking={"type":"enabled","budget_tokens":10000},
)

Correct (Opus 4.7):

python
# New (Opus 4.7)
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    output_config={"effort":"xhigh"},
    messages=[...],
    thinking={"type":"adaptive","display":"summarized"},
)

JavaScript (old → new):

javascript
// Old (4.6) — invalid on 4.7
await client.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 16000,
  temperature: 0.7, // removed in 4.7
  messages: [...],
  thinking: { type: "enabled", budget_tokens: 10000 }, // removed
});

// New (4.7)
await client.messages.create({
  model: "claude-opus-4-7",
  max_tokens: 16000,
  output_config: { effort: "xhigh", format: "json" },
  messages: [...],
  thinking: { type: "adaptive", display: "summarized" },
});
  • If you previously relied on the last assistant message as a prefill, switch to:
    • output_config.format / output_config.json_schema to enforce structured outputs; or
    • Put constraints/context in the system prompt (avoid non-deterministic content to preserve cacheability).

Example (requesting JSON via output_config):

json
{
  "model": "claude-opus-4-7",
  "max_tokens": 1024,
  "messages": [{"role":"user","content":"Return result per schema"}],
  "output_config": {
    "format": "json",
    "json_schema": {"type":"object","properties":{"summary":{"type":"string"}}}
  }
}

Notes

  • Before upgrading, scan and remove temperature, top_p, top_k, and budget_tokens from your client code and test streaming thinking behavior.
  • Tune system / tool organization to match Opus 4.7 caching constraints (4096-token prefix, strict byte-level matching).

This documentation is licensed under CC BY-SA 4.0.