<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Migration on RockB</title><link>https://baeseokjae.github.io/tags/migration/</link><description>Recent content in Migration on RockB</description><image><title>RockB</title><url>https://baeseokjae.github.io/images/og-default.png</url><link>https://baeseokjae.github.io/images/og-default.png</link></image><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sat, 25 Apr 2026 22:03:49 +0000</lastBuildDate><atom:link href="https://baeseokjae.github.io/tags/migration/index.xml" rel="self" type="application/rss+xml"/><item><title>Claude Opus 4.7 budget_tokens Removal: Migration from Extended Thinking</title><link>https://baeseokjae.github.io/posts/claude-opus-4-7-budget-tokens-breaking-change-2026/</link><pubDate>Sat, 25 Apr 2026 22:03:49 +0000</pubDate><guid>https://baeseokjae.github.io/posts/claude-opus-4-7-budget-tokens-breaking-change-2026/</guid><description>How to fix the 400 error from Claude Opus 4.7&amp;#39;s budget_tokens removal and migrate to adaptive thinking in Python and TypeScript.</description><content:encoded><![CDATA[<p>Claude Opus 4.7, released April 16, 2026, silently removed <code>budget_tokens</code> from its extended thinking API. Any code that passes <code>budget_tokens</code> to Opus 4.7 receives an immediate <code>400 Bad Request</code> error. The fix is a four-step migration: switch to <code>adaptive</code> thinking type, replace <code>budget_tokens</code> with the <code>effort</code> parameter, update agentic loops to use <code>task_budget</code>, and strip <code>temperature</code>, <code>top_p</code>, and <code>top_k</code>. This guide walks through each step with exact before/after code.</p>
<h2 id="what-changed-in-claude-opus-47-budget_tokens-is-gone">What Changed in Claude Opus 4.7: budget_tokens Is Gone</h2>
<p>Claude Opus 4.7 removed <code>budget_tokens</code> entirely from the extended thinking configuration, replacing it with an adaptive thinking system that automatically allocates reasoning compute based on task complexity. The change affects every application that previously used <code>thinking: { type: &quot;enabled&quot;, budget_tokens: N }</code> to control how much the model &ldquo;thinks&rdquo; before responding. Released April 16, 2026, Opus 4.7 also removes <code>temperature</code>, <code>top_p</code>, and <code>top_k</code> parameters — three additional fields that silently accepted values in 4.6 but now return 400 errors in 4.7. Pricing remains unchanged at $5/M input tokens and $25/M output tokens, and the model shows a 13% coding benchmark lift over Opus 4.6 on Anthropic&rsquo;s internal 93-task evaluation. For teams upgrading by changing only the model string, these breaking changes arrive without warning in production — there is no deprecation header or soft-failure mode in the API response before the hard 400 begins.</p>
<h3 id="why-anthropic-removed-budget_tokens">Why Anthropic Removed budget_tokens</h3>
<p>Anthropic&rsquo;s internal evaluations showed that adaptive thinking — where the model dynamically decides how much reasoning to apply — outperforms a fixed <code>budget_tokens</code> cap. When you hard-cap tokens at 8,000, the model either runs out of reasoning budget mid-thought or wastes compute finishing trivially. Adaptive mode removes the constraint and lets the model match reasoning depth to actual task difficulty, which produced better benchmark results across coding and agentic workloads.</p>
<h2 id="the-400-error-explained-why-your-extended-thinking-code-breaks">The 400 Error Explained: Why Your Extended Thinking Code Breaks</h2>
<p>The 400 error from Claude Opus 4.7 is a strict API validation rejection — not a quota error, rate limit, or content policy violation. It occurs because <code>budget_tokens</code> is no longer a recognized field in the thinking configuration object, and Anthropic&rsquo;s API now returns a hard validation error rather than silently ignoring unknown fields. If you upgraded from Opus 4.6 to 4.7 by changing only the model ID string in your config, every request using <code>thinking.budget_tokens</code> will fail immediately with a message like <code>{&quot;error&quot;:{&quot;type&quot;:&quot;invalid_request_error&quot;,&quot;message&quot;:&quot;Unknown field: budget_tokens in thinking configuration&quot;}}</code>. The same validation failure applies to requests that include <code>temperature</code>, <code>top_p</code>, or <code>top_k</code> at the top-level request body. Importantly, Anthropic did not introduce a deprecation warning period — Opus 4.6 accepted these fields, Opus 4.7 rejects them with no soft-failure mode in between. Teams running automated model upgrades via version aliases experienced instant production breakage.</p>
<h3 id="finding-all-affected-call-sites">Finding All Affected Call Sites</h3>
<p>Before migrating, locate every place in your codebase that uses these removed fields:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span><span style="color:#75715e"># Find budget_tokens usage</span>
</span></span><span style="display:flex;"><span>rg <span style="color:#e6db74">&#34;budget_tokens&#34;</span> --type py --type ts --type js -n
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># Find removed sampling parameters</span>
</span></span><span style="display:flex;"><span>rg <span style="color:#e6db74">&#34;temperature|top_p|top_k&#34;</span> --type py --type ts --type js -n | grep -v <span style="color:#e6db74">&#34;\.md&#34;</span>
</span></span></code></pre></div><h2 id="step-1--switch-to-adaptive-thinking">Step 1 — Switch to Adaptive Thinking</h2>
<p>Adaptive thinking is the replacement for the <code>enabled</code> thinking type with a fixed token budget. The migration changes one field in your thinking configuration object. The <code>adaptive</code> type signals that the model should dynamically allocate reasoning compute, removing the need to predict how much thinking a given task requires. In Python SDK terms, you replace <code>AnthropicThinking(type=&quot;enabled&quot;, budget_tokens=8000)</code> with <code>AnthropicThinking(type=&quot;adaptive&quot;)</code>. In raw JSON API terms, you replace <code>{&quot;type&quot;: &quot;enabled&quot;, &quot;budget_tokens&quot;: 8000}</code> with <code>{&quot;type&quot;: &quot;adaptive&quot;}</code>. The model ID should be updated to <code>claude-opus-4-7</code> or its alias. Note that <code>thinking.type = &quot;enabled&quot;</code> is also rejected on Opus 4.7 — only <code>&quot;adaptive&quot;</code> and <code>&quot;none&quot;</code> are valid values. If you want to disable extended thinking entirely on Opus 4.7, pass <code>{&quot;type&quot;: &quot;none&quot;}</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e"># Before (Opus 4.6)</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> anthropic
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>client <span style="color:#f92672">=</span> anthropic<span style="color:#f92672">.</span>Anthropic()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>response <span style="color:#f92672">=</span> client<span style="color:#f92672">.</span>messages<span style="color:#f92672">.</span>create(
</span></span><span style="display:flex;"><span>    model<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;claude-opus-4-6&#34;</span>,
</span></span><span style="display:flex;"><span>    max_tokens<span style="color:#f92672">=</span><span style="color:#ae81ff">16000</span>,
</span></span><span style="display:flex;"><span>    thinking<span style="color:#f92672">=</span>{
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;type&#34;</span>: <span style="color:#e6db74">&#34;enabled&#34;</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;budget_tokens&#34;</span>: <span style="color:#ae81ff">8000</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    messages<span style="color:#f92672">=</span>[{<span style="color:#e6db74">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;user&#34;</span>, <span style="color:#e6db74">&#34;content&#34;</span>: <span style="color:#e6db74">&#34;Explain quantum entanglement.&#34;</span>}]
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># After (Opus 4.7)</span>
</span></span><span style="display:flex;"><span>response <span style="color:#f92672">=</span> client<span style="color:#f92672">.</span>messages<span style="color:#f92672">.</span>create(
</span></span><span style="display:flex;"><span>    model<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>    max_tokens<span style="color:#f92672">=</span><span style="color:#ae81ff">16000</span>,
</span></span><span style="display:flex;"><span>    thinking<span style="color:#f92672">=</span>{
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;type&#34;</span>: <span style="color:#e6db74">&#34;adaptive&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    messages<span style="color:#f92672">=</span>[{<span style="color:#e6db74">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;user&#34;</span>, <span style="color:#e6db74">&#34;content&#34;</span>: <span style="color:#e6db74">&#34;Explain quantum entanglement.&#34;</span>}]
</span></span><span style="display:flex;"><span>)
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-typescript" data-lang="typescript"><span style="display:flex;"><span><span style="color:#75715e">// Before (Opus 4.6)
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">import</span> <span style="color:#a6e22e">Anthropic</span> <span style="color:#66d9ef">from</span> <span style="color:#e6db74">&#34;@anthropic-ai/sdk&#34;</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">client</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">Anthropic</span>();
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">response</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">client</span>.<span style="color:#a6e22e">messages</span>.<span style="color:#a6e22e">create</span>({
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">model</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;claude-opus-4-6&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">max_tokens</span>: <span style="color:#66d9ef">16000</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">thinking</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;enabled&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">budget_tokens</span>: <span style="color:#66d9ef">8000</span>,
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">messages</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;user&#34;</span>, <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;Explain quantum entanglement.&#34;</span> }],
</span></span><span style="display:flex;"><span>});
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// After (Opus 4.7)
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">response</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">await</span> <span style="color:#a6e22e">client</span>.<span style="color:#a6e22e">messages</span>.<span style="color:#a6e22e">create</span>({
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">model</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">max_tokens</span>: <span style="color:#66d9ef">16000</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">thinking</span><span style="color:#f92672">:</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;adaptive&#34;</span>,
</span></span><span style="display:flex;"><span>  },
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">messages</span><span style="color:#f92672">:</span> [{ <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;user&#34;</span>, <span style="color:#a6e22e">content</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;Explain quantum entanglement.&#34;</span> }],
</span></span><span style="display:flex;"><span>});
</span></span></code></pre></div><h2 id="step-2--replace-budget_tokens-with-the-effort-parameter">Step 2 — Replace budget_tokens with the Effort Parameter</h2>
<p>The <code>effort</code> parameter is the new mechanism for controlling how much reasoning Opus 4.7 applies when adaptive thinking is enabled. It replaces <code>budget_tokens</code> as the user-facing control for reasoning depth, replacing a numeric token count with a named level that the model interprets relative to the task. The five levels are <code>low</code>, <code>medium</code>, <code>high</code>, <code>xhigh</code>, and <code>max</code>. Anthropic recommends <code>xhigh</code> as the default for coding and agentic tasks, and <code>medium</code> for summarization or classification where deep reasoning adds latency without benefit. Unlike <code>budget_tokens</code>, <code>effort</code> is advisory — the model may allocate more or less compute than the level suggests depending on task signals. You cannot set <code>effort</code> and <code>budget_tokens</code> simultaneously; the field does not exist in the thinking object and including it causes the same 400 error.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e"># Opus 4.7 with effort parameter</span>
</span></span><span style="display:flex;"><span>response <span style="color:#f92672">=</span> client<span style="color:#f92672">.</span>messages<span style="color:#f92672">.</span>create(
</span></span><span style="display:flex;"><span>    model<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>    max_tokens<span style="color:#f92672">=</span><span style="color:#ae81ff">16000</span>,
</span></span><span style="display:flex;"><span>    thinking<span style="color:#f92672">=</span>{
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;type&#34;</span>: <span style="color:#e6db74">&#34;adaptive&#34;</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;effort&#34;</span>: <span style="color:#e6db74">&#34;xhigh&#34;</span>   <span style="color:#75715e"># low | medium | high | xhigh | max</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    messages<span style="color:#f92672">=</span>[{<span style="color:#e6db74">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;user&#34;</span>, <span style="color:#e6db74">&#34;content&#34;</span>: <span style="color:#e6db74">&#34;Write a merge sort in Rust with tests.&#34;</span>}]
</span></span><span style="display:flex;"><span>)
</span></span></code></pre></div><h3 id="effort-level-mapping-guide">Effort Level Mapping Guide</h3>
<p>Use this table as a starting point for migrating your <code>budget_tokens</code> values to effort levels:</p>
<table>
  <thead>
      <tr>
          <th>Old budget_tokens</th>
          <th>Recommended effort</th>
          <th>Use case</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>&lt; 2,000</td>
          <td><code>low</code></td>
          <td>Classification, routing, simple Q&amp;A</td>
      </tr>
      <tr>
          <td>2,000–5,000</td>
          <td><code>medium</code></td>
          <td>Summarization, structured extraction</td>
      </tr>
      <tr>
          <td>5,000–12,000</td>
          <td><code>high</code></td>
          <td>Multi-step reasoning, code review</td>
      </tr>
      <tr>
          <td>12,000–20,000</td>
          <td><code>xhigh</code></td>
          <td>Complex coding, agentic tasks</td>
      </tr>
      <tr>
          <td>&gt; 20,000</td>
          <td><code>max</code></td>
          <td>Research, exhaustive analysis</td>
      </tr>
  </tbody>
</table>
<h2 id="step-3--migrate-agentic-loops-with-task_budget">Step 3 — Migrate Agentic Loops with task_budget</h2>
<p><code>task_budget</code> is a new advisory parameter for agentic loop use cases that replaces the pattern of passing <code>budget_tokens</code> to each individual API call. In extended thinking on Opus 4.6, teams would often set a per-call <code>budget_tokens</code> to prevent a multi-turn agent from consuming unlimited compute. Opus 4.7 introduces <code>task_budget</code> as a softer control that signals the model how much total thinking budget it has across the full agentic loop, rather than capping each individual turn. The minimum value is 20,000 tokens. Because it is advisory rather than a hard cap, the model can slightly exceed the budget if stopping mid-thought would produce a worse result. This is intentional — Anthropic found that hard mid-thought truncation was a significant source of degraded output quality in agentic contexts.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e"># Agentic loop with task_budget</span>
</span></span><span style="display:flex;"><span>response <span style="color:#f92672">=</span> client<span style="color:#f92672">.</span>messages<span style="color:#f92672">.</span>create(
</span></span><span style="display:flex;"><span>    model<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>    max_tokens<span style="color:#f92672">=</span><span style="color:#ae81ff">16000</span>,
</span></span><span style="display:flex;"><span>    thinking<span style="color:#f92672">=</span>{
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;type&#34;</span>: <span style="color:#e6db74">&#34;adaptive&#34;</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;effort&#34;</span>: <span style="color:#e6db74">&#34;xhigh&#34;</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;task_budget&#34;</span>: <span style="color:#ae81ff">80000</span>   <span style="color:#75715e"># advisory total tokens for the full loop, min 20000</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    messages<span style="color:#f92672">=</span>conversation_history
</span></span><span style="display:flex;"><span>)
</span></span></code></pre></div><h3 id="task_budget-vs-max_tokens">task_budget vs max_tokens</h3>
<table>
  <thead>
      <tr>
          <th>Parameter</th>
          <th>Type</th>
          <th>What it limits</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>max_tokens</code></td>
          <td>Hard cap</td>
          <td>Total output tokens per call</td>
      </tr>
      <tr>
          <td><code>task_budget</code></td>
          <td>Advisory</td>
          <td>Total thinking tokens across the agentic loop</td>
      </tr>
      <tr>
          <td><code>effort</code></td>
          <td>Advisory level</td>
          <td>Per-call reasoning depth signal</td>
      </tr>
  </tbody>
</table>
<h2 id="step-4--remove-temperature-top_p-and-top_k">Step 4 — Remove temperature, top_p, and top_k</h2>
<p>Claude Opus 4.7 rejects <code>temperature</code>, <code>top_p</code>, and <code>top_k</code> at the request body level when adaptive thinking is enabled. These parameters were silently accepted in Opus 4.6 even when extended thinking was active (the model effectively ignored them during thinking mode). Opus 4.7 enforces the constraint with a hard 400 error. If your codebase passes these parameters conditionally — for example, setting <code>temperature=0</code> for reproducibility — you must strip them from requests that use adaptive thinking. For non-thinking requests on Opus 4.7, these parameters may still be available; check the API docs for the current state. The safest migration strategy is to remove all three from your thinking-mode code paths unconditionally.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e"># Before — these cause 400 on Opus 4.7 with adaptive thinking</span>
</span></span><span style="display:flex;"><span>response <span style="color:#f92672">=</span> client<span style="color:#f92672">.</span>messages<span style="color:#f92672">.</span>create(
</span></span><span style="display:flex;"><span>    model<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>    max_tokens<span style="color:#f92672">=</span><span style="color:#ae81ff">16000</span>,
</span></span><span style="display:flex;"><span>    temperature<span style="color:#f92672">=</span><span style="color:#ae81ff">0.7</span>,    <span style="color:#75715e"># ❌ 400 error</span>
</span></span><span style="display:flex;"><span>    top_p<span style="color:#f92672">=</span><span style="color:#ae81ff">0.9</span>,          <span style="color:#75715e"># ❌ 400 error  </span>
</span></span><span style="display:flex;"><span>    top_k<span style="color:#f92672">=</span><span style="color:#ae81ff">50</span>,           <span style="color:#75715e"># ❌ 400 error</span>
</span></span><span style="display:flex;"><span>    thinking<span style="color:#f92672">=</span>{<span style="color:#e6db74">&#34;type&#34;</span>: <span style="color:#e6db74">&#34;adaptive&#34;</span>},
</span></span><span style="display:flex;"><span>    messages<span style="color:#f92672">=</span>[<span style="color:#f92672">...</span>]
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e"># After — clean request for Opus 4.7 adaptive thinking</span>
</span></span><span style="display:flex;"><span>response <span style="color:#f92672">=</span> client<span style="color:#f92672">.</span>messages<span style="color:#f92672">.</span>create(
</span></span><span style="display:flex;"><span>    model<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>    max_tokens<span style="color:#f92672">=</span><span style="color:#ae81ff">16000</span>,
</span></span><span style="display:flex;"><span>    thinking<span style="color:#f92672">=</span>{<span style="color:#e6db74">&#34;type&#34;</span>: <span style="color:#e6db74">&#34;adaptive&#34;</span>, <span style="color:#e6db74">&#34;effort&#34;</span>: <span style="color:#e6db74">&#34;high&#34;</span>},
</span></span><span style="display:flex;"><span>    messages<span style="color:#f92672">=</span>[<span style="color:#f92672">...</span>]
</span></span><span style="display:flex;"><span>)
</span></span></code></pre></div><h2 id="full-beforeafter-code-examples-python-and-typescript">Full Before/After Code Examples (Python and TypeScript)</h2>
<p>The full migration consolidates all four steps into a single diff per language. These examples show a realistic production pattern: a helper function that wraps the Anthropic client and handles model-specific configuration, so changing from 4.6 to 4.7 requires updating one function rather than every call site. The Python example uses the <code>anthropic</code> SDK; the TypeScript example uses <code>@anthropic-ai/sdk</code>. Both examples show the complete parameter set including <code>task_budget</code> for agentic contexts. After migrating, run a parallel test that sends the same prompt to both 4.6 and 4.7 and compares output quality — Anthropic&rsquo;s 13% coding benchmark improvement means results may differ even for semantically equivalent requests, and you should validate that your downstream systems handle the output format correctly.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e"># migration_helper.py</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> anthropic
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>client <span style="color:#f92672">=</span> anthropic<span style="color:#f92672">.</span>Anthropic()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">create_thinking_request</span>(
</span></span><span style="display:flex;"><span>    prompt: str,
</span></span><span style="display:flex;"><span>    model: str <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>    max_tokens: int <span style="color:#f92672">=</span> <span style="color:#ae81ff">16000</span>,
</span></span><span style="display:flex;"><span>    effort: str <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;xhigh&#34;</span>,
</span></span><span style="display:flex;"><span>    task_budget: int <span style="color:#f92672">|</span> <span style="color:#66d9ef">None</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">None</span>,
</span></span><span style="display:flex;"><span>    conversation_history: list <span style="color:#f92672">|</span> <span style="color:#66d9ef">None</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">None</span>,
</span></span><span style="display:flex;"><span>) <span style="color:#f92672">-&gt;</span> anthropic<span style="color:#f92672">.</span>types<span style="color:#f92672">.</span>Message:
</span></span><span style="display:flex;"><span>    thinking_config: dict <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;type&#34;</span>: <span style="color:#e6db74">&#34;adaptive&#34;</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;effort&#34;</span>: effort,
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> task_budget <span style="color:#f92672">is</span> <span style="color:#f92672">not</span> <span style="color:#66d9ef">None</span>:
</span></span><span style="display:flex;"><span>        thinking_config[<span style="color:#e6db74">&#34;task_budget&#34;</span>] <span style="color:#f92672">=</span> max(task_budget, <span style="color:#ae81ff">20000</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    messages <span style="color:#f92672">=</span> conversation_history <span style="color:#f92672">or</span> []
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> prompt:
</span></span><span style="display:flex;"><span>        messages <span style="color:#f92672">=</span> messages <span style="color:#f92672">+</span> [{<span style="color:#e6db74">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;user&#34;</span>, <span style="color:#e6db74">&#34;content&#34;</span>: prompt}]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> client<span style="color:#f92672">.</span>messages<span style="color:#f92672">.</span>create(
</span></span><span style="display:flex;"><span>        model<span style="color:#f92672">=</span>model,
</span></span><span style="display:flex;"><span>        max_tokens<span style="color:#f92672">=</span>max_tokens,
</span></span><span style="display:flex;"><span>        thinking<span style="color:#f92672">=</span>thinking_config,
</span></span><span style="display:flex;"><span>        messages<span style="color:#f92672">=</span>messages,
</span></span><span style="display:flex;"><span>        <span style="color:#75715e"># No temperature, top_p, or top_k</span>
</span></span><span style="display:flex;"><span>    )
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-typescript" data-lang="typescript"><span style="display:flex;"><span><span style="color:#75715e">// migrationHelper.ts
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">import</span> <span style="color:#a6e22e">Anthropic</span> <span style="color:#66d9ef">from</span> <span style="color:#e6db74">&#34;@anthropic-ai/sdk&#34;</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> <span style="color:#a6e22e">client</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">Anthropic</span>();
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">interface</span> <span style="color:#a6e22e">ThinkingRequestOptions</span> {
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">prompt</span>: <span style="color:#66d9ef">string</span>;
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">model?</span>: <span style="color:#66d9ef">string</span>;
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">maxTokens?</span>: <span style="color:#66d9ef">number</span>;
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">effort</span><span style="color:#f92672">?:</span> <span style="color:#e6db74">&#34;low&#34;</span> <span style="color:#f92672">|</span> <span style="color:#e6db74">&#34;medium&#34;</span> <span style="color:#f92672">|</span> <span style="color:#e6db74">&#34;high&#34;</span> <span style="color:#f92672">|</span> <span style="color:#e6db74">&#34;xhigh&#34;</span> <span style="color:#f92672">|</span> <span style="color:#e6db74">&#34;max&#34;</span>;
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">taskBudget?</span>: <span style="color:#66d9ef">number</span>;
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">conversationHistory?</span>: <span style="color:#66d9ef">Anthropic.MessageParam</span>[];
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">async</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">createThinkingRequest</span>({
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">prompt</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">model</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">maxTokens</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">16000</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">effort</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;xhigh&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">taskBudget</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#a6e22e">conversationHistory</span> <span style="color:#f92672">=</span> [],
</span></span><span style="display:flex;"><span>}<span style="color:#f92672">:</span> <span style="color:#a6e22e">ThinkingRequestOptions</span>)<span style="color:#f92672">:</span> <span style="color:#a6e22e">Promise</span>&lt;<span style="color:#f92672">Anthropic.Message</span>&gt; {
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">thinkingConfig</span>: <span style="color:#66d9ef">Record</span>&lt;<span style="color:#f92672">string</span>, <span style="color:#a6e22e">unknown</span>&gt; <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">type</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;adaptive&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">effort</span>,
</span></span><span style="display:flex;"><span>  };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">if</span> (<span style="color:#a6e22e">taskBudget</span> <span style="color:#f92672">!==</span> <span style="color:#66d9ef">undefined</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">thinkingConfig</span>.<span style="color:#a6e22e">task_budget</span> <span style="color:#f92672">=</span> Math.<span style="color:#a6e22e">max</span>(<span style="color:#a6e22e">taskBudget</span>, <span style="color:#ae81ff">20000</span>);
</span></span><span style="display:flex;"><span>  }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">const</span> <span style="color:#a6e22e">messages</span>: <span style="color:#66d9ef">Anthropic.MessageParam</span>[] <span style="color:#f92672">=</span> [
</span></span><span style="display:flex;"><span>    ...<span style="color:#a6e22e">conversationHistory</span>,
</span></span><span style="display:flex;"><span>    { <span style="color:#a6e22e">role</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#34;user&#34;</span>, <span style="color:#a6e22e">content</span>: <span style="color:#66d9ef">prompt</span> },
</span></span><span style="display:flex;"><span>  ];
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>  <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">client</span>.<span style="color:#a6e22e">messages</span>.<span style="color:#a6e22e">create</span>({
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">model</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">max_tokens</span>: <span style="color:#66d9ef">maxTokens</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">thinking</span>: <span style="color:#66d9ef">thinkingConfig</span> <span style="color:#66d9ef">as</span> <span style="color:#a6e22e">Anthropic</span>.<span style="color:#a6e22e">ThinkingConfigParam</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">messages</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// No temperature, top_p, or top_k
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>  });
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h2 id="performance-and-cost-impact-of-the-migration">Performance and Cost Impact of the Migration</h2>
<p>Migration to Opus 4.7 adaptive thinking affects performance and cost in three separate ways: benchmark quality, tokenizer changes, and efficiency features. On the positive side, Anthropic reports a 13% lift on internal coding benchmarks over Opus 4.6, and adaptive thinking produces better reasoning outcomes than fixed <code>budget_tokens</code> in Anthropic&rsquo;s own evaluations. Token pricing is unchanged at $5/M input and $25/M output. However, Opus 4.7 ships with an updated tokenizer that may count 1.0–1.35x more tokens for the same content depending on code density and language — so your token spend may increase even with the same prompts and outputs. On the efficiency side, prompt caching offers up to 90% cost savings on cached input tokens, and batch processing provides 50% off for non-real-time workloads. For agentic loops, <code>task_budget</code> replaces the blunt approach of hard-capping each call, which reduces the overhead of restarting interrupted reasoning chains.</p>
<h3 id="cost-comparison-table">Cost Comparison Table</h3>
<table>
  <thead>
      <tr>
          <th>Factor</th>
          <th>Opus 4.6</th>
          <th>Opus 4.7</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Input pricing</td>
          <td>$5/M tokens</td>
          <td>$5/M tokens (unchanged)</td>
      </tr>
      <tr>
          <td>Output pricing</td>
          <td>$25/M tokens</td>
          <td>$25/M tokens (unchanged)</td>
      </tr>
      <tr>
          <td>Tokenizer</td>
          <td>v4.6</td>
          <td>v4.7 (1.0–1.35x token count)</td>
      </tr>
      <tr>
          <td>Prompt caching</td>
          <td>Available</td>
          <td>Up to 90% savings</td>
      </tr>
      <tr>
          <td>Batch processing</td>
          <td>Available</td>
          <td>50% savings</td>
      </tr>
      <tr>
          <td>Coding benchmark</td>
          <td>Baseline</td>
          <td>+13% over 4.6</td>
      </tr>
  </tbody>
</table>
<h2 id="testing-your-migration-and-validating-behavior">Testing Your Migration and Validating Behavior</h2>
<p>Testing an Opus 4.7 migration requires more than confirming the 400 errors are gone. Adaptive thinking can allocate significantly different amounts of compute than your previous <code>budget_tokens</code> values, which means reasoning depth and output style may shift even for prompts that worked well before. The recommended test strategy is a three-phase approach: first, run a smoke test that confirms all migrated endpoints return 200; second, run a parallel comparison that sends identical prompts to both Opus 4.6 and 4.7 and logs the thinking block token counts and response lengths; third, evaluate output quality against your domain-specific acceptance criteria. Pay particular attention to tasks where you previously used low <code>budget_tokens</code> values (under 2,000) to control costs — adaptive thinking may allocate substantially more compute for the same prompts, which is good for quality but requires monitoring your token spend during the first week after migration. Use Anthropic&rsquo;s usage API to track per-request thinking token consumption while you tune <code>effort</code> levels.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#75715e"># Smoke test for migration validation</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> anthropic
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> pytest
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>client <span style="color:#f92672">=</span> anthropic<span style="color:#f92672">.</span>Anthropic()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">test_opus_47_adaptive_thinking</span>():
</span></span><span style="display:flex;"><span>    response <span style="color:#f92672">=</span> client<span style="color:#f92672">.</span>messages<span style="color:#f92672">.</span>create(
</span></span><span style="display:flex;"><span>        model<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>        max_tokens<span style="color:#f92672">=</span><span style="color:#ae81ff">4096</span>,
</span></span><span style="display:flex;"><span>        thinking<span style="color:#f92672">=</span>{<span style="color:#e6db74">&#34;type&#34;</span>: <span style="color:#e6db74">&#34;adaptive&#34;</span>, <span style="color:#e6db74">&#34;effort&#34;</span>: <span style="color:#e6db74">&#34;medium&#34;</span>},
</span></span><span style="display:flex;"><span>        messages<span style="color:#f92672">=</span>[{<span style="color:#e6db74">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;user&#34;</span>, <span style="color:#e6db74">&#34;content&#34;</span>: <span style="color:#e6db74">&#34;What is 2+2?&#34;</span>}]
</span></span><span style="display:flex;"><span>    )
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> response<span style="color:#f92672">.</span>stop_reason <span style="color:#f92672">in</span> (<span style="color:#e6db74">&#34;end_turn&#34;</span>, <span style="color:#e6db74">&#34;max_tokens&#34;</span>)
</span></span><span style="display:flex;"><span>    <span style="color:#75715e"># Verify thinking block present</span>
</span></span><span style="display:flex;"><span>    thinking_blocks <span style="color:#f92672">=</span> [b <span style="color:#66d9ef">for</span> b <span style="color:#f92672">in</span> response<span style="color:#f92672">.</span>content <span style="color:#66d9ef">if</span> b<span style="color:#f92672">.</span>type <span style="color:#f92672">==</span> <span style="color:#e6db74">&#34;thinking&#34;</span>]
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">assert</span> len(thinking_blocks) <span style="color:#f92672">&gt;=</span> <span style="color:#ae81ff">0</span>  <span style="color:#75715e"># may be 0 for trivial tasks with adaptive</span>
</span></span><span style="display:flex;"><span>    print(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Thinking tokens used: </span><span style="color:#e6db74">{</span>sum(len(b<span style="color:#f92672">.</span>thinking) <span style="color:#66d9ef">for</span> b <span style="color:#f92672">in</span> thinking_blocks)<span style="color:#e6db74">}</span><span style="color:#e6db74">&#34;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">test_no_budget_tokens_accepted</span>():
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">with</span> pytest<span style="color:#f92672">.</span>raises(anthropic<span style="color:#f92672">.</span>BadRequestError):
</span></span><span style="display:flex;"><span>        client<span style="color:#f92672">.</span>messages<span style="color:#f92672">.</span>create(
</span></span><span style="display:flex;"><span>            model<span style="color:#f92672">=</span><span style="color:#e6db74">&#34;claude-opus-4-7&#34;</span>,
</span></span><span style="display:flex;"><span>            max_tokens<span style="color:#f92672">=</span><span style="color:#ae81ff">4096</span>,
</span></span><span style="display:flex;"><span>            thinking<span style="color:#f92672">=</span>{<span style="color:#e6db74">&#34;type&#34;</span>: <span style="color:#e6db74">&#34;enabled&#34;</span>, <span style="color:#e6db74">&#34;budget_tokens&#34;</span>: <span style="color:#ae81ff">5000</span>},
</span></span><span style="display:flex;"><span>            messages<span style="color:#f92672">=</span>[{<span style="color:#e6db74">&#34;role&#34;</span>: <span style="color:#e6db74">&#34;user&#34;</span>, <span style="color:#e6db74">&#34;content&#34;</span>: <span style="color:#e6db74">&#34;test&#34;</span>}]
</span></span><span style="display:flex;"><span>        )
</span></span></code></pre></div><h2 id="faq">FAQ</h2>
<p>The following questions cover the most common migration blockers developers encounter when upgrading from Claude Opus 4.6 to 4.7. The short version: <code>budget_tokens</code> is gone, <code>temperature</code>/<code>top_p</code>/<code>top_k</code> cause 400 errors with adaptive thinking, and <code>effort</code> plus optional <code>task_budget</code> are the replacement controls. The migration typically takes under an hour for a single-model codebase — most of the time is spent finding all call sites with <code>rg</code> and running parallel smoke tests to confirm behavior. If you hit a 400 error on Opus 4.7, the error message will name the invalid field explicitly, which makes debugging straightforward once you know that validation is now strict. Adaptive thinking on Opus 4.7 is not a drop-in replacement in behavior — output quality and reasoning depth will shift, so budget one to two days for output validation against your domain acceptance criteria even after the API errors are resolved.</p>
<h3 id="does-claude-opus-47-support-budget_tokens-at-all">Does Claude Opus 4.7 support budget_tokens at all?</h3>
<p>No. <code>budget_tokens</code> was fully removed in Opus 4.7 and passing it in the thinking configuration returns an immediate <code>400 Bad Request</code> error. There is no compatibility mode, alias, or fallback. The replacement is the <code>effort</code> parameter (<code>low</code>, <code>medium</code>, <code>high</code>, <code>xhigh</code>, <code>max</code>) combined with <code>thinking.type = &quot;adaptive&quot;</code>.</p>
<h3 id="will-my-opus-46-code-work-on-opus-47-if-i-just-change-the-model-string">Will my Opus 4.6 code work on Opus 4.7 if I just change the model string?</h3>
<p>Only if your code does not use <code>thinking.budget_tokens</code>, <code>temperature</code>, <code>top_p</code>, or <code>top_k</code> with extended thinking enabled. If it uses any of these, changing the model string alone will break your app with 400 errors immediately. Run <code>rg &quot;budget_tokens|temperature|top_p|top_k&quot;</code> across your codebase before switching model IDs.</p>
<h3 id="what-is-the-minimum-value-for-task_budget">What is the minimum value for task_budget?</h3>
<p>The minimum valid <code>task_budget</code> value is 20,000 tokens. Passing a value below this threshold causes a validation error. The parameter is advisory — the model may slightly exceed the budget rather than truncate a reasoning chain mid-thought.</p>
<h3 id="does-effort-replace-budget_tokens-exactly-or-does-it-behave-differently">Does effort replace budget_tokens exactly, or does it behave differently?</h3>
<p>Effort behaves differently. <code>budget_tokens</code> was a hard numeric cap that the model could not exceed. <code>effort</code> is an advisory signal — the model interprets it as a hint about reasoning depth but retains discretion over actual token allocation. This means results are less predictable in token count but generally better in quality, especially for tasks where the optimal reasoning depth was hard to predict in advance.</p>
<h3 id="can-i-mix-adaptive-thinking-with-streaming-on-opus-47">Can I mix adaptive thinking with streaming on Opus 4.7?</h3>
<p>Yes. Adaptive thinking on Opus 4.7 is compatible with streaming. Thinking blocks are streamed as <code>thinking_delta</code> events before the main <code>text_delta</code> events, the same pattern as extended thinking in Opus 4.6. The only change in streaming behavior is that you no longer see a predictable thinking block size since the model allocates compute adaptively.</p>
]]></content:encoded></item></channel></rss>