<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Open Source LLMs on RockB</title><link>https://baeseokjae.github.io/tags/open-source-llms/</link><description>Recent content in Open Source LLMs on RockB</description><image><title>RockB</title><url>https://baeseokjae.github.io/images/og-default.png</url><link>https://baeseokjae.github.io/images/og-default.png</link></image><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sun, 21 Jun 2026 10:00:00 +0000</lastBuildDate><atom:link href="https://baeseokjae.github.io/tags/open-source-llms/index.xml" rel="self" type="application/rss+xml"/><item><title>Claude Fable 5 Alternatives: Best Models to Use After the Export Ban in 2026</title><link>https://baeseokjae.github.io/posts/claude-fable-5-alternatives-2026/</link><pubDate>Sun, 21 Jun 2026 10:00:00 +0000</pubDate><guid>https://baeseokjae.github.io/posts/claude-fable-5-alternatives-2026/</guid><description>Claude Fable 5 and Mythos 5 were shut down globally on June 12, 2026 under US export controls. Here are the best proprietary, open source, and developer...</description><content:encoded><![CDATA[<p>Claude Fable 5 launched on June 9, 2026 as Anthropic&rsquo;s first publicly available Mythos-class model — 1M-token context, 80.3% on SWE-Bench Pro, and the most capable reasoning model ever shipped at its price point. Three days later, the US Commerce Department ordered it shut down for all foreign nationals under the Export Administration Regulations. Anthropic pulled both Fable 5 and Mythos 5 globally within 90 minutes.</p>
<p>If you built on Fable 5 or were planning to, you now need an alternative. Here is everything you need to make that decision.</p>
<hr>
<h2 id="quick-decision-matrix">Quick Decision Matrix</h2>
<table>
  <thead>
      <tr>
          <th>Your Situation</th>
          <th>Best Alternative</th>
          <th>Why</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>You want the closest Anthropic drop-in</td>
          <td>Claude Opus 4.8</td>
          <td>Same provider, same SDK, $5/$25 per M tokens</td>
      </tr>
      <tr>
          <td>You need the strongest coding model</td>
          <td>GPT-5.5 via Codex CLI</td>
          <td>82.1% SWE-Bench Pro, parallel agents, GitHub integration</td>
      </tr>
      <tr>
          <td>You need the largest context window</td>
          <td>Gemini 3.1 Pro</td>
          <td>2M tokens, $1.50/M input, free tier available</td>
      </tr>
      <tr>
          <td>You want maximum cost efficiency</td>
          <td>Gemini 3.5 Flash</td>
          <td>$1.50/M input, 68% better token efficiency than predecessor</td>
      </tr>
      <tr>
          <td>You are outside the US and want frontier</td>
          <td>Grok 5 (xAI)</td>
          <td>No export restrictions, competitive reasoning</td>
      </tr>
      <tr>
          <td>You want open source / self-hosted</td>
          <td>Qwen 3.6 Plus or Mistral Medium 3.5</td>
          <td>70-98% cost savings vs proprietary</td>
      </tr>
      <tr>
          <td>You need a free dev tool alternative</td>
          <td>Gemini CLI</td>
          <td>Free (1K req/day), 2M context, Google Search grounding</td>
      </tr>
      <tr>
          <td>You want model flexibility</td>
          <td>OpenAI Codex or OpenCode</td>
          <td>Multi-provider, no lock-in</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="tier-1-proprietary-frontier-alternatives">Tier 1: Proprietary Frontier Alternatives</h2>
<h3 id="claude-opus-48--the-direct-replacement">Claude Opus 4.8 — The Direct Replacement</h3>
<p>If you want to change as little as possible, Opus 4.8 is your answer. It is the same Anthropic API, the same SDK patterns, and the same 1M-token context window — just at the Opus tier instead of Mythos.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Fable 5</th>
          <th>Opus 4.8</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Context window</td>
          <td>1M tokens</td>
          <td>1M tokens</td>
      </tr>
      <tr>
          <td>Max output</td>
          <td>128K tokens</td>
          <td>64K tokens</td>
      </tr>
      <tr>
          <td>Input price</td>
          <td>$10/M tokens</td>
          <td>$5/M tokens</td>
      </tr>
      <tr>
          <td>Output price</td>
          <td>$50/M tokens</td>
          <td>$25/M tokens</td>
      </tr>
      <tr>
          <td>SWE-Bench Pro</td>
          <td>80.3%</td>
          <td>69.2%</td>
      </tr>
      <tr>
          <td>Availability</td>
          <td>Banned globally</td>
          <td>Everywhere Anthropic operates</td>
      </tr>
  </tbody>
</table>
<p>The migration is trivial: replace <code>claude-fable-5</code> with <code>claude-opus-4-8</code> in your API calls. The gap on SWE-Bench Pro is real — about 11 points — but for most production workloads (document analysis, summarization, code review, customer support), Opus 4.8 is more than capable. You pay half the price for roughly 85% of the capability.</p>
<p><strong>The catch:</strong> Opus 4.8 does not match Fable 5 on agentic coding or long-horizon reasoning tasks. If your workflow depends on multi-day autonomous coding sessions, you will need to move up to Tier 1B.</p>
<h3 id="gpt-55--the-coding-leader">GPT-5.5 — The Coding Leader</h3>
<p>OpenAI&rsquo;s GPT-5.5 is the strongest coding model available after the Fable 5 ban. It scores 82.1% on SWE-Bench Pro — slightly ahead of Fable 5&rsquo;s 80.3% — and it is available globally with no export restrictions.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Pricing</td>
          <td>$5/M input, $30/M output</td>
      </tr>
      <tr>
          <td>Context window</td>
          <td>~256K tokens</td>
      </tr>
      <tr>
          <td>SWE-Bench Pro</td>
          <td>82.1%</td>
      </tr>
      <tr>
          <td>Best access method</td>
          <td>OpenAI Codex CLI or API</td>
      </tr>
      <tr>
          <td>Availability</td>
          <td>Global (no export ban)</td>
      </tr>
  </tbody>
</table>
<p>GPT-5.5 excels at structured coding tasks, test generation, and bug fixing. Its token efficiency is meaningfully better than Fable 5 — Fable 5&rsquo;s &ldquo;Adaptive Thinking&rdquo; mode can burn tokens on reasoning traces even when you do not need them, while GPT-5.5 is more predictable in its token consumption.</p>
<p>For agentic coding, pair GPT-5.5 with the <a href="/posts/openai-codex-cli-guide-2026/">OpenAI Codex CLI</a>, which supports parallel agents with Git worktrees, GitHub issue-to-PR automation, and scheduled background tasks. This combination is arguably more productive than Fable 5 ever was for software engineering workflows.</p>
<h3 id="gemini-31-pro--the-context-king">Gemini 3.1 Pro — The Context King</h3>
<p>Google&rsquo;s Gemini 3.1 Pro has the largest context window of any frontier model at 2M tokens — double Fable 5&rsquo;s. If your workload involves processing entire codebases, massive document corpora, or long-running agentic sessions, this is your model.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Pricing</td>
          <td>$1.50/M input (API), free tier via Gemini CLI</td>
      </tr>
      <tr>
          <td>Context window</td>
          <td>2M tokens</td>
      </tr>
      <tr>
          <td>Availability</td>
          <td>Global</td>
      </tr>
      <tr>
          <td>Best access method</td>
          <td>Gemini CLI (free, 1K req/day) or Vertex AI</td>
      </tr>
  </tbody>
</table>
<p>At $1.50 per million input tokens, Gemini 3.1 Pro is roughly 7× cheaper than Fable 5 on input and 3× cheaper than Opus 4.8. The free Gemini CLI tier gives you 1,000 requests per day, which is enough for most individual developers. The tradeoff: it trails on hard reasoning benchmarks (GPQA, ARC-AGI-2) compared to GPT-5.5 and Fable 5.</p>
<h3 id="gemini-35-flash--the-cost-champion">Gemini 3.5 Flash — The Cost Champion</h3>
<p>If your priority is maximum throughput at minimum cost, Gemini 3.5 Flash is the best deal in frontier AI. At $1.50/M input tokens with 68% better token efficiency than its predecessor, it handles high-volume inference workloads at a fraction of the cost of any Anthropic or OpenAI model.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Pricing</td>
          <td>$1.50/M input</td>
      </tr>
      <tr>
          <td>Context window</td>
          <td>1M tokens</td>
      </tr>
      <tr>
          <td>Token efficiency</td>
          <td>68% improvement over previous Flash tier</td>
      </tr>
      <tr>
          <td>Best for</td>
          <td>High-volume coding assistants, document pipelines, customer-facing chatbots</td>
      </tr>
  </tbody>
</table>
<p>Gemini 3.5 Flash does not compete on hard benchmarks — it trails on Humanity&rsquo;s Last Exam and ARC-AGI-2 — but for the 90% of production workloads that do not need frontier reasoning, it is the most cost-effective choice on the market.</p>
<h3 id="grok-5--the-non-us-frontier-option">Grok 5 — The Non-US Frontier Option</h3>
<p>xAI&rsquo;s Grok 5 is available globally with no US export restrictions. It is a competitive frontier model for coding and reasoning, particularly for developers outside the US who cannot rely on Anthropic or OpenAI infrastructure.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>Value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Pricing</td>
          <td>Competitive with GPT-5.5</td>
      </tr>
      <tr>
          <td>Availability</td>
          <td>Global, no export restrictions</td>
      </tr>
      <tr>
          <td>Best for</td>
          <td>Non-US developers needing frontier capability</td>
      </tr>
      <tr>
          <td>Access</td>
          <td>xAI API</td>
      </tr>
  </tbody>
</table>
<hr>
<h2 id="tier-2-open-source-alternatives">Tier 2: Open Source Alternatives</h2>
<p>Open source models have closed the gap substantially. They can be self-hosted on your own hardware or accessed through third-party API providers at 70-98% lower cost than proprietary models.</p>
<table>
  <thead>
      <tr>
          <th>Model</th>
          <th>Provider</th>
          <th>Approx. API Cost</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>GLM-5.1</td>
          <td>Zhipu AI</td>
          <td>$0.30-$1.50/M tokens</td>
          <td>Strong coding + reasoning</td>
      </tr>
      <tr>
          <td>Qwen 3.6 Plus</td>
          <td>Alibaba Cloud</td>
          <td>$0.30-$1.50/M tokens</td>
          <td>Best agentic capabilities in open source</td>
      </tr>
      <tr>
          <td>Mistral Medium 3.5</td>
          <td>Mistral AI</td>
          <td>$0.30-$1.50/M tokens</td>
          <td>EU-based, strong for privacy-sensitive workloads</td>
      </tr>
      <tr>
          <td>Kimi K2.6</td>
          <td>Moonshot AI</td>
          <td>Fraction of proprietary</td>
          <td>Competitive with Opus 4.8 on coding</td>
      </tr>
      <tr>
          <td>MiMo V2.5 Pro</td>
          <td>12Labs</td>
          <td>Fraction of proprietary</td>
          <td>Multimodal capabilities</td>
      </tr>
      <tr>
          <td>MiniMax M3</td>
          <td>MiniMax</td>
          <td>Fraction of proprietary</td>
          <td>Strong long-context performance</td>
      </tr>
  </tbody>
</table>
<p><strong>When to go open source:</strong></p>
<ul>
<li>Your workload is high-volume and predictable — the cost savings compound quickly</li>
<li>You need data privacy and want to self-host</li>
<li>You are outside the US and want to avoid any future export restriction risk</li>
<li>Your team can invest in prompt engineering and model tuning</li>
</ul>
<p><strong>When to stay proprietary:</strong></p>
<ul>
<li>You need frontier-level reasoning for complex agentic tasks</li>
<li>Your team has no ML infrastructure for self-hosting</li>
<li>The 70-98% cost savings are real, but so are the capability gaps on hard benchmarks</li>
</ul>
<hr>
<h2 id="tier-3-developer-tools-claude-code-alternatives">Tier 3: Developer Tools (Claude Code Alternatives)</h2>
<p>If you were using Claude Code with Fable 5, here are the best tool-level alternatives:</p>
<table>
  <thead>
      <tr>
          <th>Tool</th>
          <th>Type</th>
          <th>Best For</th>
          <th>Pricing</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>OpenAI Codex</td>
          <td>App + CLI + VS Code</td>
          <td>Parallel agents, skills, automations, GitHub CI/CD</td>
          <td>$20/mo Pro or API</td>
      </tr>
      <tr>
          <td>Gemini CLI</td>
          <td>Terminal CLI</td>
          <td>Free tier, 2M context, Google Search grounding</td>
          <td>Free (1K req/day)</td>
      </tr>
      <tr>
          <td>Cursor</td>
          <td>IDE</td>
          <td>Background agents, visual diffs, multi-model</td>
          <td>$20/mo Pro</td>
      </tr>
      <tr>
          <td>OpenCode</td>
          <td>App + CLI</td>
          <td>Model flexibility, BYOK, zero markup</td>
          <td>$5-45/mo</td>
      </tr>
      <tr>
          <td>Aider</td>
          <td>CLI</td>
          <td>Budget-friendly, local models via Ollama</td>
          <td>Free (open source)</td>
      </tr>
  </tbody>
</table>
<p>OpenAI Codex is the strongest Claude Code alternative after the Fable 5 ban. It supports parallel agents running on isolated Git worktrees, scheduled automations, and GitHub issue-to-PR integration. If you are migrating a Claude Code-based workflow, Codex is the most feature-complete replacement.</p>
<p>Gemini CLI is the best free option. Its 2M-token context window and Google Search grounding make it useful for research and long-document tasks, and 1,000 free requests per day covers most individual use cases.</p>
<hr>
<h2 id="migration-runbook">Migration Runbook</h2>
<h3 id="if-you-are-a-developer-api-user">If you are a developer (API user):</h3>
<ol>
<li><strong>Replace model identifiers:</strong> Change <code>claude-fable-5</code> to <code>claude-opus-4-8</code> in all API calls. This is the fastest path back to working code.</li>
<li><strong>Evaluate GPT-5.5:</strong> If your workflow depends on Fable 5&rsquo;s coding accuracy, test GPT-5.5. The API is global, the SDK is mature, and SWE-Bench Pro scores slightly exceed Fable 5&rsquo;s.</li>
<li><strong>Consider cost optimization:</strong> If you were paying $10/$50 for Fable 5, Opus 4.8 ($5/$25) saves 50% and Gemini 3.5 Flash ($1.50/M) saves 85% on input tokens. Do not default to the most expensive model for every task.</li>
<li><strong>Implement multi-provider routing:</strong> Use <a href="https://litellm.ai">LiteLLM</a> or a similar abstraction layer so you can swap providers without code changes. The Fable 5 shutdown proved that any model can disappear with zero notice.</li>
<li><strong>Pin model versions:</strong> Do not use <code>latest</code> aliases. Explicit version strings prevent auto-upgrade from pulling in a restricted or deprecated model.</li>
</ol>
<h3 id="if-you-are-an-enterprise-customer">If you are an enterprise customer:</h3>
<ol>
<li><strong>Audit your team&rsquo;s exposure:</strong> Map which team members are foreign nationals. The &ldquo;deemed export&rdquo; rule applies to sharing controlled technology with non-US persons inside the US.</li>
<li><strong>Build a fallback pipeline:</strong> Configure automatic failover from Mythos-class models to Opus-tier or GPT-5.5. Model availability is not guaranteed.</li>
<li><strong>Evaluate Gemini 3.1 Pro for long-context workloads:</strong> At $1.50/M input and 2M tokens, it changes the economics of large-scale document processing.</li>
<li><strong>Monitor restoration progress:</strong> As of June 19, President Trump signaled a softened stance, and Anthropic updated its privacy policy to add government-ID collection — a likely technical step toward US-only restoration. No timeline has been announced.</li>
</ol>
<hr>
<h2 id="faq">FAQ</h2>
<p><strong>Q: Will Claude Fable 5 come back?</strong>
A: Likely yes, but initially US-only. Trump told Axios on June 19 he no longer views Anthropic as a security threat, and Anthropic&rsquo;s updated privacy policy (effective July 8) adds government-ID and biometric data collection — a prerequisite for nationality-based access control. Trading market Kalshi priced roughly 57% probability of restoration before July 1 as of June 18. However, export control negotiations typically move in weeks to months, not days.</p>
<p><strong>Q: Can H-1B visa holders still use Claude?</strong>
A: Yes. Only Fable 5 and Mythos 5 are subject to the export controls. Claude Opus 4.8, Sonnet 4.6, and Haiku 4.5 remain fully available to all users including foreign nationals. If you are on a visa and were using Fable 5, migrate to <code>claude-opus-4-8</code> immediately.</p>
<p><strong>Q: Do VPNs work to access Fable 5?</strong>
A: No. Anthropic&rsquo;s eligibility check is account-based (billing address, payment method, Trust &amp; Safety signals), not IP-based. A VPN gets you to the login screen, not to Fable 5 access. Attempting to circumvent the restriction puts your Anthropic account at risk.</p>
<p><strong>Q: Which alternative is closest to Fable 5&rsquo;s capabilities?</strong>
A: For coding: GPT-5.5 (82.1% vs 80.3% SWE-Bench Pro). For general reasoning and long context: Gemini 3.1 Pro (2M tokens, $1.50/M input). For direct Anthropic compatibility: Claude Opus 4.8 ($5/$25 per M tokens).</p>
<p><strong>Q: Are open source models a viable replacement for production?</strong>
A: For cost-sensitive, high-volume, or privacy-constrained workloads, yes. GLM-5.1 and Qwen 3.6 Plus are within striking distance of Opus 4.8 on coding benchmarks at 70-98% lower cost. For frontier agentic tasks requiring multi-day autonomous reasoning, proprietary models remain ahead.</p>
<p><strong>Q: How should I prepare for future export bans?</strong>
A: Build model-agnostic abstractions now. Use LiteLLM or a provider interface that accepts model identifiers as configuration parameters. Pin explicit version strings. Implement automated fallback pipelines. The Fable 5 shutdown was the first — it will not be the last.</p>
<hr>
<p><em>Last updated: June 21, 2026. Fable 5 and Mythos 5 were banned on June 12, 2026. Restoration prospects are evolving. Check <a href="https://status.anthropic.com">status.anthropic.com</a> for the latest.</em></p>
]]></content:encoded></item></channel></rss>