Mixture-of-Experts

Qwen 3 is Alibaba’s open-source LLM family released in 2026, spanning eight dense models (0.6B to 32B) and two MoE models (30B-A3B, 235B-A22B). All models run in both thinking and non-thinking modes, are licensed Apache 2.0, and were trained on 36 trillion tokens across 119 languages. What Is Qwen 3? Alibaba’s Biggest Open-Source LLM Family Yet Qwen 3 is a family of open-weight large language models developed by Alibaba’s Qwen team, spanning from ultra-lightweight 0.6B edge models to the 235B-parameter MoE flagship that competes head-to-head with GPT-4o and Gemini 2.5 Pro. Unlike previous generations that separated chat models from reasoning models, every Qwen 3 model ships with a built-in dual-mode thinking system: flip a soft switch in your prompt and the same model either engages deep chain-of-thought reasoning or returns fast responses like a traditional assistant. Trained on 36 trillion tokens across 119 languages and dialects — up from 29 in Qwen 2.5 — the family covers code, math, STEM reasoning, and multilingual tasks under a single Apache 2.0 license. The flagship Qwen3-235B-A22B scores 95.6 on ArenaHard and 2056 on CodeForces Elo, outperforming DeepSeek-R1 on 17 of 23 benchmarks. For developers, this is the first open-source family where one model can genuinely replace both a reasoning specialist and a general-purpose chat model. ...