Image Generation

FLUX.1 is a 12-billion parameter rectified flow transformer from Black Forest Labs that outperforms Stable Diffusion XL on photorealism, text rendering, and prompt adherence — available under Apache 2.0 for commercial use. This guide covers everything you need to integrate, fine-tune, and deploy FLUX.1 in production. What Is FLUX.1? Architecture and Why It Dominates Open-Source Image Generation FLUX.1 is a 12-billion parameter rectified flow transformer developed by Black Forest Labs, released in August 2024 by the original Stable Diffusion researchers who founded the company after leaving Stability AI. Unlike earlier diffusion models that stack UNet decoders, FLUX.1 uses a transformer-based architecture with bidirectional attention across text and image tokens simultaneously, which enables dramatically better prompt adherence and coherent multi-subject compositions. The model achieves state-of-the-art scores on the ELO image quality leaderboard, beating Midjourney v6 and DALL-E 3 in independent benchmarks for photorealism, anatomical accuracy, and typographic rendering. Black Forest Labs released FLUX.1 [schnell] under Apache 2.0 license — the only fully commercial-grade tier — while [dev] uses a non-commercial research license. By October 2025, MLCommons added FLUX.1 as an official training benchmark in MLPerf, signaling its industrial adoption. The architecture’s key innovation is its hybrid multimodal attention, which allows the model to model the correlation between image patches and text tokens jointly rather than conditioning image generation on a fixed text embedding. This translates to significantly better multi-subject scene generation and reliable text-in-image rendering that previous open-source models struggled with. ...