Engineering Benchmarks

LinearB’s 2026 Software Engineering Benchmarks Report analyzed 8.1 million pull requests from 4,800+ organizations across 42 countries and found a clear, alarming pattern: agentic AI PRs wait 5.3x longer for review than unassisted human PRs. AI tools generate code faster, but review capacity has not kept pace — creating a bottleneck that erases most of the speed gains. What the LinearB 2026 Benchmarks Actually Measured (8.1M PRs, 4,800 Orgs) The LinearB 2026 Software Engineering Benchmarks Report is one of the largest empirical studies of engineering team performance published this year. It draws on 8.1 million pull requests submitted between January and December 2025 from 4,800 organizations in 42 countries, spanning startups to Fortune 500 enterprises. The report tracks 20 distinct metrics across the entire software delivery lifecycle, and introduces 3 new AI-specific metrics to address the gap left by traditional DORA measurements. These new metrics capture PR Pickup Time by code origin (AI-generated, AI-assisted, or unassisted), code quality scores per PR type, and acceptance rates segmented by generation method. The dataset is large enough to establish statistically significant benchmarks at the 25th, 50th, and 75th percentile tiers, which LinearB labels Developing, Core, and Elite. The 2026 edition is the first to reveal that AI origin of a PR is now the single most predictive variable for PR Pickup Time — more predictive than team size, tech stack, or deployment frequency. ...