ProjectDiscovery Neo Review: Nuclei-Based AI Pentest Agent That Found 66 Exploitable Vulnerabilities

ProjectDiscovery Neo is an autonomous AI security engineer that runs real exploit chains, not just detection passes. In a three-application benchmark spanning banking, healthcare, and insurance targets, Neo confirmed 66 exploitable vulnerabilities — the highest count of any tool tested — including 24 findings that no other scanner or agent caught.

What Is ProjectDiscovery Neo? (The Nuclei-Powered AI Security Engineer)

ProjectDiscovery Neo is an autonomous penetration testing platform built on the Nuclei toolchain, designed to behave like a senior security engineer: it plans attack chains, executes exploits, validates impact, and returns proof packs that your team can replay. Unlike traditional scanners that flag potential issues, Neo confirms whether a vulnerability is actually exploitable before reporting it. The platform launched commercially at RSAC 2026 in March after ProjectDiscovery won the RSAC 2025 Innovation Sandbox — the highest-profile pre-launch validation any AI security startup has received. Underneath Neo sits Nuclei, the open-source engine that has completed over 10 billion scans and is maintained by a community of 100,000+ security engineers with 9,000+ YAML templates covering CVEs, misconfigurations, and custom attack patterns. Neo takes this attack-pattern library — which no new AI security startup can replicate overnight — and wraps it inside an agentic loop powered by Claude Opus 4.5, running 30+ agent-native security tools inside isolated sandboxes. The result is a tool that combines breadth (every CVE template Nuclei ships) with depth (multi-step reasoning to chain vulnerabilities into working exploits).

How Neo Works: The Autonomous Pentest Loop Explained

Neo operates on a perceive–plan–act loop that mirrors how a human penetration tester approaches a target, but runs continuously without fatigue. When you give Neo a target — a URL, a repository, or an API spec — it begins with reconnaissance using Nuclei’s fingerprinting templates to map the attack surface. It then formulates a testing strategy, prioritizing high-value attack paths like authentication bypass, injection points, and sensitive data exposure. The agent dispatches exploit attempts using Nuclei templates plus custom code it writes on the fly inside sandboxes, then evaluates whether each attempt succeeded. Critically, if an initial approach fails, Neo dynamically changes strategy — a behavior observed in hands-on testing where the agent tried multiple authentication bypass techniques before finding one that worked. When exploitation succeeds, Neo captures full evidence: HTTP request/response pairs, extracted data, and a replayable proof pack that your security or engineering team can verify independently without needing to re-run the tool. This proof-first architecture is what separates Neo from tools that return CVSS scores without evidence. A finding in Neo’s output comes with a working exploit, not a probability estimate. The isolated sandbox environment means these exploit chains run safely without touching production state, making Neo suitable for continuous testing on live targets.

Benchmark Results Deep Dive: Breaking Down the 66 Exploitable Vulnerabilities

Neo’s public benchmark covered three full-stack AI-generated applications in banking, healthcare, and insurance — verticals where vulnerability impact maps directly to regulatory and financial risk. The benchmark methodology required full exploit confirmation, not just detection: a finding counted only if Neo could demonstrate actual impact. Neo confirmed 66 exploitable vulnerabilities total — the highest count of any tool in the comparison set, which included Semgrep, Snyk, CodeQL, and other AI security platforms. The 24 findings no other tool caught are the most commercially significant result: these were real vulnerabilities in production-representative codebases that an entire class of existing tools missed. Among the unique findings were an arbitrary refund vulnerability — a business logic flaw that traditional pattern-matching scanners cannot detect — and password hash exposure through an API endpoint. Traditional static analysis tools like Semgrep and CodeQL excel at known vulnerability classes but struggle with multi-step business logic flaws that require understanding application state across multiple requests. Neo’s agentic approach, which chains reconnaissance → injection → impact verification, is designed precisely for this gap. For security teams running quarterly manual pentests, the implication is clear: Neo can surface entire categories of vulnerabilities between pentests at a fraction of the cost.

What the 24 Unique Findings Tell Us

The 24 findings no other tool caught reveal a structural limitation in signature-based security tools. Business logic vulnerabilities — arbitrary price manipulation, privilege escalation through parameter tampering, state confusion attacks — require an attacker to understand how the application works before they can identify what to break. Nuclei templates encode attack patterns, but Neo’s Claude Opus 4.5 backbone adds the reasoning layer that connects those patterns into working multi-step exploits. When Neo encounters a checkout endpoint, it doesn’t just run a price-injection template; it first maps the authentication flow, identifies session handling, tests discount code logic, and then attempts manipulation at each layer. This is closer to how a human tester works than how any scanner works.

Real CVE Discoveries: 22 CVEs Across 12 Projects (Including Faraday SSRF)

Neo autonomously discovered 22 confirmed CVEs across 12 open source projects — findings that earned Neo the status of a credited security researcher, not just a scanning tool. The most documented example is an SSRF (Server-Side Request Forgery) vulnerability in Faraday, a widely used HTTP client library in the Ruby ecosystem. Neo identified that Faraday’s URL handling could be manipulated to make the application issue requests to internal network addresses, a class of vulnerability that enables cloud metadata theft, internal service enumeration, and in some architectures, full SSRF-to-RCE escalation. The Faraday finding matters because the library is embedded in tens of thousands of active Ruby deployments — the blast radius of an unpatched SSRF extends far beyond any single application. Neo’s CVE discovery workflow illustrates what autonomous AI pentesting looks like at research quality: it doesn’t stop at confirming the vulnerability exists, it traces the impact chain to understand exploitability and documents findings in a format that satisfies CVE disclosure requirements. For security teams, the CVE track record is a proof point that Neo’s autonomous agent isn’t just running pre-written templates — it’s capable of discovering novel vulnerabilities in widely deployed software. The 22 CVEs across 12 projects were found in software with tens of thousands of active deployments, meaning these weren’t obscure research targets; they were real software that real organizations run in production today.

Why the Faraday SSRF Discovery Matters for AppSec Teams

The Faraday SSRF is a concrete illustration of AI-driven vulnerability research outpacing traditional disclosure timelines. Human researchers typically discover SSRF vulnerabilities during manual code reviews or targeted testing engagements — processes that take days to weeks. Neo identified the Faraday SSRF autonomously as part of a broader scan, without a human researcher specifying “look for SSRF in Ruby HTTP clients.” This unsupervised discovery mode is what makes Neo strategically interesting for organizations that depend on third-party libraries: Neo can continuously scan your dependency graph for newly exploitable patterns without requiring a pentest engagement to be scheduled in advance.

Neo vs. Competitors: Pentera, Horizon3 NodeZero, XBOW, and Burp Suite

Neo competes in the autonomous penetration testing platform category alongside Pentera, Horizon3 NodeZero, XBOW, and Escape. Each tool takes a different architectural approach that shapes its coverage and best-fit use case. Pentera and NodeZero are network-focused autonomous pentest platforms optimized for internal network segmentation testing, lateral movement, and Active Directory attack paths — they excel in enterprise infrastructure environments but are not primarily designed for web application and API testing. Neo, built on Nuclei’s web-focused template library, has deeper coverage for HTTP-layer attacks including OWASP Top 10, business logic flaws, and API security issues. XBOW positions itself as an AI-native web application security tool with a similar proof-based approach to Neo, but lacks the Nuclei ecosystem’s 9,000+ template library and 10-billion-scan training corpus. Escape focuses specifically on API security testing with strong GraphQL coverage. Burp Suite remains the professional manual testing standard — it provides the deepest control for a skilled human tester but requires significant expertise and time to produce comparable coverage to what Neo delivers autonomously. For teams with dedicated security engineers who run manual pentests, Neo is a force multiplier rather than a replacement: it handles continuous scanning and CVE template coverage while human testers focus on complex business logic and architecture-level findings. For teams without in-house security expertise, Neo provides pentest-quality findings without requiring a full-time pentester on staff.

Tool	Focus Area	Proof-Based	Nuclei-Backed	Best For
Neo	Web/API/AppSec	Yes	Yes	Continuous AppSec, CVE coverage
Pentera	Network/Infrastructure	Yes	No	Internal network, AD testing
NodeZero	Network/Infrastructure	Yes	No	Enterprise lateral movement
XBOW	Web/API	Yes	No	AI-native web testing
Escape	API/GraphQL	Partial	No	API-first organizations
Burp Suite	Web/API (manual)	Manual	No	Expert manual testing

Enterprise Features, Pricing Model, and How to Get Access

Neo launched commercially in March 2026 with a usage-based pricing model built around tokens and infrastructure consumption — a departure from the flat per-application pricing that traditional PTaaS vendors like Cobalt, Synack, and Bishop Fox use. Traditional full PTaaS with manual testing typically runs $5,999+ per application per year; Neo’s token-based model allows teams to run continuous testing at significantly lower cost for initial deployments, though costs scale with target complexity and scan frequency. Enterprise features in the Neo platform include replayable proof packs for every confirmed finding, integration with GitHub and GitLab for shift-left testing in CI/CD pipelines, isolated sandbox execution to prevent production impact during live testing, and a dashboard that tracks vulnerability status across multiple target applications over time. The proof pack feature is particularly valuable for compliance-driven organizations: instead of a static PDF pentest report, Neo produces machine-readable evidence that can be attached to a Jira ticket, reviewed in a pull request, or submitted to a compliance auditor. Access to Neo is through ProjectDiscovery’s platform at projectdiscovery.io — the company offers a waitlist and direct sales engagement for enterprise deployments. For teams already using Nuclei open source, Neo represents a cloud-managed upgrade path that adds the agentic reasoning layer without requiring teams to build and maintain their own orchestration infrastructure.

Shift-Left Security: Running Neo in CI/CD

Neo’s GitHub and GitLab integration enables a security testing pattern that traditional pentesting cannot support: testing every pull request against a staging environment before code merges. A typical CI/CD security gate with Neo runs in minutes — Nuclei’s parallel template execution means that a 9,000-template scan completes far faster than a human tester could review the same code. When Neo finds a confirmed vulnerability in a PR, it creates a finding with full evidence, blocking the merge until the issue is resolved. This shifts security left in the software delivery lifecycle — catching exploitable vulnerabilities at the point where they’re cheapest to fix, before they reach production.

Limitations: Where Neo Falls Short and What to Watch

Neo’s documented limitations matter for enterprise buyers evaluating deployment at scale. Hands-on testing has shown that Neo excels on single targets and small-to-medium assessments but may struggle with environments containing 200+ servers or resource-heavy infrastructure targets. This isn’t surprising for a tool designed around web application and API security — Nuclei’s template architecture is optimized for HTTP-layer testing, not infrastructure-scale network scanning. The token-based pricing model, while flexible, creates cost unpredictability for teams running continuous testing at high frequency across large application portfolios. Unlike flat-rate PTaaS pricing, token costs scale with scan complexity and depth, which means a thorough test of a complex application with many endpoints and authentication states will cost more than a shallow scan of a simple API. AI reasoning errors are a third limitation to monitor: while Claude Opus 4.5 provides strong multi-step reasoning, no LLM-based system eliminates false positives entirely. Neo’s proof-based approach reduces false positives significantly compared to signature scanners, but the requirement to review proof packs for each finding adds analyst time to the workflow. Finally, Neo’s coverage is strongest for web applications and APIs — organizations with significant mobile, desktop, or network infrastructure security requirements will still need specialized tools to complement Neo’s coverage.

When Not to Choose Neo

Neo is not the right tool for network penetration testing, Active Directory security assessments, or physical security evaluations. If your primary security concern is lateral movement inside a corporate network or privilege escalation in Windows domains, Pentera or NodeZero will provide better coverage. Neo is also not a compliance scanner — it doesn’t produce PCI DSS, SOC 2, or ISO 27001 compliance reports. If your team needs compliance-mapped findings rather than raw vulnerability evidence, you’ll need to map Neo’s output to compliance frameworks manually or use a dedicated GRC platform alongside it.

Should Your Team Use Neo? Verdict and Recommendations

Neo is the most compelling AI pentesting platform for web application and API security in 2026. The benchmark result — 66 confirmed exploitable vulnerabilities including 24 findings no other tool caught — is not a marketing claim; it reflects a structural advantage that the Nuclei ecosystem provides over tools built from scratch. The 22 autonomous CVE discoveries demonstrate that Neo is operating at research quality, not just scan quality. For development teams that ship web applications or APIs, Neo provides a continuous security testing capability that would otherwise require either a full-time security engineer or expensive quarterly pentests. The CI/CD integration makes it practical to adopt a shift-left security posture without restructuring engineering workflows. For security teams running existing manual pentest programs, Neo is a force multiplier: deploy it for continuous coverage between scheduled engagements, and redirect human testers toward architecture review and complex business logic testing where AI agents still struggle. The primary caveats are cost predictability for large-scale deployments and the 200+ server limitation for infrastructure-heavy environments. Teams should pilot Neo on a representative subset of their application portfolio before committing to full continuous coverage, to calibrate token costs against finding value before scaling.

FAQ

What makes ProjectDiscovery Neo different from traditional vulnerability scanners like Nessus or Qualys?

Traditional scanners like Nessus and Qualys detect the presence of known vulnerability signatures — they tell you a vulnerability might exist. Neo confirms that a vulnerability is actually exploitable by running real exploit chains and returning proof packs with full evidence. This proof-first approach eliminates the high false-positive rates that slow down remediation workflows in traditional scanning programs.

Does Neo replace manual penetration testing?

Neo automates the coverage layer of penetration testing — running known vulnerability patterns, CVE templates, and exploit chains continuously and at scale. It does not replace the judgment of a skilled human tester for architecture review, complex business logic attacks, or social engineering assessments. Most security teams use Neo to extend continuous coverage between scheduled manual pentests, not to eliminate the manual engagement entirely.

What programming languages and frameworks does Neo support?

Neo’s testing surface is application behavior at the HTTP/API layer, which means it is language- and framework-agnostic. Whether your application is built in Python, Java, Ruby, Node.js, or Go, Neo tests the deployed endpoints rather than the source code. For source code analysis, Nuclei also supports code-scanning templates, but Neo’s primary strength is black-box and grey-box network testing.

How does Neo handle false positives?

Neo’s proof-based architecture is specifically designed to minimize false positives. A finding appears in Neo’s output only when the agent has confirmed exploitability — not just detected a potential issue. In practice, this means Neo’s finding lists are shorter than those from traditional scanners but significantly higher in actionability. Security teams report that nearly all Neo findings require remediation, compared to the 30-50% false positive rates common in signature-based scanners.

What is Neo’s pricing and how does it compare to PTaaS vendors?

Neo uses a token-based pricing model tied to scan complexity and infrastructure consumption, rather than flat per-application fees. Traditional PTaaS with manual testing typically starts at $5,999+ per application per year. Neo’s token model is generally more cost-effective for teams running continuous testing at moderate frequency, but costs scale with target complexity. ProjectDiscovery offers enterprise pricing through direct sales for organizations running large application portfolios.

What Is ProjectDiscovery Neo? (The Nuclei-Powered AI Security Engineer)#

How Neo Works: The Autonomous Pentest Loop Explained#

Benchmark Results Deep Dive: Breaking Down the 66 Exploitable Vulnerabilities#

What the 24 Unique Findings Tell Us#

Real CVE Discoveries: 22 CVEs Across 12 Projects (Including Faraday SSRF)#

Why the Faraday SSRF Discovery Matters for AppSec Teams#

Neo vs. Competitors: Pentera, Horizon3 NodeZero, XBOW, and Burp Suite#

Enterprise Features, Pricing Model, and How to Get Access#

Shift-Left Security: Running Neo in CI/CD#

Limitations: Where Neo Falls Short and What to Watch#

When Not to Choose Neo#

Should Your Team Use Neo? Verdict and Recommendations#

FAQ#

📎 Related Articles