Sentry on RockB

Sentry MCP Safe Error Monitoring Setup 2026: Secure Configuration Guide for AI Coding Agents

Sat, 04 Jul 2026 14:00:00 +0000

Why This Guide Exists

Sentry MCP hit 751 stars on GitHub in July 2026, and for good reason — it’s the most polished error-monitoring MCP server I’ve seen. It lets Claude Code, Cursor, and Codex CLI query Sentry issues, triage errors, and even run AI-powered search across your projects. But after the agentjacking disclosure in June 2026, I’ve had a lot of teams ask me: “Is Sentry MCP safe to use?”

The answer is yes — if you configure it correctly. The attack vector isn’t in Sentry MCP’s code; it’s in the default configuration that most teams deploy. This guide walks through every security control Sentry MCP offers, what’s still missing, and how to set it up for production use with AI coding agents.

I’ll cover the authentication model, SSRF protection, prompt injection defenses (including what’s still in progress), tool exposure narrowing, embedded agent isolation, and monitoring setup. By the end, you’ll have a checklist you can apply to your own deployment.

Authentication: The Two-Layer Model

Sentry MCP’s authentication architecture is its strongest security feature — if you use it correctly.

OAuth Remote Mode (Production Default)

In remote mode, Sentry MCP runs on Cloudflare Workers and implements a two-layer OAuth flow:

MCP OAuth (Cloudflare layer) — the client authenticates with an MCP-level token
Upstream Sentry OAuth — the MCP server authenticates with Sentry on your behalf

The critical property: the client never sees the raw Sentry token. The MCP server acts as a secure proxy. Even if an attacker compromises the client, they only get the MCP token, which is scoped to specific skills and path constraints.

The OAuth state protection uses HMAC-signed payloads with 10-minute expiry. Cookies are HttpOnly, Secure, SameSite=Lax, with a 30-day max age. Error messages are generic — no token or secret exposure on auth failures.

# Required environment variables for OAuth mode
export SENTRY_CLIENT_ID="your-oauth-client-id"
export SENTRY_CLIENT_SECRET="your-oauth-client-secret"
export COOKIE_SECRET="$(openssl rand -base64 32)"  # 32+ random characters

Direct Token Mode (Trusted Clients)

For trusted clients in controlled environments, you can pass a Sentry bearer token directly via the Sentry-Bearer header. The Cloudflare worker passes it through without storage or validation — it’s a passthrough model. Use this only when:

The client runs in a sandboxed environment
The network path between client and MCP server is isolated
You’ve scoped the token to the minimum required orgs and projects

STDIO Mode (Local Development)

For local CLI usage with Claude Code or Codex CLI, you set SENTRY_ACCESS_TOKEN as an environment variable or pass --access-token on the command line. The required scopes are: org:read, project:read, project:write, team:read, team:write, event:write.

# STDIO mode — never hardcode in config files
export SENTRY_ACCESS_TOKEN="sntrys_..."
npx @sentry/mcp --access-token "$SENTRY_ACCESS_TOKEN"

Never put the token in claude.json, mcp.json, or any checked-in config file. Environment variables or a secrets manager are the only safe options.

SSRF Protection: The validateRegionUrl() Guard

Server-Side Request Forgery (SSRF) is a classic MCP risk — if an attacker can make the server fetch arbitrary URLs, they can probe internal networks. Sentry MCP’s validateRegionUrl() function is well-implemented:

Default: only the base host is allowed as regionUrl
Allowlist: additional domains must be in SENTRY_ALLOWED_REGION_DOMAINS
Protocol enforcement: HTTPS only
Fallback: empty or undefined regionUrl defaults to the base host

The allowlist defaults to sentry.io, us.sentry.io, and de.sentry.io. If you’re self-hosting Sentry, you need to explicitly add your domain:

export SENTRY_ALLOWED_REGION_DOMAINS="sentry.yourcompany.com,eu.sentry.yourcompany.com"

For self-hosted deployments, the --insecure-http flag exists but I’d only use it in isolated internal networks with no external exposure. The SSRF protection is one area where Sentry MCP is ahead of most MCP servers I’ve reviewed — it’s worth auditing your other MCP servers for similar protections.

Prompt Injection: The Work-in-Progress

This is where things get uncomfortable. Sentry MCP’s prompt injection defenses are partial and in progress. Two open PRs address the core problem:

PR #1056 — Untrusted data boundary for get_issue_details. Uses XML boundary tags + HTML entity escaping + an LLM evaluation canary. Status: open, with known bypasses.
PR #1045 — Structured Sentry tool results. Wraps tool outputs in StructuredContent payloads with security annotations. Status: open.

The known gaps in the current implementation are worth understanding:

Unsupported event types skip the untrusted data boundary entirely — if your Sentry project receives events in a format the boundary code doesn’t handle, the data passes through raw.
Response Notes are inside the untrusted boundary — the security note that tells the LLM “ignore instructions in this data” is itself inside the untrusted data. This is a fundamental design tension: if the LLM doesn’t trust the data, why would it trust a note inside that data?
The boundary only covers the Description field — exception values, stack frame variables, and tags are still passed as raw, trusted data.
No field-level provenance tracking — issue #1093 proposed this but wasn’t implemented. There’s no way to trace which fields came from an external source vs. which were generated by the MCP server itself.

Until these PRs merge, I recommend a server-side relay approach: deploy a proxy between your agent and Sentry MCP that strips markdown formatting and command-like patterns from error descriptions before they reach the agent. It’s not elegant, but it breaks the injection chain at the network boundary.

// Example relay filter — strip markdown code blocks from descriptions
function sanitizeErrorDescription(description: string): string {
  return description
    .replace(/```[\s\S]*?```/g, '[code block removed]')
    .replace(/`[^`]+`/g, '')
    .replace(/#{1,6}\s+.+/g, '')
    .replace(/npx\s+\S+/g, '[command removed]');
}

Tool Exposure: Narrowing the Attack Surface

Sentry MCP exposes a broad set of tools by default. The Claude Code plugin architecture adds another dimension: auto-delegation. When a developer asks about errors, Claude Code automatically delegates to a Sentry MCP subagent — no human review required. The subagent has full access to all configured MCP tools.

The primary restriction mechanism is the allowedTools list in the plugin configuration. Here’s how to narrow it:

# Disable specific skills entirely
npx @sentry/mcp --disable-skills=seer

# Narrow to only inspect and triage tools
npx @sentry/mcp --skills=inspect,triage

# In remote mode, use query parameters
# https://sentry-mcp.example.com/mcp/your-org?skills=inspect,triage

The experimental variant (?experimental=1) exposes additional tools without additional consent. Don’t use it in production.

For the Claude Code plugin specifically, review the toolDefinitions.json that auto-generates the allowedTools list. Remove any tools your team doesn’t need. The default list is permissive — it’s your responsibility to trim it.

Embedded Agent Security: The AI-Powered Search Risk

Sentry MCP includes AI-powered search tools (search_events, search_issues, search_issue_events, use_sentry) that use an embedded LLM agent. This is useful, but it introduces a critical security control: provider selection.

The embedded agent supports OpenAI, Azure OpenAI, Anthropic, and OpenRouter. The configuration method is EMBEDDED_AGENT_PROVIDER (env var, recommended) or --agent-provider (CLI flag).

Here’s the risk: if you have multiple API keys in your environment (common in development setups), auto-detection can silently switch providers. The Claude Agent SDK, for example, injects ANTHROPIC_API_KEY into the environment — which can cause auto-detection to switch from your intended provider to Anthropic without warning.

# Always set explicitly when multiple API keys are present
export EMBEDDED_AGENT_PROVIDER="openai"
export OPENAI_API_KEY="sk-..."

# Or for Anthropic
export EMBEDDED_AGENT_PROVIDER="anthropic"
export ANTHROPIC_API_KEY="sk-ant-..."

Auto-detection is deprecated. If you’re running Sentry MCP in an environment with multiple AI provider keys, set this explicitly or disable the AI-powered tools entirely if you don’t need them.

Monitoring the Monitor: Observability Setup

Sentry MCP eats its own dog food — it uses Sentry SDKs for its own monitoring. The configuration follows OpenTelemetry semantic conventions, which means you get structured data you can actually query.

// Cloudflare Worker SDK config
Sentry.init({
  dsn: process.env.SENTRY_DSN,
  environment: 'production',
  beforeSend(event) {
    // Redact auth headers from captured data
    if (event.request?.headers) {
      delete event.request.headers['authorization'];
      delete event.request.headers['cookie'];
    }
    return event;
  }
});

The tracing setup wraps every tool handler with OpenTelemetry spans via createTracedToolHandler. Key attributes captured:

gen_ai.tool.name — which tool was called
mcp.session.id — session identifier
gen_ai.provider.name — AI provider in use
gen_ai.request.model — model name
network.transport — pipe (stdio) or tcp (SSE)
app.transport — stdio, sse, or http

The production sample rate is 10%. Error classification is sensible:

Skip Logging	Always Log
UserInputError	5xx errors
4xx (except 429)	Network failures
Validation errors	Unexpected exceptions
	Rate limit errors (429)

The app.server.response counter with http.route and http.response.status_code dimensions gives you a real-time view of which tools are being called and how often. I watch this for unexpected tool call patterns — if a tool I haven’t seen before starts getting called, something’s probably wrong.

Production Checklist

Here’s the condensed checklist I use when setting up Sentry MCP for a team:

Authentication

Use OAuth remote mode for production (client never sees raw Sentry token)
For stdio: use SENTRY_ACCESS_TOKEN env var, never hardcoded in config files
Scope tokens to minimum required orgs/projects
Use /mcp/:org or /mcp/:org/:project URL constraints

Tool Exposure

Disable unnecessary skills with --disable-skills=seer
Narrow to only needed tools with --skills=inspect,triage
Review allowedTools in Claude Code plugin config
Avoid ?experimental=1 in production

Prompt Injection Defense

Monitor PR #1056 and PR #1045 for merge status
Deploy server-side relay that strips markdown/commands from descriptions
Configure agent to require human approval for MCP-sourced commands
Run MCP-Scan against your MCP server configuration

Embedded Agent

Set EMBEDDED_AGENT_PROVIDER explicitly when multiple API keys present
Use dedicated API keys for Sentry MCP — don’t share with other tools
Disable AI-powered search tools if not needed

Network

HTTPS for all connections (enforced by SSRF protection)
Configure SENTRY_ALLOWED_REGION_DOMAINS for custom regions
Validate region URLs are restricted to known Sentry domains

Monitoring

Configure beforeSend to redact auth headers and tokens
Set tracesSampleRate: 0.1 for production
Monitor app.server.response metrics for unexpected tool calls
Watch for 5xx errors, network failures, and rate limit errors

The Bottom Line

Sentry MCP is a well-architected MCP server with security controls that most MCP servers don’t have — OAuth wrapping, SSRF protection, and tool hint annotations. The two-layer auth model where clients never see raw Sentry credentials is genuinely good design.

But it’s not a set-and-forget tool. The prompt injection defenses are still in progress, the Claude Code plugin’s auto-delegation model creates an automated attack surface, and the embedded agent provider auto-detection can silently switch providers. Every team using Sentry MCP needs to work through the checklist above.

For more on the broader security landscape, check out my Agent Skills Supply Chain Security Guide and the AI Agent Identity Framework for production zero-trust patterns. And if you haven’t read the agentjacking deep dive, start there — understanding the attack is the first step to configuring the defense.

Agentjacking Mitigation Guide 2026: Secure Sentry, Datadog, PagerDuty, and Jira for Coding Agents

Sat, 04 Jul 2026 12:00:00 +0000

Your coding agent trusts the tools it reads. That trust is the vulnerability.

When an attacker poisons a Sentry error report, a Datadog monitor alert, a PagerDuty incident, or a Jira ticket description with hidden prompt injection payloads, your agent doesn’t know the difference between a legitimate instruction and a hijack attempt. I’ve spent the last few months digging into this attack surface across the four most common integrations teams wire up to Claude Code, Cursor, and Codex. Here’s what I found and exactly how to fix it.

What Is Agentjacking and Why Should You Care?

Agentjacking is the exploitation of AI coding agents through poisoned tool outputs. The core problem is structural: agents treat the data they receive from integrated tools as trusted context. When Sentry returns an error report, the agent reads the exception message, stack frame variables, and tags — and if any of those fields contain injected instructions, the agent may follow them.

This isn’t theoretical. Invariant Labs demonstrated MCP Tool Poisoning Attacks against Anthropic, OpenAI, Zapier, and Cursor in early 2025. The OWASP Top 10 for Agentic Applications 2026 — built with input from over 100 industry experts — lists prompt injection and tool misuse as top-tier risks. Darktrace’s 2026 survey found that 92% of security professionals are concerned about AI agent impact. And 19.5% of CISOs in the State of AI Agent Security 2026 report had already experienced an AI-agent-related security incident.

The attack surface is real, and it’s growing. By the end of 2026, Gartner predicts 40% of enterprise applications will include task-specific AI agents. If you’re running coding agents today, you need a mitigation strategy for the tools they connect to.

The Four Critical Integration Risks

Each tool has a different attack vector, but the mitigation patterns are consistent. Let me walk through each one.

Sentry MCP: Fake Error Reports

Sentry’s MCP server lets agents query error events, stack traces, and performance data. The attack vector is straightforward: an attacker injects a fake error report into a Sentry project the agent monitors. The exception value, stack frame variables, tags, or event description contain a prompt injection payload. The agent reads the error, follows the injected instructions, and executes destructive commands.

The Sentry team has been responsive — PR #1056 added XML untrusted data boundary tags around the Description field. But I’ve found three bypass patterns in testing:

Unsupported event types — the wrapper only covers the Description field, not stack frame variables, tags, or breadcrumbs
Response Notes enclosed inside the boundary — the wrapper wraps the entire response, so Notes that should be outside end up inside
Only Description is covered — tags and extra data fields pass through raw

Mitigations for Sentry:

Apply untrusted data boundary wrapping to ALL Sentry event fields, not just Description
Use read-only API tokens scoped to minimal Sentry projects
Implement a tool-call approval queue for any Sentry-triggered write operations
Strip HTML/XML tags and control characters from Sentry event output before agent processing
Add LLM eval canary tests that verify prompt-injection resistance on every Sentry MCP deployment

I covered the full attack walkthrough in the Agentjacking Sentry MCP Attack Guide.

Datadog: Poisoned Monitor Alerts and Logs

Datadog integrations typically use API keys or MCP servers to query monitors, dashboards, and logs. An attacker who can create a monitor alert or inject a log entry with a crafted message can hijack any agent that reads that data.

Datadog’s API key model supports scoping — you can create restricted keys with read-only access to specific resources. The problem is that most teams don’t. They use the same admin-level API key for agent integrations that they use for their CI/CD pipelines.

Mitigations for Datadog:

Create Datadog API keys with read-only scopes for agent integrations — never use admin keys
Restrict application key permissions to specific dashboards and monitors only
Apply input sanitization to all Datadog event and monitor data before agent processing
Use Datadog’s restriction policies to limit which data agents can access
Implement separate Datadog API keys per agent identity for audit trail
Rotate Datadog API keys every 90 days minimum

PagerDuty: Crafted Incident Payloads

PagerDuty’s REST API and MCP integrations let agents query incidents, acknowledge alerts, and modify on-call schedules. An attacker who creates a fake incident with a crafted title or description can inject instructions that the agent follows.

PagerDuty supports read-only API tokens and scoped OAuth, which is good. But MCP integrations may not enforce field-level untrusted data boundaries on incident and alert data. The incident title, description, and custom details fields all pass through to the agent’s context.

Mitigations for PagerDuty:

Use PagerDuty read-only API tokens for agent integrations — never use account-level tokens
Scope API tokens to specific services and minimal permission sets
Apply untrusted data boundary wrapping to all PagerDuty incident and alert data
Implement human-in-the-loop approval for any PagerDuty write operations (acknowledge, resolve, create incidents)
Use PagerDuty’s audit logs to monitor agent-initiated actions
Rotate PagerDuty API tokens every 90 days

Jira: Injected Ticket Descriptions and Comments

Jira is the most dangerous integration because it’s the most write-heavy. Agents read issue descriptions, comments, and custom fields — and they create, update, and transition issues. An attacker who can create a Jira ticket with an injected description can hijack any agent that reads it.

Jira’s API token model is user-scoped with no granular permission model beyond project-level permissions. If your agent uses a personal account’s API token, it inherits everything that account can do. Basic auth is deprecated in favor of API tokens, but the permission model hasn’t improved.

Mitigations for Jira:

Create dedicated Jira service accounts with minimal project permissions for agent integrations
Use OAuth 2.0 (3LO) with scoped permissions instead of API tokens where possible
Apply untrusted data boundary wrapping to all Jira field data (description, comments, custom fields)
Implement tool-call approval queue for any Jira write operations (create, update, transition issues)
Restrict agent access to specific Jira projects only
Enable Jira audit logging and monitor for unusual agent activity patterns
Never use personal Jira accounts for agent integrations — always use service accounts

API Token Hygiene for Agent Integrations

Across all four tools, the single highest-impact change you can make is fixing your API token strategy. Here’s what I’ve found works in practice:

Dedicated tokens per agent. Every agent gets its own API token. No sharing between agents, no sharing between agents and humans, no sharing between agents and CI/CD pipelines. When you rotate a token, you only affect one agent.

Read-only by default. Start with read-only tokens. Grant write access only when you have a specific use case that requires it, and scope that write access to the minimum resources needed.

Automatic rotation. Set a 90-day maximum token lifetime. Most platforms support token expiry natively. If yours doesn’t, add a calendar reminder and a script that rotates tokens on schedule.

Secrets management. Store tokens in a secrets manager — Vault, AWS Secrets Manager, or 1Password. Never in code, never in config files, never in environment variables that get logged. I’ve seen too many tokens leak through CI/CD logs and debug output.

Token tagging. Tag every token with metadata: purpose, owner, expiry date, and the agent identity it belongs to. This makes lifecycle management and audits much easier.

Untrusted Data Boundaries: Your First Line of Defense

The most effective technical control is wrapping all external tool output in explicit untrusted data boundary markers. The pattern looks like this:


  sentry_mcp
  12345
  
    Error: Connection refused on port 5432
    Stack trace: ...

The agent’s system prompt should instruct it to treat anything inside tags as potentially malicious input, not as instructions. This is the same pattern the Sentry MCP PR #1056 implements, but you need to apply it to ALL fields, not just the Description.

Sanitization techniques:

Strip HTML and XML tags from tool output before it reaches the agent
Remove control characters and Unicode direction overrides
Filter known injection patterns (e.g., “ignore previous instructions”, “system prompt”)
Truncate excessively long fields that could hide payloads

LLM eval canary tests. For every deployment, run automated tests that verify boundary integrity. Create a test Sentry event with an injection payload in each field type, feed it through your sanitization pipeline, and verify the agent doesn’t follow the injected instruction. If the test fails, your boundaries have a bypass.

Known bypass patterns to watch for:

Unsupported event types that skip the wrapper entirely
Nested boundaries that confuse the parser
Encoding tricks (Unicode normalization, HTML entities, base64)
Fields that the wrapper developer forgot to cover

Human-in-the-Loop Approval Queues

Boundaries can be bypassed. That’s why you need a second line of defense: approval queues for high-risk tool calls.

Risk level classification:

Low (read-only queries) — auto-approve. Reading a Sentry event, querying a Datadog dashboard, listing Jira issues.
Medium (issue updates, incident acknowledgments) — conditional approval. Auto-approve if the change matches expected patterns, flag for review if it doesn’t.
High (deletes, infrastructure changes, financial operations) — require human approval every time.

Structured diffs in the approval UI. When an agent proposes a change, show the reviewer exactly what will change. A diff view for Jira issue updates. A before/after for PagerDuty incident resolution. The reviewer should be able to verify the change in seconds.

Rejection feedback loops. When a reviewer rejects an action, feed the rejection reason back into the agent’s context. The agent can then propose an alternative path. This turns rejections into learning opportunities rather than dead ends.

Track these metrics:

Approval items per day
Approval rate (what percentage of requests are approved)
Median review time
Stale items (requests that haven’t been reviewed in > 1 hour)

Least Privilege Architecture for Coding Agents

The WorkOS containment paper got this right: prompt injection may still occur, but the blast radius should be bounded by permissions, not detection. Design your agents as untrusted workers operating inside a policy-controlled perimeter.

Every tool call is an authorization event. Don’t check permissions once at startup. Validate on every single request. The agent’s identity, the tool being called, the resource being accessed, and the action being performed should all be checked against a policy.

Put policy outside the prompt. Prompts are not durable security boundaries. An attacker who successfully injects instructions can override any security rules in the system prompt. The policy must live in the runtime — the tool-call router, the API gateway, the authorization layer.

Separate identities per environment. Your dev agent should use different API tokens than your staging agent, which should use different tokens than your production agent. This limits blast radius and makes audit trails meaningful.

Deny-by-default. Agents can only access explicitly permitted resources. If you haven’t configured access to a Jira project, the agent can’t read it. If you haven’t granted write access to a Datadog dashboard, the agent can’t modify it.

I covered the identity and access control layer in more detail in the AI Agent Identity Framework guide.

Monitoring and Detection

Even with all the above controls in place, you need to detect when something goes wrong.

Log all agent tool calls. Every call should record: the agent identity, the tool called, the resource accessed, the action performed, the timestamp, and whether it was approved or rejected. Store this in a centralized logging system.

Anomaly detection. Set up alerts for:

Agent calling tools it doesn’t normally use
Agent operating outside its normal hours
Agent making an unusual volume of calls
Agent making failed approval attempts (potential injection probe)

Dashboards. Create a dashboard showing agent activity across all integrated tools. I recommend tracking: calls per agent per hour, approval rate over time, top tools called, top resources accessed, and error rate.

Circuit breakers. If an agent makes N failed approval attempts in T minutes, pause the agent automatically. This stops an active injection attack from continuing to probe for bypasses.

Regular audit reviews. Every month, review the agent activity logs. Look for patterns that don’t match expected behavior. Revoke tokens that haven’t been used in 90 days. Update permission scopes based on actual usage.

Putting It All Together: A Mitigation Checklist

Here’s the actionable checklist I use when securing a new agent deployment. Order by impact and effort.

Week 1 — High Impact, Low Effort:

Create dedicated read-only API tokens for each agent integration
Store tokens in a secrets manager, not in code or config files
Set 90-day token rotation
Tag tokens with purpose, owner, and expiry metadata

Week 2 — High Impact, Medium Effort:

Apply untrusted data boundary wrapping to all tool output fields
Implement input sanitization (strip HTML/XML, control characters)
Add LLM eval canary tests for boundary integrity
Test known bypass patterns (unsupported event types, encoding tricks)

Week 3 — Medium Impact, Medium Effort:

Implement tool-call approval queue for write operations
Define risk levels and auto-approve rules
Set up structured diffs in approval UI
Configure rejection feedback loops

Week 4 — Medium Impact, Higher Effort:

Create dedicated service accounts per agent per environment
Implement deny-by-default access policies
Set up centralized agent activity logging
Configure anomaly detection alerts and circuit breakers
Schedule monthly audit reviews

FAQ

What is agentjacking? Agentjacking is an attack where malicious instructions are injected into the data that AI coding agents read from integrated tools like Sentry, Datadog, PagerDuty, and Jira. The agent treats the poisoned data as trusted context and follows the injected instructions, potentially executing destructive actions.

Which coding agents are vulnerable to agentjacking? Any agent that reads external tool output is potentially vulnerable. This includes Claude Code, Cursor, GitHub Copilot, Codex CLI, and custom agent frameworks that integrate with observability and project management tools via MCP servers or REST APIs.

Can untrusted data boundaries be bypassed? Yes. Known bypass patterns include unsupported event types that skip the wrapper, nested boundaries that confuse the parser, and encoding tricks like Unicode normalization and HTML entities. Regular LLM eval canary tests are essential to catch bypasses.

Should I use API tokens or OAuth for agent integrations? OAuth 2.0 with scoped permissions is preferred where available, because it supports granular permission scoping and token revocation. API tokens are a reasonable fallback, but they should be read-only, scoped to minimal resources, rotated every 90 days, and stored in a secrets manager.

How do I detect an active agentjacking attack? Monitor for unusual agent behavior: calls to tools the agent doesn’t normally use, operation outside normal hours, unusual call volume, and a spike in failed approval attempts. Set up circuit breakers that pause the agent after N failed attempts in T minutes.

Agentjacking Sentry MCP Attack Guide 2026: How Fake Errors Hijack Claude Code, Cursor, and Codex

Sat, 04 Jul 2026 12:00:00 +0000

What Is Agentjacking?

In June 2026, researchers at Tenet Security disclosed a new attack class they called agentjacking — and it’s the most practical AI agent supply chain attack I’ve seen in production. The premise is deceptively simple: an attacker injects a malicious error event into your Sentry project, and when your AI coding agent (Claude Code, Cursor, or OpenAI Codex CLI) reads that event via the Sentry MCP server, it executes the attacker’s embedded payload with your system privileges.

No phishing. No malware. No prior server access. Just one HTTP POST request and a publicly visible Sentry DSN.

The numbers are sobering: Tenet Security reported an 85% exploitation success rate across all three major coding agents, tested against 100+ consenting organizations. They identified 2,388 organizations with publicly exposed Sentry DSNs — the only prerequisite for the attack. And here’s the part that keeps me up at night: 0% detection by EDR, WAF, firewall, VPN, or IAM in tested environments.

How Agentjacking Works: The 6-Step Attack Chain

Let me walk through the exact mechanics, because understanding the chain is the only way to defend against it.

Step 1: DSN Discovery

Sentry DSNs look like this:

They’re documented as safe to embed in public browser JavaScript — a design decision that predates AI agents by years. Attackers find them through:

GitHub code search — DSNs in public repos, often in .env.example files or frontend configs
Browser JS source maps — embedded in minified bundles shipped to production
Shodan / Censys — scanning for Sentry ingest endpoints with known patterns

Tenet Security found 2,388 organizations with exposed DSNs. Of those, 71 were in the Tranco top-1M — major production sites.

Step 2: Crafting the Fake Error Event

The attacker creates a Sentry error event that looks legitimate but contains a malicious payload in the event description. The key trick: the event uses markdown formatting with a ## Resolution section that includes a shell command disguised as a fix instruction.

Here’s what the injected event body looks like conceptually:

The npx command fetches and executes an attacker-controlled npm package. That package contains the exfiltration logic.

Step 3: Injection via Single HTTP POST

The attacker sends the crafted event to Sentry’s public ingest endpoint using the exposed DSN. Sentry accepts it — the DSN is write-only by design, and Sentry has no mechanism to verify the authenticity of submitted events. The event appears in the project’s error dashboard alongside legitimate errors.

Step 4: Developer Triggers the Agent

This is the human factor that makes the attack work. A developer opens their Sentry dashboard, sees a new error, and tells their AI coding agent:

“Fix the Sentry errors”

Or they paste the error into Claude Code / Cursor / Codex CLI and ask for a fix. This is normal developer behavior — we do this dozens of times a day.

Step 5: MCP Hands Payload to Agent as Trusted Context

The agent queries the Sentry MCP server for error details. The MCP server returns the attacker’s crafted event — markdown, ## Resolution section, and all. The agent treats this as trusted context because it came from an authorized MCP tool.

This is the critical architectural flaw: MCP servers are trust boundaries, but nobody treats them as such. The MCP protocol provides no mechanism for data provenance, content integrity, or instruction separation. The agent has no way to distinguish “this is error data” from “this is an instruction to execute.”

Step 6: Agent Executes Attacker’s Code

The agent reads the ## Resolution section, interprets it as a fix instruction, and runs the npx command. The attacker’s npm package executes with the agent’s full system privileges.

In 85% of test cases, the agent ran the command without asking for confirmation. The 15% failure rate wasn’t a security control — it was agents asking “are you sure you want to run this?” before the developer said yes and the payload executed anyway.

What Gets Stolen

Once the payload runs, it exfiltrates everything the agent has access to — which is everything the developer has access to:

AWS credentials — ~/.aws/config and ~/.aws/credentials
GitHub tokens — ~/.config/gh/hosts.yml, environment variables
Docker credentials — ~/.docker/config.json
SSH keys — ~/.ssh/id_rsa and authorized keys
npm tokens — ~/.npmrc
Environment variables — database URLs, API keys, secrets
Git credentials — stored in the local git config

The exfiltration happens over HTTPS to an attacker-controlled endpoint. No EDR flags it because it’s just curl or wget sending data — the same tools developers use legitimately all day.

Why Traditional Security Fails: The Authorized Intent Chain

This is the most unsettling part. Tenet Security calls it the Authorized Intent Chain:

Developer authorized the agent → authorized
Agent authorized the MCP server → authorized
MCP server returned data from Sentry → authorized
Agent executed a command based on that data → authorized

Every link in the chain is authorized. There’s nothing for EDR, WAF, IAM, or VPN to flag. The attack lives entirely in the logic layer — between what the system is supposed to do and what the attacker can make it do.

System prompt instructions to “distrust external data” don’t help. Researchers tested this explicitly — agents still executed the payload 85% of the time. The agent’s training and design prioritize being helpful over being cautious, and the MCP protocol gives it no tools to distinguish data from instructions.

Sentry’s Response: “Technically Not Defensible”

Sentry acknowledged the disclosure on June 3, 2026. Their response was honest and controversial: they declined root-cause remediation, calling it “technically not defensible” at the Sentry level. They deployed a content filter that blocks only the specific PoC payload string from the disclosure — a narrow band-aid that doesn’t address the class of attack.

I understand their position. Sentry’s ingest endpoint is designed to accept arbitrary data from any client. That’s the whole point of error reporting. Adding content validation would break legitimate use cases where error messages contain code snippets, stack traces, or commands. The vulnerability isn’t in Sentry — it’s in the trust model between MCP servers and AI agents.

But passing responsibility upstream to model vendors without a coordinated fix means thousands of teams remain exposed.

Beyond Sentry: The MCP Trust Problem

Agentjacking isn’t a Sentry bug. It’s an MCP ecosystem vulnerability. The same attack vector applies to any MCP server that surfaces external, attacker-controllable data:

Datadog — inject fake monitor alerts with embedded commands
PagerDuty — craft incident descriptions with malicious payloads
Jira / GitHub Issues / Linear — file a ticket with markdown containing shell commands

Elastic Security Labs found that 43% of MCP implementations in their sample contained command injection flaws. 30% of MCP servers permitted unrestricted URL fetching. The MCP protocol itself lacks:

Data provenance — no way to trace where data originated
Content integrity — no mechanism to verify data hasn’t been tampered with
Instruction separation — no distinction between “data about an error” and “instructions for fixing it”

This is a class of vulnerability, not a single CVE. We’re going to see variants of agentjacking for years.

Defense Guide: Practical Mitigations

Here’s what I’ve found actually works in production, ordered by impact:

1. Audit and Rotate Exposed DSNs

Run a scan across your GitHub orgs for exposed Sentry DSNs. The pattern is straightforward:

# Search for Sentry DSNs in your repos
grep -r "ingest\.sentry\.io" --include="*.{js,ts,jsx,tsx,env,json,yaml,yml}" .

For any exposed DSN, rotate it in the Sentry dashboard immediately. Add DSN patterns to your pre-commit hooks and secret scanning tools.

2. Disable Sentry MCP or Require Human Approval

This is the single most effective mitigation. If you don’t need Sentry MCP, disable it. If you do need it, configure your agent to require human approval before executing any command sourced from MCP data.

In Claude Code, you can set --approval-mode to require confirmation. In Cursor, disable automatic MCP tool execution in settings. For Codex CLI, review the MCP configuration and remove the Sentry server.

3. Apply Least-Privilege to Agent Environments

Run your AI coding agents in sandboxed environments with limited permissions. This is where tools like Cloudflare Temporary Accounts (60-minute sandboxed sessions, announced the same week as the disclosure) and containerized development environments make sense.

The principle: your agent should never have direct access to production AWS keys or SSH credentials. If the agent needs to deploy, it should go through a CI/CD pipeline with its own access controls.

4. Server-Side Sentry Relay

Instead of letting agents connect directly to Sentry’s ingest endpoint, route through a server-side relay that strips markdown and command-like content from error descriptions before passing them to the MCP server. This breaks the injection chain at the network boundary.

5. Pre-Commit Secret Scanning

Add Sentry DSN patterns to your pre-commit hooks and CI pipeline:

# .pre-commit-config.yaml
- repo: https://github.com/your-org/detect-secrets
  rev: v1.5.0
  hooks:
    - id: detect-secrets
      args: ['--pattern', 'ingest\.sentry\.io']

6. Monitor for Agentjacking Indicators

Watch for unexpected npx executions, unusual outbound HTTPS connections from developer machines, and Sentry events with suspicious ## Resolution sections. Agent Beacon (an open-source telemetry tool that emerged alongside the disclosure) can help detect anomalous agent behavior.

The Bigger Picture

Agentjacking is the first major demonstration of a problem I’ve been worried about since MCP was introduced: we’re connecting AI agents to external data sources without any trust boundary between them. The MCP protocol treats all data from a tool as equally trustworthy, and the agent treats all MCP-sourced data as context for action.

This isn’t fixable with a single patch. It requires:

Protocol-level changes to MCP for data provenance and instruction separation
Agent-level changes to treat MCP-sourced commands as untrusted by default
Organizational changes to audit and restrict what agents can access

For related reading on AI agent supply chain risks, check out my Agent Skills Supply Chain Security Guide and the PraisonAI Cross-Origin Agent Execution Vulnerability analysis. For a broader look at agent identity and access control, the AI Agent Identity Framework guide covers production patterns for zero-trust agent environments.

Bottom Line

Agentjacking works because it exploits the gap between what developers think their agents are doing and what agents actually do. Every link in the chain is authorized — that’s the whole point. The fix isn’t a better firewall or a smarter EDR. It’s treating MCP servers as untrusted data sources, sandboxing agent environments, and requiring human approval for any command sourced from external data.

Check your DSNs today. Disable Sentry MCP if you don’t absolutely need it. And start thinking about agent security as a trust boundary problem, not a malware problem — because the next variant of this attack won’t use Sentry at all.