// case comparison

mcp server compromise vs prompt injection

an LLM produced an unexpected response or tool call. prompt injection is adversarial input — user prompt · RAG chunk · uploaded attachment — bending an honest model running against an honest server. server compromise is a malicious MCP server feeding tampered tool results or definitions back to a clean model; the model is faithful, the input is honest, but the server pipeline lies. attempt-log pattern matching solves one; server audit-log diffing solves the other.

primary tools · side by side

ordered entry points from the case-type taxonomy. highlighted rows appear in both case types' editorial tool lists.

case a

MCP server compromise

the MCP (Model Context Protocol) server itself is the failure locus — leaked server credentials, impersonated server identity, server-side tool-definition tampering, or permission escalation in the server's tool-grant ledger. evidence is the server audit log, the client-invocation trail showing what the LLM thinks it called vs what the server actually executed, the tool-call attribution graph, and the OAuth scope grant ledger. distinct from ai-agent-runaway (agent did this with a benign server) and llm-prompt-injection (input bent the model · server was clean). a compromised server can fool both honest models and honest agents.

  1. 01mcp model context protocol server audit log forensic analyzerdrop mcp server audit log · parse tool calls + resource accesses + auth · runs locally
  2. 02mcp client invocation log forensic analyzerdrop mcp client invocation log · parse server calls + arguments + responses · runs locally
  3. 03mcp server permission escalation detectordrop mcp server audit log · detect over-permissioned tool exposure · runs locally
  4. 04mcp tool call graph reconstructordrop mcp client + server log set · reconstruct tool-call dependency graph · runs locally
  5. 05mcp prompt injection via tool result detectordrop mcp server tool result log · detect injection payloads in tool responses · runs locally
  6. 06anthropic mcp claude tool call attribution tooldrop claude tool call log · attribute each tool call to model decision · runs locally

editorial overlap

3 tools mapped to both case types in the editorial taxonomy — useful when the investigation spans both surfaces.

lean toward…

disambiguation signals derived from case-type descriptions and common practitioner confusion points.

lean toward mcp server compromise if you see…

  • MCP server-side tool-result tampering — the result returned from the server contains content the underlying data source did not produce · server log shows tool-result rewrite between data layer and client response
  • server tool-definition drift or capability-list change between client sessions without an attributable admin event — model received different tool semantics on different calls
  • OAuth scope grant ledger on the server shows escalation or token-handler edit · server impersonation signal in TLS pinning or server-identity attestation logs

lean toward prompt injection if you see…

  • matched injection pattern in user-turn logs · jailbreak template cluster · adversarial turn sequence in LLM attempt logs driving model output against a server whose audit log is clean
  • indirect-injection carrier artifact — uploaded doc · RAG chunk · MCP tool result containing override text whose source was the underlying data, not server-side rewrite
  • guardrail bypass score anomaly, system-prompt exfil attempt, or red-team eval log showing the input stream — not the server pipeline — was the attack vector
ready