// case comparison

mcp server compromise vs agent runaway

an LLM integration produced an unexpected tool call. agent runaway is autonomous re-planning by an agent against a clean server — the failure is in the agent's tool-call chain. server compromise is the MCP server itself being malicious or impersonated — the server returns tampered tool definitions or executes operations the client never requested. wrong call sends you to MCP trace reconstruction when the server log shows tool-definition drift, or vice versa.

primary tools · side by side

ordered entry points from the case-type taxonomy. highlighted rows appear in both case types' editorial tool lists.

case a

MCP server compromise

the MCP (Model Context Protocol) server itself is the failure locus — leaked server credentials, impersonated server identity, server-side tool-definition tampering, or permission escalation in the server's tool-grant ledger. evidence is the server audit log, the client-invocation trail showing what the LLM thinks it called vs what the server actually executed, the tool-call attribution graph, and the OAuth scope grant ledger. distinct from ai-agent-runaway (agent did this with a benign server) and llm-prompt-injection (input bent the model · server was clean). a compromised server can fool both honest models and honest agents.

  1. 01mcp model context protocol server audit log forensic analyzerdrop mcp server audit log · parse tool calls + resource accesses + auth · runs locally
  2. 02mcp client invocation log forensic analyzerdrop mcp client invocation log · parse server calls + arguments + responses · runs locally
  3. 03mcp server permission escalation detectordrop mcp server audit log · detect over-permissioned tool exposure · runs locally
  4. 04mcp tool call graph reconstructordrop mcp client + server log set · reconstruct tool-call dependency graph · runs locally
  5. 05mcp prompt injection via tool result detectordrop mcp server tool result log · detect injection payloads in tool responses · runs locally
  6. 06anthropic mcp claude tool call attribution tooldrop claude tool call log · attribute each tool call to model decision · runs locally

editorial overlap

lean toward…

disambiguation signals derived from case-type descriptions and common practitioner confusion points.

lean toward mcp server compromise if you see…

  • MCP server audit-log shows tool-definition or capability-list drift between sessions · server signature changes mid-session · server-side handler invoked an operation no client requested
  • client-invocation log diverges from server-side execution log — the client thinks it called tool X, the server actually ran tool Y · attribution graph breaks at the server boundary
  • OAuth scope grant ledger on the server shows escalation without a corresponding client grant event · server tool-permission edit by an unattributable principal

lean toward agent runaway if you see…

  • agent prompt-vs-action divergence on a server whose audit log shows stable tool-definitions, consistent signature, and clean per-session capability list
  • tool-call execution trace on the agent side anchored to one session · server log shows the agent's requests arrived faithfully and the server executed exactly what was asked
  • agent persistence mechanism (cron · webhook · scheduled lambda) added by the agent against a server with no tampered-definition or impersonation signal
ready