// case comparison

agent runaway vs prompt injection

security sees an LLM integration misbehaving. runaway is autonomous re-planning by an agent that received a benign prompt — failure is in the tool-call chain. prompt injection is adversarial user/document/retrieved input bending the model — failure is in the input stream. wrong call sends you to MCP trace reconstruction when you need attempt-log pattern matching, or vice versa.

primary tools · side by side

ordered entry points from the case-type taxonomy. highlighted rows appear in both case types' editorial tool lists.

editorial overlap

1 tool mapped to both case types in the editorial taxonomy — useful when the investigation spans both surfaces.

lean toward…

disambiguation signals derived from case-type descriptions and common practitioner confusion points.

lean toward agent runaway if you see…

  • autonomous tool-call chain with no matched jailbreak pattern or adversarial template in user-turn logs — the original prompt was bounded
  • prompt-vs-action divergence on agent steps where stated_intent was benign but actual_action was out-of-scope (exfil · persistence · credential read)
  • MCP tool-call graph or agent persistence (cron · webhook) added while the deploying operator was offline — no model-input manipulation in the path

lean toward prompt injection if you see…

  • matched injection pattern, jailbreak template cluster, or adversarial turn sequence in LLM attempt logs driving model output
  • indirect-injection carrier artifact — uploaded doc · retrieved RAG chunk · MCP tool result — containing imperative override text that the model then followed
  • guardrail bypass score anomaly, system-prompt exfil attempt, or red-team evaluation log showing the input was the attack vector — not autonomous replanning
ready