// investigation guide

phishing campaign investigation — methodology

phishing campaign investigation is not one suspicious email. it is multiple waves, shared kit infrastructure, rotating domains, and a mix of link and attachment lures. your job is to extract stable iocs, fingerprint the kit (deobfuscate kit javascript), and give soc a normalized blocklist — before the next wave lands in inboxes that already clicked wave one.

what evidence exists and how fast it dies

artifactvolatilitytime to loss
original .eml / .msg per wavepersistent if saveddestroyed if users delete or auto-purge runs
landing page html + kit jsvolatile at hosthours — kits rotate on compromise
shortener redirect chainsrollinglinks expire or get flagged within days
mail gateway quarantine exportrollingretention varies 7–30 days
user click telemetry (proxy/dns)rolling30–90 days typical
domain whois at triage timepersistentprivacy redaction increases over time

the first 10 minutes

  1. pull all reported messages as .eml — do not forward.
  2. export mail gateway quarantine for the reported subject lines and sender domains — 14 days back.
  3. search org mail for the same subject, attachment hash, or url host across all mailboxes.
  4. preserve one landing page capture (html + js) before the host goes offline.
  5. expand every shortener url found — record final host and path.
  6. identify users who clicked — proxy or dns logs if available.
  7. reset credentials for clickers who submitted the lure form.
  8. block kit domains and shortener targets at proxy — not just the first reported url.
  9. open ticket with domain registrar / hosting abuse if infrastructure is fresh.
  10. begin the path below.

the path

  1. 1. phishing header analyzer

    campaign .eml set. surfaces spf/dmarc failures, reply-to redirects, and authentication gaps across waves.why first: campaign scope starts with headers — one wave's IOCs mean nothing if you miss the second sender domain.

  2. 2. phishing url email extractor

    html + text parts from lure messages. extracts href vs display mismatches, tracking pixels, and hidden links.why second: the click URL is rarely the visible anchor text — extract before you block the wrong domain.

  3. 3. email attachment scanner

    mime attachments from the campaign. flags double extensions, mime/type mismatch, and embedded archives.why third: attachment lures run parallel to link lures in the same campaign.

  4. 4. url unshortener chain

    shortened urls from messages. expands bit.ly / tinyurl chains to landing host (offline mock trace in fixtures).why fourth: shorteners hide the kit domain until you expand the chain.

  5. 5. domain reputation check

    harvested domains and subdomains. scores lookalikes, fresh registration, and dga-shaped names.why fifth: infrastructure pivoting separates one-off phish from a registered campaign kit.

  6. 6. ioc extractor

    mixed campaign dump — headers, html, attachments, and landing page captures.why sixth: builds the master ioc list before dedupe and blocklist export.

  7. 7. ioc deduplicator normalizer

    raw ioc list from prior step. dedupes urls, domains, hashes, and normalizes defanged forms.why seventh: campaign triage produces duplicate hosts across waves — normalize before sharing with soc.

  8. 8. javascript deobfuscator

    kit loader js from landing page or attachment. strips obfuscation layers on credential harvest scripts.why last: kit fingerprinting confirms the same actor across waves even when domains rotate.

common false leads

  • one reported email equals one attacker — campaigns run parallel domains and templates.
  • blocking the display url is enough — users click the href, not the visible text.
  • spf passed so it is safe — spf validates envelope, not the from header users see.
  • no attachment means no malware — link-only credential harvest is the dominant pattern.
  • the kit host is the only ioc — shorteners and redirectors rotate faster than landing pages.

what we can tell you, what we can't

we can tell you:

  • header-level authentication failures across a message set
  • url/display mismatches and tracking artifacts in html parts
  • attachment mime mismatches and suspicious extensions
  • expanded shortener chains from captured urls (offline in browser tools)
  • deduplicated ioc lists and deobfuscated kit javascript patterns

we can't tell you:

  • live domain reputation from external feeds — offline heuristics only
  • who clicked in your org without your proxy/dns exports
  • attribution to a named apt — intel and law enforcement territory
  • guaranteed block efficacy — soc must deploy iocs in your stack

handing it off

  • soc / mail team: normalized ioc csv, kit js hash, blocked domains, quarantine search queries.
  • identity team: clicker list, reset status, mfa re-enrollment for compromised accounts.
  • law enforcement / IC3: representative .eml set, landing page capture, wallet of kit domains.

further reading

reference investigation

synthetic fixture northwind-phishing-campaign — two-wave m365 + apple id lure with shorteners, mime mismatch attachment, and obfuscated kit js, seed northwind-phishing-campaign:v1. compare output via npm run check:flagship.

proof page: /forensics/proof/northwind-phishing-campaign · fixture download: evidence zip · case playbook: case type tools

ready