phishing campaign investigation — methodology
phishing campaign investigation is not one suspicious email. it is multiple waves, shared kit infrastructure, rotating domains, and a mix of link and attachment lures. your job is to extract stable iocs, fingerprint the kit (deobfuscate kit javascript), and give soc a normalized blocklist — before the next wave lands in inboxes that already clicked wave one.
what evidence exists and how fast it dies
| artifact | volatility | time to loss |
|---|---|---|
| original .eml / .msg per wave | persistent if saved | destroyed if users delete or auto-purge runs |
| landing page html + kit js | volatile at host | hours — kits rotate on compromise |
| shortener redirect chains | rolling | links expire or get flagged within days |
| mail gateway quarantine export | rolling | retention varies 7–30 days |
| user click telemetry (proxy/dns) | rolling | 30–90 days typical |
| domain whois at triage time | persistent | privacy redaction increases over time |
the first 10 minutes
- pull all reported messages as .eml — do not forward.
- export mail gateway quarantine for the reported subject lines and sender domains — 14 days back.
- search org mail for the same subject, attachment hash, or url host across all mailboxes.
- preserve one landing page capture (html + js) before the host goes offline.
- expand every shortener url found — record final host and path.
- identify users who clicked — proxy or dns logs if available.
- reset credentials for clickers who submitted the lure form.
- block kit domains and shortener targets at proxy — not just the first reported url.
- open ticket with domain registrar / hosting abuse if infrastructure is fresh.
- begin the path below.
the path
1. phishing header analyzer
campaign .eml set. surfaces spf/dmarc failures, reply-to redirects, and authentication gaps across waves.why first: campaign scope starts with headers — one wave's IOCs mean nothing if you miss the second sender domain.
2. phishing url email extractor
html + text parts from lure messages. extracts href vs display mismatches, tracking pixels, and hidden links.why second: the click URL is rarely the visible anchor text — extract before you block the wrong domain.
3. email attachment scanner
mime attachments from the campaign. flags double extensions, mime/type mismatch, and embedded archives.why third: attachment lures run parallel to link lures in the same campaign.
4. url unshortener chain
shortened urls from messages. expands bit.ly / tinyurl chains to landing host (offline mock trace in fixtures).why fourth: shorteners hide the kit domain until you expand the chain.
5. domain reputation check
harvested domains and subdomains. scores lookalikes, fresh registration, and dga-shaped names.why fifth: infrastructure pivoting separates one-off phish from a registered campaign kit.
6. ioc extractor
mixed campaign dump — headers, html, attachments, and landing page captures.why sixth: builds the master ioc list before dedupe and blocklist export.
7. ioc deduplicator normalizer
raw ioc list from prior step. dedupes urls, domains, hashes, and normalizes defanged forms.why seventh: campaign triage produces duplicate hosts across waves — normalize before sharing with soc.
8. javascript deobfuscator
kit loader js from landing page or attachment. strips obfuscation layers on credential harvest scripts.why last: kit fingerprinting confirms the same actor across waves even when domains rotate.
common false leads
- one reported email equals one attacker — campaigns run parallel domains and templates.
- blocking the display url is enough — users click the href, not the visible text.
- spf passed so it is safe — spf validates envelope, not the from header users see.
- no attachment means no malware — link-only credential harvest is the dominant pattern.
- the kit host is the only ioc — shorteners and redirectors rotate faster than landing pages.
what we can tell you, what we can't
we can tell you:
- header-level authentication failures across a message set
- url/display mismatches and tracking artifacts in html parts
- attachment mime mismatches and suspicious extensions
- expanded shortener chains from captured urls (offline in browser tools)
- deduplicated ioc lists and deobfuscated kit javascript patterns
we can't tell you:
- live domain reputation from external feeds — offline heuristics only
- who clicked in your org without your proxy/dns exports
- attribution to a named apt — intel and law enforcement territory
- guaranteed block efficacy — soc must deploy iocs in your stack
handing it off
- soc / mail team: normalized ioc csv, kit js hash, blocked domains, quarantine search queries.
- identity team: clicker list, reset status, mfa re-enrollment for compromised accounts.
- law enforcement / IC3: representative .eml set, landing page capture, wallet of kit domains.
further reading
reference investigation
synthetic fixture northwind-phishing-campaign — two-wave m365 + apple id lure with shorteners, mime mismatch attachment, and obfuscated kit js, seed northwind-phishing-campaign:v1. compare output via npm run check:flagship.
proof page: /forensics/proof/northwind-phishing-campaign · fixture download: evidence zip · case playbook: case type tools