AI-generated content dispute — methodology
an ai-content dispute is not a deepfake investigation and not a plagiarism scan. it is a fight over authorship — did this letter, image, or module come from a human or from a generator? evidence lives in PNG tEXt chunks,ComfyUI workflow JSON, IDE attribution headers, and chat export timelines. metadata gets stripped on upload. heuristics disagree. the first job is preservation of originals before anyone re-saves through Instagram, Slack, or a CMS.
what evidence exists and how fast it dies
| artifact | volatility | time to loss |
|---|---|---|
| original PNG/JPEG before platform upload | persistent if saved | generation chunks destroyed on first social re-export |
| ComfyUI workflow JSON + embedded PNG workflow tEXt | persistent | lost if only the flattened render is kept |
| A1111 parameters tEXt chunk (seed, model, sampler) | persistent in original file | stripped within minutes of Discord/Slack upload |
| IDE / Copilot attribution in source files | persistent in git | removed if someone rebases or squashes before you pull |
| LLM chat export (ChatGPT / Claude conversation JSON) | persistent if exported | account deletion or retention expiry — vendor-dependent |
| git commit history for disputed module | persistent | force-push or history rewrite destroys prior attribution |
| platform CDN copy (Twitter, Instagram, LinkedIn) | persistent but degraded | metadata already gone · recompression immediate |
the first 10 minutes
- stop re-sharing the disputed files through chat apps, email, or CMS — every hop strips metadata.
- collect originals from the claimant, the accuser, and any shared drive — not screenshots of screenshots.
- sha-256 hash every file at collection time. write down who handed you what and when.
- export LLM chat logs if either party admits using ChatGPT, Claude, or Copilot — JSON export, not copy-paste.
- pull git history for disputed code before anyone force-pushes or amends commits.
- download platform CDN copies separately — they are degraded but timestamped.
- photograph or archive the dispute thread (email, Slack, ticket) showing when each asset was submitted.
- do not run disputed images through any editor, converter, or “optimize” pipeline.
- separate text, image, and code artifacts — each modality has a different tool path.
- begin the path below on the highest-fidelity originals you have.
the path
1. ai generated text fingerprint analyzer
drop the disputed letter, contract clause, or campaign copy as plain text. surfaces low burstiness, transition-phrase density, and structural tells common in LLM output.why first: text disputes arrive first and are easiest to preserve — but stylistic signals alone are not proof. establish a baseline before you touch images or code.
2. ai generated image provenance analyzer
drop original PNG/JPEG files before any re-export. scans chunk structure, embedded parameters, and cross-artifact consistency across the image set.why second: a single image often carries generation metadata even when the surrounding narrative claims human authorship.
3. ai generated image metadata stripper detector
compare the social-platform export against the claimed-original file. flags recompression, missing generation chunks, and strip-and-repost patterns.why third: actors strip PNG tEXt chunks before posting — the stripped export is often the only version in circulation. you need to prove metadata was removed, not that it never existed.
4. ai generated code provenance analyzer
drop disputed source files (.ts, .py, .js). surfaces Copilot/GitHub attribution markers, generic handler stubs, and TODO-density patterns.why fourth: code disputes hinge on IDE attribution headers and structural sameness — separate from whether the logic is correct.
5. stable diffusion generation metadata extractor
drop PNGs with A1111-style parameters tEXt chunks. extracts seed, sampler, steps, cfg, model checkpoint name.why fifth: SD parameters in the file are hard to forge convincingly — seed + model + sampler alignment is stronger than any stylistic guess.
6. comfyui workflow forensic analyzer
drop standalone workflow JSON or PNGs with embedded workflow tEXt. parses node graph, CLIP prompts, checkpoint refs, and render chain.why sixth: ComfyUI embeds the full generation graph — prompts, models, and node wiring survive even when the poster claims manual photography.
7. automatic1111 artifact forensic extractor
drop A1111 config.ini, PNG parameters chunks, or WebUI export bundles. correlates checkpoint names, extension paths, and generation settings.why seventh: config files and PNG parameters cross-validate — a portrait tagged chenVision_v20 in the PNG should match the WebUI config on disk.
8. gan fingerprint detector
drop synthetic headshots or avatars with no metadata. runs FFT/grid heuristics for GAN upsampling artifacts when provenance chunks are absent.why last: when metadata is stripped and no workflow exists, frequency-domain tells are the fallback — honest, probabilistic, not definitive.
common false leads
- low burstiness proves AI authorship — formal writing, ESL authors, and edited corporate copy all score “LLM-like.” stylistic signals are triage, not verdict.
- no metadata means human-created — stripping is trivial. a missing parameters chunk is evidence of tampering, not innocence.
- the social export is the original — platform recompression removes tEXt chunks within the first upload. always ask for the pre-upload file.
- Copilot attribution means the whole file is AI — attribution headers mark assisted sections; humans edit around them constantly.
- GAN fingerprint positive means deepfake — FFT grid artifacts also appear in upscaled phone photos and aggressive JPEG recompression. treat as one signal among many.
- matching seed across two images means copy-paste fraud — shared seeds can be intentional (batch renders from the same prompt session). context matters.
what we can tell you, what we can't
we can tell you:
- whether PNG files contain Stable Diffusion parameters, ComfyUI workflow JSON, or A1111 generation chunks
- whether a social export shows metadata stripping vs a claimed original
- structural and attribution markers in disputed source code
- text stylistic signals consistent with LLM output (with explicit confidence limits)
- frequency-domain heuristics suggestive of GAN upsampling when metadata is absent
- cross-artifact consistency — model names in PNG parameters matching WebUI config on disk
we can't tell you:
- definitively prove human authorship — absence of AI markers ≠ human wrote it
- attribute content to a specific person — that is legal and contextual, not forensic
- recover metadata stripped before you received the file — only prove it was stripped
- determine contract breach, copyright infringement, or employment outcome — counsel territory
- detect every generator — new models and custom fine-tunes outpace heuristic classifiers
handing it off
- outside counsel: sha-256 hashes, preservation log, tool output JSON, originals vs platform-degraded copies, timeline of when each party submitted each asset.
- platform abuse / trust & safety: stripped-metadata proof, CDN URL, account identifiers, generation-parameter extracts if the platform hosts the disputed render.
- HR / labor mediator: code provenance report, git history export, IDE attribution markers — not stylistic text scores alone.
- law enforcement: only if fraud or identity impersonation crosses into criminal territory — full artifact set, chain of custody, no scrubbed copies.
- expert witness / dedicated media forensics lab: when the dispute goes to litigation and heuristic browser output needs independent validation under Daubert/Frye standards.
further reading
reference investigation
synthetic fixture chen-ai-content-dispute — Chen Media campaign authenticity dispute, seed chen-ai-content-dispute:v1. bundle includes an LLM-like dispute letter, A1111 portrait with parameters tEXt, metadata-stripped social export, ComfyUI workflow JSON + embedded render, Copilot-marked TypeScript module, WebUI config, and GAN-grid synthetic headshot.
proof page: /forensics/proof/chen-ai-content-dispute · fixture download: evidence zip · case playbook: case type tools