A revenue dashboard drops 18% overnight. The pipeline is ‘green.’ The lineage graph looks right. Query history shows the job ran successfully. Yet you still can’t answer the only question leadership cares about: what changed—and can we prove it?

Traditional lineage is built for discovery: it shows what depends on what. Incidents demand evidence: what exactly ran, on which versions, with which logic and checks, and what blast radius that run created. Graphs show paths; incidents require proof.

This article proposes a practical, vendor-neutral standard you can implement with tools you already have: Minimum Incident Lineage (MIL). MIL is not a lineage UI. It’s a run-level evidence schema—the smallest set of fields that makes incidents replayable, auditable, and fast to triage, without storing raw data.

Why ‘lineage’ isn’t enough during incidents

During an incident, these questions matter more than dependency paths:

MIL targets incident questions, not just dependency questions.

MIL in one sentence

Minimum Incident Lineage (MIL) is the minimal run-level evidence you must capture for each dataset published to reproduce, triage, and audit a data incident—without storing raw data.

MIL design principles

  1. Replayable: evidence reconstructs input versions → transform → output version.
  2. Minimal: if it’s heavy, teams won’t emit it consistently.
  3. Safe by default: store proof, not payload (hashes/IDs/buckets).

The MIL schema: the minimum 12 fields

A) Run identity and timing

  1. mil_run_id—globally unique run identifier (orchestrator run + task + attempt)
  2. timestamp_start
  3. timestamp_end
  4. asset_id—stable catalog identifier (not just schema.table)

B) Input/output version evidence

  1. input_asset_versions[]—upstream (asset_id, version) pairs (snapshots/commits)
  2. output_asset_version—immutable version produced
  3. schema_fingerprint—hash of output schema (cols + types + order)

C) Transformation and execution evidence

  1. transform_fingerprint—hash of logic (normalized SQL/dbt hash/Spark code hash)
  2. execution_fingerprint—plan/config hash (warehouse plan hash, Spark physical plan hash, key params)

D) Quality, governance, and safety gates

  1. dq_gate_status—PASS | WARN | FAIL + dq_ruleset_version
  2. policy_tags_applied—tags at publish time (classification/masking/retention)

E) Ownership and impact

  1. owner_ref—team/on-call reference
  2. blast_radius—dependents count + tier/severity bucket

If you want a strict ‘12,’ make blast_radius the 12th field and enforce owner_ref via your catalog. In practice, most teams keep both because they remove the two biggest sources of incident latency: “Who owns this?” and “Who is impacted?”

A safe MIL event example (no raw data)

{
"mil_run_id": "airflow:dag=rev_mart,run=2026-01-25T09:00Z,task=build,try=1",
"asset_id": "catalog:dataset:rev_mart_v2",
"timestamp_start": "2026-01-25T09:00:03Z",
"timestamp_end": "2026-01-25T09:07:41Z",
"input_asset_versions": [
    {"asset_id":"catalog:dataset:orders", "version":"iceberg:snap=8841201"},
    {"asset_id":"catalog:dataset:customers", "version":"iceberg:snap=220993"}
],
"output_asset_version": "iceberg:snap=9912103",
"schema_fingerprint": "sha256:9b3c…",
"transform_fingerprint": "git:dbt_model_hash=3f2a…",
"execution_fingerprint": "warehouse:plan_hash=aa81…",
"dq_gate_status": {"status":"WARN", "ruleset_version":"dq:v7"},
"policy_tags_applied": ["pii:none", "masking:standard"],
"owner_ref": "oncall:data-platform:rev-marts",
"blast_radius": {"dependents_count": 37, "tier":"SEV2"},
"publish_action": "PUBLISHED",
"change_context": "pr:github:org/repo#1842"
}

Where MIL fields come from (implementation blueprint)

You can source MIL from systems you already run:

Walkthrough: ‘Revenue dropped overnight’ solved using MIL

Pipelines are green, but the dashboard is down 18% after the 9 AM refresh.

  1. Query MIL for rev_mart_v2 and pull the latest mil_run_id.
  2. Compare transform_fingerprint to last known good: logic change vs not.
  3. Check schema_fingerprint for silent drift that alters joins.
  4. Compare input_asset_versions[] to isolate the upstream change quickly.
  5. Check dq_gate_status (and publish_action, if present) for enforcement gaps.
  6. Use blast_radius to set severity and notify impacted teams.
  7. If snapshots exist, roll back to the prior output_asset_version with confidence.

Failure modes, MIL catches that dependency lineage often misses

What MIL is (and isn’t)

MIL is an evidence trail for publishers, not a replacement for your lineage UI. You can still draw dependency graphs, but MIL gives each node a verifiable run card you can inspect during triage.

MIL is also not ‘logging everything for observability.’ It’s the minimum that lets you answer incident questions quickly and defend your conclusions in a postmortem.

Where MIL lives

Most teams store MIL events in a small append-only store that’s easy to query during incidents:

The only hard requirement: MIL records must be immutable and queryable by asset_id and time.

Implementation tips that keep MIL minimal

MIL readiness checklist

Conclusion: MIL is ‘lineage as evidence’

Lineage helps you navigate systems. MIL helps you prove what happened. If you can answer ‘what changed?’ with evidence in under five minutes, you’ve built operational trust—not just lineage.