Solution Atlas
EverydayUser storyConsultative playbook

Auditors want column-level lineage and we cannot produce it

A financial-services CDO is preparing for a regulatory review. The auditor has asked for column-level lineage of all regulatory reporting figures from source through transformation to consumption. The current catalogue is partial, lineage is informal, and the data team is spending weeks reconstructing flows.

Trigger
Regulator request; lineage cannot be produced on demand.
Good outcome
Purview Data Governance live across Fabric, Databricks, and Snowflake; column-level lineage end-to-end; data products with named owners.
Diagnostic discovery

Signals this story fits

Observable cues that confirm the conversation belongs here.

  • ·Regulator requesting column-level lineage
  • ·Multiple data platforms (Fabric, Databricks, Snowflake) with separate catalogues
  • ·Catalogue partial; lineage informal
  • ·Data products without named owners
  • ·Audit cycle approaching

Questions to ask

Open-ended, SPIN-style — each one has a reason it matters.

  1. 1.Which data platforms are live today and what catalogues do they have?

    WhySizes the federation surface area.

  2. 2.Which regulatory framework drives the lineage requirement — DORA, BCBS 239, GDPR, sector-specific?

    WhyDifferent regulators want different lineage depth.

  3. 3.Who owns the data products today?

    WhyWithout named owners, federation is plumbing without accountability.

  4. 4.What's your column-level lineage story today?

    Why"Informal" or "ask the data team" is the common entry.

  5. 5.What's the audit timeline and what artefacts does the regulator demand?

    WhySharpens the deliverable.

Baseline → target architecture

TOGAF-style gap framing — what we typically see today, and what the proposed end state looks like. The gap between them is the engagement.

Baseline architecture

Fragmented catalogues per platform: Fabric has native catalogue, Databricks uses Unity Catalog or Hive metastore, Snowflake has Horizon. Lineage informal. Data products without named owners. Audit response reactive.

Typical concerns

  • ·Lineage cannot be produced on demand
  • ·Catalogue coverage partial
  • ·Data products without accountability
  • ·Audit cycle consumes weeks of preparation
  • ·Cross-platform queries lack traceability

Capability gaps

  • ·Federated catalogue across platforms
  • ·Column-level lineage end-to-end
  • ·Data products with named owners
  • ·Continuous audit-readiness
  • ·Regulatory framework mapping
Target architecture

Microsoft Purview Data Governance federates Fabric, Databricks (Unity Catalog), and Snowflake (Horizon). Column-level lineage produced on demand for any regulator-relevant data product. Data products with named owners across all platforms. Continuous audit-readiness via Purview Insights.

Key capabilities

  • Federated catalogue across platforms
  • Column-level lineage end-to-end
  • Data products with named owners
  • Continuous audit-readiness
  • Regulatory framework mapping

Enabling SKUs

Resolved in the ‘Recommended cards’ section below.

Architecture decisions

Each decision is offered as explicit options with trade-offs — Hohpe's “selling options” principle. A safe default is noted where one exists.

  1. Decision 1.Purview tier — basic vs Data Estate Insights

    Basic

    When it fitsCatalogue + lineage as the deliverable; audit posture sufficient.

    Trade-offsNo estate-wide insights, no automation around stewardship.

    Data Estate Insights

    When it fitsMature governance programme; need stewardship + curation metrics.

    Trade-offsHigher consumption cost.

    Default recommendationBasic for the first 12 months; add Insights when stewardship is operational.

  2. Decision 2.Federation depth — metadata-only vs full lineage scan

    Metadata-only

    When it fitsQuick onboarding; lineage from existing source systems sufficient.

    Trade-offsLineage gaps inside platforms.

    Full lineage scan

    When it fitsRegulator demands column-level lineage end-to-end.

    Trade-offsScan cost per asset; tuning required.

    Default recommendationFull lineage scan on regulator-relevant data products; metadata-only elsewhere.

  3. Decision 3.Data product ownership — domain-owned vs central data team

    Domain-owned

    When it fitsData mesh ambition; mature domain teams.

    Trade-offsQuality uneven; harder to enforce baseline.

    Central data team

    When it fitsEarly-maturity programme; data team has capacity.

    Trade-offsBottleneck; less domain knowledge embedded.

    Default recommendationDomain-owned with central stewardship via the BI champions community.

Low-risk trial — proof of value

60-day Purview rollout — federate three platforms on five critical products

8 weeks

Purview Data Governance provisioned. Federate Fabric + one Databricks workspace + one Snowflake account. Column-level lineage on five critical data products. Named owners assigned. First audit-readiness report produced.

Success criteria

  • Catalogue coverage above 60% on the trial scope
  • Column-level lineage demonstrable end-to-end for five products
  • Named owners assigned and signed off
  • Audit-readiness report demonstrates regulator-required artefacts

InvestmentPurview Data Governance ~€0.37/capacity unit/hour for Data Map + consumption-based scanning. Estimated ~€2–5k/month for trial scope.

Proof metrics

  • ·Lineage produced on demand for five critical products
  • ·Catalogue coverage trending up monthly
  • ·Time-to-produce-lineage-report under 1 day
  • ·Audit response time reduced by 50%+

Recommended cards

The SKUs and capabilities most likely to be part of the solution, with the editorial rationale for each in the context of this story. Add the ones that fit your situation.

Back to Data governance