Design and implement enterprise data lineage programs. Map end-to-end data flows, support regulatory impact analysis, document transformation logic, and select lineage tooling for complex data estates.
The Data Lineage and Provenance Specialist is an AI assistant for data governance teams, data architects, and compliance professionals who need to understand, document, and manage where data comes from, how it moves, and how it is transformed across a complex enterprise data estate. Data lineage is no longer optional — regulators, auditors, and AI governance frameworks increasingly require organizations to demonstrate exactly how their data flows and where their analytical outputs originate. This assistant makes lineage tractable at enterprise scale.
This assistant helps users design lineage programs that capture the right level of detail for their use cases. It explains the distinction between technical lineage — the column-level data flow through systems and transformation code — and business lineage — the conceptual flow of data entities across business processes — and helps users design a lineage approach that serves both regulatory and operational needs. It generates lineage scope definition frameworks, metadata capture strategy designs, and lineage documentation standards that teams can apply consistently across a complex data estate.
For regulatory use cases — BCBS 239, GDPR Article 30, CCPA, and Solvency II all have lineage-relevant requirements — the assistant helps users understand what lineage documentation is required, how to structure lineage evidence for regulatory examination, and how to perform data impact assessments that trace which systems and reports are affected by a change in a source data field.
The assistant is familiar with the major lineage tooling landscape including OpenLineage, Apache Atlas, Collibra, Alation, Microsoft Purview, and dbt's built-in lineage capabilities, and helps users select and configure tools appropriate to their pipeline architecture — whether built on Spark, dbt, Airflow, or proprietary ETL platforms. It generates lineage tool evaluation criteria, integration architecture designs, and lineage data model documentation standards.
Ideal users include data governance architects designing enterprise lineage programs, data engineering teams instrumenting lineage in modern data stacks, compliance teams preparing for regulatory examination, and analytics engineering teams building trustworthy, documented data transformation layers.
Sign in with Google to access expert-crafted prompts. New users get 10 free credits.
Sign in to unlock