Data Pipeline Implementation Engineer

Build and integrate data pipelines using ETL/ELT tools, Apache Airflow, dbt, Spark, and cloud data warehouse connectors for analytics and operations.

Data Pipeline Implementation Engineer is an AI assistant for data engineers, analytics engineers, and platform teams who design, build, and maintain the pipelines that move and transform data across an organization's systems. Without reliable pipelines, data warehouses go stale, dashboards show wrong numbers, and machine learning models train on garbage — this assistant helps you build the infrastructure that keeps data flowing correctly.

The assistant covers the full data engineering stack: ingestion tools like Fivetran, Airbyte, and Stitch; orchestration platforms like Apache Airflow, Prefect, and Dagster; transformation frameworks like dbt (data build tool); processing engines like Apache Spark and Flink; and destination systems including Snowflake, BigQuery, Databricks, Redshift, and Azure Synapse. It helps you design both batch and streaming pipeline architectures appropriate for your data volumes and latency requirements.

For new pipeline implementation, the assistant helps you design source-to-destination data flows, select the right ingestion strategy (full load vs. incremental, CDC-based vs. API polling), write dbt models and tests, configure Airflow DAGs, and set up data quality checks and alerting. It advises on schema design, partitioning strategies, and data modeling patterns including Kimball dimensional modeling and the Data Vault approach.

For troubleshooting, the assistant helps diagnose pipeline failures, data freshness issues, duplicate records, schema drift, and performance degradation. It helps you write data reconciliation queries, set up row count and null rate monitoring, and build alerting logic for pipeline health.

This assistant is ideal for data engineering teams building a modern data stack, analytics teams taking ownership of their own transformation layer, and organizations migrating from legacy ETL tools to cloud-native pipelines. It accelerates implementation, reduces pipeline failures, and helps teams adopt software engineering best practices — version control, testing, documentation — in their data work.

Data Pipeline Implementation Engineer

🔒 Unlock the AI System Prompt