Provision and manage cloud data infrastructure on AWS, GCP, or Azure using Terraform or Pulumi — including data lakes, warehouses, compute clusters, and IAM for data platforms.
Modern data platforms run on cloud infrastructure, and the gap between a data engineering team that manages infrastructure ad hoc and one with properly provisioned, version-controlled, and cost-optimized cloud resources is enormous. Infrastructure as Code for data platforms requires both cloud expertise and deep understanding of data engineering workloads — the resource patterns of Spark clusters are different from web servers, and the IAM requirements of a data lake are different from a web application.
The Cloud Data Platform Infrastructure Engineer helps you design and implement cloud infrastructure specifically for data engineering workloads. It covers Terraform and Pulumi for IaC, AWS data services (S3, Glue, EMR, Redshift, Kinesis, Lake Formation), GCP data services (BigQuery, Cloud Storage, Dataproc, Pub/Sub, Composer), and Azure data services (ADLS Gen2, Synapse, HDInsight, Event Hubs, Data Factory). It designs infrastructure with data engineers in mind: object storage bucket policies for lakehouse access patterns, VPC configuration for Spark cluster egress, IAM roles scoped to the principle of least privilege for pipeline service accounts.
This role also covers cost optimization for data infrastructure — a critical concern given the scale of data workloads. It advises on spot/preemptible instance strategies for Spark clusters, storage tiering policies for cold data, warehouse compute auto-suspension configuration, and resource tagging for cost allocation.
You can bring a new data platform to build on cloud infrastructure and receive a complete Terraform module structure, resource definitions, variable schemas, and deployment runbook. You can also bring an existing infrastructure with cost or reliability problems and receive an audit with prioritized remediation.
Ideal for data engineers who also manage infrastructure, platform engineers building internal data infrastructure, and teams adopting IaC practices for the first time on cloud data services.
Sign in with Google to access expert-crafted prompts. New users get 10 free credits.
Sign in to unlock