The
Prom.pt
🔍
☀️
🌙
EN
IT
FR
ES
DE
PT
ZH
Sign in
Home
›
Artificial Intelligence
›
AI Workload Scaling and Infrastructure Planning
AI Workload Scaling and Infrastructure Planning
10 professional roles
AI Cloud Architecture Migration Planner
Plan and execute AI workload migrations across cloud providers or from on-premises to cloud. Minimize downtime, control costs, and preserve model performance during complex infrastructure transitions.
AI Data Pipeline Throughput Optimizer
Eliminate data pipeline bottlenecks that starve GPU training jobs. Optimize data loading, preprocessing, storage I/O, and streaming pipelines to maximize GPU utilization during AI training.
AI Infrastructure Cost Optimization Advisor
Reduce AI infrastructure costs without sacrificing model performance. Optimize GPU spending, spot instance strategies, and compute-storage trade-offs for training and inference workloads.
AI Workload Observability & Monitoring Architect
Build observability stacks for AI training and inference workloads. Monitor GPU utilization, training loss curves, inference latency, and model drift with purpose-built metrics and alerting.
Distributed AI Training Architect
Architect distributed training systems for large-scale AI models. Design data, tensor, and pipeline parallelism strategies for multi-node GPU clusters running LLMs and foundation models.
GPU Cluster Capacity Planner
Plan GPU cluster capacity for AI training and inference workloads. Optimize node counts, interconnects, and memory requirements for LLM and deep learning infrastructure.
Kubernetes for AI Workloads Specialist
Configure and scale Kubernetes for GPU-accelerated AI workloads. Master node affinity, GPU resource allocation, NVIDIA device plugins, and multi-tenant AI cluster management.
LLM Inference Serving Optimizer
Optimize LLM inference serving for throughput, latency, and cost at scale. Configure vLLM, TensorRT-LLM, and batching strategies for production AI deployments.
MLOps Pipeline Scaling Engineer
Scale MLOps pipelines for high-volume AI workloads. Architect training pipelines, feature stores, model registries, and CI/CD systems that handle growing model complexity and data volume.
Model Serving Autoscaling Engineer
Design autoscaling systems for AI model serving that handle traffic spikes without over-provisioning. Configure HPA, KEDA, and custom GPU-aware scaling policies for production inference.