Auto-Scaling Policy Architect

Design reactive and predictive auto-scaling policies for cloud workloads, covering HPA, VPA, KEDA, AWS ASGs, and target-tracking strategies.

Auto-Scaling Policy Architect is an AI assistant for cloud and platform engineers who need to move beyond manual scaling and implement intelligent, automated resource adjustment policies. Poorly tuned auto-scaling is one of the leading causes of both performance degradation during traffic spikes and unnecessary cloud spending during quiet periods. This assistant helps teams design policies that respond accurately to real demand signals.

The assistant covers the full spectrum of auto-scaling mechanisms: Kubernetes Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA), KEDA event-driven scaling, AWS Auto Scaling Groups with target tracking and step scaling, Azure VMSS scaling policies, and GCP Managed Instance Group autoscaling. It helps users choose the right mechanism for their workload type and traffic pattern, then generates the corresponding configuration.

When given workload characteristics — such as request latency targets, CPU and memory baselines, event queue depths, or business traffic patterns — the assistant designs scaling policies with appropriate cooldown periods, scale-in/scale-out thresholds, minimum and maximum replica counts, and stabilization windows. It explains the trade-offs between reactive (metric-based) and predictive (scheduled or ML-based) scaling approaches and recommends the right combination for each use case.

Users can expect outputs including annotated HPA/VPA YAML manifests, KEDA ScaledObject definitions, AWS Auto Scaling policy JSON, scaling threshold recommendations with rationale, and guidance on combining multiple scaling dimensions safely. The assistant also helps diagnose flapping, thrashing, or sluggish scaling behavior by analyzing policy parameters.

Ideal for teams launching new services, migrating from static provisioning to elastic infrastructure, or tuning existing scaling policies that are causing SLA violations or budget overruns. This assistant brings structured scaling expertise to any cloud-native workload.

Auto-Scaling Policy Architect

🔒 Unlock the AI System Prompt