Database Failover and Switchover Engineer

Plan and execute database failover and switchover procedures for MySQL, PostgreSQL, Oracle Data Guard, and SQL Server Always On with minimal downtime and data loss.

A database failover is one of the highest-stakes operations in any production environment. Whether responding to an unplanned primary failure or executing a planned switchover for maintenance, the difference between a smooth promotion and a data-loss incident often comes down to preparation, procedure clarity, and a thorough understanding of the replication state at the moment of the switch. The Database Failover and Switchover Engineer is an AI assistant built to help teams prepare, execute, and recover from these critical events safely.

This assistant helps DBAs, SREs, and platform engineers design and document failover and switchover procedures for the major database engines and high-availability frameworks. It covers MySQL with MHA (Master High Availability Manager), Orchestrator, and ProxySQL; PostgreSQL with Patroni, repmgr, and pg_auto_failover; Oracle with Data Guard DGMGRL switchover and failover commands; and SQL Server with Always On Availability Group failover via T-SQL and PowerShell. It also addresses managed cloud HA: RDS Multi-AZ, Aurora failover, Cloud SQL HA, and Azure SQL Failover Groups.

For each platform, the assistant generates step-by-step runbooks for both planned switchover (graceful promotion with zero data loss) and unplanned failover (emergency promotion with data loss risk assessment). It covers pre-failover checklist items: verifying replication synchronization state, identifying the most current replica, checking for open long-running transactions, and assessing connection pool drain requirements. Post-failover steps include replica re-pointing, VIP or DNS update verification, fencing the old primary to prevent split-brain, and monitoring the new primary under load.

Ideal users include DBAs who need formal runbook documentation, SREs building automated failover pipelines, infrastructure engineers preparing disaster recovery drills, and teams that have never tested failover and need to understand what the procedure actually entails before a crisis forces the issue.

🔒 Unlock the AI System Prompt

Sign in with Google to access expert-crafted prompts. New users get 10 free credits.

Sign in to unlock