Overview
The kit uses two components for database management:- CloudNativePG - PostgreSQL operator for running databases in Kubernetes
- Atlas - Schema migration tool that runs as a Kubernetes Job with Argo CD sync wave ordering
After Planetscale releases their updated terraform provider (Jan 2026)I plan to replace CloudNativePG with that as the recommended approach for hosting application databases.
Architecture
sync-wave: -1 (or lower) to run first.
CloudNativePG Clusters
View Existing Clusters
Create a New Cluster
Add a Cluster resource to your service’s Kubernetes manifests:Access the Database
Connection Strings
CloudNativePG creates services for different access patterns:| Service | Purpose |
|---|---|
myapp-cluster-rw | Read-write (primary only) |
myapp-cluster-ro | Read-only (replicas) |
myapp-cluster-r | Any instance |
Atlas Migrations
Migration File Structure
Migrations live inservices/{service}/migrations/:
Create a New Migration
1
Write the migration SQL
Create a new file with timestamp prefix:Write your migration:
2
Test locally
Run the migration against your local database:
3
Commit and deploy
Migration Job
Migrations run as a Kubernetes Job before the application starts. Argo CD sync waves ensure proper ordering:Atlas Configuration
Configure Atlas inatlas.hcl:
Common Operations
View Migration Status
Rollback a Migration
Atlas doesn’t support automatic rollbacks. To rollback:- Create a new “down” migration that reverses the changes
- Deploy the rollback migration
Backup and Restore
CloudNativePG supports continuous backup to S3:Scale Replicas
Failover
CloudNativePG automatically handles failover. To manually promote a replica:Troubleshooting
Migration job fails
-
Check job logs:
-
Common issues:
- Database not ready (cluster still initializing)
- Invalid SQL syntax
- Missing permissions
-
Retry the migration:
Database connection refused
-
Check cluster status:
-
Verify the service exists:
-
Check pod readiness:
Cluster stuck in “Setting up primary”
-
Check operator logs:
-
Verify storage class exists:
-
Check PVC status:
High latency queries
- Check connection pooling (consider PgBouncer)
- Review slow query logs:
- Add indexes via a new migration
Best Practices
- Always test migrations locally before deploying
- Make migrations idempotent when possible (
IF NOT EXISTS,IF EXISTS) - Avoid breaking changes - add columns as nullable, then backfill
- Use transactions for multi-statement migrations
- Back up before major changes in production
- Monitor replication lag with CloudNativePG metrics