Back to Blog
infrastructureGCPSupabaseAuto-scalingDockerInfrastructure

Auto-Scaling Supabase Studio on GCP with Managed Instance Groups

How we deploy Supabase Studio on GCP with auto-scaling, auto-healing, and zero-downtime updates using Managed Instance Groups and Container-Optimized OS.

January 9, 2026
10 min read
Pulore Team
Auto-Scaling Supabase Studio on GCP with Managed Instance Groups

Auto-Scaling Supabase Studio on GCP with Managed Instance Groups

In previous posts, we covered the database, networking, and security layers of our self-hosted Supabase setup. Now let's talk about the application layer: how we deploy and scale Supabase Studio.

Our goals:

  • Auto-scaling — Handle traffic spikes without manual intervention
  • Auto-healing — Replace unhealthy instances automatically
  • Zero-downtime updates — Deploy new versions without interruption
  • Cost efficiency — Scale down when traffic is low

What is Supabase Studio?

Supabase Studio is the web-based admin interface for managing your Supabase/PostgreSQL database. It includes:

  • SQL Editor — Run queries directly
  • Table Editor — Visual database management
  • API Documentation — Auto-generated from your schema
  • Auth Management — User administration

It's open source and runs as a Docker container, making it perfect for self-hosting.

The architecture

┌─────────────────────────────────────────────────────┐
│                 Global Load Balancer                │
│                 (SSL Termination)                   │
└───────────────────────┬─────────────────────────────┘
                        │
┌───────────────────────▼─────────────────────────────┐
│              Regional Instance Group                 │
│    ┌──────────────┐    ┌──────────────┐            │
│    │   Zone A     │    │   Zone B     │            │
│    │  ┌────────┐  │    │  ┌────────┐  │            │
│    │  │   VM   │  │    │  │   VM   │  │            │
│    │  │ Studio │  │    │  │ Studio │  │            │
│    │  └────────┘  │    │  └────────┘  │            │
│    └──────────────┘    └──────────────┘            │
└─────────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────┐
│              Cloud SQL (Private IP)                  │
└─────────────────────────────────────────────────────┘

Instance Template

The foundation of auto-scaling is the instance template. It defines what each VM looks like:

this.instanceTemplate = new gcp.compute.InstanceTemplate(
  `${resourceName}-template`,
  {
    namePrefix: `${resourceName}-template-`,
    machineType: machineType,
    region: region,
    tags: ["http-server", "allow-ssh"],
 
    disks: [
      {
        sourceImage: "projects/cos-cloud/global/images/family/cos-stable",
        autoDelete: true,
        boot: true,
        diskSizeGb: 50,
        diskType: "hyperdisk-balanced",
      },
    ],
 
    networkInterfaces: [
      {
        network: args.networking.vpc.id,
        subnetwork: args.networking.subnet.id,
        // No external IP - instances are private
      },
    ],
 
    serviceAccount: {
      email: this.serviceAccount.email,
      scopes: ["cloud-platform"],
    },
 
    metadataStartupScript: startupScript,
 
    metadata: {
      "google-logging-enabled": "true",
      "google-monitoring-enabled": "true",
    },
 
    shieldedInstanceConfig: {
      enableSecureBoot: false, // COS needs this off for Docker
      enableVtpm: true,
      enableIntegrityMonitoring: true,
    },
  },
  {
    parent: this,
    dependsOn: [args.database.instance, args.database.database, args.database.user],
  },
);

Key decisions

Container-Optimized OS

We use cos-stable (Container-Optimized OS). It's:

  • Minimal and secure
  • Optimized for running Docker
  • Auto-updates the OS
  • Maintained by Google

No external IP

Instances don't have public IPs. They access the internet through Cloud NAT (for pulling Docker images) and are accessed through the load balancer.

Hyperdisk Balanced

For the boot disk, we use hyperdisk-balanced. It offers better IOPS than standard persistent disk, which helps with container startup times.

Dependencies

The template depends on the database being ready. No point starting Studio if it can't connect to PostgreSQL.

The startup script

This is where the magic happens. When a VM boots, it runs this script to configure and start Supabase Studio:

#!/bin/bash
set -euo pipefail
 
exec > >(tee /var/log/supabase-startup.log) 2>&1
echo "Starting Supabase Studio setup at $(date)"
 
# Wait for Docker
for i in {1..30}; do
    if docker info > /dev/null 2>&1; then
        break
    fi
    echo "Waiting for Docker... ($i/30)"
    sleep 2
done
 
# Environment variables (replaced at deploy time)
DB_HOST="__DB_HOST__"
DB_NAME="__DB_NAME__"
DB_PASSWORD="__DB_PASSWORD__"
AUTH_USER="__AUTH_USER__"
AUTH_PASSWORD="__AUTH_PASSWORD__"
ANON_KEY="__ANON_KEY__"
SERVICE_ROLE_KEY="__SERVICE_ROLE_KEY__"
JWT_SECRET="__JWT_SECRET__"
 
# Create network for containers
docker network create supabase-network 2>/dev/null || true
 
# Start postgres-meta (API for database introspection)
docker run -d \
    --name postgres-meta \
    --restart always \
    --network supabase-network \
    -e "PG_META_HOST=0.0.0.0" \
    -e "PG_META_PORT=8080" \
    -e "PG_META_DB_HOST=${DB_HOST}" \
    -e "PG_META_DB_PORT=5432" \
    -e "PG_META_DB_NAME=${DB_NAME}" \
    -e "PG_META_DB_USER=postgres" \
    -e "PG_META_DB_PASSWORD=${DB_PASSWORD}" \
    supabase/postgres-meta:latest
 
# Wait for postgres-meta
for i in {1..30}; do
    if curl -s http://localhost:8080/health > /dev/null; then
        break
    fi
    echo "Waiting for postgres-meta... ($i/30)"
    sleep 2
done
 
# Start Supabase Studio
docker run -d \
    --name studio \
    --restart always \
    --network supabase-network \
    -p 3000:3000 \
    -e "STUDIO_PG_META_URL=http://postgres-meta:8080" \
    -e "SUPABASE_URL=http://localhost:8000" \
    -e "SUPABASE_PUBLIC_URL=http://localhost:8000" \
    -e "SUPABASE_ANON_KEY=${ANON_KEY}" \
    -e "SUPABASE_SERVICE_KEY=${SERVICE_ROLE_KEY}" \
    -e "AUTH_JWT_SECRET=${JWT_SECRET}" \
    -e "DEFAULT_ORGANIZATION_NAME=Acme Corp" \
    -e "DEFAULT_PROJECT_NAME=Production" \
    -e "NEXT_PUBLIC_SITE_URL=http://localhost:3000" \
    -e "NEXT_ANALYTICS_BACKEND_PROVIDER=postgres" \
    supabase/studio:latest
 
echo "Supabase Studio setup complete at $(date)"

Secret injection

Notice the __DB_HOST__, __DB_PASSWORD__, etc. placeholders. These are replaced at deployment time:

const startupScript = pulumi.all([args.secrets, args.database.privateIp]).apply(([s, dbHost]) => {
  return startupScriptTemplate
    .replace(/__DB_HOST__/g, dbHost)
    .replace(/__DB_NAME__/g, s.infra.postgresDb)
    .replace(/__DB_PASSWORD__/g, s.infra.postgresPassword)
    .replace(/__AUTH_USER__/g, s.studio.authUser)
    .replace(/__AUTH_PASSWORD__/g, s.studio.authPassword)
    .replace(/__ANON_KEY__/g, s.studio.anonKey)
    .replace(/__SERVICE_ROLE_KEY__/g, s.studio.serviceRoleKey)
    .replace(/__JWT_SECRET__/g, s.studio.jwtSecret);
});

This keeps secrets out of the instance template (which would be visible in GCP console) while still baking them into the startup script.

Two containers

We run two containers:

  1. postgres-meta — Provides the API that Studio uses for database introspection
  2. studio — The actual Supabase Studio web UI

They communicate over a Docker network.

Health checks

Health checks are critical for auto-healing. If an instance fails the health check, it gets replaced.

this.healthCheck = new gcp.compute.HealthCheck(
  `${resourceName}-health-check`,
  {
    name: `${resourceName}-health-check`,
    checkIntervalSec: 10,
    timeoutSec: 5,
    healthyThreshold: 2,
    unhealthyThreshold: 3,
    httpHealthCheck: {
      port: 3000,
      requestPath: "/api/platform/profile",
    },
  },
  { parent: this },
);

Configuration explained

SettingValueMeaning
checkIntervalSec10Check every 10 seconds
timeoutSec5Request must respond in 5 seconds
healthyThreshold22 consecutive successes = healthy
unhealthyThreshold33 consecutive failures = unhealthy

The health endpoint

We hit /api/platform/profile on port 3000. This endpoint:

  • Returns 200 if Studio is running and connected to the database
  • Returns an error if something's wrong

Choosing the right health endpoint matters. A simple / might return 200 even if the database connection is broken. We want to verify the full stack is working.

Managed Instance Group

The instance group manages the VMs:

this.instanceGroupManager = new gcp.compute.RegionInstanceGroupManager(
  `${resourceName}-mig`,
  {
    name: `${resourceName}-mig`,
    region: region,
    baseInstanceName: resourceName,
    targetSize: minInstances,
    distributionPolicyZones: [`${region}-a`, `${region}-b`],
 
    versions: [{ instanceTemplate: this.instanceTemplate.selfLinkUnique }],
 
    namedPorts: [{ name: "http", port: 3000 }],
 
    autoHealingPolicies: {
      healthCheck: this.healthCheck.id,
      initialDelaySec: 300,
    },
 
    updatePolicy: {
      type: "PROACTIVE",
      minimalAction: "REPLACE",
      maxSurgeFixed: 2,
      maxUnavailableFixed: 0,
      replacementMethod: "SUBSTITUTE",
    },
  },
  { parent: this },
);

Regional distribution

We spread instances across two zones (region-a and region-b). If one zone has issues, the other keeps serving traffic.

Auto-healing

The autoHealingPolicies configuration:

  • Uses our health check to monitor instances
  • initialDelaySec: 300 — Wait 5 minutes before checking (startup takes time)
  • Unhealthy instances are automatically terminated and replaced

Update policy

The updatePolicy controls how new versions roll out:

SettingValueEffect
typePROACTIVEApply updates immediately
minimalActionREPLACECreate new instances (don't just restart)
maxSurgeFixed2Create up to 2 extra instances during update
maxUnavailableFixed0Never have fewer than target instances
replacementMethodSUBSTITUTEDelete old, create new (vs. recreate in-place)

This gives us zero-downtime deployments:

  1. New instances are created with the new template
  2. Once healthy, traffic shifts to them
  3. Old instances are terminated

Autoscaler

The autoscaler adjusts instance count based on load:

this.autoscaler = new gcp.compute.RegionAutoscaler(
  `${resourceName}-autoscaler`,
  {
    name: `${resourceName}-autoscaler`,
    region: region,
    target: this.instanceGroupManager.id,
    autoscalingPolicy: {
      minReplicas: minInstances,
      maxReplicas: maxInstances,
      cooldownPeriod: 60,
      cpuUtilization: { target: 0.7 },
    },
  },
  { parent: this },
);

Scaling policy

  • minReplicas — Never go below this (1 for dev, 2 for prod)
  • maxReplicas — Never exceed this (keeps costs bounded)
  • cooldownPeriod — Wait 60 seconds between scaling decisions
  • cpuUtilization.target — Scale up when average CPU exceeds 70%

Environment differences

SettingDevProd
minReplicas12
maxReplicas12

For dev, we fix at 1 instance (cost savings). For prod, we ensure at least 2 for high availability.

Backend service

The backend service connects the load balancer to the instance group:

this.backendService = new gcp.compute.BackendService(
  `${resourceName}-backend`,
  {
    name: `${resourceName}-backend`,
    protocol: "HTTP",
    portName: "http",
    timeoutSec: 30,
    healthChecks: this.healthCheck.id,
    securityPolicy: securityPolicy?.selfLink,
    backends: [
      {
        group: this.instanceGroupManager.instanceGroup,
        balancingMode: "UTILIZATION",
        capacityScaler: 1.0,
      },
    ],
    logConfig: { enable: true, sampleRate: 1.0 },
  },
  { parent: this },
);

Connection draining

When an instance is being removed, GCP waits for existing requests to complete. The timeoutSec: 30 gives in-flight requests 30 seconds to finish.

Logging

logConfig: { enable: true, sampleRate: 1.0 } logs every request. In production, you might reduce sampleRate to save on logging costs.

Load balancer

We covered the load balancer in detail in the infrastructure post. Key points:

  • Global IP — Single static IP for DNS
  • SSL termination — Google-managed certificate
  • HTTP to HTTPS redirect — All traffic forced to HTTPS
this.loadBalancer = new LoadBalancer(
  `${resourceName}-lb`,
  {
    name: resourceName,
    domain: args.domain,
    backendService: this.backendService,
  },
  { parent: this },
);

Putting it all together

Here's the complete service component:

const supabaseStudio = new SupabaseStudioService("supabase-studio", {
  name: "supabase-studio",
  projectId: config.projectId,
  region: config.region,
  domain: `db.${config.environment}.example.com`,
  networking: networking,
  database: database,
  secrets: secrets,
  machineType: "c4d-standard-2",
  minInstances: config.environment === "prod" ? 2 : 1,
  maxInstances: config.environment === "prod" ? 2 : 1,
  vpnPublicIp: vpn?.publicIp,
});

Machine type

We use c4d-standard-2:

  • 2 vCPUs
  • 8 GB RAM
  • Compute-optimized (C4D series)

This is plenty for Supabase Studio. The containers are lightweight.

Deployment workflow

When we run pulumi up:

  1. Template changes? If the startup script or machine config changed, a new template is created
  2. Rolling update — New instances are created with the new template
  3. Health check passes — Traffic shifts to new instances
  4. Old instances terminated — Previous version instances are deleted

The whole process takes 5-10 minutes and requires no manual intervention.

Monitoring

Built-in metrics

With google-monitoring-enabled: true, we get:

  • CPU utilization
  • Memory usage
  • Disk I/O
  • Network traffic

Custom health checks

For deeper monitoring, we could add:

  • Database connection latency
  • Container startup time
  • API response times

Logging

Container logs go to Cloud Logging via google-logging-enabled: true. We can:

  • Search and filter logs
  • Set up alerts on error patterns
  • Export to BigQuery for analysis

Common issues and fixes

1. Startup too slow

If instances take too long to start, they might fail health checks during boot. Solutions:

  • Increase initialDelaySec in auto-healing policy
  • Optimize Docker image pulls (use regional mirrors)
  • Pre-pull images in a base image

2. Memory pressure

If instances run out of memory:

  • Increase machine type
  • Add swap (not ideal but works)
  • Optimize container memory limits

3. Database connection pool exhaustion

Each Studio instance opens connections to the database. With many instances:

  • Monitor max_connections in PostgreSQL
  • Consider connection pooling (PgBouncer)
  • Limit concurrent Studio instances

4. Slow health check response

If the health endpoint is slow:

  • Check database connection latency
  • Increase health check timeout
  • Optimize the health endpoint

Cost optimization

Right-size instances

Start small. Monitor CPU and memory. Scale up only if needed.

Use committed use discounts

For predictable workloads, committed use discounts save 20-50%.

Preemptible/Spot instances

For dev environments, preemptible instances cost 60-80% less. They can be terminated with 30 seconds notice, but for non-production, that's usually fine.

What's next

In the final post of this series, we'll preview our plans for multi-region read replicas — bringing the database closer to users around the world.


Need help deploying auto-scaling applications on GCP? Get in touch — we've built infrastructure for everything from MVPs to enterprise workloads.

Pulore Team
Engineering
Share:

Want to discuss this topic?

We love talking about software architecture, development best practices, and technical strategy.