Back to Blog
infrastructurePostgreSQLGCPMulti-regionPrismaScaling

Multi-Region PostgreSQL on GCP: Our Plan for Global Scale

A preview of how we're planning to add read replicas across regions for lower latency worldwide. Includes our architecture plans and Prisma integration strategy.

January 10, 2026
8 min read
Pulore Team
Multi-Region PostgreSQL on GCP: Our Plan for Global Scale

Multi-Region PostgreSQL on GCP: Our Plan for Global Scale

This is the final post in our series on self-hosting Supabase on GCP. Throughout this series, we've covered:

  1. Why we self-host
  2. Infrastructure as Code with Pulumi
  3. Cloud SQL Enterprise Plus
  4. Security with VPN & Cloud Armor
  5. Auto-scaling Supabase Studio

Now we're looking ahead. Our client's application is growing internationally, and users in the US and Australia are experiencing higher latency than those in Europe (where our primary database lives).

This post outlines our plan for adding read replicas across regions. We haven't implemented this yet, but we're sharing our research and architecture decisions.

The problem: Latency across continents

Our primary database is in europe-west2 (London). Here's the typical latency for a database query:

User LocationRound-trip Latency
London~5ms
New York~80ms
Sydney~280ms

For a typical page load with 5-10 database queries, that's 400ms+ of database latency alone for Australian users. Not great.

The solution: Read replicas

Cloud SQL supports cross-region read replicas. The architecture:

                    ┌─────────────────────────────┐
                    │        Primary DB            │
                    │      (europe-west2)          │
                    │     Writes + Reads           │
                    └──────────────┬──────────────┘
                                   │
                    ┌──────────────┼──────────────┐
                    │              │              │
                    ▼              ▼              ▼
           ┌────────────┐  ┌────────────┐  ┌────────────┐
           │  Replica   │  │  Replica   │  │  Replica   │
           │  (us-east1)│  │  (us-west1)│  │(aus-southeast1)│
           │  Reads     │  │  Reads     │  │  Reads     │
           └────────────┘  └────────────┘  └────────────┘

Key points:

  • Writes go to primary — All mutations hit the primary in London
  • Reads can go to replicas — Queries can use the nearest replica
  • Automatic replication — GCP handles sync (typically under 1 second lag)

Our planned regions

Based on our user distribution:

RegionLocationPurpose
europe-west2LondonPrimary (writes + reads)
us-east1South CarolinaReplica (US East Coast)
australia-southeast1SydneyReplica (APAC)

We might add us-west1 (Oregon) later if West Coast latency becomes an issue.

Infrastructure changes

Read replica resource

In Pulumi, read replicas are separate DatabaseInstance resources that reference the primary:

// Pseudocode - not yet implemented
const usEastReplica = new gcp.sql.DatabaseInstance(`${resourceName}-replica-us-east`, {
  name: `${resourceName}-replica-us-east`,
  masterInstanceName: primaryInstance.name,
  region: "us-east1",
  databaseVersion: "POSTGRES_18",
 
  replicaConfiguration: {
    failoverTarget: false, // Not for automatic failover
  },
 
  settings: {
    tier: "db-perf-optimized-N-4", // Can be smaller than primary
    edition: "ENTERPRISE_PLUS",
    availabilityType: "ZONAL", // Replicas don't need regional HA
 
    ipConfiguration: {
      ipv4Enabled: false,
      privateNetwork: usEastVpc.id,
    },
 
    dataCacheConfig: {
      dataCacheEnabled: true, // Data Cache helps replicas too
    },
  },
});

Cross-region networking

Each replica needs private connectivity. Options:

  1. VPC Peering — Connect VPCs across regions
  2. Shared VPC — Single VPC spanning regions
  3. Private Service Connect — GCP-managed private connectivity

We're leaning toward Shared VPC for simplicity. One VPC with subnets in each region keeps the networking straightforward.

Replica sizing

Replicas can be smaller than the primary:

  • Primary: db-perf-optimized-N-8 (8 vCPU, 64 GB)
  • Replicas: db-perf-optimized-N-4 (4 vCPU, 32 GB)

Read replicas handle less load (no writes), and we can always scale up if needed.

Application layer: Routing reads to replicas

This is where it gets interesting. The application needs to:

  1. Route writes to the primary
  2. Route reads to the nearest replica
  3. Handle replication lag gracefully

Option 1: Prisma's read replica support

Prisma (our ORM of choice) has built-in read replica support:

// schema.prisma
datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")         // Primary
  readReplicas = [
    { url = env("DATABASE_URL_US_EAST") },
    { url = env("DATABASE_URL_AUSTRALIA") }
  ]
}
// Usage in application code
const users = await prisma.user.findMany(); // Automatically routes to replica
 
const newUser = await prisma.user.create({
  // Always goes to primary
  data: { name: "Alice" },
});

Prisma handles:

  • Automatic routing based on read/write
  • Connection pooling per replica
  • Fallback if a replica is unavailable

Option 2: Manual routing with connection selection

For more control, we could manually select connections:

const primaryDb = new PrismaClient({ datasources: { db: { url: PRIMARY_URL } } });
const replicaDb = new PrismaClient({ datasources: { db: { url: REPLICA_URL } } });
 
// Write
await primaryDb.user.create({ data: { name: "Alice" } });
 
// Read
const users = await replicaDb.user.findMany();

More verbose but gives explicit control.

Option 3: Smart routing based on user location

For APIs, we could route based on the incoming request's region:

function getDatabaseClient(request: Request) {
  const region = getRequestRegion(request); // From CF-IPCountry header, etc.
 
  switch (region) {
    case "US":
      return usEastPrisma;
    case "AU":
      return australiaPrisma;
    default:
      return primaryPrisma;
  }
}

This ensures users hit the closest replica.

Handling replication lag

Read replicas have some delay (typically under 1 second, but can be more under load). This matters for "read-after-write" scenarios:

// User creates a post
await prisma.post.create({ data: { title: "Hello" } });
 
// Immediately fetch their posts
const posts = await prisma.post.findMany({ where: { authorId: userId } });
// ⚠️ New post might not be visible yet if reading from replica

Solutions

1. Read-your-writes consistency

After a write, temporarily route that user's reads to the primary:

async function createPost(userId: string, data: PostData) {
  await primaryDb.post.create({ data });
 
  // Set a short TTL flag
  await cache.set(`read-primary:${userId}`, true, { ttl: 5 }); // 5 seconds
}
 
async function getPosts(userId: string) {
  const shouldReadPrimary = await cache.get(`read-primary:${userId}`);
  const db = shouldReadPrimary ? primaryDb : replicaDb;
 
  return db.post.findMany({ where: { authorId: userId } });
}

2. Explicit primary reads for sensitive operations

Some queries should always hit primary:

// Dashboard showing user's just-created data
const myPosts = await primaryDb.post.findMany({
  where: { authorId: currentUser.id },
});
 
// Public listing can use replica (slightly stale is fine)
const publicPosts = await replicaDb.post.findMany({
  where: { published: true },
  take: 20,
});

3. Accept eventual consistency

For many use cases, a 1-second delay is fine:

  • Product listings
  • Blog posts
  • Search results
  • Analytics dashboards

Document which operations tolerate staleness and which don't.

Monitoring replication lag

Cloud SQL exposes replication lag metrics. We'd set up alerts for:

  • Lag exceeds 5 seconds — Warning, investigate
  • Lag exceeds 30 seconds — Critical, potential replication issues
  • Replica unavailable — Failover reads to primary
// Pseudocode for health check
async function checkReplicaHealth(replicaUrl: string) {
  const lag = await getReplicationLag(replicaUrl);
 
  if (lag > 30) {
    logger.error(`Replica lag critical: ${lag}s`);
    metrics.increment("replica.lag.critical");
    return false;
  }
 
  return true;
}

Cost considerations

Adding replicas increases costs:

ComponentMonthly Cost (est.)
Primary (N-8)$800-1000
US East Replica (N-4)$400-500
Australia Replica (N-4)$400-500
Cross-region egress$50-100

Total: roughly 2x the database cost. But if latency is causing user churn, it's worth it.

Optimizing costs

  • Right-size replicas — Start with N-2, scale up as needed
  • Consider cascading replicas — US West replicates from US East, not primary
  • Pause unused replicas — If Australia traffic is low at night, stop the replica

Deployment considerations

DNS and service discovery

Each replica needs to be discoverable:

// Environment variables per region
DATABASE_URL_PRIMARY=postgresql://primary.internal:5432/db
DATABASE_URL_US_EAST=postgresql://replica-us-east.internal:5432/db
DATABASE_URL_AUSTRALIA=postgresql://replica-aus.internal:5432/db

Connection pooling

Each replica should have its own connection pool. With Prisma, this happens automatically. With other tools, configure PgBouncer per replica.

Failover strategy

If a replica fails:

  1. Route reads to primary temporarily
  2. Alert on-call
  3. Investigate and restore replica

Replicas are for performance, not availability. The primary handles all traffic if replicas are down.

What we're still figuring out

Edge deployments

With replicas in multiple regions, should we also deploy the application in those regions? Options:

  • Single-region app — App in Europe, queries cross-region to replicas (still adds network hop)
  • Multi-region app — App instances near each replica (lower latency, more complexity)
  • Edge functions — Cloudflare Workers/Vercel Edge connecting to nearest replica

We're researching the tradeoffs.

Prisma Data Proxy

Prisma offers a Data Proxy that handles connection pooling and could simplify multi-region routing. We're evaluating if it's worth the additional dependency.

Write latency for remote users

Even with read replicas, writes still go to London. An Australian user creating content faces ~280ms latency. Options:

  • Accept it — Writes are less frequent than reads
  • Async writes — Queue locally, process async
  • Multi-primary — Complex, usually not worth it for our scale

Timeline

We're planning to implement this in phases:

  1. Research complete (now) — Architecture decisions made
  2. Proof of concept — Single replica in us-east1
  3. Production rollout — Add replicas, update application
  4. Monitoring — Observe latency improvements, tune as needed

We'll update this post with implementation details once we've built it.

Key takeaways

If you're considering multi-region PostgreSQL:

  1. Measure first — Is latency actually your bottleneck?
  2. Start with one replica — Add complexity incrementally
  3. Plan for consistency — Read-after-write needs special handling
  4. Use built-in tooling — Prisma, PgBouncer, etc. handle the hard parts
  5. Monitor replication lag — It's the key metric for replica health

Multi-region databases aren't simple, but for global applications, they're often necessary. The key is understanding the tradeoffs and implementing incrementally.


Building a global application and need help with multi-region architecture? Get in touch — we love solving these kinds of infrastructure challenges.

Pulore Team
Engineering
Share:

Want to discuss this topic?

We love talking about software architecture, development best practices, and technical strategy.