Cloud SQL PostgreSQL 18 Enterprise Plus: A Complete Setup Guide

When we decided to self-host Supabase on GCP, the database choice was straightforward: Cloud SQL with PostgreSQL 18 on the Enterprise Plus tier.

This post covers everything we learned setting it up — the configuration decisions, the gotchas, and the features that make Enterprise Plus worth considering.

Why Enterprise Plus?

GCP offers two Cloud SQL editions:

Feature	Enterprise	Enterprise Plus
Availability SLA	99.95%	99.99%
Max instance size	96 vCPUs / 624 GB RAM	128 vCPUs / 864 GB RAM
Data Cache	❌	✅
Point-in-time recovery	✅	✅
Read replicas	✅	✅
Private networking	✅	✅

The killer feature for us was Data Cache.

What is Data Cache?

Data Cache is an in-memory caching layer that sits in front of your database storage. It automatically caches frequently accessed data, reducing disk I/O and improving query latency.

Think of it as a built-in Redis layer, but:

You don't manage it
It's automatically populated based on access patterns
It's included in the Enterprise Plus pricing
It works transparently with your existing queries

For read-heavy workloads (which most web applications are), this is a significant performance boost without any code changes.

Our database configuration

Here's how we configure Cloud SQL in Pulumi:

this.instance = new gcp.sql.DatabaseInstance(
  `${resourceName}-db`,
  {
    name: `${resourceName}-db`,
    databaseVersion: "POSTGRES_18",
    region: region,
    deletionProtection: deletionProtection,
    settings: {
      tier: tier,
      edition: "ENTERPRISE_PLUS",
      availabilityType: availabilityType,
 
      // Storage
      diskSize: diskSize,
      diskType: "PD_SSD",
      diskAutoresize: true,
      diskAutoresizeLimit: 1000,
 
      // Data Cache
      dataCacheConfig: dataCacheEnabled ? { dataCacheEnabled: true } : undefined,
 
      // Private IP
      ipConfiguration: {
        ipv4Enabled: false,
        privateNetwork: args.network.id,
        enablePrivatePathForGoogleCloudServices: true,
      },
 
      // Backups
      backupConfiguration: {
        enabled: true,
        startTime: "03:00",
        pointInTimeRecoveryEnabled: true,
        transactionLogRetentionDays: 7,
        backupRetentionSettings: {
          retainedBackups: backupRetainedDays,
          retentionUnit: "COUNT",
        },
      },
 
      // Maintenance
      maintenanceWindow: {
        day: 7, // Sunday
        hour: 4, // 4 AM UTC
        updateTrack: "stable",
      },
 
      // Database flags
      databaseFlags: [
        { name: "max_connections", value: maxConnections.toString() },
        { name: "cloudsql.iam_authentication", value: "on" },
      ],
 
      // Query monitoring
      insightsConfig: {
        queryInsightsEnabled: true,
        queryPlansPerMinute: 5,
        queryStringLength: 1024,
        recordApplicationTags: true,
        recordClientAddress: true,
      },
    },
  },
  {
    parent: this,
    dependsOn: [args.privateVpcConnection],
  },
);

Let's break down each section.

Instance tiers and sizing

Cloud SQL uses a tiered naming convention for machine types. For Enterprise Plus, the performance-optimized tiers are what you want:

db-perf-optimized-N-{vCPUs}

Common configurations:

Tier	vCPUs	Memory	Use Case
db-perf-optimized-N-2	2	16 GB	Small dev environments
db-perf-optimized-N-4	4	32 GB	Dev/staging, light production
db-perf-optimized-N-8	8	64 GB	Production workloads
db-perf-optimized-N-16	16	128 GB	High-traffic applications
db-perf-optimized-N-32	32	256 GB	Enterprise workloads

We use db-perf-optimized-N-4 for development (32 GB RAM is plenty for testing) and db-perf-optimized-N-8 for production.

The memory matters

With Enterprise Plus and Data Cache enabled, more memory means more cache. If your working dataset fits in memory, you'll see dramatically lower query latencies.

Storage configuration

diskSize: diskSize,
diskType: "PD_SSD",
diskAutoresize: true,
diskAutoresizeLimit: 1000,

SSD storage

Always use PD_SSD. The IOPS and latency improvements over standard persistent disk are significant for database workloads. The cost difference is minor compared to the performance gain.

Auto-resize

Enable diskAutoresize to prevent running out of space. The database will automatically expand storage when it reaches 90% capacity.

We set diskAutoresizeLimit: 1000 (1 TB) as a safety cap. This prevents runaway costs if something goes wrong with data growth.

Warning: Disk size can only increase, never decrease. Start with a reasonable minimum and let it grow.

Availability and redundancy

availabilityType: "REGIONAL", // or "ZONAL"

ZONAL (Single-zone)

Instance runs in one zone
Cheaper
Downtime during zone failures
Good for: Development, non-critical workloads

REGIONAL (Multi-zone)

Primary instance + standby in different zones
Automatic failover (usually under 60 seconds)
Required for 99.99% SLA
Good for: Production workloads

We use ZONAL for dev (save costs) and REGIONAL for prod (maximize availability).

Private networking

This is crucial for security. We never expose the database to the public internet.

ipConfiguration: {
  ipv4Enabled: false,
  privateNetwork: args.network.id,
  enablePrivatePathForGoogleCloudServices: true,
},

How it works

VPC Peering: Cloud SQL creates a peering connection to your VPC
Private IP: The database gets an internal IP address in your network
No public exposure: ipv4Enabled: false ensures no public IP is assigned

The setup requirement

Before creating a Cloud SQL instance with private IP, you need to establish the private services connection:

// Reserve an IP range for Cloud SQL
this.privateIpRange = new gcp.compute.GlobalAddress(`${resourceName}-private-ip-range`, {
  purpose: "VPC_PEERING",
  addressType: "INTERNAL",
  prefixLength: 16,
  network: this.vpc.id,
});
 
// Create the peering connection
this.privateVpcConnection = new gcp.servicenetworking.Connection(`${resourceName}-private-vpc-connection`, {
  network: this.vpc.id,
  service: "servicenetworking.googleapis.com",
  reservedPeeringRanges: [this.privateIpRange.name],
});

The key detail: Cloud SQL depends on this connection. In Pulumi, we express this explicitly:

{
  dependsOn: [args.privateVpcConnection],
}

Without this dependency, Pulumi might try to create the database before the VPC peering is ready, and you'll get cryptic networking errors.

Private path for Google services

enablePrivatePathForGoogleCloudServices: true,

This allows other Google services (like Cloud Run, GKE, etc.) to access the database through private networking without going through the public internet.

Backup configuration

Backups are non-negotiable for production databases.

backupConfiguration: {
  enabled: true,
  startTime: "03:00", // 3 AM UTC
  pointInTimeRecoveryEnabled: true,
  transactionLogRetentionDays: 7,
  backupRetentionSettings: {
    retainedBackups: 35,
    retentionUnit: "COUNT",
  },
},

Daily automated backups

startTime: "03:00" schedules backups at 3 AM UTC. Choose a time when your database has low activity.

Point-in-time recovery (PITR)

When enabled, Cloud SQL continuously writes transaction logs to storage. You can restore to any point within the retention window — not just the daily backup.

transactionLogRetentionDays: 7 means you can restore to any moment in the last week.

Backup retention

We keep 35 daily backups (about 5 weeks). This gives us enough history to recover from issues that aren't noticed immediately.

Tip: Test your restore process regularly. Backups are worthless if you can't restore from them.

Maintenance windows

maintenanceWindow: {
  day: 7, // Sunday
  hour: 4, // 4 AM UTC
  updateTrack: "stable",
},

GCP applies security patches and updates during maintenance windows. By specifying day and hour, you control when this happens.

We choose Sunday at 4 AM UTC — lowest traffic for most business applications.

Update tracks

stable: Updates after they've been tested in production by other users
canary: Get updates earlier (useful for testing)

Stick with stable for production.

Database flags

databaseFlags: [
  { name: "max_connections", value: maxConnections.toString() },
  { name: "cloudsql.iam_authentication", value: "on" },
],

max_connections

The default is often too low for production. We set this to 200, but adjust based on your needs:

Connection poolers like PgBouncer can reduce the connections you need
Each connection uses memory, so don't set this arbitrarily high
Monitor pg_stat_activity to see actual usage

IAM authentication

cloudsql.iam_authentication enables authentication using GCP IAM users and service accounts. This is useful for:

Service accounts connecting from Cloud Run, GKE, etc.
Developers connecting via Cloud SQL Auth Proxy
Reducing password management overhead

Query Insights

insightsConfig: {
  queryInsightsEnabled: true,
  queryPlansPerMinute: 5,
  queryStringLength: 1024,
  recordApplicationTags: true,
  recordClientAddress: true,
},

Query Insights is free and provides invaluable debugging information:

Top queries by CPU, latency, and IO
Query plans for slow queries
Historical trends

We enable all options. The overhead is minimal, and the visibility is worth it.

Creating the database and user

The Cloud SQL instance is just the server. You also need a database and user:

this.database = new gcp.sql.Database(
  `${resourceName}-database`,
  {
    name: args.databaseName,
    instance: this.instance.name,
    charset: "UTF8",
    collation: "en_US.UTF8",
  },
  { parent: this },
);
 
this.user = new gcp.sql.User(
  `${resourceName}-db-user`,
  {
    name: "postgres",
    instance: this.instance.name,
    password: args.password,
  },
  { parent: this },
);

Character set and collation

UTF8 with en_US.UTF8 collation is the safe default for most applications. If you have specific locale requirements, adjust accordingly.

The postgres user

We use the default postgres superuser. For enhanced security, you could:

Create a separate application user with limited privileges
Disable the postgres user after setup
Use IAM authentication for application access

For now, password authentication with secrets management (via Infisical) works well for our use case.

Environment presets

Rather than duplicating configuration, we use presets:

export const DatabasePresets = {
  dev: {
    tier: "db-perf-optimized-N-4",
    edition: "ENTERPRISE_PLUS",
    availabilityType: "ZONAL" as const,
    diskSize: 250,
    dataCacheEnabled: true,
    deletionProtection: false,
    backupRetainedDays: 35,
  },
  prod: {
    tier: "db-perf-optimized-N-8",
    edition: "ENTERPRISE_PLUS",
    availabilityType: "REGIONAL" as const,
    diskSize: 250,
    dataCacheEnabled: true,
    deletionProtection: true,
    backupRetainedDays: 35,
  },
};

Key differences:

Setting	Dev	Prod	Why
Tier	N-4 (4 vCPU)	N-8 (8 vCPU)	Prod handles more load
Availability	ZONAL	REGIONAL	Prod needs HA
Deletion protection	Off	On	Prevent accidental deletion in prod

Connecting to the database

From within GCP

Applications running in the same VPC can connect directly using the private IP:

const connectionString = pulumi.interpolate`postgresql://postgres:${password}@${database.privateIp}:5432/${dbName}`;

From outside GCP (development)

For local development, use the Cloud SQL Auth Proxy:

# Install the proxy
gcloud components install cloud-sql-proxy
 
# Connect (this creates a local socket)
cloud-sql-proxy --port 5432 PROJECT:REGION:INSTANCE_NAME

Then connect as if the database were local:

psql -h localhost -U postgres -d mydb

Via VPN

Our setup includes a VPN, so developers can:

Connect to the VPN
Access the database directly via private IP

This is convenient for tools that don't support Cloud SQL Auth Proxy.

Common gotchas

1. VPC peering takes time

The private VPC connection can take 5-10 minutes to establish. If your pulumi up seems stuck on the database creation, this is likely why.

2. Connection limits with serverless

Cloud Run and similar services can create many connections quickly. Use a connection pooler (PgBouncer, Prisma's built-in pooling) or Cloud SQL's built-in connection pooling.

3. Disk can only grow

Once you increase disk size, you can't decrease it. Start smaller and let auto-resize handle growth.

4. Maintenance can cause brief downtime

Even with REGIONAL availability, maintenance updates can cause brief (seconds) connection interruptions. Design your application to handle database reconnection gracefully.

5. Backups aren't free

Backup storage costs money. 35 daily backups of a large database adds up. Monitor your storage costs.

Monitoring and alerts

Beyond Query Insights, set up alerts for:

CPU utilization > 80% sustained
Memory utilization > 90%
Disk utilization > 80%
Connection count approaching max_connections
Replication lag (if using read replicas)

GCP Cloud Monitoring makes this straightforward.

Next steps

In upcoming posts, we'll cover:

Read replicas: Setting up cross-region replicas for global deployments
Performance tuning: PostgreSQL configuration for specific workloads
Monitoring deep dive: Setting up comprehensive observability

Need help with your database architecture? Get in touch — we've deployed and optimized databases for applications at every scale.

Cloud SQL PostgreSQL 18 Enterprise Plus: A Complete Setup Guide

Cloud SQL PostgreSQL 18 Enterprise Plus: A Complete Setup Guide

Why Enterprise Plus?

What is Data Cache?

Our database configuration

Instance tiers and sizing

The memory matters

Storage configuration

SSD storage

Auto-resize

Availability and redundancy

ZONAL (Single-zone)

REGIONAL (Multi-zone)

Private networking

How it works

The setup requirement

Private path for Google services

Backup configuration

Daily automated backups

Point-in-time recovery (PITR)

Backup retention

Maintenance windows

Update tracks

Database flags

max_connections

IAM authentication

Query Insights

Creating the database and user

Character set and collation

The postgres user

Environment presets

Connecting to the database

From within GCP

From outside GCP (development)

Via VPN

Common gotchas

1. VPC peering takes time

2. Connection limits with serverless

3. Disk can only grow

4. Maintenance can cause brief downtime

5. Backups aren't free

Monitoring and alerts

Next steps

Related Posts

Multi-Region PostgreSQL on GCP: Our Plan for Global Scale

Auto-Scaling Supabase Studio on GCP with Managed Instance Groups

Building Reusable GCP Infrastructure with Pulumi & TypeScript

Want to discuss this topic?