{"id":4116,"date":"2025-10-07T09:00:00","date_gmt":"2025-10-07T16:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/startups\/blog\/?p=4116"},"modified":"2026-03-03T17:27:58","modified_gmt":"2026-03-04T01:27:58","slug":"openai-and-postgresql-scaling-with-microsoft-azure","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/startups\/blog\/openai-and-postgresql-scaling-with-microsoft-azure\/","title":{"rendered":"OpenAI and PostgreSQL: Scaling with Microsoft Azure"},"content":{"rendered":"\n

OpenAI, the company behind ChatGPT and other breakthrough AI models, is known for pushing technological boundaries. But one surprising part of OpenAI\u2019s story is how much it leans on a tried-and-true technology: PostgreSQL. Postgres is the backbone of OpenAI\u2019s most critical systems. In this blog, we\u2019ll explore OpenAI\u2019s PostgreSQL journey with Microsoft Azure<\/a>\u2014the challenges faced, the solutions implemented, and the impressive results achieved. More importantly, we\u2019ll distill lessons you can use to scale your database.<\/p>\n\n\n\n

\n
Innovate with a fully managed, AI-ready PostgreSQL database<\/a><\/div>\n<\/div>\n\n\n\n

The beginning: Initial architecture focused on simplicity<\/h2>\n\n\n\n
\n\t
\n\t\t

\n\t\t\tWhat is PostgreSQL?\t\t<\/h2>\n\t\t

\n\t\t\t\t\t\t\t\n\t\t\t\t\t\tLearn more\t\t\t\t\t\t\t\u2197<\/a>\n\t\t\t\t\t<\/p>\n\t<\/div>\n<\/div>\n\n\n\n

From early on, OpenAI used Azure Database for PostgreSQL<\/a>, which spared the team from low-level database maintenance while providing important features like automated backups and high availability. The architecture was initially simple: one primary Postgres instance handled writes, with multiple read-only replicas to shoulder the heavy read traffic. This classic primary-replica setup worked well through OpenAI\u2019s early growth.<\/p>\n\n\n\n

For read-intensive workloads, this single-shard approach was a big win. Read scalability was excellent, thanks to dozens of replicas that the team could add as needed. Each replica is a live copy of the primary, so spreading out read queries among them allowed OpenAI to serve millions of users with low latency. Geographic distribution of replicas even enabled snappy read performance for users around the world. It\u2019s a showcase of how cloud-managed Postgres can scale out reads efficiently.<\/p>\n\n\n\n

However, as usage of ChatGPT and other services grew, the limits of this design were tested. Write requests became a growing bottleneck. All write operations had to funnel into the single primary database. As traffic surged, a few incidents occurred where database performance affected OpenAI\u2019s services. These events were wake-up calls to implement new strategies to support read and write scale-out for their PostgreSQL workloads.<\/p>\n\n\n\n

Scaling\u00a0up with PostgreSQL on Azure as demand\u00a0grows<\/h2>\n\n\n\n

At POSETTE 2025<\/a>, OpenAI shared how their team scaled PostgreSQL to support ChatGPT and other mission-critical services. Microsoft Azure Database for PostgreSQL team worked closely with OpenAI\u2019s engineers to push the service to new limits. The result was a series of upgrades and best practices that transformed the database layer into a resilient component of OpenAI\u2019s data platform.<\/p>\n\n\n\n

Let\u2019s break down the key strategies OpenAI used to scale and sharpen PostgreSQL, as shared in Bohan Zhang\u2019s talk<\/a>:<\/a><\/p>\n\n\n\n

1. Offloading and smoothing write workloads<\/strong><\/h3>\n\n\n\n

On a single database server, writes are often the hardest to scale. PostgreSQL\u2019s design can introduce bloat and performance issues under heavy write loads. OpenAI encountered exactly this. Their solution was to minimize the burden on the primary by:<\/p>\n\n\n\n