{"id":4116,"date":"2025-10-07T09:00:00","date_gmt":"2025-10-07T16:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/startups\/blog\/?p=4116"},"modified":"2026-03-03T17:27:58","modified_gmt":"2026-03-04T01:27:58","slug":"openai-and-postgresql-scaling-with-microsoft-azure","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/startups\/blog\/openai-and-postgresql-scaling-with-microsoft-azure\/","title":{"rendered":"OpenAI and PostgreSQL: Scaling with Microsoft Azure"},"content":{"rendered":"\n
OpenAI, the company behind ChatGPT and other breakthrough AI models, is known for pushing technological boundaries. But one surprising part of OpenAI\u2019s story is how much it leans on a tried-and-true technology: PostgreSQL. Postgres is the backbone of OpenAI\u2019s most critical systems. In this blog, we\u2019ll explore OpenAI\u2019s PostgreSQL journey with Microsoft Azure<\/a>\u2014the challenges faced, the solutions implemented, and the impressive results achieved. More importantly, we\u2019ll distill lessons you can use to scale your database.<\/p>\n\n\n\n \n\t\t\t\t\t\t\t\n\t\t\t\t\t\tLearn more\t\t\t\t\t\t\t\u2197<\/a>\n\t\t\t\t\t<\/p>\n\t<\/div>\n<\/div>\n\n\n\n From early on, OpenAI used Azure Database for PostgreSQL<\/a>, which spared the team from low-level database maintenance while providing important features like automated backups and high availability. The architecture was initially simple: one primary Postgres instance handled writes, with multiple read-only replicas to shoulder the heavy read traffic. This classic primary-replica setup worked well through OpenAI\u2019s early growth.<\/p>\n\n\n\n For read-intensive workloads, this single-shard approach was a big win. Read scalability was excellent, thanks to dozens of replicas that the team could add as needed. Each replica is a live copy of the primary, so spreading out read queries among them allowed OpenAI to serve millions of users with low latency. Geographic distribution of replicas even enabled snappy read performance for users around the world. It\u2019s a showcase of how cloud-managed Postgres can scale out reads efficiently.<\/p>\n\n\n\n However, as usage of ChatGPT and other services grew, the limits of this design were tested. Write requests became a growing bottleneck. All write operations had to funnel into the single primary database. As traffic surged, a few incidents occurred where database performance affected OpenAI\u2019s services. These events were wake-up calls to implement new strategies to support read and write scale-out for their PostgreSQL workloads.<\/p>\n\n\n\n At POSETTE 2025<\/a>, OpenAI shared how their team scaled PostgreSQL to support ChatGPT and other mission-critical services. Microsoft Azure Database for PostgreSQL team worked closely with OpenAI\u2019s engineers to push the service to new limits. The result was a series of upgrades and best practices that transformed the database layer into a resilient component of OpenAI\u2019s data platform.<\/p>\n\n\n\n Let\u2019s break down the key strategies OpenAI used to scale and sharpen PostgreSQL, as shared in Bohan Zhang\u2019s talk<\/a>:<\/a><\/p>\n\n\n\n On a single database server, writes are often the hardest to scale. PostgreSQL\u2019s design can introduce bloat and performance issues under heavy write loads. OpenAI encountered exactly this. Their solution was to minimize the burden on the primary by:<\/p>\n\n\n\n These optimizations paid off by keeping the primary database lean and efficient.<\/p>\n\n\n\n With write pressure under control, OpenAI focused on optimizing read-heavy workloads, which form the bulk of ChatGPT\u2019s traffic. Key steps included:<\/p>\n\n\n\n After all these efforts, the OpenAI team went from fighting fires to feeling in control.<\/p>\n\n\n\n Scaling isn\u2019t only about raw performance; it\u2019s also about maintaining stability and uptime. OpenAI implemented processes to ensure that pushing the limits of PostgreSQL wouldn\u2019t compromise reliability:<\/p>\n\n\n\n All these measures contributed to a robust PostgreSQL setup with cloud-grade reliability.<\/p>\n\n\n\n OpenAI\u2019s journey with Azure Database for PostgreSQL has resulted in some meaningful outcomes for their business, illustrating just how far a startup can go with a well-architected relational database in the cloud:<\/p>\n\n\n\n OpenAI\u2019s PostgreSQL setup is handling a workload that few companies have ever seen and yet, it\u2019s doing so on a foundation of open-source technology and cloud services that any startup can use. This kind of scale was once thought to require exotic databases or enormous engineering teams, but OpenAI achieved it with a small team focused on systematic, pragmatic optimizations. In Bohan Zhang\u2019s words, \u201cAfter all the optimization we did, we are super happy with Postgres right now for our read-heavy workloads.\u201d<\/p>\n\n\n\n By using Azure Database for PostgreSQL, OpenAI benefited from a service built for high-scale, mission-critical workloads. Azure Database for PostgreSQL provided several advantages that complemented OpenAI\u2019s engineering work.<\/p>\n\n\n\n Azure made it straightforward to add replicas on demand. Learning from OpenAI\u2019s workload evolution, the Azure Database for PostgreSQL team developed the elastic clusters<\/a> feature, now available in preview, which enabled the OpenAI team to scale horizontally through row-based and schema-based sharding. The Azure team also introduced the cascading read replicas<\/a> capability, also available in preview, which lets users create additional read replicas from an existing one. This helped them easily scale read workloads more efficiently across regions.<\/p>\n\n\n\n As Bohan Zhang, a member of OpenAI\u2019s infrastructure team, highlighted, \u201cAt OpenAI, we utilize an unsharded architecture with one writer and multiple readers, demonstrating that PostgreSQL can scale gracefully under massive read loads.\u201d<\/p>\n\n\n\n Additional Azure advantages included:<\/p>\n\n\n\n Azure Database for PostgreSQL provided a reliable canvas on which OpenAI executed these optimizations. If you\u2019re a startup, using a managed database means you get enterprise readiness out of the box, so you can devote your energy to product innovation and the specific tuning that your use case needs.<\/p>\n\n\n\n OpenAI\u2019s success with Azure Database for PostgreSQL is a story of resilience and innovation. It shines a light on what\u2019s possible when a startup pairs a powerful cloud platform with smart engineering. This balance of old and new is often a winning formula\u2014you innovate where it differentiates you, and you rely on well-established solutions for things like databases for their proven reliability. Here are some key takeaways for startup developers and technical decision makers looking to replicate this success:<\/p>\n\n\n\n If you\u2019re feeling inspired to supercharge your own startup\u2019s data layer, a great way to begin is by learning more about Azure Database for PostgreSQL<\/a> and how to use it effectively.<\/p>\n\n\n\nThe beginning: Initial architecture focused on simplicity<\/h2>\n\n\n\n
\n\t\t\tWhat is PostgreSQL?\t\t<\/h2>\n\t\t
Scaling\u00a0up with PostgreSQL on Azure as demand\u00a0grows<\/h2>\n\n\n\n
1. Offloading and smoothing write workloads<\/strong><\/h3>\n\n\n\n
\n
2. Scaling reads with replicas and smart query routing<\/strong><\/h3>\n\n\n\n
\n
3. Schema governance and safeguards<\/strong><\/h3>\n\n\n\n
\n
The result: PostgreSQL at scale<\/h2>\n\n\n\n
\n
Why Azure Database for PostgreSQL was key<\/h2>\n\n\n\n
Ease of scaling and replication<\/strong><\/h3>\n\n\n\n
\n
Making Postgres work for you<\/h2>\n\n\n\n
\n