{"id":18105,"date":"2016-12-16T10:00:21","date_gmt":"2016-12-16T18:00:21","guid":{"rendered":"https:\/\/blogs.technet.microsoft.com\/dataplatforminsider\/?p=18105"},"modified":"2024-01-22T22:50:40","modified_gmt":"2024-01-23T06:50:40","slug":"sql-server-on-linux-how-introduction","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2016\/12\/16\/sql-server-on-linux-how-introduction\/","title":{"rendered":"SQL Server on Linux: How? Introduction"},"content":{"rendered":"
This post was authored by Scott Konersmann, Partner Engineering Manager, SQL Server, Slava Oks, Partner Group Engineering Manager, SQL Server, and Tobias Ternstrom, Principal Program Manager, SQL Server.<\/em><\/p>\n We first announced<\/a> SQL Server on Linux in March, and recently released the first public preview of SQL Server on Linux<\/a> (SQL Server v.Next CTP1) at the Microsoft Connect(); conference. We\u2019ve been pleased to see the positive reaction from our customers and the community; in the two weeks following the release, there were more than 21,000 downloads of the preview. A lot of you are curious to hear more about how we made SQL Server run on Linux (and some of you have already figured out and posted interesting articles about part of the story with \u201cDrawbridge\u201d). We decided to kick off a blog series to share technical details about this very topic starting with an introduction to the journey of offering SQL Server on Linux. Hopefully you will find it as interesting as we do! J<\/p>\n Making SQL Server run on Linux involves introducing what is known as a Platform Abstraction Layer (\u201cPAL\u201d) into SQL Server. This layer is used to align all operating system or platform specific code in one place and allow the rest of the codebase to stay operating system agnostic. Because of SQL Server\u2019s long history on a single operating system, Windows, it never needed a PAL. In fact, the SQL Server database engine codebase has many references to libraries that are popular on Windows to provide various functionality. In bringing SQL Server to Linux, we set strict requirements for ourselves to bring the full functional, performance, and scale value of the SQL Server RDBMS to Linux. This includes the ability for an application that works great on SQL Server on Windows to work equally great against SQL Server on Linux. Given these requirements and the fact that the existing SQL Server OS dependencies would make it very hard to provide a highly capable version of SQL Server outside of Windows in reasonable time it was decided to marry parts of the Microsoft Research (MSR) project Drawbridge<\/a> with SQL Server\u2019s existing platform layer SQL Server Operating System<\/a> (SOS) to create what we call the SQLPAL. The Drawbridge project provided an abstraction between the underlying operating system and the application for the purposes of secure containers and SOS provided robust memory management, thread scheduling, and IO services. Creating SQLPAL enabled the existing Windows dependencies to be used on Linux with the help of parts of the Drawbridge design focused on OS abstraction while leaving the key OS services to SOS. We are also changing the SQL Server database engine code to by-pass the Windows libraries and call directly into SQLPAL for resource intensive functionality.<\/p>\n SQL Server is Microsoft\u2019s flagship database product which with close to 30 years of development behind it. At a high level, the list below represents our requirements as we designed the solution to make the SQL Server RDBMS available on multiple platforms:<\/p>\n To make SQL Server support multiple platforms, the engineering task is essentially to remove or abstract away its dependencies on Windows. As you can imagine, after decades of development against a single operating system, there are plenty of OS-specific dependencies across the code base. In addition, the code base is huge. There are tens of millions of lines of code in SQL Server.<\/p>\n SQL Server depends on various libraries and their functions and semantics commonly used in Windows development that fall into three categories:<\/p>\n You can think of these as core library functions, most of them have nothing to do with the operating system kernel and only execute in user mode.<\/p>\n While SQL Server has dependencies on both Win32 and the Windows kernel, the most complex dependency is that of Windows application libraries that have been added over the years in order to provide new functionality.\u00a0 Here are some examples:<\/p>\n These dependencies are the biggest challenge for us to overcome to meet our goals of bringing the same value and having a very high level compatibility between SQL Server on Windows and Linux. As an example, to re-implement something like SQLXML would take a significant amount of time and would run a high risk of not providing the same semantics as before, and could potentially break applications. The option of completely removing these dependencies would mean we must also remove the functionality they provide from SQL Server on Linux. If the dependencies were edge cases and only impacting very few customer visible features, we could have considered it. As it turns out, removing them would cause us to have to remove tons of features from SQL Server on Linux which would go against our goals around compatibility and value across operating systems.<\/p>\n We could take the approach of doing this re-implementation piecemeal, bringing value little by little. While this would be possible, it would also go against the requirements because it would mean that there would be a significant gap between SQL Server on Linux and Windows for years. The resolution lies in the right platform abstraction layer.<\/p>\n Software that is supported across multiple operating systems always has an implementation of some sort of Platform Abstraction Layer (PAL). The PAL layer is responsible for abstraction of the calls and semantics of the underlying operating system and its libraries from the software itself. The next couple of sections consider some of the technology that we investigated as solutions to building a PAL for SQL Server.<\/p>\n In the SQL Server 2005 release, a platform layer was created between the SQL Server engine and Windows called the SQL Operating System (SOS). This layer was responsible for user mode thread scheduling, memory management, and synchronization (see SQLOS<\/a> for reference).\u00a0 A key reason for the creation of SOS was that it allowed for a centralized set of low level management and diagnostics functionality to be provided to customers and support (subset of Dynamic Management Views\/DMVs and Extended Events\/XEvents).\u00a0 This layer allowed us to minimize the number of system calls involved in scheduling execution by running non-preemptively and letting SQL Server do its own resource management.\u00a0 While SOS improved performance and greatly helped supportability and debugging, it did not provide a proper abstraction layer from the OS dependencies described above, i.e. Windows semantics were carried through SOS and exposed to the database engine.<\/p>\n <\/a><\/p>\n In the scenario where we would completely remove the dependencies on the underlying operating system from the database engine, the best option was to grow SOS into a proper Platform Abstraction Layer (PAL).\u00a0 All<\/b> the calls to Windows APIs would be routed through a new set of equivalent APIs in SOS and a new host extension layer would be added on the bottom of SOS that would interact with the operating system. While this would resolve the system call dependencies, it would not help with the dependencies on the higher-level libraries.<\/p>\n Drawbridge was an Microsoft Research project (see Drawbridge<\/a> for reference) that focused on drastically reducing the virtualization resource overhead incurred when hosting many Virtual Machines on the same hardware.\u00a0 The research involved two ideas.\u00a0 The first idea was a \u201cpicoprocess\u201d which consists of an empty address space, a monitor process that interacts with the host operating system on behalf of the picoprocess, and a kernel driver that allows a driver to populate the address space at startup and implements a host Application Binary Interface (ABI) that allows the picoprocess to interact with the host.\u00a0 The second idea was a user mode Library OS, sometimes referred to as LibOS.\u00a0 Drawbridge provided a working Windows Library OS that could be used to run Windows programs on a Windows host.\u00a0 This Library OS implements a subset of the 1500+ Win32 and NT ABIs and stubs the rest to either succeed or fail depending on the type of call.<\/p>\n <\/a><\/p>\n Our needs didn\u2019t align with the original goals of the Drawbridge research.\u00a0 For instance, the picoprocess idea isn\u2019t something needed for moving SQL Server to other platforms.\u00a0 However, there were a couple of synergies that stood out:<\/p>\n There were also some risk and reward tradeoffs:<\/p>\nIntroduction<\/span><\/a><\/h4>\n
Summary<\/h4>\n
Requirements for supporting Linux<\/span><\/a><\/h4>\n
\n
\n
\n
Building a PAL<\/h2>\n
SQL Operating System (SOS or SQLOS)<\/h4>\n
Drawbridge<\/span><\/a><\/h4>\n
\n
\n