{"id":1443,"date":"2014-02-18T09:00:00","date_gmt":"2014-02-18T17:00:00","guid":{"rendered":""},"modified":"2024-01-22T22:52:14","modified_gmt":"2024-01-23T06:52:14","slug":"data-intensive-applications-in-the-cloud-computing-world","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/sql-server\/blog\/2014\/02\/18\/data-intensive-applications-in-the-cloud-computing-world\/","title":{"rendered":"Data-intensive Applications in the Cloud Computing World"},"content":{"rendered":"
Building data-intensive applications in emerging cloud computing environments is fundamentally different and more exciting.\u00a0 The levels of scale, reliability, and performance are as challenging as anything we have previously seen.\u00a0 Databases are still prevalent in design, but new patterns and storage options need to be considered, as well.<\/p>\n
To provide a little context, I have developed and supported database software for over 30 years.\u00a0 I started with IMS\/DLI and CICS\/VSAM, then quickly moved to DB2 while it was still in beta (System R).\u00a0 I became a pretty hard core RDBMS expert with 11 years of DB2 experience and over 20 years of Microsoft SQL Server experience.\u00a0 I have been involved with some of the largest RDBMS projects in the world. (Example: a reliable, large-scale application in Europe that is available 24\/7 and is designed to process up to 500,000 batch requests\/sec, which equates to greater than 4 million SQL statements\/sec.)\u00a0 Before the emerging era of cloud computing, my database thinking was all about scale-up computing with transaction latency measured in a few microseconds and IOPS measured in many GBs\/sec.<\/p>\n
For the past 18 months, my team has worked with customers to build applications on the Windows Azure platform.\u00a0 We\u2019ve learned a lot about scale-out distributed computing\u2014composing applications and solutions using different sets of services and resources while exploiting cloud platform fundamentals such as scalability, availability, and manageability. \u00a0We\u2019ve learned that developing data-intensive applications to a set of online services is very different than writing traditional client\/server applications.<\/i><\/p>\n
In the remainder of this post, I take a high-level look at the role of databases in cloud-based applications.<\/p>\n
The design pattern and use of a database for a cloud-based application is different and, generally, expanded.\u00a0 You still have the need to store the persisted database transactions of the traditional RDBMS application. \u00a0And, due to the use of distributed computing resources in a cloud-based application with higher standards for reliability, performance, and manageability, you also need extensive telemetry data captured about the entire application\u2014if you want to build a great cloud application.<\/p>\n
When we first started working with customers writing cloud-based data-intensive applications, most would use a relational database like Windows Azure SQL Database for all data storage, including telemetry data.\u00a0 This could be expected because developers often use the tool(s) they are most familiar with, TSQL provides a quick and well-known interface to get data in and out of the database, and relational databases generally take care of threading and concurrency for developers.\u00a0 At the time, the default thinking was that most data belongs in a traditional relational database where data is always stateful and carries atomic transactional properties.\u00a0 However, in the distributed cloud computing environment, scale will likely come from the implementation of stateless as well as stateful data properties.\u00a0 The new paradigm shifts us away from the use of a traditional RDBMS for all data.<\/p>\n
An Example<\/b><\/p>\n
Let me show you an example of a cloud application where multiple data stores are used.\u00a0 The architecture is for an online gaming experience.\u00a0 This application is designed to manage several thousand concurrent users and can scale out at several points, as needed.\u00a0 After the diagram, I will explain the different functions of the application, the type of data store used for each function, and why that particular type of data store is used for that function.\u00a0 As I describe each function, I will refer to a number in the diagram as a reference.<\/p>\n
\u00a0 Function: Login and Initialize Profile <\/b>\u2014 Looking at the bottom of the diagram, you see three users; let\u2019s start there.\u00a0 These users log in and their sessions are assigned to a Windows Azure web role (#1).\u00a0 The web role hosts them while they are active on the system.\u00a0 The first step is to authenticate them and bring their profiles into Windows Azure Cache (#5).\u00a0 Their profiles are stored in Windows Azure SQL Database (#2).\u00a0 Complex queries retrieve profile data from the SQL Database by joining data (game history, scores, activity, etc.) from multiple relational tables to store in the Cache.\u00a0 A relational database is best suited for this type of persistent storage and complex query activity.\u00a0 The user profile also needs to be updated when information changes, so the same types of complex transactions are required to update the information back into the SQL Database.\u00a0 This is a good example of where a traditional relational database is best utilized as your data store.<\/p>\n Function: Play Games and Perform Online Activities <\/b>\u2014 After the users are logged in and their profiles are in Cache, they can start the gaming experience.\u00a0 As you might expect, the gaming experience will be all in Cache (#5).\u00a0 The Cache is a high performance data store and is the obvious place to store active game data. \u00a0Because Cache is non-durable, leaderboard, profile, and friend information is pushed out to other data stores for persistence.<\/p>\n Function: Documenting and Updating Activities <\/b>\u2014 All active game activity is recorded while the users are playing, and this activity needs to be durable during play while constantly making changes to it.\u00a0 Activity data is stored in a Queue (#4).\u00a0 This is a durable Queue, so unlike the Cache, activity data is not lost if an outage takes place. \u00a0Data stored in the Queue is processed by \u201cactivity processors\u201d (hosted as Windows Azure worker roles) that process the data, carry-out application logic, and persist results and history.<\/p>\n Function:\u00a0 Activity History <\/b>\u2014 For each user, all activity is stored and kept as history.\u00a0 Activity history is persisted on a periodic basis from the active Queue (#4) to a NoSQL store (#3).\u00a0 This NoSQL store rests in a table, using Windows Azure Table service.\u00a0 The table is used because the data is mostly write-only, with a requirement to easily grow in place with little need for complex query activity against it.\u00a0 So, the Table service is the best store for this type of data activity.<\/p>\n Function: Friend Interaction and Leaderboard <\/b>\u2014 While users are playing, they can communicate and interact with friends (other users) in the system.\u00a0 They also might want to keep tabs of the leaderboard.\u00a0 Friend and leaderboard data changes often but not constantly, so this data is best stored in an Azure SQL Database (#7).\u00a0 The relational database is updated often, and the \u201ccache tasks\u201d role continuously pulls the latest information and ensures the active cache (#5) is always updated with the latest leaderboard and friend information through a query to the SQL Database.<\/p>\n Function: Data Warehouse <\/b>\u2014 All user profile data and activity data is stored in a data warehouse (#6) for reporting purposes.\u00a0 Unstructured data from the Azure Table service is stored in Hadoop (Windows Azure HDInsight), and structured data is stored in a relational data warehouse (Azure SQL Database).<\/p>\n Summary<\/b><\/p>\n In summary, you can see that this single application uses five different data storage options:\u00a0 Windows Azure Cache, Azure SQL Database, Azure Queue service, Azure Table service, and Azure HDInsight (Hadoop).\u00a0 Each type of store was chosen because it represents the best option for the transactional needs of the operation being executed:<\/p>\n If this was an on-premise application, you could have used multiple data stores, too, but the overhead of procurement, installation, and configuration of all of these sources adds time and money to your solution.\u00a0 As the diagram below suggests, with only a couple of clicks in the Windows Azure portal, you can have any of these data sources installed, configured, and up and running.<\/p>\n My world has changed.\u00a0 I am no longer just a relational database developer.\u00a0 The Windows Azure platform and Microsoft cloud services make it easy to use the best data store for whatever task I am trying to accomplish\u2014and, in many cases, this means using several different types of data stores. For more information about our data platform vision and the future of data-intensive applications on the Windows Azure platform, see Quentin Clark\u2019s blog, \u201cWhat Drives Microsoft\u2019s Data Platform Vision?<\/a>\u201d<\/p>\n Mark Souza Building data-intensive applications in emerging cloud computing environments is fundamentally different and more exciting.\u00a0 The levels of scale, reliability, and performance are as challenging as anything we have previously seen.\u00a0 Databases are still prevalent in design, but new patterns and storage options need to be considered, as well.<\/p>\n","protected":false},"author":1457,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ep_exclude_from_search":false,"_classifai_error":"","_classifai_text_to_speech_error":"","footnotes":""},"post_tag":[],"product":[],"content-type":[2445],"topic":[],"coauthors":[],"class_list":["post-1443","post","type-post","status-publish","format-standard","hentry","content-type-thought-leadership","review-flag-1593580427-503","review-flag-1-1593580431-15","review-flag-2-1593580436-981","review-flag-3-1593580441-293","review-flag-4-1593580446-456","review-flag-5-1593580452-31","review-flag-6-1593580457-144","review-flag-7-1593580462-294","review-flag-alway-1593580309-407","review-flag-new-1593580247-437","review-flag-on-pr-1593580815-813"],"yoast_head":"\n
<\/a><\/p>\n\n
<\/a><\/p>\n
\nGeneral Manager
\nWindows Azure Customer Advisory Team<\/p>\n","protected":false},"excerpt":{"rendered":"