{"id":212075,"date":"2015-06-01T09:00:53","date_gmt":"2015-06-01T16:00:53","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/project\/dsoap-distributed-social-analytics-platform\/"},"modified":"2023-03-30T12:01:13","modified_gmt":"2023-03-30T19:01:13","slug":"dsoap-distributed-social-analytics-platform","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/dsoap-distributed-social-analytics-platform\/","title":{"rendered":"DSoAP – Distributed Social Analytics Platform"},"content":{"rendered":"
The Distributed Social Analytics Platform (DSoAP)<\/strong> project is focused on the \u201cHuge Data\u201d problem in social policy research caused by the breadth of data involved. Using aggregate social media data to investigate and validate social issues (such as employment, health and fiscal policy) requires analyzing many months or years of data. DSoAP is applying intelligent compaction, pre-indexing and distribution of data across a server cluster to achieve responsive query times for online data exploration.<\/p>\n Twitter<\/strong> is much more than just cat pictures and what people eat for lunch! – it is a treasure trove of data about people\u2019s life events, experiences, and opinions.<\/p>\n Recent research has started to look at how to use broader aggregate data to investigate and validate social issues such as employment, health and fiscal policy. A defining characteristic of this type of social policy research is the timeline and breadth of data involved. While most tweet analysis concentrates on a short sliding time window of the order of hours or days, extracting meaningful social policy trends typically involves looking at many months or even years of data.<\/p>\n With ~500 million new tweets (~2-3TB) been added to the Twitter data corpus daily, creating systems that can efficiently handle that massive volume of data is a challenging task. In the dsoap project, we are working on solutions for this \u201chuge data\u201d problem by applying intelligent compaction, pre-indexing and distribution of data across a cluster of machines to achieve reasonable query times for online data exploration.<\/p>\n","protected":false},"excerpt":{"rendered":" The Distributed Social Analytics Platform (DSoAP) project is focused on the \u201cHuge Data\u201d problem in social policy research caused by the breadth of data involved. Using aggregate social media data to investigate and validate social issues (such as employment, health and fiscal policy) requires analyzing many months or years of data. DSoAP is applying intelligent […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13563,13555,13547],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-212075","msr-project","type-msr-project","status-publish","hentry","msr-research-area-data-platform-analytics","msr-research-area-search-information-retrieval","msr-research-area-systems-and-networking","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"2015-06-01","related-publications":[312656,166370],"related-downloads":[377063,234669],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Lidong Zhou","user_id":32673,"people_section":"Group 1","alias":"lidongz"},{"type":"user_nicename","display_name":"Emre Kiciman","user_id":31739,"people_section":"Group 1","alias":"emrek"},{"type":"user_nicename","display_name":"Scott Counts","user_id":31471,"people_section":"Group 1","alias":"counts"}],"msr_research_lab":[199565],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/212075"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":6,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/212075\/revisions"}],"predecessor-version":[{"id":932181,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/212075\/revisions\/932181"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=212075"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=212075"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=212075"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=212075"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=212075"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}