{"id":165106,"date":"2013-07-01T00:00:00","date_gmt":"2013-07-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/gestalt-unifying-fault-localization-for-networked-systems\/"},"modified":"2018-10-16T20:41:43","modified_gmt":"2018-10-17T03:41:43","slug":"gestalt-unifying-fault-localization-for-networked-systems","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/gestalt-unifying-fault-localization-for-networked-systems\/","title":{"rendered":"Gestalt: Unifying fault localization for networked systems"},"content":{"rendered":"
\n

Researchers have proposed many algorithms for localizing faults in networked systems, but it is unclear which algorithm is best suited for a given network; the performance of these algorithms differs markedly for different networks. We develop a framework that can explain these differences by anatomizing the algorithms into their basic choices and analyzing these choices with respect to six defining characteristics of real networks. Our analysis also reveals that no existing algorithm simultaneously provides good localization accuracy and low computational overhead. Based on our insights, we develop a new algorithm called Gestalt. To perform well across a range of networks, Gestalt combines the good choices of existing algorithms and with a new method to explore the space of possible faults in a way that is both low overhead and robust to noise. We apply it to three real, diverse networks: an email network, a peer-to-peer messaging system, and an ISP network. In each case, Gestalt has either significantly higher localization accuracy or an order of magnitude faster running time. For example, when applied to Lync, Gestalt localizes faults with the same accuracy as Sherlock, while reducing fault localization time from days to 23s on a single system.<\/p>\n<\/div>\n

<\/p>\n","protected":false},"excerpt":{"rendered":"

Researchers have proposed many algorithms for localizing faults in networked systems, but it is unclear which algorithm is best suited for a given network; the performance of these algorithms differs markedly for different networks. We develop a framework that can explain these differences by anatomizing the algorithms into their basic choices and analyzing these choices […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13547],"msr-publication-type":[193718],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-165106","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-systems-and-networking","msr-locale-en_us"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2013-07-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"MSR-TR-2013-65","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"205355","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"tr2013-gestalt.pdf","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/tr2013-gestalt.pdf","id":205355,"label_id":0}],"msr_related_uploader":"","msr_attachments":[{"id":205355,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/tr2013-gestalt.pdf"}],"msr-author-ordering":[{"type":"text","value":"Radhika Niranjan Mysore","user_id":0,"rest_url":false},{"type":"user_nicename","value":"ratul","user_id":33351,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=ratul"},{"type":"text","value":"Amin Vahdat","user_id":0,"rest_url":false},{"type":"user_nicename","value":"varghese","user_id":34496,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=varghese"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[144899,550641],"msr_project":[170534],"publication":[],"video":[],"download":[],"msr_publication_type":"techreport","related_content":{"projects":[{"ID":170534,"post_title":"NetMedic: Detailed and Understandable Network Diagnosis","post_name":"netmedic-detailed-and-understandable-network-diagnosis","post_type":"msr-project","post_date":"2010-08-19 11:13:25","post_modified":"2024-10-02 16:41:33","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/netmedic-detailed-and-understandable-network-diagnosis\/","post_excerpt":"NetMedic helps operators perform detailed diagnosis in computer networks. It diagnoses not only generic faults (e.g., performance-related) but also application specfic faults (e.g., error codes). It identifies culprits at a fine granularity such as a process or firewall configuration. Our work focuses on both the algorithmic aspects of detailed diagnosis as well as the important task of explaining diagnostic reasoning to the operator. Talks Detailed and understandable network diagnosis University of Wisconsin, Nov 2009; Georgia…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/170534"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/165106"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/165106\/revisions"}],"predecessor-version":[{"id":529669,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/165106\/revisions\/529669"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=165106"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=165106"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=165106"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=165106"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=165106"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=165106"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=165106"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=165106"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=165106"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=165106"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=165106"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=165106"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=165106"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=165106"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=165106"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=165106"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}