{"id":664530,"date":"2020-06-05T07:01:30","date_gmt":"2020-06-05T14:01:30","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=664530"},"modified":"2021-03-17T16:13:38","modified_gmt":"2021-03-17T23:13:38","slug":"identifying-linked-incidents-in-large-scale-online-service-systems","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/identifying-linked-incidents-in-large-scale-online-service-systems\/","title":{"rendered":"Identifying linked incidents in large-scale online service systems"},"content":{"rendered":"
In large-scale online service systems, incidents occur frequently due to a variety of causes, from updates of software and hardware to changes in operation environment. These incidents could significantly degrade system\u2019s availability and customers\u2019 satisfaction. Some incidents are linked because they are duplicate or inter-related. The linked incidents can greatly help on-call engineers find mitigation solutions and identify the root causes. In this work, we investigate the incidents and their links in a real-world incident management (IcM) system of a company providing large online services. Based on the identified indicators of linked incidents, we further propose LiDAR(Linked Incident identification with DAta-driven Representation), a deep learning based approach to incident linking. More specifically, we incorporate the textual description of incidents and structural information extracted from historical linked incidents to identify possible links among a large number of incidents. To show the effectiveness of our method, we apply our method to a real-world IcM system and find that our method outperforms other state-of-the-art methods.<\/p>\n","protected":false},"excerpt":{"rendered":"
In large-scale online service systems, incidents occur frequently due to a variety of causes, from updates of software and hardware to changes in operation environment. These incidents could significantly degrade system\u2019s availability and customers\u2019 satisfaction. Some incidents are linked because they are duplicate or inter-related. The linked incidents can greatly help on-call engineers find mitigation […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13561,13556,13563,13560],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[246694,246691,248116,246658,253696,248326],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-664530","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-research-area-data-platform-analytics","msr-research-area-programming-languages-software-engineering","msr-locale-en_us","msr-field-of-study-artificial-intelligence","msr-field-of-study-computer-science","msr-field-of-study-data-science","msr-field-of-study-deep-learning","msr-field-of-study-incident-management-itsm","msr-field-of-study-software"],"msr_publishername":"ACM","msr_edition":"","msr_affiliation":"","msr_published_date":"2020-11-7","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/09\/fse20main-p978-p-6ba9c67-submitted.pdf","id":"695616","title":"fse20main-p978-p-6ba9c67-submitted","label_id":"243109","label":0},{"type":"doi","viewUrl":"false","id":"false","title":"10.1145\/3368089.3409768","label_id":"243106","label":0}],"msr_related_uploader":"","msr_attachments":[{"id":664533,"url":"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2020\/06\/placeholder5.docx"}],"msr-author-ordering":[{"type":"text","value":"Yujun Chen","user_id":0,"rest_url":false},{"type":"text","value":"Xian Yang","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Hang Dong","user_id":39687,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Hang Dong"},{"type":"text","value":"Xiaoting He","user_id":0,"rest_url":false},{"type":"text","value":"Hongyu Zhang","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Qingwei Lin","user_id":33318,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Qingwei Lin"},{"type":"text","value":"Junjie Chen","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Pu Zhao","user_id":38886,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Pu Zhao"},{"type":"user_nicename","value":"Yu Kang","user_id":39381,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Yu Kang"},{"type":"user_nicename","value":"Feng Gao","user_id":31797,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Feng Gao"},{"type":"text","value":"Zhangwei Xu","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Dongmei Zhang","user_id":31665,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Dongmei Zhang"}],"msr_impact_theme":[],"msr_research_lab":[199560],"msr_event":[],"msr_group":[714577],"msr_project":[],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/664530"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":4,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/664530\/revisions"}],"predecessor-version":[{"id":734551,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/664530\/revisions\/734551"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=664530"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=664530"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=664530"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=664530"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=664530"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=664530"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=664530"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=664530"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=664530"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=664530"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=664530"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=664530"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=664530"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=664530"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=664530"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=664530"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}