Efficient Customer Issue Triage via Linking with System Incidents

ESEC/FSE 2020 Industry |

Related File

In cloud service systems, customers will report the service issues they have encountered to cloud service providers. Quick troubleshooting of a Customer reported Issue (CI) is critical. To this end, a customer reported issue should be assigned to its responsible team accurately in a timely manner.

Our industrial experiences show that linking CIs with detected system incidents can help CI triage. In particular, our empirical study on 7 real cloud service systems shows that with the additional information about the system incidents, the customer issue triage time can be accelerated 13.1X on average. In order to improve the efficiency of customer issue triage, in this paper, we propose LinkCM, a learning based approach to automatically Link Customer reported issues to Monitor reported system incidents. LinkCM incorporates a novel learning-based model that effectively extracts related information from two resources, and a transfer learning strategy is proposed to help LinkCM to achieve better performance without huge amount of data. The experimental results indicate that LinkCM is able to achieve more accurate link prediction comparing against its two variants. Furthermore, case studies are presented to demonstrate how LinkCM can help the customer issue triage procedure in real production cloud service systems.