DeepMerge: Learning to Merge Programs
- Elizabeth Dinella ,
- Todd Mytkowicz ,
- Alexey Svyatkovskiy ,
- Christian Bird ,
- Mayur Naik ,
- Shuvendu Lahiri
In collaborative software development, program merging is {\it the} mechanism to integrate changes from multiple programmers. %in modern version control systems. Merge algorithms in modern version control systems report a conflict when changes interfere textually. Merge conflicts require manual intervention and frequently stall modern continuous integration pipelines. Prior work found that, although costly, a large majority of resolutions involve re-arranging text without writing any new code. Inspired by this observation we propose the {\it first data-driven approach} to resolve merge conflicts with a machine learning model. We realize our approach in a tool \deepMergeTool{} that uses a novel combination of (i) an edit-aware embedding of merge inputs and (ii) a variation of pointer networks, to construct resolutions from input segments. We also propose an algorithm to localize manual resolutions in a resolved file and employ it to curate a ground-truth dataset comprising 8,719 non-trivial resolutions in JavaScript programs. Our evaluation shows that, on a held out test set, \deepMergeTool{} can predict correct resolutions for 37\% of non-trivial merges, compared to only 4\% by a state-of-the-art semistructured merge technique. Furthermore, on the subset of merges with upto 3 lines (comprising 24\% of the total dataset), \deepMergeTool{} can predict correct resolutions with 78\% accuracy.