{"id":418838,"date":"2017-08-04T10:08:59","date_gmt":"2017-08-04T17:08:59","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=418838"},"modified":"2017-08-04T10:08:59","modified_gmt":"2017-08-04T17:08:59","slug":"program-repairs-programs-achieve-78-3-percent-precision-automated-program-repair","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/program-repairs-programs-achieve-78-3-percent-precision-automated-program-repair\/","title":{"rendered":"Program that repairs programs: how to achieve 78.3 percent precision in automated program repair"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-418847\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539.jpg\" alt=\"\" width=\"1260\" height=\"539\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539.jpg 1260w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539-300x128.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539-768x329.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539-1024x438.jpg 1024w\" sizes=\"(max-width: 1260px) 100vw, 1260px\" \/><\/p>\n<p><em>By <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/lisu\/#\">Lily Sun<\/a>, Research Program Manager of Microsoft Research Asia<\/em><\/p>\n<p>In February 2017, Microsoft and Cambridge University announced a <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/deepcoder-learning-write-programs\/\">DeepCoder algorithm<\/a> that produces programs from problem inputs\/outputs. DeepCoder, which operates on a novel yet greatly simplified programming language, cannot handle complex problems\u2014general programming languages are still too hard for DeepCoder to master. So, currently, programmers don\u2019t have to worry about being replaced by machines.<\/p>\n<p>But programmers have plenty of other worries, including programming bugs. Could machines assist programmers by taking over the task of bug fixes?<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-419009 aligncenter\" style=\"border: 1px solid black;\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/debugging_bothers_more_than_coding_1200x.png\" alt=\"\" width=\"1200\" height=\"669\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/debugging_bothers_more_than_coding_1200x.png 1200w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/debugging_bothers_more_than_coding_1200x-300x167.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/debugging_bothers_more_than_coding_1200x-768x428.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/debugging_bothers_more_than_coding_1200x-1024x571.png 1024w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p>To test this idea, researchers from Peking University, Microsoft Research Asia (MSRA) and University of Electronic Science and Technology of China (UESTC) have developed a new approach, called Accurate Condition System (ACS), to automatically repair defects in software systems without human intervention.<\/p>\n<p>For example, consider the following piece of code from Apache Math, which is used to calculate the least common multiplier from two numbers. This piece of code uses Math.abs to ensure the return value is positive. However, code defects may still return negative results for some input values.<\/p>\n<pre style=\"padding-left: 30px;\">int lcm=Math.abs(mulAndCheck(a\/gdc(a,b), b));\r\n return lcm;<\/pre>\n<p>The root cause of this error is that there is one more negative number than there are positive numbers in the range of signed integers. As a result, when the value passed to Math.abs is Integer.MIN_VALUE, Math.abs cannot convert the input into a positive number, causing a negative return. A correct implementation should throw ArithmeticException() at such cases.<\/p>\n<p>We could create a test to capture this fault. The input of the test is a=Integer.MIN_VALUE and b=1, and the expected output is to throw ArithmeticException. Obviously, the program fails the test because no exception will be thrown.<\/p>\n<p>But when we pass this program and the corresponding tests to ACS, the following path is generated, which precisely repairs the defect:<\/p>\n<pre style=\"padding-left: 30px;\">int lcm=Math.abs(mulAndCheck(a\/gdc(a,b), b));\r\n + if (lcm == Integer.MIN_VALUE) {\r\n + \u00a0throw new ArithmeticException();\r\n + }\r\n return lcm;<\/pre>\n<p>This latest approach stems from a legacy of historic program repair approaches. Since Genprog in 2009, there have been many different approaches offered to repair programs. However, these techniques have a significant problem: typically only a very small portion of generated patches are correct, that is, low precision in patch generation. This is because traditional program repair approaches aim for passing all the tests. However, in real-world software systems, there are only a limited number of tests, and passing the tests does not mean that the program is correct.<\/p>\n<p>For example, current approaches may generate the following patch for the above program:<\/p>\n<pre style=\"padding-left: 30px;\">\u00a0int lcm=Math.abs(mulAndCheck(a\/gdc(a,b), b));\r\n + if (b == 1) {\r\n + \u00a0throw new ArithmeticException();\r\n + }\r\n return lcm;<\/pre>\n<p>Or even the following patch:<\/p>\n<pre style=\"padding-left: 30px;\">\u00a0int lcm=Math.abs(mulAndCheck(a\/gdc(a,b), b));\r\n - return lcm;\r\n + throw new ArithmeticException();<\/pre>\n<p>All these patches can pass the test, but are far from being correct. In fact, we can easily construct hundreds of patches to pass the test in this example. Yet few would be of high precision.<\/p>\n<p>Based on the findings of Prof. Martin Rinard\u2019s group at MIT, revealed in an ISSTA 2015 paper, the precisions of mainstream program repair approaches are less than 10 percent. Though some improved approaches have been proposed, such as Prophet and Angelix, the precision of these approaches remains less than 40 percent. In other words, the patches generated by these approaches are mostly incorrect, rendering these approaches difficult to use in practice.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-418859 aligncenter\" style=\"border: 1px solid black;\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/ACS-aims-to-repair-programs-with-high-precision.jpg\" alt=\"\" width=\"554\" height=\"293\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/ACS-aims-to-repair-programs-with-high-precision.jpg 554w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/ACS-aims-to-repair-programs-with-high-precision-300x159.jpg 300w\" sizes=\"(max-width: 554px) 100vw, 554px\" \/><\/p>\n<p>The precision of ACS is a significant improvement over previous approaches. Based on the result over the Defects4J benchmark, ACS produced 23 patches, of which 18 (almost 80 percent) are correct\u2014a significantly better result than existing approaches. In addition, the number of defects correctly repaired by ACS is also the highest among the approaches evaluated on Defects4J.<\/p>\n<p>The key to ACS\u2019s high precision is the use of multiple information sources, especially the \u201cbig code\u201d existing on the Internet. Compared with existing techniques, ACS uses three new types of information sources.<\/p>\n<ul>\n<li>First, researchers noticed that the principle of locality holds on variable uses, and apply such information to sort the variables to be used in the patches.<\/li>\n<li>Second, ACS uses natural language analysis techniques to analyze Javadoc, and then uses the information in Javadoc to filter incorrect patches.<\/li>\n<li>Last, and most importantly, ACS performs statistical analysis on the open-source program on the Internet, discovers the conditional probabilities of the operations over the variables and further generates correct patches.<\/li>\n<\/ul>\n<p>In the above example, ACS first learns from the failed test that a conditional check throwing ArithmeticException is missing, and then uses the principle of locality to determine that variable lcm should be used in the conditional check. Lastly, based on the conditional probability, it determines \u201c==Integer.MIN_VALUE\u201d should be applied to lcm, generating the whole patch.<\/p>\n<div id=\"attachment_419012\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-419012\" class=\"wp-image-419012 size-full\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/microsoft_research_asia_1200x800.jpg\" alt=\"\" width=\"1200\" height=\"800\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/microsoft_research_asia_1200x800.jpg 1200w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/microsoft_research_asia_1200x800-300x200.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/microsoft_research_asia_1200x800-768x512.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/microsoft_research_asia_1200x800-1024x683.jpg 1024w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><p id=\"caption-attachment-419012\" class=\"wp-caption-text\">Left: Shi Han, lead researcher, Microsoft Research Asia; Middle: Lily Sun, Research Program Manager, Microsoft Research Asia; Right: Prof. Yingfei Xiong, Peking University.<\/p><\/div>\n<p>The paper describing ACS \u201c<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/precise-condition-synthesis-program-repair\/\">Precise Condition Synthesis for Program Repair<\/a>\u201d has been published at ICSE 2017. The authors include Yingfei Xiong, Jie Wang, Guang Huang and Lu Zhang from Peking University, Runfa Yan from UESTC and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/shihan\/#\">Shi Han<\/a> from Microsoft Research Asia.<\/p>\n<p><strong>Related<\/strong>:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/precise-condition-synthesis-program-repair\/\">Precise Condition Synthesis for Program Repair<\/a><\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/deepcoder-learning-write-programs\/\">DeepCoder: Learning to Write Programs<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>By Lily Sun, Research Program Manager of Microsoft Research Asia In February 2017, Microsoft and Cambridge University announced a DeepCoder algorithm that produces programs from problem inputs\/outputs. DeepCoder, which operates on a novel yet greatly simplified programming language, cannot handle complex problems\u2014general programming languages are still too hard for DeepCoder to master. So, currently, programmers [&hellip;]<\/p>\n","protected":false},"author":36509,"featured_media":418847,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"categories":[194488],"tags":[],"research-area":[13560],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-418838","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-program-languages-and-software-engineering","msr-research-area-programming-languages-software-engineering","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199560],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144847],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"411\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539.jpg\" class=\"img-object-cover\" alt=\"\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539.jpg 1260w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539-300x128.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539-768x329.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2017\/08\/custom-code-3-1260x539-1024x438.jpg 1024w\" sizes=\"(max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"August 4, 2017","formattedExcerpt":"By Lily Sun, Research Program Manager of Microsoft Research Asia In February 2017, Microsoft and Cambridge University announced a DeepCoder algorithm that produces programs from problem inputs\/outputs. DeepCoder, which operates on a novel yet greatly simplified programming language, cannot handle complex problems\u2014general programming languages are&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/418838"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/36509"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=418838"}],"version-history":[{"count":16,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/418838\/revisions"}],"predecessor-version":[{"id":419063,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/418838\/revisions\/419063"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/418847"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=418838"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=418838"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=418838"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=418838"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=418838"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=418838"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=418838"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=418838"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=418838"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=418838"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=418838"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}