{"id":418838,"date":"2017-08-04T10:08:59","date_gmt":"2017-08-04T17:08:59","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=418838"},"modified":"2017-08-04T10:08:59","modified_gmt":"2017-08-04T17:08:59","slug":"program-repairs-programs-achieve-78-3-percent-precision-automated-program-repair","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/program-repairs-programs-achieve-78-3-percent-precision-automated-program-repair\/","title":{"rendered":"Program that repairs programs: how to achieve 78.3 percent precision in automated program repair"},"content":{"rendered":"

\"\"<\/p>\n

By Lily Sun<\/a>, Research Program Manager of Microsoft Research Asia<\/em><\/p>\n

In February 2017, Microsoft and Cambridge University announced a DeepCoder algorithm<\/a> that produces programs from problem inputs\/outputs. DeepCoder, which operates on a novel yet greatly simplified programming language, cannot handle complex problems\u2014general programming languages are still too hard for DeepCoder to master. So, currently, programmers don\u2019t have to worry about being replaced by machines.<\/p>\n

But programmers have plenty of other worries, including programming bugs. Could machines assist programmers by taking over the task of bug fixes?<\/p>\n

\"\"<\/p>\n

To test this idea, researchers from Peking University, Microsoft Research Asia (MSRA) and University of Electronic Science and Technology of China (UESTC) have developed a new approach, called Accurate Condition System (ACS), to automatically repair defects in software systems without human intervention.<\/p>\n

For example, consider the following piece of code from Apache Math, which is used to calculate the least common multiplier from two numbers. This piece of code uses Math.abs to ensure the return value is positive. However, code defects may still return negative results for some input values.<\/p>\n

int lcm=Math.abs(mulAndCheck(a\/gdc(a,b), b));\r\n return lcm;<\/pre>\n

The root cause of this error is that there is one more negative number than there are positive numbers in the range of signed integers. As a result, when the value passed to Math.abs is Integer.MIN_VALUE, Math.abs cannot convert the input into a positive number, causing a negative return. A correct implementation should throw ArithmeticException() at such cases.<\/p>\n

We could create a test to capture this fault. The input of the test is a=Integer.MIN_VALUE and b=1, and the expected output is to throw ArithmeticException. Obviously, the program fails the test because no exception will be thrown.<\/p>\n

But when we pass this program and the corresponding tests to ACS, the following path is generated, which precisely repairs the defect:<\/p>\n

int lcm=Math.abs(mulAndCheck(a\/gdc(a,b), b));\r\n + if (lcm == Integer.MIN_VALUE) {\r\n + \u00a0throw new ArithmeticException();\r\n + }\r\n return lcm;<\/pre>\n

This latest approach stems from a legacy of historic program repair approaches. Since Genprog in 2009, there have been many different approaches offered to repair programs. However, these techniques have a significant problem: typically only a very small portion of generated patches are correct, that is, low precision in patch generation. This is because traditional program repair approaches aim for passing all the tests. However, in real-world software systems, there are only a limited number of tests, and passing the tests does not mean that the program is correct.<\/p>\n

For example, current approaches may generate the following patch for the above program:<\/p>\n

\u00a0int lcm=Math.abs(mulAndCheck(a\/gdc(a,b), b));\r\n + if (b == 1) {\r\n + \u00a0throw new ArithmeticException();\r\n + }\r\n return lcm;<\/pre>\n

Or even the following patch:<\/p>\n

\u00a0int lcm=Math.abs(mulAndCheck(a\/gdc(a,b), b));\r\n - return lcm;\r\n + throw new ArithmeticException();<\/pre>\n

All these patches can pass the test, but are far from being correct. In fact, we can easily construct hundreds of patches to pass the test in this example. Yet few would be of high precision.<\/p>\n

Based on the findings of Prof. Martin Rinard\u2019s group at MIT, revealed in an ISSTA 2015 paper, the precisions of mainstream program repair approaches are less than 10 percent. Though some improved approaches have been proposed, such as Prophet and Angelix, the precision of these approaches remains less than 40 percent. In other words, the patches generated by these approaches are mostly incorrect, rendering these approaches difficult to use in practice.<\/p>\n

\"\"<\/p>\n

The precision of ACS is a significant improvement over previous approaches. Based on the result over the Defects4J benchmark, ACS produced 23 patches, of which 18 (almost 80 percent) are correct\u2014a significantly better result than existing approaches. In addition, the number of defects correctly repaired by ACS is also the highest among the approaches evaluated on Defects4J.<\/p>\n

The key to ACS\u2019s high precision is the use of multiple information sources, especially the \u201cbig code\u201d existing on the Internet. Compared with existing techniques, ACS uses three new types of information sources.<\/p>\n