{"id":168252,"date":"2015-04-01T00:00:00","date_gmt":"2015-04-01T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/the-impact-of-tangled-code-changes-on-defect-prediction-models\/"},"modified":"2018-10-16T21:14:47","modified_gmt":"2018-10-17T04:14:47","slug":"the-impact-of-tangled-code-changes-on-defect-prediction-models","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/the-impact-of-tangled-code-changes-on-defect-prediction-models\/","title":{"rendered":"The impact of tangled code changes on defect prediction models"},"content":{"rendered":"
When interacting with source control management system, developers often commit unrelated or loosely related code changes in a single transaction. When analyzing version histories, such tangled changes will make all changes to all modules appear related, possibly compromising the resulting analyses through noise and bias. In an investigation of five open-source Java projects, we found between 7 % and 20 % of all bug fixes to consist of multiple tangled changes. Using a multi-predictor approach to untangle changes, we show that on average at least 16.6 % of all source files are incorrectly associated with bug reports. These incorrect bug file associations seem to not significantly impact models classifying source files to have at least one bug or no bugs. But our experiments show that untangling tangled code changes can result in more accurate regression bug prediction models when compared to models trained and tested on tangled bug datasets\u2014in our experiments, the statistically significant accuracy improvements lies between 5 % and 200 %. We recommend better change organization to limit the impact of tangled changes.<\/p>\n","protected":false},"excerpt":{"rendered":"
When interacting with source control management system, developers often commit unrelated or loosely related code changes in a single transaction. When analyzing version histories, such tangled changes will make all changes to all modules appear related, possibly compromising the resulting analyses through noise and bias. In an investigation of five open-source Java projects, we found […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"kimh"}],"msr_publishername":"Springer US","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"Empirical Software Engineering","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"Empirical Software Engineering","msr_number":"","msr_organization":"","msr_pages_string":"1-34","msr_page_range_start":"1","msr_page_range_end":"34","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2015-04-01","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"http:\/\/dx.doi.org\/10.1007\/s10664-015-9376-6","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":2015,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13563,13560,13555],"msr-publication-type":[193715],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-168252","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-data-platform-analytics","msr-research-area-programming-languages-software-engineering","msr-research-area-search-information-retrieval","msr-locale-en_us"],"msr_publishername":"Springer US","msr_edition":"Empirical Software Engineering","msr_affiliation":"","msr_published_date":"2015-04-01","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"1-34","msr_chapter":"","msr_isbn":"","msr_journal":"Empirical Software Engineering","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"http:\/\/dx.doi.org\/10.1007\/s10664-015-9376-6","msr_doi":"","msr_publication_uploader":[{"type":"url","title":"http:\/\/dx.doi.org\/10.1007\/s10664-015-9376-6","viewUrl":false,"id":false,"label_id":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":0,"url":"http:\/\/dx.doi.org\/10.1007\/s10664-015-9376-6"}],"msr-author-ordering":[{"type":"user_nicename","value":"kimh","user_id":32548,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=kimh"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[144812,144860],"msr_project":[170993],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"article","related_content":{"projects":[{"ID":170993,"post_title":"Tools for Software Engineers","post_name":"tools-for-software-engineers","post_type":"msr-project","post_date":"2012-06-29 06:20:39","post_modified":"2021-11-12 09:09:39","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/tools-for-software-engineers\/","post_excerpt":"The mission of Microsoft's One Engineering System (formerly known as Tools for Software Engineers) team is to enable the world's best product engineering teams with world-class tools and systems that help them ship products their customers love. 1ES provides tools and services to cover the full spectrum of the engineering life-cycle, from the developer desktop to product deployment. 1ES focuses on engineering solutions that mitigate the unique scale challenges that Microsoft teams face, both in…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/170993"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/168252","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/168252\/revisions"}],"predecessor-version":[{"id":534151,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/168252\/revisions\/534151"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=168252"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=168252"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=168252"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=168252"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=168252"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=168252"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=168252"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=168252"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=168252"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=168252"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=168252"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=168252"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=168252"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}