Predicting Defective Lines Using a Model-Agnostic TechniqueJournal-First
Thu 27 May 2021 22:40 - 23:00 at Blended Sessions Room 3 - 3.1.3. Defect Prediction: Automation #2
Defect prediction models are proposed to help a team prioritize source code areas files that need Software Quality Assurance (SQA) based on the likelihood of having defects. However, developers may waste their unnecessary effort on the whole file while only a small fraction of its source code lines are defective. Indeed, we find that as little as 1%-3% of lines of a file are defective. Hence, in this work, we propose a novel framework (called LINE-DP) to identify defective lines using a model-agnostic technique, i.e., an Explainable AI technique that provides information why the model makes such a prediction. Broadly speaking, our LINE-DP first builds a file-level defect model using code token features. Then, our LINE-DP uses a state-of-the-art model-agnostic technique (i.e., LIME) to identify risky tokens, i.e., code tokens that lead the file-level defect model to predict that the file will be defective. Then, the lines that contain risky tokens are predicted as defective lines. Through a case study of 32 releases of nine Java open source systems, our evaluation results show that our LINE-DP achieves an average recall of 0.61, a false alarm rate of 0.47, a top 20%LOC recall of 0.27, and an initial false alarm of 16, which are statistically better than six baseline approaches. Our evaluation shows that our LINE-DP requires an average computation time of 10 seconds including model construction and defective identification time. In addition, we find that 63% of defective lines that can be identified by our LINE-DP are related to common defects (e.g., argument change, condition change). These results suggest that our LINE-DP can effectively identify defective lines that contain common defects while requiring a smaller amount of inspection effort and a manageable computation cost. The contribution of this paper builds an important step towards line-level defect prediction by leveraging a model-agnostic technique.
Thu 27 MayDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:00 - 11:00 | 3.1.3. Defect Prediction: Automation #2Journal-First Papers at Blended Sessions Room 3 +12h Chair(s): Robert Feldt Chalmers | University of Gothenburg, Blekinge Institute of Technology | ||
10:00 20mPaper | Revisiting Supervised and Unsupervised Methods for Effort-Aware Cross-Project Defect PredictionJournal-First Journal-First Papers Chao Ni Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, Xiang Chen Nantong University, Qing Gu Nanjing University Pre-print Media Attached | ||
10:20 20mPaper | Ammonia: an Approach for Deriving Project-Specific Bug PatternsJournal-First Journal-First Papers Yoshiki Higo Osaka University, Shinpei Hayashi Tokyo Institute of Technology, Hideaki Hata Shinshu University, Mei Nagappan University of Waterloo Link to publication DOI Authorizer link Pre-print Media Attached | ||
10:40 20mPaper | Predicting Defective Lines Using a Model-Agnostic TechniqueJournal-First Journal-First Papers Supatsara Wattanakriengkrai Nara Institute of Science and Technology, Patanamon Thongtanunam University of Melbourne, Kla Tantithamthavorn Monash University, Hideaki Hata Shinshu University, Kenichi Matsumoto Nara Institute of Science and Technology DOI Pre-print Media Attached |