A Differential Testing Approach for Evaluating Abstract Syntax Tree Mapping AlgorithmsTechnical Track
Thu 27 May 2021 23:50 - 00:10 at Blended Sessions Room 1 - 3.2.1. Programming: Code Analysis Algorithms
Abstract syntax tree (AST) mapping algorithms are widely used to analyze changes in source code. Despite the foundational role of AST mapping algorithms, little effort has been made to evaluate the accuracy of AST mapping algorithms, i.e., the extent to which an algorithm captures the evolution of code. We observe that a program element often has only one best-mapped program element. Based on this observation, we propose a hierarchical approach to automatically compare the similarity of mapped statements and tokens by different algorithms. By performing the comparison, we determine if each of the compared algorithms generates inaccurate mappings for a statement or its tokens. We invite 12 external experts to determine if three commonly used AST mapping algorithms generate accurate mappings for a statement and its tokens for 200 statements. Based on the experts’ feedback, we observe that our approach achieves a precision of 0.98–1.00 and a recall of 0.65–0.75. Furthermore, we conduct a large-scale study with a dataset of ten Java projects containing a total of 263,165 file revisions. Our approach determines that GumTree, MTDiff and IJM generate inaccurate mappings for 20%–29%, 25%–36% and 21%–30% of the file revisions, respectively. Our experimental results show that state-of-the-art AST mapping algorithms still need improvements.
Thu 27 MayDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
11:50 - 13:10 | 3.2.1. Programming: Code Analysis AlgorithmsJournal-First Papers / Technical Track / SEIP - Software Engineering in Practice at Blended Sessions Room 1 +12h Chair(s): Giuseppe Scanniello University of Basilicata | ||
11:50 20mPaper | A Differential Testing Approach for Evaluating Abstract Syntax Tree Mapping AlgorithmsTechnical Track Technical Track Yuanrui Fan College of Computer Science and Technology, Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, Ahmed E. Hassan School of Computing, Queen's University, Yuan Wang Huawei Sweden Research Center, Shanping Li Zhejiang University Pre-print Media Attached | ||
12:10 20mPaper | InferCode: Self-Supervised Learning of Code Representations by Predicting SubtreesTechnical Track Technical Track Nghi D. Q. Bui Singapore Management University, Singapore, Yijun Yu The Open University, UK, Lingxiao Jiang Singapore Management University Pre-print Media Attached | ||
12:30 20mPaper | Modular Tree Network for Source Code Representation LearningJournal-First Journal-First Papers Wenhan Wang Peking University, Ge Li Peking University, Sijie Shen Peking University, Xin Xia Huawei Software Engineering Application Technology Lab, Zhi Jin Peking University Link to publication Pre-print Media Attached | ||
12:50 20mPaper | Case Study on Data-driven Deployment of Program Analysis on an Open Tools StackSEIP SEIP - Software Engineering in Practice Anton Ljungberg Lund University, David Åkerman Axis Communications, Emma Söderberg Lund University, Gustaf Lundh Axis Communications, Jon Sten Axis Communications, Luke Church University of Cambridge | Lund University | Lark Systems Pre-print Media Attached |
23:50 - 01:10 | 3.2.1. Programming: Code Analysis AlgorithmsSEIP - Software Engineering in Practice / Journal-First Papers / Technical Track at Blended Sessions Room 1 | ||
23:50 20mPaper | A Differential Testing Approach for Evaluating Abstract Syntax Tree Mapping AlgorithmsTechnical Track Technical Track Yuanrui Fan College of Computer Science and Technology, Zhejiang University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, Ahmed E. Hassan School of Computing, Queen's University, Yuan Wang Huawei Sweden Research Center, Shanping Li Zhejiang University Pre-print Media Attached | ||
00:10 20mPaper | InferCode: Self-Supervised Learning of Code Representations by Predicting SubtreesTechnical Track Technical Track Nghi D. Q. Bui Singapore Management University, Singapore, Yijun Yu The Open University, UK, Lingxiao Jiang Singapore Management University Pre-print Media Attached | ||
00:30 20mPaper | Modular Tree Network for Source Code Representation LearningJournal-First Journal-First Papers Wenhan Wang Peking University, Ge Li Peking University, Sijie Shen Peking University, Xin Xia Huawei Software Engineering Application Technology Lab, Zhi Jin Peking University Link to publication Pre-print Media Attached | ||
00:50 20mPaper | Case Study on Data-driven Deployment of Program Analysis on an Open Tools StackSEIP SEIP - Software Engineering in Practice Anton Ljungberg Lund University, David Åkerman Axis Communications, Emma Söderberg Lund University, Gustaf Lundh Axis Communications, Jon Sten Axis Communications, Luke Church University of Cambridge | Lund University | Lark Systems Pre-print Media Attached |