On the Lack of Consensus Among Technical Debt Detection Tools (ICSE 2021 - SEIP - Software Engineering in Practice)

Who

Jason Lefever, Yuanfang Cai, Humberto Cervantes, Rick Kazman, Hongzhou Fang

Track

ICSE 2021 SEIP - Software Engineering in Practice

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 26 May 2021 16:40 - 17:00 at Blended Sessions Room 3 - 2.4.3. Observational Studies: Different Domains Chair(s): Daniela Damian
Thu 27 May 2021 04:40 - 05:00 at Blended Sessions Room 3 - 2.4.3. Observational Studies: Different Domains

Abstract

A vigorous and growing set of technical debt analysis tools have been developed in recent years—both research tools and industrial products—such as Structure 101, SonarQube, and DV8. Each of these tools identifies problematic files using their own definitions and measures. But to what extent do these tools agree with each other in terms of the files that they identify as problematic? If the top-ranked files reported by these tools are largely consistent, then we can be confident in using any of these tools. Otherwise, a problem of accuracy arises. In this paper, we report the results of an empirical study analyzing 10 projects using multiple tools. Our results show that: 1) these tools report very different results even for the most common measures, such as size, complexity, file cycles, and package cycles. 2) These tools also differ dramatically in terms of the set of problematic files they identify, since each implements its own definitions of “problematic”. After normalizing by size, the most problematic file sets that the tools identify barely overlap. 3) Our results show that code-based measures, other than size and complexity, do not even moderately correlate with a file’s change-proneness or error proneness. In contrast, co-change-related measures performed better. Our results suggest that, to identify files with true technical debt—those that experience excessive changes or bugs—co-change information must be considered. Code-based measures are largely ineffective at pinpointing true debt. Finally, this study reveals the need for the community to create benchmarks and data sets to assess the accuracy of software analysis tools in terms of commonly used measures.

Link to Preprint

https://arxiv.org/abs/2103.04506

Jason Lefever

Drexel University

United States

Yuanfang Cai

Drexel University

United States

Humberto Cervantes

UAM Iztapalapa

Mexico

Rick Kazman

University of Hawai‘i at Mānoa

Hongzhou Fang

Drexel University

United States

YT Video

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 26 May
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

16:05 - 17:00	2.4.3. Observational Studies: Different DomainsJournal-First Papers / NIER - New Ideas and Emerging Results / SEIP - Software Engineering in Practice at Blended Sessions Room 3 +12h Chair(s): Daniela Damian University of Victoria

16:05 15m Paper		Two Elements of Pair Programming SkillNIER NIER - New Ideas and Emerging Results Franz Zieris Freie Universität Berlin, Lutz Prechelt Freie Universität Berlin Pre-print Media Attached
16:20 20m Paper		The best laid plans or lack thereof: Security decision-making of different stakeholder groupsJournal-First Journal-First Papers Benjamin Shreeve University of Bristol, Joseph Hallett University of Bristol, Matthew Edwards University of Bristol, Kopo M. Ramokapane University of Bristol, Richard Atkins City of London Police, Awais Rashid University of Bristol, UK Link to publication DOI Pre-print Media Attached
16:40 20m Paper		On the Lack of Consensus Among Technical Debt Detection ToolsSEIP SEIP - Software Engineering in Practice Jason Lefever Drexel University, Yuanfang Cai Drexel University, Humberto Cervantes UAM Iztapalapa, Rick Kazman University of Hawai‘i at Mānoa, Hongzhou Fang Drexel University Pre-print Media Attached

Thu 27 May
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

04:05 - 05:00	2.4.3. Observational Studies: Different DomainsSEIP - Software Engineering in Practice / Journal-First Papers / NIER - New Ideas and Emerging Results at Blended Sessions Room 3

04:05 15m Paper		Two Elements of Pair Programming SkillNIER NIER - New Ideas and Emerging Results Franz Zieris Freie Universität Berlin, Lutz Prechelt Freie Universität Berlin Pre-print Media Attached
04:20 20m Paper		The best laid plans or lack thereof: Security decision-making of different stakeholder groupsJournal-First Journal-First Papers Benjamin Shreeve University of Bristol, Joseph Hallett University of Bristol, Matthew Edwards University of Bristol, Kopo M. Ramokapane University of Bristol, Richard Atkins City of London Police, Awais Rashid University of Bristol, UK Link to publication DOI Pre-print Media Attached
04:40 20m Paper		On the Lack of Consensus Among Technical Debt Detection ToolsSEIP SEIP - Software Engineering in Practice Jason Lefever Drexel University, Yuanfang Cai Drexel University, Humberto Cervantes UAM Iztapalapa, Rick Kazman University of Hawai‘i at Mānoa, Hongzhou Fang Drexel University Pre-print Media Attached