Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks (ICSE 2021 - Technical Track)

Who

Antonio Mastropaolo, Simone Scalabrino, Nathan Cooper, David Nader Palacio, Denys Poshyvanyk, Rocco Oliveto, Gabriele Bavota

Track

ICSE 2021 Technical Track

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 27 May 2021 10:40 - 11:00 at Blended Sessions Room 2 - 3.1.2. Deep Neural Networks: Supporting SE Tasks #2 Chair(s): Sira Vegas
Thu 27 May 2021 22:40 - 23:00 at Blended Sessions Room 2 - 3.1.2. Deep Neural Networks: Supporting SE Tasks #2

Abstract

Deep learning (DL) techniques are gaining more and more attention in the software engineering community. They have been used to support several code-related tasks, such as automatic bug fixing and code comments generation. Recent studies in the Natural Language Processing (NLP) field have shown that the Text-To-Text Transfer Transformer (T5) architecture can achieve state-of-the-art performance for a variety of NLP tasks. The basic idea behind T5 is to first pre-train a model on a large and generic dataset using a self-supervised task ( e.g: filling masked words in sentences). Once the model is pre-trained, it is fine-tuned on smaller and specialized datasets, each one related to a specific task ( e.g: language translation, sentence classification). In this paper, we empirically investigate how the T5 model performs when pre-trained and fine-tuned to support code-related tasks. We pre-train a T5 model on a dataset composed of natural language English text and source code. Then, we fine-tune such a model by reusing datasets used in four previous works that used DL techniques to: (i) fix bugs, (ii) inject code mutants, (iii) generate assert statements, and (iv) generate code comments. We compared the performance of this single model with the results reported in the four original papers proposing DL-based solutions for those four tasks. We show that our T5 model, exploiting additional data for the self-supervised pre-training phase, can achieve performance improvements over the four baselines.

Link to Preprint

https://arxiv.org/abs/2102.02017

Antonio Mastropaolo

Università della Svizzera italiana

Simone Scalabrino

University of Molise

Nathan Cooper

William & Mary

United States

David Nader Palacio

William and Mary

United States

Denys Poshyvanyk

College of William & Mary

United States

Rocco Oliveto

University of Molise

Gabriele Bavota

Software Institute, USI Università della Svizzera italiana

Switzerland

YT Video

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 27 May
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

10:00 - 11:00	3.1.2. Deep Neural Networks: Supporting SE Tasks #2SEIP - Software Engineering in Practice / Journal-First Papers / Technical Track at Blended Sessions Room 2 +12h Chair(s): Sira Vegas Universidad Politecnica de Madrid

10:00 20m Paper		NNStreamer: Efficient and Agile Development of On-Device AI SystemsSEIP SEIP - Software Engineering in Practice MyungJoo Ham Samsung Electronics, Jijoong Moon Samsung Electronics, Geunsik Lim Samsung Research, Samsung Electronics, Jaeyun Jung Samsung Electronics, Hyoungjoo Ahn Samsung Electronics, Wook Song Samsung Electronics, Sangjung Woo Samsung Electronics, Parichay Kapoor Samsung Electronics, Dongju Chae Samsung Electronics, Gichan Jang Samsung Electronics, Yongjoo Ahn Samsung Electronics, Jihoon Lee Samsung Electronics Pre-print Media Attached
10:20 20m Paper		Deep Learning Based Program Generation from Requirements Text: Are We There Yet?Journal-First Journal-First Papers Hui Liu Beijing Institute of Technology, Mingzhu Shen Beijing Institute of Technology, Jiaqi Zhu Beijing Institute of Technology, Nan Niu University of Cincinnati, Ge Li Peking University, Lu Zhang Peking University, China Link to publication DOI Pre-print Media Attached
10:40 20m Paper		Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related TasksTechnical Track Technical Track Antonio Mastropaolo Università della Svizzera italiana, Simone Scalabrino University of Molise, Nathan Cooper William & Mary, David Nader Palacio William and Mary, Denys Poshyvanyk College of William & Mary, Rocco Oliveto University of Molise, Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached

22:00 - 23:00	3.1.2. Deep Neural Networks: Supporting SE Tasks #2Technical Track / SEIP - Software Engineering in Practice / Journal-First Papers at Blended Sessions Room 2

22:00 20m Paper		NNStreamer: Efficient and Agile Development of On-Device AI SystemsSEIP SEIP - Software Engineering in Practice MyungJoo Ham Samsung Electronics, Jijoong Moon Samsung Electronics, Geunsik Lim Samsung Research, Samsung Electronics, Jaeyun Jung Samsung Electronics, Hyoungjoo Ahn Samsung Electronics, Wook Song Samsung Electronics, Sangjung Woo Samsung Electronics, Parichay Kapoor Samsung Electronics, Dongju Chae Samsung Electronics, Gichan Jang Samsung Electronics, Yongjoo Ahn Samsung Electronics, Jihoon Lee Samsung Electronics Pre-print Media Attached
22:20 20m Paper		Deep Learning Based Program Generation from Requirements Text: Are We There Yet?Journal-First Journal-First Papers Hui Liu Beijing Institute of Technology, Mingzhu Shen Beijing Institute of Technology, Jiaqi Zhu Beijing Institute of Technology, Nan Niu University of Cincinnati, Ge Li Peking University, Lu Zhang Peking University, China Link to publication DOI Pre-print Media Attached
22:40 20m Paper		Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related TasksTechnical Track Technical Track Antonio Mastropaolo Università della Svizzera italiana, Simone Scalabrino University of Molise, Nathan Cooper William & Mary, David Nader Palacio William and Mary, Denys Poshyvanyk College of William & Mary, Rocco Oliveto University of Molise, Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached