Anomalicious: Automated Detection of Anomalous and Potentially Malicious Commits on GitHub (ICSE 2021 - SEIP - Software Engineering in Practice)

Who

Danielle Gonzalez, Thomas Zimmermann, Patrice Godefroid, Max Schaefer

Track

ICSE 2021 SEIP - Software Engineering in Practice

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 27 May 2021 16:50 - 17:10 at Blended Sessions Room 2 - 3.4.2. Security Vulnerabilities: From 3rd Parties' Code Chair(s): Jeff Carver
Fri 28 May 2021 04:50 - 05:10 at Blended Sessions Room 2 - 3.4.2. Security Vulnerabilities: From 3rd Parties' Code

Abstract

Security is critical to the adoption of open source software (OSS), yet few automated solutions currently exist to help detect and prevent malicious contributions from infecting open source repositories. On GitHub, a primary host of OSS, repositories contain not only code but also a wealth of commit related and contextual metadata – what if this metadata could be used to automatically identify malicious OSS contributions?

In this work, we show how to use only commit logs and repository metadata to automatically detect anomalous and potentially malicious commits. We identify and evaluate several relevant factors which can be automatically computed from this data, such as the modification of sensitive files, outlier change properties, or a lack of trust in the commit’s author. Our tool, Anomalicious, automatically computes these factors and considers them holistically using a rule-based decision model. In an evaluation on a data set of 15 malware-infected repositories, Anomalicious showed promising results and identified 53.33% of malicious commits, while flagging less than 1% of commits for most repositories. Additionally, the tool found other interesting anomalies that are not related to malicious commits in an analysis of repositories with no known malicious commits.

Link to Preprint

https://arxiv.org/abs/2103.03846

Danielle Gonzalez

Rochester Institute of Technology

United States

Thomas Zimmermann

Microsoft Research

United States

Patrice Godefroid

Microsoft Research, USA

United States

Max Schaefer

GitHub, Inc.

United Kingdom

YouTube video

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 27 May
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

16:30 - 17:30	3.4.2. Security Vulnerabilities: From 3rd Parties' CodeTechnical Track / SEIP - Software Engineering in Practice / Journal-First Papers at Blended Sessions Room 2 +12h Chair(s): Jeff Carver University of Alabama

16:30 20m Paper		An Empirical Study of C++ Vulnerabilities in Crowd-Sourced Code ExamplesJournal-First Journal-First Papers Morteza Verdi Shiraz University, Ashkan Sami Shiraz University, Jafar Akhondali Shiraz University, Foutse Khomh Polytechnique Montréal, Gias Uddin University of Calgary, Canada, Alireza Karami Motlagh Shiraz University Link to publication DOI Pre-print Media Attached
16:50 20m Paper		Anomalicious: Automated Detection of Anomalous and Potentially Malicious Commits on GitHubSEIP SEIP - Software Engineering in Practice Danielle Gonzalez Rochester Institute of Technology, Thomas Zimmermann Microsoft Research, Patrice Godefroid Microsoft Research, USA, Max Schaefer GitHub, Inc. Pre-print Media Attached
17:10 20m Paper		Why Security Defects Go Unnoticed during Code Reviews? A Case-Control Study of the Chromium OS ProjectTechnical Track Technical Track Rajshakhar Paul Wayne State University, Asif Kamal Turzo Wayne State University, Amiangshu Bosu Wayne State University Pre-print Media Attached

Fri 28 May
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

04:30 - 05:30	3.4.2. Security Vulnerabilities: From 3rd Parties' CodeTechnical Track / Journal-First Papers / SEIP - Software Engineering in Practice at Blended Sessions Room 2

04:30 20m Paper		An Empirical Study of C++ Vulnerabilities in Crowd-Sourced Code ExamplesJournal-First Journal-First Papers Morteza Verdi Shiraz University, Ashkan Sami Shiraz University, Jafar Akhondali Shiraz University, Foutse Khomh Polytechnique Montréal, Gias Uddin University of Calgary, Canada, Alireza Karami Motlagh Shiraz University Link to publication DOI Pre-print Media Attached
04:50 20m Paper		Anomalicious: Automated Detection of Anomalous and Potentially Malicious Commits on GitHubSEIP SEIP - Software Engineering in Practice Danielle Gonzalez Rochester Institute of Technology, Thomas Zimmermann Microsoft Research, Patrice Godefroid Microsoft Research, USA, Max Schaefer GitHub, Inc. Pre-print Media Attached
05:10 20m Paper		Why Security Defects Go Unnoticed during Code Reviews? A Case-Control Study of the Chromium OS ProjectTechnical Track Technical Track Rajshakhar Paul Wayne State University, Asif Kamal Turzo Wayne State University, Amiangshu Bosu Wayne State University Pre-print Media Attached