Towards Mining OSS Skills from GitHub Activity
Thu 12 May 2022 13:15 - 13:20 at ICSE room 5-odd hours - Mining Software Repositories 6 Chair(s): Sonia Haiduc
Fri 27 May 2022 09:25 - 09:30 at Room 301+302 - Papers 16: Mining Software Repositories 1 Chair(s): Grace Lewis
Fri 27 May 2022 13:30 - 15:00 at Ballroom Gallery - Posters 3
Open source software (OSS) development relies on diverse skill sets. Each skill is vital to OSS—software developers use problem-solving skills to build new features, software maintainers use organizational skills to manage an OSS project, and communication skills help everyone to effectively collaborate. However, to our knowledge, there are no tools which detect OSS-related skills. In this paper, we present a novel method to detecting OSS skills and prototype it in a tool called Disko. Our approach relies on identifying relevant signals, which are measurable activities or cues associated with a skill. Our tool detects how contributors 1) teach others to be involved in OSS projects, 2) show commitment towards an OSS project, 3) have knowledge in specific programming languages, and4) are familiar with OSS practices. We then evaluate the tool by administering a survey to 455 OSS contributors. We demonstrate that Disko yields promising results: it detects the presence of these OSS skills with precision scores between 77% to 97%. We also find that over 54% of participants would display their high-proficiency skills. Our approach can be used to transform existing OSS experiences, such as identifying potential collaborators, matching mentors to mentees, and assigning project roles. Given the positive results and the potential impact of our tool, we outline future research that considers the opportunities of interpreting and sharing OSS skills.
Mon 9 MayDisplayed time zone: Eastern Time (US & Canada) change
Thu 12 MayDisplayed time zone: Eastern Time (US & Canada) change
13:00 - 14:00 | Mining Software Repositories 6NIER - New Ideas and Emerging Results / Journal-First Papers / SEIP - Software Engineering in Practice / Technical Track at ICSE room 5-odd hours Chair(s): Sonia Haiduc Florida State University | ||
13:00 5mTalk | A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits Journal-First Papers Steffen Herbold TU Clausthal, Alexander Trautsch University of Göttingen, Benjamin Ledel TU Clausthal, Alireza Aghamohammadi Sharif University of Technology, Taher A Ghaleb University of Ottawa, Kuljit Kaur Chahal Guru Nanak Dev University, Tim Bossenmaier Karlsruhe Institute of Technology (KIT), Bhaveet Nagaria Brunel University London, Philip Makedonski University of Goettingen, Matin Nili Ahmadabadi University of Tehran, Kristof Szabados Ericsson Hungary ltd., Helge Spieker Simula Research Laboratory, Norway, Matej Madeja Technical University of Košice, Nathaniel G. Hoy Brunel University London, Valentina Lenarduzzi University of Oulu, Shangwen Wang National University of Defense Technology, Gema Rodríguez-Pérez University of British Columbia (UBC), Ricardo Colomo-Palacios Østfold University College, Roberto Verdecchia Vrije Universiteit Amsterdam, Paramvir Singh The University of Auckland, Yihao Qin , Debasish Chakroborti University of Saskatchewan, Willard Davis IBM, Vijay Walunj University of Missouri-Kansas City, Hongjun Wu National University of Defense Technology, Diego Marcilio USI Università della Svizzera italiana, Omar Alam Trent University, Abdullah Aldaeej Imam Abdulrahman Bin Faisal University, Idan Amit The Hebrew University, Burak Turhan University of Oulu, Simon Eismann University of Würzburg, Anna-Katharina Wickert TU Darmstadt, Germany, Ivano Malavolta Vrije Universiteit Amsterdam, Matúš Sulír Technical University of Košice, Fatemeh Hendijani Fard University of British Columbia, Austin Henley University of Tennessee, Efstratios Kourtzanidis University Of Macedonia, Eray Tüzün Bilkent University, Christoph Treude University of Melbourne, Simin Maleki Shamasbi Indendent Researcher, Ivan Pashchenko University of Trento, Marvin Wyrich University of Stuttgart, James C. Davis Purdue University, USA, Alexander Serebrenik Eindhoven University of Technology, Ella Albrecht University of Goettingen, Ethem Utku Aktas Softtech Inc., Daniel Strüber Chalmers | University of Gothenburg / Radboud University, Johannes Erbel University of Goettingen Pre-print Media Attached | ||
13:05 5mTalk | On Using Stack Overflow Comment-Edit Pairs to Recommend Code Maintenance Changes Journal-First Papers Link to publication DOI Pre-print Media Attached | ||
13:10 5mTalk | An Exploratory Study on the Repeatedly Shared External Links on Stack Overflow Journal-First Papers Jiakun Liu Zhejiang University, Haoxiang Zhang Huawei, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University, Ying Zou Queen's University, Kingston, Ontario, Ahmed E. Hassan Queen's University, Shanping Li Zhejiang University Link to publication DOI Media Attached | ||
13:15 5mTalk | Towards Mining OSS Skills from GitHub Activity NIER - New Ideas and Emerging Results Jenny T. Liang University of Washington, Thomas Zimmermann Microsoft Research, Denae Ford Microsoft Research DOI Pre-print Media Attached | ||
13:20 5mTalk | Bug Tracking Process Smells In Practice SEIP - Software Engineering in Practice DOI Pre-print Media Attached | ||
13:25 5mTalk | Manas: Mining Software Repositories to Assist AutoML Technical Track Giang Nguyen Iowa State University, Md Johirul Islam Iowa State University, Rangeet Pan Iowa State University, USA, Hridesh Rajan Iowa State University DOI Pre-print Media Attached |
Fri 27 MayDisplayed time zone: Eastern Time (US & Canada) change
09:00 - 10:30 | Papers 16: Mining Software Repositories 1NIER - New Ideas and Emerging Results / Technical Track / Journal-First Papers / SEIP - Software Engineering in Practice at Room 301+302 Chair(s): Grace Lewis Carnegie Mellon Software Engineering Institute | ||
09:00 5mTalk | Post2Vec: Learning Distributed Representations of Stack Overflow Posts Journal-First Papers Bowen Xu Singapore Management University, Thong Hoang Singapore Management University, Singapore, Abhishek Sharma Veracode, Inc., Yang Chengran Singapore Management University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University Link to publication DOI Pre-print | ||
09:05 5mTalk | Assisting Example-based API Misuse Detection via Complementary Artificial Examples Journal-First Papers Maxime Lamothe Polytechnique Montréal, Heng Li Polytechnique Montréal, Weiyi Shang Concordia University Link to publication DOI Pre-print Media Attached | ||
09:10 5mTalk | What happens in my code reviews? An investigation on automatically classifying review changes Journal-First Papers Enrico Fregnan University of Zurich, Switzerland, Fernando Petrulio University of Zurich, Linda Di Geronimo University of Zurich, Switzerland, Alberto Bacchelli University of Zurich Link to publication Pre-print Media Attached | ||
09:15 5mTalk | Bus Factor In Practice SEIP - Software Engineering in Practice Elgun Jabrayilzade Bilkent University, Mikhail Evtikhiev JetBrains Research, Eray Tüzün Bilkent University, Vladimir Kovalenko JetBrains Research Pre-print Media Attached | ||
09:20 5mTalk | A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits Journal-First Papers Steffen Herbold TU Clausthal, Alexander Trautsch University of Göttingen, Benjamin Ledel TU Clausthal, Alireza Aghamohammadi Sharif University of Technology, Taher A Ghaleb University of Ottawa, Kuljit Kaur Chahal Guru Nanak Dev University, Tim Bossenmaier Karlsruhe Institute of Technology (KIT), Bhaveet Nagaria Brunel University London, Philip Makedonski University of Goettingen, Matin Nili Ahmadabadi University of Tehran, Kristof Szabados Ericsson Hungary ltd., Helge Spieker Simula Research Laboratory, Norway, Matej Madeja Technical University of Košice, Nathaniel G. Hoy Brunel University London, Valentina Lenarduzzi University of Oulu, Shangwen Wang National University of Defense Technology, Gema Rodríguez-Pérez University of British Columbia (UBC), Ricardo Colomo-Palacios Østfold University College, Roberto Verdecchia Vrije Universiteit Amsterdam, Paramvir Singh The University of Auckland, Yihao Qin , Debasish Chakroborti University of Saskatchewan, Willard Davis IBM, Vijay Walunj University of Missouri-Kansas City, Hongjun Wu National University of Defense Technology, Diego Marcilio USI Università della Svizzera italiana, Omar Alam Trent University, Abdullah Aldaeej Imam Abdulrahman Bin Faisal University, Idan Amit The Hebrew University, Burak Turhan University of Oulu, Simon Eismann University of Würzburg, Anna-Katharina Wickert TU Darmstadt, Germany, Ivano Malavolta Vrije Universiteit Amsterdam, Matúš Sulír Technical University of Košice, Fatemeh Hendijani Fard University of British Columbia, Austin Henley University of Tennessee, Efstratios Kourtzanidis University Of Macedonia, Eray Tüzün Bilkent University, Christoph Treude University of Melbourne, Simin Maleki Shamasbi Indendent Researcher, Ivan Pashchenko University of Trento, Marvin Wyrich University of Stuttgart, James C. Davis Purdue University, USA, Alexander Serebrenik Eindhoven University of Technology, Ella Albrecht University of Goettingen, Ethem Utku Aktas Softtech Inc., Daniel Strüber Chalmers | University of Gothenburg / Radboud University, Johannes Erbel University of Goettingen Pre-print Media Attached | ||
09:25 5mTalk | Towards Mining OSS Skills from GitHub Activity NIER - New Ideas and Emerging Results Jenny T. Liang University of Washington, Thomas Zimmermann Microsoft Research, Denae Ford Microsoft Research DOI Pre-print Media Attached | ||
09:30 5mTalk | Bug Tracking Process Smells In Practice SEIP - Software Engineering in Practice DOI Pre-print Media Attached | ||
09:35 5mTalk | Manas: Mining Software Repositories to Assist AutoML Technical Track Giang Nguyen Iowa State University, Md Johirul Islam Iowa State University, Rangeet Pan Iowa State University, USA, Hridesh Rajan Iowa State University DOI Pre-print Media Attached |
13:30 - 15:00 | |||
13:30 90mTalk | Investigating User Perceptions of Conversational Agents for Software-related Exploratory Web Search NIER - New Ideas and Emerging Results Matthew Frazier University of Delaware, Shaayal Kumar University of Delaware, Kostadin Damevski Virginia Commonwealth University, Lori Pollock University of Delaware DOI Pre-print Media Attached | ||
13:30 90mTalk | Bots for Pull Requests: The Good, the Bad, and the Promising Technical Track Mairieli Wessel Delft University of Technology, Ahmad Abdellatif Concordia University, Igor Wiese Federal University of Technology - Paraná (UTFPR), Tayana Conte Universidade Federal do Amazonas, Emad Shihab Concordia University, Marco Gerosa Northern Arizona University, USA, Igor Steinmacher Federal University of Technology - Paraná / Northern Arizona University Pre-print | ||
13:30 90mTalk | Post2Vec: Learning Distributed Representations of Stack Overflow Posts Journal-First Papers Bowen Xu Singapore Management University, Thong Hoang Singapore Management University, Singapore, Abhishek Sharma Veracode, Inc., Yang Chengran Singapore Management University, Xin Xia Huawei Software Engineering Application Technology Lab, David Lo Singapore Management University Link to publication DOI Pre-print | ||
13:30 90mTalk | Detecting Interpersonal Conflict in Issues and Code Review: Cross Pollinating Open- and Closed-Source Approaches SEIS - Software Engineering in Society Huilian Sophie Qiu Carnegie Mellon University, USA, Bogdan Vasilescu Carnegie Mellon University, USA, Christian Kästner Carnegie Mellon University, Carolyn Egelman Google, Ciera Jaspan , Emerson Murphy-Hill Google Pre-print Media Attached | ||
13:30 90mPoster | Poster: Comprehensive Comparisons of Embedding Approaches for Cryptographic API Completion Posters Ya Xiao Virginia Tech, Salman Ahmed Virginia Polytechnic Institute and State University, Xinyang Ge Microsoft Research, Bimal Viswanath Virginia Tech, Na Meng Virginia Tech, Daphne Yao Virginia Tech | ||
13:30 90mTalk | Semantic Image Fuzzing of AI Perception Systems Technical Track Trey Woodlief University of Virginia, Sebastian Elbaum University of Virginia, Kevin Sullivan University of Virginia DOI Pre-print Media Attached | ||
13:30 90m | To Disengage or Not to Disengage: A Look at Contributor Disengagement in Open Source Software SRC - ACM Student Research Competition Philip Gray New College of Florida | ||
13:30 90mTalk | Hashing It Out: A Survey of Programmers’ Cannabis Usage, Perception, and Motivation Technical Track Madeline Endres University of Michigan, Kevin Boehnke University of Michigan, Westley Weimer University of Michigan DOI Pre-print Media Attached | ||
13:30 90mTalk | Bus Factor In Practice SEIP - Software Engineering in Practice Elgun Jabrayilzade Bilkent University, Mikhail Evtikhiev JetBrains Research, Eray Tüzün Bilkent University, Vladimir Kovalenko JetBrains Research Pre-print Media Attached | ||
13:30 90mTalk | Garbage Collection Makes Rust Easier to Use: A Randomized Controlled Trial of the Bronze Garbage CollectorNominated for Distinguished Paper Technical Track Michael Coblenz University of Maryland at College Park, Michelle Mazurek University of Maryland, Michael Hicks University of Maryland at College Park DOI Pre-print Media Attached | ||
13:30 90mTalk | Learning and Programming Challenges of Rust: A Mixed-Methods Study Technical Track Shuofei Zhu The Pennsylvania State University, Ziyi Zhang University of Wisconsin–Madison, Boqin Qin China Telecom Cloud Computing Corporation, Aiping Xiong The Pennsylvania State University, Linhai Song Pennsylvania State University, USA DOI Pre-print Media Attached | ||
13:30 90mTalk | Better Modeling the Programming World with Code Concept Graphs-augmented Multi-modal Learning NIER - New Ideas and Emerging Results Martin Weyssow DIRO, Université de Montréal, Houari Sahraoui Université de Montréal, Bang Liu DIRO & Mila, Université de Montréal Pre-print Media Attached | ||
13:30 90mTalk | Defect Reduction Planning (using TimeLIME) Journal-First Papers Authorizer link Pre-print Media Attached | ||
13:30 90mDemonstration | Gamekins: Gamifying Software Testing in Jenkins DEMO - Demonstrations DOI Pre-print Media Attached | ||
13:30 90mTalk | How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics in Stack Overflow Journal-First Papers Anthony Peruma Rochester Institute of Technology, Steven Simmons Rochester Institute of Technology, Eman Abdullah AlOmar Stevens Institute of Technology, Christian D. Newman Rochester Institute of Technology, Mohamed Wiem Mkaouer Rochester Institute of Technology, Ali Ouni ETS Montreal, University of Quebec Link to publication DOI Pre-print Media Attached | ||
13:30 90mTalk | Lessons Learnt on Reproducibility in Machine Learning Based Android Malware Detection Journal-First Papers Nadia Daoudi SnT, University of Luxembourg, Kevin Allix University of Luxembourg, Tegawendé F. Bissyandé SnT, University of Luxembourg, Jacques Klein University of Luxembourg Link to publication Pre-print Media Attached | ||
13:30 90m | Mu2: Using Mutation Analysis to Guide Mutation-Based Fuzzing SRC - ACM Student Research Competition Isabella Laybourn Carnegie Mellon Silicon Valley | ||
13:30 90mTalk | Emotions and Perceived Productivity of Software Developers at the Workplace Journal-First Papers Daniela Girardi University of Bari, Filippo Lanubile University of Bari, Nicole Novielli University of Bari, Alexander Serebrenik Eindhoven University of Technology Link to publication DOI Pre-print Media Attached | ||
13:30 90mPoster | CRustS: A Transpiler from Unsafe C to Safer Rust Posters Michael Ling Huawei Technologies Canada, Yijun Yu The Open University, UK, Haitao Wu Huawei Technologies Canada, Yuan Wang Huawei Sweden Research Center, James R. Cordy Queen's University, Ahmed E. Hassan Queen's University | ||
13:30 90mTalk | Multilingual training for Software Engineering Technical Track Toufique Ahmed University of California at Davis, Prem Devanbu Department of Computer Science, University of California, Davis DOI Pre-print Media Attached | ||
13:30 90mTalk | An Empirical Investigation on the Challenges Faced by Women in the Software Industry: A Case StudySEIS-track Award SEIS - Software Engineering in Society Bianca Trinkenreich Northern of Arizona Univeristy, Ricardo Britto Ericsson / Blekinge Institute of Technology, Marco Gerosa Northern Arizona University, USA, Igor Steinmacher Federal University of Technology - Paraná / Northern Arizona University Pre-print Media Attached | ||
13:30 90mTalk | Using Deep Learning to Generate Complete Log Statements Technical Track Antonio Mastropaolo Università della Svizzera italiana, Luca Pascarella Università della Svizzera italiana (USI), Gabriele Bavota Software Institute, USI Università della Svizzera italiana Pre-print Media Attached | ||
13:30 90mTalk | Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and ProcessDistinguished Paper Award Technical Track Nadia Nahar Carnegie Mellon University, Shurui Zhou University of Toronto, Grace Lewis Carnegie Mellon Software Engineering Institute, Christian Kästner Carnegie Mellon University Pre-print Media Attached | ||
13:30 90mTalk | Discovering Repetitive Code Changes in Python ML Systems Technical Track Malinda Dilhara University of Colorado Boulder, USA, Ameya Ketkar Oregon State University, USA, Nikhith Sannidhi University of Colorado Boulder, Danny Dig University of Colorado Boulder, USA DOI Pre-print Media Attached | ||
13:30 90mTalk | Towards Mining OSS Skills from GitHub Activity NIER - New Ideas and Emerging Results Jenny T. Liang University of Washington, Thomas Zimmermann Microsoft Research, Denae Ford Microsoft Research DOI Pre-print Media Attached | ||
13:30 90mTalk | EREBA: Black-box Energy Testing of Adaptive Neural Networks Technical Track Mirazul Haque UT Dallas, Yaswanth Yadlapalli University of Texas at Dallas, Wei Yang University of Texas at Dallas, Cong Liu University of Texas at Dallas, USA Pre-print Media Attached | ||
13:30 90mTalk | "Project smells" — Experiences in Analysing the Software Quality of ML Projects with mllint SEIP - Software Engineering in Practice Bart van Oort Delft University of Technology, Luís Cruz Deflt University of Technology, Babak Loni ING Bank N.V., Arie van Deursen Delft University of Technology, Netherlands Pre-print Media Attached | ||
13:30 90mPoster | Improving Responsiveness of Android Activity Navigation via Genetic Improvement Posters |