-
An Empirical Study of API Misuses of Data-Centric Libraries
Authors:
Akalanka Galappaththi,
Sarah Nadi,
Christoph Treude
Abstract:
Developers rely on third-party library Application Programming Interfaces (APIs) when developing software. However, libraries typically come with assumptions and API usage constraints, whose violation results in API misuse. API misuses may result in crashes or incorrect behavior. Even though API misuse is a well-studied area, a recent study of API misuse of deep learning libraries showed that the…
▽ More
Developers rely on third-party library Application Programming Interfaces (APIs) when developing software. However, libraries typically come with assumptions and API usage constraints, whose violation results in API misuse. API misuses may result in crashes or incorrect behavior. Even though API misuse is a well-studied area, a recent study of API misuse of deep learning libraries showed that the nature of these misuses and their symptoms are different from misuses of traditional libraries, and as a result highlighted potential shortcomings of current misuse detection tools. We speculate that these observations may not be limited to deep learning API misuses but may stem from the data-centric nature of these APIs. Data-centric libraries often deal with diverse data structures, intricate processing workflows, and a multitude of parameters, which can make them inherently more challenging to use correctly. Therefore, understanding the potential misuses of these libraries is important to avoid unexpected application behavior. To this end, this paper contributes an empirical study of API misuses of five data-centric libraries that cover areas such as data processing, numerical computation, machine learning, and visualization. We identify misuses of these libraries by analyzing data from both Stack Overflow and GitHub. Our results show that many of the characteristics of API misuses observed for deep learning libraries extend to misuses of the data-centric library APIs we study. We also find that developers tend to misuse APIs from data-centric libraries, regardless of whether the API directive appears in the documentation. Overall, our work exposes the challenges of API misuse in data-centric libraries, rather than only focusing on deep learning libraries. Our collected misuses and their characterization lay groundwork for future research to help reduce misuses of these libraries.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Optimizing Large Language Model Hyperparameters for Code Generation
Authors:
Chetan Arora,
Ahnaf Ibn Sayeed,
Sherlock Licorish,
Fanyu Wang,
Christoph Treude
Abstract:
Large Language Models (LLMs), such as GPT models, are increasingly used in software engineering for various tasks, such as code generation, requirements management, and debugging. While automating these tasks has garnered significant attention, a systematic study on the impact of varying hyperparameters on code generation outcomes remains unexplored. This study aims to assess LLMs' code generation…
▽ More
Large Language Models (LLMs), such as GPT models, are increasingly used in software engineering for various tasks, such as code generation, requirements management, and debugging. While automating these tasks has garnered significant attention, a systematic study on the impact of varying hyperparameters on code generation outcomes remains unexplored. This study aims to assess LLMs' code generation performance by exhaustively exploring the impact of various hyperparameters. Hyperparameters for LLMs are adjustable settings that affect the model's behaviour and performance. Specifically, we investigated how changes to the hyperparameters: temperature, top probability (top_p), frequency penalty, and presence penalty affect code generation outcomes. We systematically adjusted all hyperparameters together, exploring every possible combination by making small increments to each hyperparameter at a time. This exhaustive approach was applied to 13 Python code generation tasks, yielding one of four outcomes for each hyperparameter combination: no output from the LLM, non executable code, code that fails unit tests, or correct and functional code. We analysed these outcomes for a total of 14,742 generated Python code segments, focusing on correctness, to determine how the hyperparameters influence the LLM to arrive at each outcome. Using correlation coefficient and regression tree analyses, we ascertained which hyperparameters influence which aspect of the LLM. Our results indicate that optimal performance is achieved with a temperature below 0.5, top probability below 0.75, frequency penalty above -1 and below 1.5, and presence penalty above -1. We make our dataset and results available to facilitate replication.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Can LLMs Replace Manual Annotation of Software Engineering Artifacts?
Authors:
Toufique Ahmed,
Premkumar Devanbu,
Christoph Treude,
Michael Pradel
Abstract:
Experimental evaluations of software engineering innovations, e.g., tools and processes, often include human-subject studies as a component of a multi-pronged strategy to obtain greater generalizability of the findings. However, human-subject studies in our field are challenging, due to the cost and difficulty of finding and employing suitable subjects, ideally, professional programmers with varyi…
▽ More
Experimental evaluations of software engineering innovations, e.g., tools and processes, often include human-subject studies as a component of a multi-pronged strategy to obtain greater generalizability of the findings. However, human-subject studies in our field are challenging, due to the cost and difficulty of finding and employing suitable subjects, ideally, professional programmers with varying degrees of experience. Meanwhile, large language models (LLMs) have recently started to demonstrate human-level performance in several areas. This paper explores the possibility of substituting costly human subjects with much cheaper LLM queries in evaluations of code and code-related artifacts. We study this idea by applying six state-of-the-art LLMs to ten annotation tasks from five datasets created by prior work, such as judging the accuracy of a natural language summary of a method or deciding whether a code change fixes a static analysis warning. Our results show that replacing some human annotation effort with LLMs can produce inter-rater agreements equal or close to human-rater agreement. To help decide when and how to use LLMs in human-subject studies, we propose model-model agreement as a predictor of whether a given task is suitable for LLMs at all, and model confidence as a means to select specific samples where LLMs can safely replace human annotators. Overall, our work is the first step toward mixed human-LLM evaluations in software engineering.
△ Less
Submitted 10 August, 2024;
originally announced August 2024.
-
An Empirical Study of Static Analysis Tools for Secure Code Review
Authors:
Wachiraphan Charoenwet,
Patanamon Thongtanunam,
Van-Thuan Pham,
Christoph Treude
Abstract:
Early identification of security issues in software development is vital to minimize their unanticipated impacts. Code review is a widely used manual analysis method that aims to uncover security issues along with other coding issues in software projects. While some studies suggest that automated static application security testing tools (SASTs) could enhance security issue identification, there i…
▽ More
Early identification of security issues in software development is vital to minimize their unanticipated impacts. Code review is a widely used manual analysis method that aims to uncover security issues along with other coding issues in software projects. While some studies suggest that automated static application security testing tools (SASTs) could enhance security issue identification, there is limited understanding of SAST's practical effectiveness in supporting secure code review. Moreover, most SAST studies rely on synthetic or fully vulnerable versions of the subject program, which may not accurately represent real-world code changes in the code review process.
To address this gap, we study C/C++ SASTs using a dataset of actual code changes that contributed to exploitable vulnerabilities. Beyond SAST's effectiveness, we quantify potential benefits when changed functions are prioritized by SAST warnings. Our dataset comprises 319 real-world vulnerabilities from 815 vulnerability-contributing commits (VCCs) in 92 C and C++ projects. The result reveals that a single SAST can produce warnings in vulnerable functions of 52% of VCCs. Prioritizing changed functions with SAST warnings can improve accuracy (i.e., 12% of precision and 5.6% of recall) and reduce Initial False Alarm (lines of code in non-vulnerable functions inspected until the first vulnerable function) by 13%. Nevertheless, at least 76% of the warnings in vulnerable functions are irrelevant to the VCCs, and 22% of VCCs remain undetected due to limitations of SAST rules. Our findings highlight the benefits and the remaining gaps of SAST-supported secure code reviews and challenges that should be addressed in future work.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Contributing Back to the Ecosystem: A User Survey of NPM Developers
Authors:
Supatsara Wattanakriengkrai,
Christoph Treude,
Raula Gaikovina Kula
Abstract:
With the rise of the library ecosystem (such as NPM for JavaScript and PyPI for Python), a developer has access to a multitude of library packages that they can adopt as dependencies into their application.Prior work has found that these ecosystems form a complex web of dependencies, where sustainability issues of a single library can have widespread network effects. Due to the Open Source Softwar…
▽ More
With the rise of the library ecosystem (such as NPM for JavaScript and PyPI for Python), a developer has access to a multitude of library packages that they can adopt as dependencies into their application.Prior work has found that these ecosystems form a complex web of dependencies, where sustainability issues of a single library can have widespread network effects. Due to the Open Source Software (OSS) nature of third party libraries, there are rising concerns with the sustainability of these libraries. In a survey of 49 developers from the NPM ecosystem, we find that developers are more likely to maintain their own packages rather than contribute to the ecosystem. Our results opens up new avenues into tool support and research into how to sustain these ecosystems, especially for developers that depend on these libraries. We have made available the raw results of the survey at \url{https://tinyurl.com/2p8sdmr3}.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Documenting Ethical Considerations in Open Source AI Models
Authors:
Haoyu Gao,
Mansooreh Zahedi,
Christoph Treude,
Sarita Rosenstock,
Marc Cheong
Abstract:
Background: The development of AI-enabled software heavily depends on AI model documentation, such as model cards, due to different domain expertise between software engineers and model developers. From an ethical standpoint, AI model documentation conveys critical information on ethical considerations along with mitigation strategies for downstream developers to ensure the delivery of ethically c…
▽ More
Background: The development of AI-enabled software heavily depends on AI model documentation, such as model cards, due to different domain expertise between software engineers and model developers. From an ethical standpoint, AI model documentation conveys critical information on ethical considerations along with mitigation strategies for downstream developers to ensure the delivery of ethically compliant software. However, knowledge on such documentation practice remains scarce. Aims: The objective of our study is to investigate how developers document ethical aspects of open source AI models in practice, aiming at providing recommendations for future documentation endeavours. Method: We selected three sources of documentation on GitHub and Hugging Face, and developed a keyword set to identify ethics-related documents systematically. After filtering an initial set of 2,347 documents, we identified 265 relevant ones and performed thematic analysis to derive the themes of ethical considerations. Results: Six themes emerge, with the three largest ones being model behavioural risks, model use cases, and model risk mitigation. Conclusions: Our findings reveal that open source AI model documentation focuses on articulating ethical problem statements and use case restrictions. We further provide suggestions to various stakeholders for improving documentation practice regarding ethical considerations.
△ Less
Submitted 2 July, 2024; v1 submitted 26 June, 2024;
originally announced June 2024.
-
Characterising Contributions that Coincide with Vulnerability Mitigation in NPM Libraries
Authors:
Ruksit Rojpaisarnkit,
Hathaichanok Damrongsiri,
Christoph Treude,
Ali Ouni,
Raula Gaikovina Kula
Abstract:
With the urgent need to secure supply chains among Open Source libraries, attention has focused on mitigating vulnerabilities detected in these libraries. Although awareness has improved recently, most studies still report delays in the mitigation process. This suggests that developers still have to deal with other contributions that occur during the period of fixing vulnerabilities, such as coinc…
▽ More
With the urgent need to secure supply chains among Open Source libraries, attention has focused on mitigating vulnerabilities detected in these libraries. Although awareness has improved recently, most studies still report delays in the mitigation process. This suggests that developers still have to deal with other contributions that occur during the period of fixing vulnerabilities, such as coinciding Pull Requests (PRs) and Issues, yet the impact of these contributions remains unclear. To characterize these contributions, we conducted a mixed-method empirical study to analyze NPM GitHub projects affected by 554 different vulnerability advisories, mining a total of 4,699 coinciding PRs and Issues. We believe that tool development and improved workload management for developers have the potential to create a more efficient and effective vulnerability mitigation process.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Qualitative Data Analysis in Software Engineering: Techniques and Teaching Insights
Authors:
Christoph Treude
Abstract:
Software repositories are rich sources of qualitative artifacts, including source code comments, commit messages, issue descriptions, and documentation. These artifacts offer many interesting insights when analyzed through quantitative methods, as outlined in the chapter on mining software repositories. This chapter shifts the focus towards interpreting these artifacts using various qualitative da…
▽ More
Software repositories are rich sources of qualitative artifacts, including source code comments, commit messages, issue descriptions, and documentation. These artifacts offer many interesting insights when analyzed through quantitative methods, as outlined in the chapter on mining software repositories. This chapter shifts the focus towards interpreting these artifacts using various qualitative data analysis techniques. We introduce qualitative coding as an iterative process, which is crucial not only for educational purposes but also to enhance the credibility and depth of research findings. Various coding methods are discussed along with the strategic design of a coding guide to ensure consistency and accuracy in data interpretation. The chapter also discusses quality assurance in qualitative data analysis, emphasizing principles such as credibility, transferability, dependability, and confirmability. These principles are vital to ensure that the findings are robust and can be generalized in different contexts. By sharing best practices and lessons learned, we aim to equip all readers with the tools necessary to conduct rigorous qualitative research in the field of software engineering.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Prioritising GitHub Priority Labels
Authors:
James Caddy,
Christoph Treude
Abstract:
Communities on GitHub often use issue labels as a way of triaging issues by assigning them priority ratings based on how urgently they should be addressed. The labels used are determined by the repository contributors and not standardised by GitHub. This makes it difficult for priority-related reasoning across repositories for both researchers and contributors. Previous work shows interest in how…
▽ More
Communities on GitHub often use issue labels as a way of triaging issues by assigning them priority ratings based on how urgently they should be addressed. The labels used are determined by the repository contributors and not standardised by GitHub. This makes it difficult for priority-related reasoning across repositories for both researchers and contributors. Previous work shows interest in how issues are labelled and what the consequences for those labels are. For instance, some previous work has used clustering models and natural language processing to categorise labels without a particular emphasis on priority. With this publication, we introduce a unique data set of 812 manually categorised labels pertaining to priority; normalised and ranked as low-, medium-, or high-priority. To provide an example of how this data set could be used, we have created a tool for GitHub contributors that will create a list of the highest priority issues from the repositories to which they contribute. We have released the data set and the tool for anyone to use on Zenodo because we hope that this will help the open source community address high-priority issues more effectively and inspire other uses.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
The Role of Code Proficiency in the Era of Generative AI
Authors:
Gregorio Robles,
Christoph Treude,
Jesus M. Gonzalez-Barahona,
Raula Gaikovina Kula
Abstract:
At the current pace of technological advancements, Generative AI models, including both Large Language Models and Large Multi-modal Models, are becoming integral to the developer workspace. However, challenges emerge due to the 'black box' nature of many of these models, where the processes behind their outputs are not transparent. This position paper advocates for a 'white box' approach to these…
▽ More
At the current pace of technological advancements, Generative AI models, including both Large Language Models and Large Multi-modal Models, are becoming integral to the developer workspace. However, challenges emerge due to the 'black box' nature of many of these models, where the processes behind their outputs are not transparent. This position paper advocates for a 'white box' approach to these generative models, emphasizing the necessity of transparency and understanding in AI-generated code to match the proficiency levels of human developers and better enable software maintenance and evolution. We outline a research agenda aimed at investigating the alignment between AI-generated code and developer skills, highlighting the importance of responsibility, security, legal compliance, creativity, and social value in software development. The proposed research questions explore the potential of white-box methodologies to ensure that software remains an inspectable, adaptable, and trustworthy asset in the face of rapid AI integration, setting a course for research that could shape the role of code proficiency into 2030 and beyond.
△ Less
Submitted 8 April, 2024;
originally announced May 2024.
-
Towards the First Code Contribution: Processes and Information Needs
Authors:
Christoph Treude,
Marco A. Gerosa,
Igor Steinmacher
Abstract:
Newcomers to a software project must overcome many barriers before they can successfully place their first code contribution, and they often struggle to find information that is relevant to them. In this work, we argue that much of the information needed by newcomers already exists, albeit scattered among many different sources, and that many barriers can be addressed by automatically identifying,…
▽ More
Newcomers to a software project must overcome many barriers before they can successfully place their first code contribution, and they often struggle to find information that is relevant to them. In this work, we argue that much of the information needed by newcomers already exists, albeit scattered among many different sources, and that many barriers can be addressed by automatically identifying, extracting, generating, summarizing, and presenting documentation that is specifically aimed and customized for newcomers. To gain a detailed understanding of the processes followed by newcomers and their information needs before making their first code contribution, we conducted an empirical study. Based on a survey with about 100 practitioners, grounded theory analysis, and validation interviews, we contribute a 16-step model for the processes followed by newcomers to a software project and we identify relevant information, along with individual and project characteristics that influence the relevancy of information types and sources. Our findings form an essential step towards automated tool support that provides relevant information to project newcomers in each step of their contribution processes.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Open Source Software Development Tool Installation: Challenges and Strategies For Novice Developers
Authors:
Larissa Salerno,
Christoph Treude,
Patanamon Thongtatunam
Abstract:
As the world of technology advances, so do the tools that software developers use to create new programs. In recent years, software development tools have become more popular, allowing developers to work more efficiently and produce higher-quality software. Still, installing such tools can be challenging for novice developers at the early stage of their careers, as they may face challenges, such a…
▽ More
As the world of technology advances, so do the tools that software developers use to create new programs. In recent years, software development tools have become more popular, allowing developers to work more efficiently and produce higher-quality software. Still, installing such tools can be challenging for novice developers at the early stage of their careers, as they may face challenges, such as compatibility issues (e.g., operating systems). Therefore, this work aims to investigate the challenges novice developers face in software development when installing software development tools. To investigate these, we conducted an analysis of 24 live software installation sessions to observe challenges and comprehend their actions, the strategies they apply, and the type of source of information they consult when encountering challenges. Our findings show that unclear documentation, such as installation instructions, and inadequate feedback during the installation process are common challenges faced by novice developers. Moreover, reformulating search queries and relying on non-official documentation were some of the strategies employed to overcome challenges. Based on our findings, we provide practical recommendations for tool vendors, tool users, and researchers.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
The Impact of Sanctions on GitHub Developers and Activities
Authors:
Youmei Fan,
Ani Hovhannisyan,
Hideaki Hata,
Christoph Treude,
Raula Gaikovina Kula
Abstract:
The GitHub platform has fueled the creation of truly global software, enabling contributions from developers across various geographical regions of the world. As software becomes more entwined with global politics and social regulations, it becomes similarly subject to government sanctions. In 2019, GitHub restricted access to certain services for users in specific locations but rolled back these…
▽ More
The GitHub platform has fueled the creation of truly global software, enabling contributions from developers across various geographical regions of the world. As software becomes more entwined with global politics and social regulations, it becomes similarly subject to government sanctions. In 2019, GitHub restricted access to certain services for users in specific locations but rolled back these restrictions for some communities (e.g., the Iranian community) in 2021. We conducted a large-scale empirical study, collecting approximately 156 thousand user profiles and their 41 million activity points from 2008 to 2022, to understand the response of developers. Our results indicate that many of these targeted developers were able to navigate through the sanctions. Furthermore, once these sanctions were lifted, these developers opted to return to GitHub instead of withdrawing their contributions to the platform. The study indicates that platforms like GitHub play key roles in sustaining global contributions to Open Source Software.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
LLM-Based Multi-Agent Systems for Software Engineering: Vision and the Road Ahead
Authors:
Junda He,
Christoph Treude,
David Lo
Abstract:
Integrating Large Language Models(LLMs) into autonomous agents marks a significant shift in the research landscape by offering cognitive abilities competitive to human planning and reasoning. This paper envisions the evolution of LLM-based Multi-Agent (LMA) systems in addressing complex and multi-faceted software engineering challenges. LMA systems introduce numerous benefits, including enhanced r…
▽ More
Integrating Large Language Models(LLMs) into autonomous agents marks a significant shift in the research landscape by offering cognitive abilities competitive to human planning and reasoning. This paper envisions the evolution of LLM-based Multi-Agent (LMA) systems in addressing complex and multi-faceted software engineering challenges. LMA systems introduce numerous benefits, including enhanced robustness through collaborative cross-examination, autonomous problem-solving, and scalable solutions to complex software projects. By examining the role of LMA systems in future software engineering practices, this vision paper highlights the potential applications and emerging challenges. We further point to specific opportunities for research and conclude with a research agenda with a set of research questions to guide future research directions.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Creative and Correct: Requesting Diverse Code Solutions from AI Foundation Models
Authors:
Scott Blyth,
Markus Wagner,
Christoph Treude
Abstract:
AI foundation models have the capability to produce a wide array of responses to a single prompt, a feature that is highly beneficial in software engineering to generate diverse code solutions. However, this advantage introduces a significant trade-off between diversity and correctness. In software engineering tasks, diversity is key to exploring design spaces and fostering creativity, but the pra…
▽ More
AI foundation models have the capability to produce a wide array of responses to a single prompt, a feature that is highly beneficial in software engineering to generate diverse code solutions. However, this advantage introduces a significant trade-off between diversity and correctness. In software engineering tasks, diversity is key to exploring design spaces and fostering creativity, but the practical value of these solutions is heavily dependent on their correctness. Our study systematically investigates this trade-off using experiments with HumanEval tasks, exploring various parameter settings and prompting strategies. We assess the diversity of code solutions using similarity metrics from the code clone community. The study identifies combinations of parameters and strategies that strike an optimal balance between diversity and correctness, situated on the Pareto front of this trade-off space. These findings offer valuable insights for software engineers on how to effectively use AI foundation models to generate code solutions that are diverse and accurate.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
The Impact Of Bug Localization Based on Crash Report Mining: A Developers' Perspective
Authors:
Marcos Medeiros,
Uirá Kulesza,
Roberta Coelho,
Rodrigo Bonifácio,
Christoph Treude,
Eiji Adachi
Abstract:
Developers often use crash reports to understand the root cause of bugs. However, locating the buggy source code snippet from such information is a challenging task, mainly when the log database contains many crash reports. To mitigate this issue, recent research has proposed and evaluated approaches for grouping crash report data and using stack trace information to locate bugs. The effectiveness…
▽ More
Developers often use crash reports to understand the root cause of bugs. However, locating the buggy source code snippet from such information is a challenging task, mainly when the log database contains many crash reports. To mitigate this issue, recent research has proposed and evaluated approaches for grouping crash report data and using stack trace information to locate bugs. The effectiveness of such approaches has been evaluated by mainly comparing the candidate buggy code snippets with the actual changed code in bug-fix commits -- which happens in the context of retrospective repository mining studies. Therefore, the existing literature still lacks discussing the use of such approaches in the daily life of a software company, which could explain the developers' perceptions on the use of these approaches. In this paper, we report our experience of using an approach for grouping crash reports and finding buggy code on a weekly basis for 18 months, within three development teams in a software company. We grouped over 750,000 crash reports, opened over 130 issues, and collected feedback from 18 developers and team leaders. Among other results, we observe that the amount of system logs related to a crash report group is not the only criteria developers use to choose a candidate bug to be analyzed. Instead, other factors were considered, such as the need to deliver customer-prioritized features and the difficulty of solving complex crash reports (e.g., architectural debts), to cite some. The approach investigated in this study correctly suggested the buggy file most of the time -- the approach's precision was around 80%. In this study, the developers also shared their perspectives on the usefulness of the suspicious files and methods extracted from crash reports to fix related bugs.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Smart HPA: A Resource-Efficient Horizontal Pod Auto-scaler for Microservice Architectures
Authors:
Hussain Ahmad,
Christoph Treude,
Markus Wagner,
Claudia Szabo
Abstract:
Microservice architectures have gained prominence in both academia and industry, offering enhanced agility, reusability, and scalability. To simplify scaling operations in microservice architectures, container orchestration platforms such as Kubernetes feature Horizontal Pod Auto-scalers (HPAs) designed to adjust the resources of microservices to accommodate fluctuating workloads. However, existin…
▽ More
Microservice architectures have gained prominence in both academia and industry, offering enhanced agility, reusability, and scalability. To simplify scaling operations in microservice architectures, container orchestration platforms such as Kubernetes feature Horizontal Pod Auto-scalers (HPAs) designed to adjust the resources of microservices to accommodate fluctuating workloads. However, existing HPAs are not suitable for resource-constrained environments, as they make scaling decisions based on the individual resource capacities of microservices, leading to service unavailability and performance degradation. Furthermore, HPA architectures exhibit several issues, including inefficient data processing and a lack of coordinated scaling operations. To address these concerns, we propose Smart HPA, a flexible resource-efficient horizontal pod auto-scaler. It features a hierarchical architecture that integrates both centralized and decentralized architectural styles to leverage their respective strengths while addressing their limitations. We introduce resource-efficient heuristics that empower Smart HPA to exchange resources among microservices, facilitating effective auto-scaling of microservices in resource-constrained environments. Our experimental results show that Smart HPA outperforms the Kubernetes baseline HPA by reducing resource overutilization, overprovisioning, and underprovisioning while increasing resource allocation to microservice applications.
△ Less
Submitted 26 February, 2024;
originally announced March 2024.
-
Enhancing Source Code Representations for Deep Learning with Static Analysis
Authors:
Xueting Guan,
Christoph Treude
Abstract:
Deep learning techniques applied to program analysis tasks such as code classification, summarization, and bug detection have seen widespread interest. Traditional approaches, however, treat programming source code as natural language text, which may neglect significant structural or semantic details. Additionally, most current methods of representing source code focus solely on the code, without…
▽ More
Deep learning techniques applied to program analysis tasks such as code classification, summarization, and bug detection have seen widespread interest. Traditional approaches, however, treat programming source code as natural language text, which may neglect significant structural or semantic details. Additionally, most current methods of representing source code focus solely on the code, without considering beneficial additional context. This paper explores the integration of static analysis and additional context such as bug reports and design patterns into source code representations for deep learning models. We use the Abstract Syntax Tree-based Neural Network (ASTNN) method and augment it with additional context information obtained from bug reports and design patterns, creating an enriched source code representation that significantly enhances the performance of common software engineering tasks such as code classification and code clone detection. Utilizing existing open-source code data, our approach improves the representation and processing of source code, thereby improving task performance.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Generative AI for Pull Request Descriptions: Adoption, Impact, and Developer Interventions
Authors:
Tao Xiao,
Hideaki Hata,
Christoph Treude,
Kenichi Matsumoto
Abstract:
GitHub's Copilot for Pull Requests (PRs) is a promising service aiming to automate various developer tasks related to PRs, such as generating summaries of changes or providing complete walkthroughs with links to the relevant code. As this innovative technology gains traction in the Open Source Software (OSS) community, it is crucial to examine its early adoption and its impact on the development p…
▽ More
GitHub's Copilot for Pull Requests (PRs) is a promising service aiming to automate various developer tasks related to PRs, such as generating summaries of changes or providing complete walkthroughs with links to the relevant code. As this innovative technology gains traction in the Open Source Software (OSS) community, it is crucial to examine its early adoption and its impact on the development process. Additionally, it offers a unique opportunity to observe how developers respond when they disagree with the generated content. In our study, we employ a mixed-methods approach, blending quantitative analysis with qualitative insights, to examine 18,256 PRs in which parts of the descriptions were crafted by generative AI. Our findings indicate that: (1) Copilot for PRs, though in its infancy, is seeing a marked uptick in adoption. (2) PRs enhanced by Copilot for PRs require less review time and have a higher likelihood of being merged. (3) Developers using Copilot for PRs often complement the automated descriptions with their manual input. These results offer valuable insights into the growing integration of generative AI in software development.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Improving Automated Code Reviews: Learning from Experience
Authors:
Hong Yi Lin,
Patanamon Thongtanunam,
Christoph Treude,
Wachiraphan Charoenwet
Abstract:
Modern code review is a critical quality assurance process that is widely adopted in both industry and open source software environments. This process can help newcomers learn from the feedback of experienced reviewers; however, it often brings a large workload and stress to reviewers. To alleviate this burden, the field of automated code reviews aims to automate the process, teaching large langua…
▽ More
Modern code review is a critical quality assurance process that is widely adopted in both industry and open source software environments. This process can help newcomers learn from the feedback of experienced reviewers; however, it often brings a large workload and stress to reviewers. To alleviate this burden, the field of automated code reviews aims to automate the process, teaching large language models to provide reviews on submitted code, just as a human would. A recent approach pre-trained and fine-tuned the code intelligent language model on a large-scale code review corpus. However, such techniques did not fully utilise quality reviews amongst the training data. Indeed, reviewers with a higher level of experience or familiarity with the code will likely provide deeper insights than the others. In this study, we set out to investigate whether higher-quality reviews can be generated from automated code review models that are trained based on an experience-aware oversampling technique. Through our quantitative and qualitative evaluation, we find that experience-aware oversampling can increase the correctness, level of information, and meaningfulness of reviews generated by the current state-of-the-art model without introducing new data. The results suggest that a vast amount of high-quality reviews are underutilised with current training strategies. This work sheds light on resource-efficient ways to boost automated code review models.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Encoding Version History Context for Better Code Representation
Authors:
Huy Nguyen,
Christoph Treude,
Patanamon Thongtanunam
Abstract:
With the exponential growth of AI tools that generate source code, understanding software has become crucial. When developers comprehend a program, they may refer to additional contexts to look for information, e.g. program documentation or historical code versions. Therefore, we argue that encoding this additional contextual information could also benefit code representation for deep learning. Re…
▽ More
With the exponential growth of AI tools that generate source code, understanding software has become crucial. When developers comprehend a program, they may refer to additional contexts to look for information, e.g. program documentation or historical code versions. Therefore, we argue that encoding this additional contextual information could also benefit code representation for deep learning. Recent papers incorporate contextual data (e.g. call hierarchy) into vector representation to address program comprehension problems. This motivates further studies to explore additional contexts, such as version history, to enhance models' understanding of programs. That is, insights from version history enable recognition of patterns in code evolution over time, recurring issues, and the effectiveness of past solutions. Our paper presents preliminary evidence of the potential benefit of encoding contextual information from the version history to predict code clones and perform code classification. We experiment with two representative deep learning models, ASTNN and CodeBERT, to investigate whether combining additional contexts with different aggregations may benefit downstream activities. The experimental result affirms the positive impact of combining version history into source code representation in all scenarios; however, to ensure the technique performs consistently, we need to conduct a holistic investigation on a larger code base using different combinations of contexts, aggregation, and models. Therefore, we propose a research agenda aimed at exploring various aspects of encoding additional context to improve code representation and its optimal utilisation in specific situations.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Going Viral: Case Studies on the Impact of Protestware
Authors:
Youmei Fan,
Dong Wang,
Supatsara Wattanakriengkrai,
Hathaichanok Damrongsiri,
Christoph Treude,
Hideaki Hata,
Raula Gaikovina Kula
Abstract:
Maintainers are now self-sabotaging their work in order to take political or economic stances, a practice referred to as "protestware". In this poster, we present our approach to understand how the discourse about such an attack went viral, how it is received by the community, and whether developers respond to the attack in a timely manner. We study two notable protestware cases, i.e., Colors.js a…
▽ More
Maintainers are now self-sabotaging their work in order to take political or economic stances, a practice referred to as "protestware". In this poster, we present our approach to understand how the discourse about such an attack went viral, how it is received by the community, and whether developers respond to the attack in a timely manner. We study two notable protestware cases, i.e., Colors.js and es5-ext, comparing with discussions of a typical security vulnerability as a baseline, i.e., Ua-parser, and perform a thematic analysis of more than two thousand protest-related posts to extract the different narratives when discussing protestware.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
"My GitHub Sponsors profile is live!" Investigating the Impact of Twitter/X Mentions on GitHub Sponsors
Authors:
Youmei Fan,
Tao Xiao,
Hideaki Hata,
Christoph Treude,
Kenichi Matsumoto
Abstract:
GitHub Sponsors was launched in 2019, enabling donations to open-source software developers to provide financial support, as per GitHub's slogan: "Invest in the projects you depend on". However, a 2022 study on GitHub Sponsors found that only two-fifths of developers who were seeking sponsorship received a donation. The study found that, other than internal actions (such as offering perks to spons…
▽ More
GitHub Sponsors was launched in 2019, enabling donations to open-source software developers to provide financial support, as per GitHub's slogan: "Invest in the projects you depend on". However, a 2022 study on GitHub Sponsors found that only two-fifths of developers who were seeking sponsorship received a donation. The study found that, other than internal actions (such as offering perks to sponsors), developers had advertised their GitHub Sponsors profiles on social media, such as Twitter (also known as X). Therefore, in this work, we investigate the impact of tweets that contain links to GitHub Sponsors profiles on sponsorship, as well as their reception on Twitter/X. We further characterize these tweets to understand their context and find that (1) such tweets have the impact of increasing the number of sponsors acquired, (2) compared to other donation platforms such as Open Collective and Patreon, GitHub Sponsors has significantly fewer interactions but is more visible on Twitter/X, and (3) developers tend to contribute more to open-source software during the week of posting such tweets. Our findings are the first step toward investigating the impact of social media on obtaining funding to sustain open-source software.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
APIDocBooster: An Extract-Then-Abstract Framework Leveraging Large Language Models for Augmenting API Documentation
Authors:
Chengran Yang,
Jiakun Liu,
Bowen Xu,
Christoph Treude,
Yunbo Lyu,
Junda He,
Ming Li,
David Lo
Abstract:
API documentation is often the most trusted resource for programming. Many approaches have been proposed to augment API documentation by summarizing complementary information from external resources such as Stack Overflow. Existing extractive-based summarization approaches excel in producing faithful summaries that accurately represent the source content without input length restrictions. Neverthe…
▽ More
API documentation is often the most trusted resource for programming. Many approaches have been proposed to augment API documentation by summarizing complementary information from external resources such as Stack Overflow. Existing extractive-based summarization approaches excel in producing faithful summaries that accurately represent the source content without input length restrictions. Nevertheless, they suffer from inherent readability limitations. On the other hand, our empirical study on the abstractive-based summarization method, i.e., GPT-4, reveals that GPT-4 can generate coherent and concise summaries but presents limitations in terms of informativeness and faithfulness.
We introduce APIDocBooster, an extract-then-abstract framework that seamlessly fuses the advantages of both extractive (i.e., enabling faithful summaries without length limitation) and abstractive summarization (i.e., producing coherent and concise summaries). APIDocBooster consists of two stages: (1) \textbf{C}ontext-aware \textbf{S}entence \textbf{S}ection \textbf{C}lassification (CSSC) and (2) \textbf{UP}date \textbf{SUM}marization (UPSUM). CSSC classifies API-relevant information collected from multiple sources into API documentation sections. UPSUM first generates extractive summaries distinct from the original API documentation and then generates abstractive summaries guided by extractive summaries through in-context learning.
To enable automatic evaluation of APIDocBooster, we construct the first dataset for API document augmentation. Our automatic evaluation results reveal that each stage in APIDocBooster outperforms its baselines by a large margin. Our human evaluation also demonstrates the superiority of APIDocBooster over GPT-4 and shows that it improves informativeness, relevance, and faithfulness by 13.89\%, 15.15\%, and 30.56\%, respectively.
△ Less
Submitted 10 January, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
"Add more config detail": A Taxonomy of Installation Instruction Changes
Authors:
Haoyu Gao,
Christoph Treude,
Mansooreh Zahedi
Abstract:
README files play an important role in providing installation-related instructions to software users and are widely used in open source software systems on platforms such as GitHub. However, these files often suffer from various documentation issues, leading to challenges in comprehension and potential errors in content. Despite their significance, there is a lack of systematic understanding regar…
▽ More
README files play an important role in providing installation-related instructions to software users and are widely used in open source software systems on platforms such as GitHub. However, these files often suffer from various documentation issues, leading to challenges in comprehension and potential errors in content. Despite their significance, there is a lack of systematic understanding regarding the documentation efforts invested in README files, especially in the context of installation-related instructions, which are crucial for users to start with a software project. To fill the research gap, we conducted a qualitative study, investigating 400 GitHub repositories with 1,163 README commits that focused on updates in installation-related sections. Our research revealed six major categories of changes in the README commits, namely pre-installation instructions, installation instructions, post-installation instructions, help information updates, document presentation, and external resource management. We further provide detailed insights into modification behaviours and offer examples of these updates. Based on our findings, we propose a README template tailored to cover the installation-related sections for documentation maintainers to reference when updating documents. We further validate this template by conducting an online survey, identifying that documentation readers find the augmented documents based on our template are generally of better quality. We further provide recommendations to practitioners for maintaining their README files, as well as motivations for future research directions... (too long for arxiv)
△ Less
Submitted 14 July, 2024; v1 submitted 5 December, 2023;
originally announced December 2023.
-
Toward Effective Secure Code Reviews: An Empirical Study of Security-Related Coding Weaknesses
Authors:
Wachiraphan Charoenwet,
Patanamon Thongtanunam,
Van-Thuan Pham,
Christoph Treude
Abstract:
Identifying security issues early is encouraged to reduce the latent negative impacts on software systems. Code review is a widely-used method that allows developers to manually inspect modified code, catching security issues during a software development cycle. However, existing code review studies often focus on known vulnerabilities, neglecting coding weaknesses, which can introduce real-world…
▽ More
Identifying security issues early is encouraged to reduce the latent negative impacts on software systems. Code review is a widely-used method that allows developers to manually inspect modified code, catching security issues during a software development cycle. However, existing code review studies often focus on known vulnerabilities, neglecting coding weaknesses, which can introduce real-world security issues that are more visible through code review. The practices of code reviews in identifying such coding weaknesses are not yet fully investigated.
To better understand this, we conducted an empirical case study in two large open-source projects, OpenSSL and PHP. Based on 135,560 code review comments, we found that reviewers raised security concerns in 35 out of 40 coding weakness categories. Surprisingly, some coding weaknesses related to past vulnerabilities, such as memory errors and resource management, were discussed less often than the vulnerabilities. Developers attempted to address raised security concerns in many cases (39%-41%), but a substantial portion was merely acknowledged (30%-36%), and some went unfixed due to disagreements about solutions (18%-20%). This highlights that coding weaknesses can slip through code review even when identified. Our findings suggest that reviewers can identify various coding weaknesses leading to security issues during code reviews. However, these results also reveal shortcomings in current code review practices, indicating the need for more effective mechanisms or support for increasing awareness of security issue management in code reviews.
△ Less
Submitted 8 May, 2024; v1 submitted 27 November, 2023;
originally announced November 2023.
-
Application of Collaborative Learning Paradigms within Software Engineering Education: A Systematic Mapping Study
Authors:
Rita Garcia,
Christoph Treude,
Andrew Valentine
Abstract:
Collaboration is used in Software Engineering (SE) to develop software. Industry seeks SE graduates with collaboration skills to contribute to productive software development. SE educators can use Collaborative Learning (CL) to help students develop collaboration skills. This paper uses a Systematic Mapping Study (SMS) to examine the application of the CL educational theory in SE Education. The SM…
▽ More
Collaboration is used in Software Engineering (SE) to develop software. Industry seeks SE graduates with collaboration skills to contribute to productive software development. SE educators can use Collaborative Learning (CL) to help students develop collaboration skills. This paper uses a Systematic Mapping Study (SMS) to examine the application of the CL educational theory in SE Education. The SMS identified 14 papers published between 2011 and 2022. We used qualitative analysis to classify the papers into four CL paradigms: Conditions, Effect, Interactions, and Computer-Supported Collaborative Learning (CSCL). We found a high interest in CSCL, with a shift in student interaction research to computer-mediated technologies. We discussed the 14 papers in depth, describing their goals and further analysing the CSCL research. Almost half the papers did not achieve the appropriate level of supporting evidence; however, calibrating the instruments presented could strengthen findings and support multiple CL paradigms, especially opportunities to learn at the social and community levels, where research was lacking. Though our results demonstrate limited CL educational theory applied in SE Education, we discuss future work to layer the theory on existing study designs for more effective teaching strategies.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
Lessons from the Long Tail: Analysing Unsafe Dependency Updates across Software Ecosystems
Authors:
Supatsara Wattanakriengkrai,
Raula Gaikovina Kula,
Christoph Treude,
Kenichi Matsumoto
Abstract:
A risk in adopting third-party dependencies into an application is their potential to serve as a doorway for malicious code to be injected (most often unknowingly). While many initiatives from both industry and research communities focus on the most critical dependencies (i.e., those most depended upon within the ecosystem), little is known about whether the rest of the ecosystem suffers the same…
▽ More
A risk in adopting third-party dependencies into an application is their potential to serve as a doorway for malicious code to be injected (most often unknowingly). While many initiatives from both industry and research communities focus on the most critical dependencies (i.e., those most depended upon within the ecosystem), little is known about whether the rest of the ecosystem suffers the same fate. Our vision is to promote and establish safer practises throughout the ecosystem. To motivate our vision, in this paper, we present preliminary data based on three representative samples from a population of 88,416 pull requests (PRs) and identify unsafe dependency updates (i.e., any pull request that risks being unsafe during runtime), which clearly shows that unsafe dependency updates are not limited to highly impactful libraries. To draw attention to the long tail, we propose a research agenda comprising six key research questions that further explore how to safeguard against these unsafe activities. This includes developing best practises to address unsafe dependency updates not only in top-tier libraries but throughout the entire ecosystem.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
DevGPT: Studying Developer-ChatGPT Conversations
Authors:
Tao Xiao,
Christoph Treude,
Hideaki Hata,
Kenichi Matsumoto
Abstract:
This paper introduces DevGPT, a dataset curated to explore how software developers interact with ChatGPT, a prominent large language model (LLM). The dataset encompasses 29,778 prompts and responses from ChatGPT, including 19,106 code snippets, and is linked to corresponding software development artifacts such as source code, commits, issues, pull requests, discussions, and Hacker News threads. Th…
▽ More
This paper introduces DevGPT, a dataset curated to explore how software developers interact with ChatGPT, a prominent large language model (LLM). The dataset encompasses 29,778 prompts and responses from ChatGPT, including 19,106 code snippets, and is linked to corresponding software development artifacts such as source code, commits, issues, pull requests, discussions, and Hacker News threads. This comprehensive dataset is derived from shared ChatGPT conversations collected from GitHub and Hacker News, providing a rich resource for understanding the dynamics of developer interactions with ChatGPT, the nature of their inquiries, and the impact of these interactions on their work. DevGPT enables the study of developer queries, the effectiveness of ChatGPT in code generation and problem solving, and the broader implications of AI-assisted programming. By providing this dataset, the paper paves the way for novel research avenues in software engineering, particularly in understanding and improving the use of LLMs like ChatGPT by developers.
△ Less
Submitted 13 February, 2024; v1 submitted 31 August, 2023;
originally announced September 2023.
-
Using the TypeScript compiler to fix erroneous Node.js snippets
Authors:
Brittany Reid,
Christoph Treude,
Markus Wagner
Abstract:
Most online code snippets do not run. This means that developers looking to reuse code from online sources must manually find and fix errors. We present an approach for automatically evaluating and correcting errors in Node.js code snippets: Node Code Correction (NCC). NCC leverages the ability of the TypeScript compiler to generate errors and inform code corrections through the combination of Typ…
▽ More
Most online code snippets do not run. This means that developers looking to reuse code from online sources must manually find and fix errors. We present an approach for automatically evaluating and correcting errors in Node.js code snippets: Node Code Correction (NCC). NCC leverages the ability of the TypeScript compiler to generate errors and inform code corrections through the combination of TypeScript's built-in codefixes, our own targeted fixes, and deletion of erroneous lines. Compared to existing approaches using linters, our findings suggest that NCC is capable of detecting a larger number of errors per snippet and more error types, and it is more efficient at fixing snippets. We find that 73.7% of the code snippets in NPM documentation have errors; with the use of NCC's corrections, this number was reduced to 25.1%. Our evaluation confirms that the use of the TypeScript compiler to inform code corrections is a promising strategy to aid in the reuse of code snippets from online sources.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
Evaluating Transfer Learning for Simplifying GitHub READMEs
Authors:
Haoyu Gao,
Christoph Treude,
Mansooreh Zahedi
Abstract:
Software documentation captures detailed knowledge about a software product, e.g., code, technologies, and design. It plays an important role in the coordination of development teams and in conveying ideas to various stakeholders. However, software documentation can be hard to comprehend if it is written with jargon and complicated sentence structure. In this study, we explored the potential of te…
▽ More
Software documentation captures detailed knowledge about a software product, e.g., code, technologies, and design. It plays an important role in the coordination of development teams and in conveying ideas to various stakeholders. However, software documentation can be hard to comprehend if it is written with jargon and complicated sentence structure. In this study, we explored the potential of text simplification techniques in the domain of software engineering to automatically simplify GitHub README files. We collected software-related pairs of GitHub README files consisting of 14,588 entries, aligned difficult sentences with their simplified counterparts, and trained a Transformer-based model to automatically simplify difficult versions. To mitigate the sparse and noisy nature of the software-related simplification dataset, we applied general text simplification knowledge to this field. Since many general-domain difficult-to-simple Wikipedia document pairs are already publicly available, we explored the potential of transfer learning by first training the model on the Wikipedia data and then fine-tuning it on the README data. Using automated BLEU scores and human evaluation, we compared the performance of different transfer learning schemes and the baseline models without transfer learning. The transfer learning model using the best checkpoint trained on a general topic corpus achieved the best performance of 34.68 BLEU score and statistically significantly higher human annotation scores compared to the rest of the schemes and baselines. We conclude that using transfer learning is a promising direction to circumvent the lack of data and drift style problem in software README files simplification and achieved a better trade-off between simplification and preservation of meaning.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
Visually Analyzing Company-wide Software Service Dependencies: An Industrial Case Study
Authors:
Sebastian Baltes,
Brian Pfitzmann,
Thomas Kowark,
Christoph Treude,
Fabian Beck
Abstract:
Managing dependencies between software services is a crucial task for any company operating cloud applications. Visualizations can help to understand and maintain these complex dependencies. In this paper, we present a force-directed service dependency visualization and filtering tool that has been developed and used within SAP. The tool's use cases include guiding service retirement as well as un…
▽ More
Managing dependencies between software services is a crucial task for any company operating cloud applications. Visualizations can help to understand and maintain these complex dependencies. In this paper, we present a force-directed service dependency visualization and filtering tool that has been developed and used within SAP. The tool's use cases include guiding service retirement as well as understanding service deployment landscapes and their relationship to the company's organizational structure. We report how we built and adapted the tool under strict time constraints to address the requirements of our users. We further share insights on how we enabled internal adoption. For us, starting with a minimal viable visualization and then quickly responding to user feedback was essential for convincing users of the tool's value. The final version of the tool enabled users to visually understand company-wide service consumption, supporting data-driven decision making.
△ Less
Submitted 22 August, 2023; v1 submitted 18 August, 2023;
originally announced August 2023.
-
Addressing Compiler Errors: Stack Overflow or Large Language Models?
Authors:
Patricia Widjojo,
Christoph Treude
Abstract:
Compiler error messages serve as an initial resource for programmers dealing with compilation errors. However, previous studies indicate that they often lack sufficient targeted information to resolve code issues. Consequently, programmers typically rely on their own research to fix errors. Historically, Stack Overflow has been the primary resource for such information, but recent advances in larg…
▽ More
Compiler error messages serve as an initial resource for programmers dealing with compilation errors. However, previous studies indicate that they often lack sufficient targeted information to resolve code issues. Consequently, programmers typically rely on their own research to fix errors. Historically, Stack Overflow has been the primary resource for such information, but recent advances in large language models offer alternatives. This study systematically examines 100 compiler error messages from three sources to determine the most effective approach for programmers encountering compiler errors. Factors considered include Stack Overflow search methods and the impact of model version and prompt phrasing when using large language models. The results reveal that GPT-4 outperforms Stack Overflow in explaining compiler error messages, the effectiveness of adding code snippets to Stack Overflow searches depends on the search method, and results for Stack Overflow differ significantly between Google and StackExchange API searches. Furthermore, GPT-4 surpasses GPT-3.5, with "How to fix" prompts yielding superior outcomes to "What does this error mean" prompts. These results offer valuable guidance for programmers seeking assistance with compiler error messages, underscoring the transformative potential of advanced large language models like GPT-4 in debugging and opening new avenues of exploration for researchers in AI-assisted programming.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Wait, wasn't that code here before? Detecting Outdated Software Documentation
Authors:
Wen Siang Tan,
Markus Wagner,
Christoph Treude
Abstract:
Encountering outdated documentation is not a rare occurrence for developers and users in the software engineering community. To ensure that software documentation is up-to-date, developers often have to manually check whether the documentation needs to be updated whenever changes are made to the source code. In our previous work, we proposed an approach to automatically detect outdated code elemen…
▽ More
Encountering outdated documentation is not a rare occurrence for developers and users in the software engineering community. To ensure that software documentation is up-to-date, developers often have to manually check whether the documentation needs to be updated whenever changes are made to the source code. In our previous work, we proposed an approach to automatically detect outdated code element references in software repositories and found that more than a quarter of the 1000 most popular projects on GitHub contained at least one outdated reference. In this paper, we present a GitHub Actions tool that builds on our previous work's approach that GitHub developers can configure to automatically scan for outdated code element references in their GitHub project's documentation whenever a pull request is submitted.
△ Less
Submitted 9 July, 2023;
originally announced July 2023.
-
Promises and Perils of Mining Software Package Ecosystem Data
Authors:
Raula Gaikovina Kula,
Katsuro Inoue,
Christoph Treude
Abstract:
The use of third-party packages is becoming increasingly popular and has led to the emergence of large software package ecosystems with a maze of inter-dependencies. Since the reliance on these ecosystems enables developers to reduce development effort and increase productivity, it has attracted the interest of researchers: understanding the infrastructure and dynamics of package ecosystems has gi…
▽ More
The use of third-party packages is becoming increasingly popular and has led to the emergence of large software package ecosystems with a maze of inter-dependencies. Since the reliance on these ecosystems enables developers to reduce development effort and increase productivity, it has attracted the interest of researchers: understanding the infrastructure and dynamics of package ecosystems has given rise to approaches for better code reuse, automated updates, and the avoidance of vulnerabilities, to name a few examples. But the reality of these ecosystems also poses challenges to software engineering researchers, such as: How do we obtain the complete network of dependencies along with the corresponding versioning information? What are the boundaries of these package ecosystems? How do we consistently detect dependencies that are declared but not used? How do we consistently identify developers within a package ecosystem? How much of the ecosystem do we need to understand to analyse a single component? How well do our approaches generalise across different programming languages and package ecosystems? In this chapter, we review promises and perils of mining the rich data related to software package ecosystems available to software engineering researchers.
△ Less
Submitted 28 May, 2023;
originally announced June 2023.
-
Ethical Considerations Towards Protestware
Authors:
Marc Cheong,
Raula Gaikovina Kula,
Christoph Treude
Abstract:
A key drawback to using a Open Source third-party library is the risk of introducing malicious attacks. In recently times, these threats have taken a new form, when maintainers turn their Open Source libraries into protestware. This is defined as software containing political messages delivered through these libraries, which can either be malicious or benign. Since developers are willing to freely…
▽ More
A key drawback to using a Open Source third-party library is the risk of introducing malicious attacks. In recently times, these threats have taken a new form, when maintainers turn their Open Source libraries into protestware. This is defined as software containing political messages delivered through these libraries, which can either be malicious or benign. Since developers are willing to freely open-up their software to these libraries, much trust and responsibility are placed on the maintainers to ensure that the library does what it promises to do. Using different frameworks commonly used in AI ethics, we illustrate how an open-source maintainer's decision to protest is influenced by different stakeholders (viz., their membership in the OSS community, their personal views, financial motivations, social status, and moral viewpoints), making protestware a multifaceted and intricate matter.
△ Less
Submitted 4 January, 2024; v1 submitted 27 May, 2023;
originally announced June 2023.
-
18 Million Links in Commit Messages: Purpose, Evolution, and Decay
Authors:
Tao Xiao,
Sebastian Baltes,
Hideaki Hata,
Christoph Treude,
Raula Gaikovina Kula,
Takashi Ishio,
Kenichi Matsumoto
Abstract:
Commit messages contain diverse and valuable types of knowledge in all aspects of software maintenance and evolution. Links are an example of such knowledge. Previous work on "9.6 million links in source code comments" showed that links are prone to decay, become outdated, and lack bidirectional traceability. We conducted a large-scale study of 18,201,165 links from commits in 23,110 GitHub reposi…
▽ More
Commit messages contain diverse and valuable types of knowledge in all aspects of software maintenance and evolution. Links are an example of such knowledge. Previous work on "9.6 million links in source code comments" showed that links are prone to decay, become outdated, and lack bidirectional traceability. We conducted a large-scale study of 18,201,165 links from commits in 23,110 GitHub repositories to investigate whether they suffer the same fate. Results show that referencing external resources is prevalent and that the most frequent domains other than github.com are the external domains of Stack Overflow and Google Code. Similarly, links serve as source code context to commit messages, with inaccessible links being frequent. Although repeatedly referencing links is rare (4%), 14% of links that are prone to evolve become unavailable over time; e.g., tutorials or articles and software homepages become unavailable over time. Furthermore, we find that 70% of the distinct links suffer from decay; the domains that occur the most frequently are related to Subversion repositories. We summarize that links in commits share the same fate as links in code, opening up avenues for future work.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
The Impact of a Continuous Integration Service on the Delivery Time of Merged Pull Requests
Authors:
João Helis Bernardo,
Daniel Alencar da Costa,
Uirá Kulesza,
Christoph Treude
Abstract:
Continuous Integration (CI) is a software development practice that builds and tests software frequently (e.g., at every push). One main motivator to adopt CI is the potential to deliver software functionalities more quickly than not using CI. However, there is little empirical evidence to support that CI helps projects deliver software functionalities more quickly. Through the analysis of 162,653…
▽ More
Continuous Integration (CI) is a software development practice that builds and tests software frequently (e.g., at every push). One main motivator to adopt CI is the potential to deliver software functionalities more quickly than not using CI. However, there is little empirical evidence to support that CI helps projects deliver software functionalities more quickly. Through the analysis of 162,653 pull requests (PRs) of 87 GitHub projects, we empirically study whether adopting a CI service (TravisCI) can quicken the time to deliver merged PRs. We complement our quantitative study by analyzing 450 survey responses from participants of 73 software projects. Our results reveal that adopting a CI service may not necessarily quicken the delivery of merge PRs. Instead, the pivotal benefit of a CI service is to improve the decision making on PR submissions, without compromising the quality or overloading the project's reviewers and maintainers. The automation provided by CI and the boost in developers' confidence are key advantages of adopting a CI service. Furthermore, open-source projects planning to attract and retain developers should consider the use of a CI service in their project, since CI is perceived to lower the contribution barrier while making contributors feel more confident and engaged in the project.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Barriers and Self-Efficacy: A Large-Scale Study on the Impact of OSS Courses on Student Perceptions
Authors:
Larissa Salerno,
Simone de França Tonhão,
Igor Steinmacher,
Christoph Treude
Abstract:
Open source software (OSS) development offers a unique opportunity for students in Software Engineering to experience and participate in large-scale software development, however, the impact of such courses on students' self-efficacy and the challenges faced by students are not well understood. This paper aims to address this gap by analyzing data from multiple instances of OSS development courses…
▽ More
Open source software (OSS) development offers a unique opportunity for students in Software Engineering to experience and participate in large-scale software development, however, the impact of such courses on students' self-efficacy and the challenges faced by students are not well understood. This paper aims to address this gap by analyzing data from multiple instances of OSS development courses at universities in different countries and reporting on how students' self-efficacy changed as a result of taking the course, as well as the barriers and challenges faced by students.
△ Less
Submitted 3 July, 2023; v1 submitted 28 April, 2023;
originally announced April 2023.
-
We Live in a Society: Motivators for Contributions in an OSS Ecosystem
Authors:
Supatsara Wattanakriengkrai,
Raula Gaikovina Kula,
Christoph Treude,
Kenichi Matsumoto
Abstract:
Due to the increasing number of attacks targeting open source library ecosystems, assisting maintainers has become a top priority. This is especially important since maintainers are usually overworked. Although the motivation of Open Source developers has been widely studied, the extent to which maintainers assist libraries that they depend on is unknown. Surveying NPM developers, our early result…
▽ More
Due to the increasing number of attacks targeting open source library ecosystems, assisting maintainers has become a top priority. This is especially important since maintainers are usually overworked. Although the motivation of Open Source developers has been widely studied, the extent to which maintainers assist libraries that they depend on is unknown. Surveying NPM developers, our early results indicate a difference in motivation between maintaining their own library (i.e., more person driven), as opposed to professional factors (i.e., focus on skills and expertise) when contributing to the software ecosystem. Finally, our thematic analysis shows different motivations and barriers developers face when contributing to the ecosystem. These results show that developers have different motivations and barriers depending on the role they play when making contributions to the ecosystem.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
Understanding the Role of Images on Stack Overflow
Authors:
Dong Wang,
Tao Xiao,
Christoph Treude,
Raula Gaikovina Kula,
Hideaki Hata,
Yasutaka Kamei
Abstract:
Images are increasingly being shared by software developers in diverse channels including question-and-answer forums like Stack Overflow. Although prior work has pointed out that these images are meaningful and provide complementary information compared to their associated text, how images are used to support questions is empirically unknown. To address this knowledge gap, in this paper we specifi…
▽ More
Images are increasingly being shared by software developers in diverse channels including question-and-answer forums like Stack Overflow. Although prior work has pointed out that these images are meaningful and provide complementary information compared to their associated text, how images are used to support questions is empirically unknown. To address this knowledge gap, in this paper we specifically conduct an empirical study to investigate (I) the characteristics of images, (II) the extent to which images are used in different question types, and (III) the role of images on receiving answers. Our results first show that user interface is the most common image content and undesired output is the most frequent purpose for sharing images. Moreover, these images essentially facilitate the understanding of 68% of sampled questions. Second, we find that discrepancy questions are more relatively frequent compared to those without images, but there are no significant differences observed in description length in all types of questions. Third, the quantitative results statistically validate that questions with images are more likely to receive accepted answers, but do not speed up the time to receive answers. Our work demonstrates the crucial role that images play by approaching the topic from a new angle and lays the foundation for future opportunities to use images to assist in tasks like generating questions and identifying question-relatedness.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
Applying Information Theory to Software Evolution
Authors:
Adriano Torres,
Sebastian Baltes,
Christoph Treude,
Markus Wagner
Abstract:
Although information theory has found success in disciplines, the literature on its applications to software evolution is limit. We are still missing artifacts that leverage the data and tooling available to measure how the information content of a project can be a proxy for its complexity. In this work, we explore two definitions of entropy, one structural and one textual, and apply it to the his…
▽ More
Although information theory has found success in disciplines, the literature on its applications to software evolution is limit. We are still missing artifacts that leverage the data and tooling available to measure how the information content of a project can be a proxy for its complexity. In this work, we explore two definitions of entropy, one structural and one textual, and apply it to the historical progression of the commit history of 25 open source projects. We produce evidence that they generally are highly correlated. We also observed that they display weak and unstable correlations with other complexity metrics. Our preliminary investigation of outliers shows an unexpected high frequency of events where there is considerable change in the information content of the project, suggesting that such outliers may inform a definition of surprisal.
△ Less
Submitted 26 April, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
Stop Words for Processing Software Engineering Documents: Do they Matter?
Authors:
Yaohou Fan,
Chetan Arora,
Christoph Treude
Abstract:
Stop words, which are considered non-predictive, are often eliminated in natural language processing tasks. However, the definition of uninformative vocabulary is vague, so most algorithms use general knowledge-based stop lists to remove stop words. There is an ongoing debate among academics about the usefulness of stop word elimination, especially in domain-specific settings. In this work, we inv…
▽ More
Stop words, which are considered non-predictive, are often eliminated in natural language processing tasks. However, the definition of uninformative vocabulary is vague, so most algorithms use general knowledge-based stop lists to remove stop words. There is an ongoing debate among academics about the usefulness of stop word elimination, especially in domain-specific settings. In this work, we investigate the usefulness of stop word removal in a software engineering context. To do this, we replicate and experiment with three software engineering research tools from related work. Additionally, we construct a corpus of software engineering domain-related text from 10,000 Stack Overflow questions and identify 200 domain-specific stop words using traditional information-theoretic methods. Our results show that the use of domain-specific stop words significantly improved the performance of research tools compared to the use of a general stop list and that 17 out of 19 evaluation measures showed better performance.
Online appendix: https://zenodo.org/record/7865748
△ Less
Submitted 12 June, 2023; v1 submitted 18 March, 2023;
originally announced March 2023.
-
She Elicits Requirements and He Tests: Software Engineering Gender Bias in Large Language Models
Authors:
Christoph Treude,
Hideaki Hata
Abstract:
Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embed…
▽ More
Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embedded in large language models. We systematically translated each task from English into a genderless language and back, and investigated the pronouns associated with each task. Based on translating each task 100 times in different permutations, we identify a significant disparity in the gendered pronoun associations with different tasks. Specifically, requirements elicitation was associated with the pronoun "he" in only 6% of cases, while testing was associated with "he" in 100% of cases. Additionally, tasks related to helping others had a 91% association with "he" while the same association for tasks related to asking coworkers was only 52%. These findings reveal a clear pattern of gender bias related to software development tasks and have important implications for addressing this issue both in the training of large language models and in broader society.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.
-
Towards Understanding the Open Source Interest in Gender-Related GitHub Projects
Authors:
Rita Garcia,
Christoph Treude,
Wendy La
Abstract:
The open-source community uses the GitHub platform to exchange and share software applications and services of interest. This paper aims to identify the open-source community's interest in gender-related projects on GitHub. Our findings create research opportunities and identify resources by the open-source community that promote diversity, equity, and inclusion. We use data mining to identify Git…
▽ More
The open-source community uses the GitHub platform to exchange and share software applications and services of interest. This paper aims to identify the open-source community's interest in gender-related projects on GitHub. Our findings create research opportunities and identify resources by the open-source community that promote diversity, equity, and inclusion. We use data mining to identify GitHub projects that focus on gender-related topics. We apply quantitative and qualitative methodologies to examine the projects' attributes and to classify them within a gender social structure and a gender bias taxonomy. We aim to understand the open-source community's efforts and interests in gender topics through active projects. In this paper, we report on a preponderance of projects focusing on specific gender topics and identify those with a narrow focus. We examine projects focusing on gender bias and how they address this non-inclusive behaviour. Results show a propensity of GitHub projects focusing on recognising and detecting an individual's gender and a dearth of projects concentrating on the cultural expectations placed on women and men. In the gender bias domain, the projects mainly focus on occupational biases. These findings raise opportunities to address the limited focus of GitHub on gender-related topics through developing projects that mitigate exclusive behaviours.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Socialz: Multi-Feature Social Fuzz Testing
Authors:
Francisco Zanartu,
Christoph Treude,
Markus Wagner
Abstract:
Online social networks have become an integral aspect of our daily lives and play a crucial role in shaping our relationships with others. However, bugs and glitches, even minor ones, can cause anything from frustrating problems to serious data leaks that can have farreaching impacts on millions of users. To mitigate these risks, fuzz testing, a method of testing with randomised inputs, can provid…
▽ More
Online social networks have become an integral aspect of our daily lives and play a crucial role in shaping our relationships with others. However, bugs and glitches, even minor ones, can cause anything from frustrating problems to serious data leaks that can have farreaching impacts on millions of users. To mitigate these risks, fuzz testing, a method of testing with randomised inputs, can provide increased confidence in the correct functioning of a social network. However, implementing traditional fuzz testing methods can be prohibitively difficult or impractical for programmers outside of the social network's development team. To tackle this challenge, we present Socialz, a novel approach to social fuzz testing that (1) characterises real users of a social network, (2) diversifies their interaction using evolutionary computation across multiple, non-trivial features, and (3) collects performance data as these interactions are executed. With Socialz, we aim to put social testing tools in everybody's hands, thereby improving the reliability and security of social networks used worldwide. In our study, we came across (1) one known limitation of the current GitLab CE and (2) 6,907 errors, of which 40.16% are beyond our debugging skills.
△ Less
Submitted 4 July, 2024; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Overcoming Challenges in DevOps Education through Teaching Methods
Authors:
Samuel Ferino,
Marcelo Fernandes,
Elder Cirilo,
Lucas Agnez,
Bruno Batista,
Uirá Kulesza,
Eduardo Aranha,
Christoph Treude
Abstract:
DevOps is a set of practices that deals with coordination between development and operation teams and ensures rapid and reliable new software releases that are essential in industry. DevOps education assumes the vital task of preparing new professionals in these practices using appropriate teaching methods. However, there are insufficient studies investigating teaching methods in DevOps. We perfor…
▽ More
DevOps is a set of practices that deals with coordination between development and operation teams and ensures rapid and reliable new software releases that are essential in industry. DevOps education assumes the vital task of preparing new professionals in these practices using appropriate teaching methods. However, there are insufficient studies investigating teaching methods in DevOps. We performed an analysis based on interviews to identify teaching methods and their relationship with DevOps educational challenges. Our findings show that project-based learning and collaborative learning are emerging as the most relevant teaching methods.
△ Less
Submitted 10 February, 2023;
originally announced February 2023.
-
Navigating Complexity in Software Engineering: A Prototype for Comparing GPT-n Solutions
Authors:
Christoph Treude
Abstract:
Navigating the diverse solution spaces of non-trivial software engineering tasks requires a combination of technical knowledge, problem-solving skills, and creativity. With multiple possible solutions available, each with its own set of trade-offs, it is essential for programmers to evaluate the various options and select the one that best suits the specific requirements and constraints of a proje…
▽ More
Navigating the diverse solution spaces of non-trivial software engineering tasks requires a combination of technical knowledge, problem-solving skills, and creativity. With multiple possible solutions available, each with its own set of trade-offs, it is essential for programmers to evaluate the various options and select the one that best suits the specific requirements and constraints of a project. Whether it is choosing from a range of libraries, weighing the pros and cons of different architecture and design solutions, or finding unique ways to fulfill user requirements, the ability to think creatively is crucial for making informed decisions that will result in efficient and effective software. However, the interfaces of current chatbot tools for programmers, such as OpenAI's ChatGPT or GitHub Copilot, are optimized for presenting a single solution, even for complex queries. While other solutions can be requested, they are not displayed by default and are not intuitive to access. In this paper, we present our work-in-progress prototype "GPTCompare", which allows programmers to visually compare multiple source code solutions generated by GPT-n models for the same programming-related query by highlighting their similarities and differences.
△ Less
Submitted 28 January, 2023;
originally announced January 2023.
-
Detecting Outdated Code Element References in Software Repository Documentation
Authors:
Wen Siang Tan,
Markus Wagner,
Christoph Treude
Abstract:
Outdated documentation is a pervasive problem in software development, preventing effective use of software, and misleading users and developers alike. We posit that one possible reason why documentation becomes out of sync so easily is that developers are unaware of when their source code modifications render the documentation obsolete. Ensuring that the documentation is always in sync with the s…
▽ More
Outdated documentation is a pervasive problem in software development, preventing effective use of software, and misleading users and developers alike. We posit that one possible reason why documentation becomes out of sync so easily is that developers are unaware of when their source code modifications render the documentation obsolete. Ensuring that the documentation is always in sync with the source code takes considerable effort, especially for large codebases. To address this situation, we propose an approach that can automatically detect code element references that survive in the documentation after all source code instances have been deleted. In this work, we analysed over 3,000 GitHub projects and found that most projects contain at least one outdated code element reference at some point in their history. We submitted GitHub issues to real-world projects containing outdated references detected by our approach, some of which have already led to documentation fixes. As an initiative toward keeping documentation in software repositories up-to-date, we have made our implementation available for developers to scan their GitHub projects for outdated code element references.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
An Empirical Study of Package Management Issues via Stack Overflow
Authors:
Syful Islam,
Raula Gaikovina Kula,
Christoph Treude,
Bodin Chinthanet,
Takashi Ishio,
Kenichi Matsumoto
Abstract:
The package manager (PM) is crucial to most technology stacks, acting as a broker to ensure that a verified dependency package is correctly installed, configured, or removed from an application. Diversity in technology stacks has led to dozens of PMs with various features. While our recent study indicates that package management features of PM are related to end-user experiences, it is unclear wha…
▽ More
The package manager (PM) is crucial to most technology stacks, acting as a broker to ensure that a verified dependency package is correctly installed, configured, or removed from an application. Diversity in technology stacks has led to dozens of PMs with various features. While our recent study indicates that package management features of PM are related to end-user experiences, it is unclear what those issues are and what information is required to resolve them. In this paper, we have investigated PM issues faced by end-users through an empirical study of content on Stack Overflow (SO). We carried out a qualitative analysis of 1,131 questions and their accepted answer posts for three popular PMs (i.e., Maven, npm, and NuGet) to identify issue types, underlying causes, and their resolutions. Our results confirm that end-users struggle with PM tool usage (approximately 64-72%). We observe that most issues are raised by end-users due to lack of instructions and errors messages from PM tools. In terms of issue resolution, we find that external link sharing is the most common practice to resolve PM issues. Additionally, we observe that links pointing to useful resources (i.e., official documentation websites, tutorials, etc.) are most frequently shared, indicating the potential for tool support and the ability to provide relevant information for PM end-users.
△ Less
Submitted 22 November, 2022; v1 submitted 21 November, 2022;
originally announced November 2022.