Zum Hauptinhalt springen

Showing 1–50 of 115 results for author: Treude, C

Searching in archive cs. Search in all archives.
.
  1. An Empirical Study of API Misuses of Data-Centric Libraries

    Authors: Akalanka Galappaththi, Sarah Nadi, Christoph Treude

    Abstract: Developers rely on third-party library Application Programming Interfaces (APIs) when developing software. However, libraries typically come with assumptions and API usage constraints, whose violation results in API misuse. API misuses may result in crashes or incorrect behavior. Even though API misuse is a well-studied area, a recent study of API misuse of deep learning libraries showed that the… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2408.10577  [pdf, other

    cs.SE

    Optimizing Large Language Model Hyperparameters for Code Generation

    Authors: Chetan Arora, Ahnaf Ibn Sayeed, Sherlock Licorish, Fanyu Wang, Christoph Treude

    Abstract: Large Language Models (LLMs), such as GPT models, are increasingly used in software engineering for various tasks, such as code generation, requirements management, and debugging. While automating these tasks has garnered significant attention, a systematic study on the impact of varying hyperparameters on code generation outcomes remains unexplored. This study aims to assess LLMs' code generation… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  3. arXiv:2408.05534  [pdf, other

    cs.SE cs.HC cs.LG

    Can LLMs Replace Manual Annotation of Software Engineering Artifacts?

    Authors: Toufique Ahmed, Premkumar Devanbu, Christoph Treude, Michael Pradel

    Abstract: Experimental evaluations of software engineering innovations, e.g., tools and processes, often include human-subject studies as a component of a multi-pronged strategy to obtain greater generalizability of the findings. However, human-subject studies in our field are challenging, due to the cost and difficulty of finding and employing suitable subjects, ideally, professional programmers with varyi… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  4. arXiv:2407.12241  [pdf, other

    cs.SE

    An Empirical Study of Static Analysis Tools for Secure Code Review

    Authors: Wachiraphan Charoenwet, Patanamon Thongtanunam, Van-Thuan Pham, Christoph Treude

    Abstract: Early identification of security issues in software development is vital to minimize their unanticipated impacts. Code review is a widely used manual analysis method that aims to uncover security issues along with other coding issues in software projects. While some studies suggest that automated static application security testing tools (SASTs) could enhance security issue identification, there i… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA) 2024

  5. arXiv:2407.00862  [pdf, other

    cs.SE

    Contributing Back to the Ecosystem: A User Survey of NPM Developers

    Authors: Supatsara Wattanakriengkrai, Christoph Treude, Raula Gaikovina Kula

    Abstract: With the rise of the library ecosystem (such as NPM for JavaScript and PyPI for Python), a developer has access to a multitude of library packages that they can adopt as dependencies into their application.Prior work has found that these ecosystems form a complex web of dependencies, where sustainability issues of a single library can have widespread network effects. Due to the Open Source Softwar… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Accepted at SERA2024

  6. arXiv:2406.18071  [pdf, other

    cs.SE

    Documenting Ethical Considerations in Open Source AI Models

    Authors: Haoyu Gao, Mansooreh Zahedi, Christoph Treude, Sarita Rosenstock, Marc Cheong

    Abstract: Background: The development of AI-enabled software heavily depends on AI model documentation, such as model cards, due to different domain expertise between software engineers and model developers. From an ethical standpoint, AI model documentation conveys critical information on ethical considerations along with mitigation strategies for downstream developers to ensure the delivery of ethically c… ▽ More

    Submitted 2 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: This paper is accepted by 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM'24)

  7. arXiv:2406.11362  [pdf, other

    cs.SE

    Characterising Contributions that Coincide with Vulnerability Mitigation in NPM Libraries

    Authors: Ruksit Rojpaisarnkit, Hathaichanok Damrongsiri, Christoph Treude, Ali Ouni, Raula Gaikovina Kula

    Abstract: With the urgent need to secure supply chains among Open Source libraries, attention has focused on mitigating vulnerabilities detected in these libraries. Although awareness has improved recently, most studies still report delays in the mitigation process. This suggests that developers still have to deal with other contributions that occur during the period of fixing vulnerabilities, such as coinc… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 6 pages, 3 figures, 3 tables, 22nd IEEE/ACIS International Conference on Software Engineering, Management and Applications (SERA 2024)

    ACM Class: D.2.7; D.2.9

  8. arXiv:2406.08228  [pdf, ps, other

    cs.SE

    Qualitative Data Analysis in Software Engineering: Techniques and Teaching Insights

    Authors: Christoph Treude

    Abstract: Software repositories are rich sources of qualitative artifacts, including source code comments, commit messages, issue descriptions, and documentation. These artifacts offer many interesting insights when analyzed through quantitative methods, as outlined in the chapter on mining software repositories. This chapter shifts the focus towards interpreting these artifacts using various qualitative da… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  9. Prioritising GitHub Priority Labels

    Authors: James Caddy, Christoph Treude

    Abstract: Communities on GitHub often use issue labels as a way of triaging issues by assigning them priority ratings based on how urgently they should be addressed. The labels used are determined by the repository contributors and not standardised by GitHub. This makes it difficult for priority-related reasoning across repositories for both researchers and contributors. Previous work shows interest in how… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 4 pages, 5 tables, 2 figures, appearing in PROMISE 2024

  10. arXiv:2405.01565  [pdf, other

    cs.SE

    The Role of Code Proficiency in the Era of Generative AI

    Authors: Gregorio Robles, Christoph Treude, Jesus M. Gonzalez-Barahona, Raula Gaikovina Kula

    Abstract: At the current pace of technological advancements, Generative AI models, including both Large Language Models and Large Multi-modal Models, are becoming integral to the developer workspace. However, challenges emerge due to the 'black box' nature of many of these models, where the processes behind their outputs are not transparent. This position paper advocates for a 'white box' approach to these… ▽ More

    Submitted 8 April, 2024; originally announced May 2024.

    Comments: submitted to Software Engineering 2030

  11. arXiv:2404.18677  [pdf, other

    cs.SE

    Towards the First Code Contribution: Processes and Information Needs

    Authors: Christoph Treude, Marco A. Gerosa, Igor Steinmacher

    Abstract: Newcomers to a software project must overcome many barriers before they can successfully place their first code contribution, and they often struggle to find information that is relevant to them. In this work, we argue that much of the information needed by newcomers already exists, albeit scattered among many different sources, and that many barriers can be addressed by automatically identifying,… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  12. arXiv:2404.14637  [pdf, other

    cs.SE

    Open Source Software Development Tool Installation: Challenges and Strategies For Novice Developers

    Authors: Larissa Salerno, Christoph Treude, Patanamon Thongtatunam

    Abstract: As the world of technology advances, so do the tools that software developers use to create new programs. In recent years, software development tools have become more popular, allowing developers to work more efficiently and produce higher-quality software. Still, installing such tools can be challenging for novice developers at the early stage of their careers, as they may face challenges, such a… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  13. arXiv:2404.05489  [pdf, other

    cs.SE

    The Impact of Sanctions on GitHub Developers and Activities

    Authors: Youmei Fan, Ani Hovhannisyan, Hideaki Hata, Christoph Treude, Raula Gaikovina Kula

    Abstract: The GitHub platform has fueled the creation of truly global software, enabling contributions from developers across various geographical regions of the world. As software becomes more entwined with global politics and social regulations, it becomes similarly subject to government sanctions. In 2019, GitHub restricted access to certain services for users in specific locations but rolled back these… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  14. arXiv:2404.04834  [pdf, ps, other

    cs.SE

    LLM-Based Multi-Agent Systems for Software Engineering: Vision and the Road Ahead

    Authors: Junda He, Christoph Treude, David Lo

    Abstract: Integrating Large Language Models(LLMs) into autonomous agents marks a significant shift in the research landscape by offering cognitive abilities competitive to human planning and reasoning. This paper envisions the evolution of LLM-based Multi-Agent (LMA) systems in addressing complex and multi-faceted software engineering challenges. LMA systems introduce numerous benefits, including enhanced r… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  15. Creative and Correct: Requesting Diverse Code Solutions from AI Foundation Models

    Authors: Scott Blyth, Markus Wagner, Christoph Treude

    Abstract: AI foundation models have the capability to produce a wide array of responses to a single prompt, a feature that is highly beneficial in software engineering to generate diverse code solutions. However, this advantage introduces a significant trade-off between diversity and correctness. In software engineering tasks, diversity is key to exploring design spaces and fostering creativity, but the pra… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 4 pages,Forge 2024

    ACM Class: D.2.3

    Journal ref: AI Foundation Models and Software Engineering (FORGE '24), April 14, 2024, Lisbon, Portugal

  16. The Impact Of Bug Localization Based on Crash Report Mining: A Developers' Perspective

    Authors: Marcos Medeiros, Uirá Kulesza, Roberta Coelho, Rodrigo Bonifácio, Christoph Treude, Eiji Adachi

    Abstract: Developers often use crash reports to understand the root cause of bugs. However, locating the buggy source code snippet from such information is a challenging task, mainly when the log database contains many crash reports. To mitigate this issue, recent research has proposed and evaluated approaches for grouping crash report data and using stack trace information to locate bugs. The effectiveness… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  17. Smart HPA: A Resource-Efficient Horizontal Pod Auto-scaler for Microservice Architectures

    Authors: Hussain Ahmad, Christoph Treude, Markus Wagner, Claudia Szabo

    Abstract: Microservice architectures have gained prominence in both academia and industry, offering enhanced agility, reusability, and scalability. To simplify scaling operations in microservice architectures, container orchestration platforms such as Kubernetes feature Horizontal Pod Auto-scalers (HPAs) designed to adjust the resources of microservices to accommodate fluctuating workloads. However, existin… ▽ More

    Submitted 26 February, 2024; originally announced March 2024.

    Journal ref: 2024 IEEE 21st International Conference on Software Architecture (ICSA)

  18. arXiv:2402.09557  [pdf

    cs.SE

    Enhancing Source Code Representations for Deep Learning with Static Analysis

    Authors: Xueting Guan, Christoph Treude

    Abstract: Deep learning techniques applied to program analysis tasks such as code classification, summarization, and bug detection have seen widespread interest. Traditional approaches, however, treat programming source code as natural language text, which may neglect significant structural or semantic details. Additionally, most current methods of representing source code focus solely on the code, without… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  19. Generative AI for Pull Request Descriptions: Adoption, Impact, and Developer Interventions

    Authors: Tao Xiao, Hideaki Hata, Christoph Treude, Kenichi Matsumoto

    Abstract: GitHub's Copilot for Pull Requests (PRs) is a promising service aiming to automate various developer tasks related to PRs, such as generating summaries of changes or providing complete walkthroughs with links to the relevant code. As this innovative technology gains traction in the Open Source Software (OSS) community, it is crucial to examine its early adoption and its impact on the development p… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  20. Improving Automated Code Reviews: Learning from Experience

    Authors: Hong Yi Lin, Patanamon Thongtanunam, Christoph Treude, Wachiraphan Charoenwet

    Abstract: Modern code review is a critical quality assurance process that is widely adopted in both industry and open source software environments. This process can help newcomers learn from the feedback of experienced reviewers; however, it often brings a large workload and stress to reviewers. To alleviate this burden, the field of automated code reviews aims to automate the process, teaching large langua… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted by the 21st International Conference on Mining Software Repositories (MSR 24)

  21. Encoding Version History Context for Better Code Representation

    Authors: Huy Nguyen, Christoph Treude, Patanamon Thongtanunam

    Abstract: With the exponential growth of AI tools that generate source code, understanding software has become crucial. When developers comprehend a program, they may refer to additional contexts to look for information, e.g. program documentation or historical code versions. Therefore, we argue that encoding this additional contextual information could also benefit code representation for deep learning. Re… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 5 pages (plus 1 for references), 1 figure, 3 tables, paper was accepted to 21st International Conference on Mining Software Repositories (MSR 2024)

  22. arXiv:2401.16715  [pdf, ps, other

    cs.SE

    Going Viral: Case Studies on the Impact of Protestware

    Authors: Youmei Fan, Dong Wang, Supatsara Wattanakriengkrai, Hathaichanok Damrongsiri, Christoph Treude, Hideaki Hata, Raula Gaikovina Kula

    Abstract: Maintainers are now self-sabotaging their work in order to take political or economic stances, a practice referred to as "protestware". In this poster, we present our approach to understand how the discourse about such an attack went viral, how it is received by the community, and whether developers respond to the attack in a timely manner. We study two notable protestware cases, i.e., Colors.js a… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  23. arXiv:2401.02755  [pdf, other

    cs.SE

    "My GitHub Sponsors profile is live!" Investigating the Impact of Twitter/X Mentions on GitHub Sponsors

    Authors: Youmei Fan, Tao Xiao, Hideaki Hata, Christoph Treude, Kenichi Matsumoto

    Abstract: GitHub Sponsors was launched in 2019, enabling donations to open-source software developers to provide financial support, as per GitHub's slogan: "Invest in the projects you depend on". However, a 2022 study on GitHub Sponsors found that only two-fifths of developers who were seeking sponsorship received a donation. The study found that, other than internal actions (such as offering perks to spons… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  24. arXiv:2312.10934  [pdf, other

    cs.SE

    APIDocBooster: An Extract-Then-Abstract Framework Leveraging Large Language Models for Augmenting API Documentation

    Authors: Chengran Yang, Jiakun Liu, Bowen Xu, Christoph Treude, Yunbo Lyu, Junda He, Ming Li, David Lo

    Abstract: API documentation is often the most trusted resource for programming. Many approaches have been proposed to augment API documentation by summarizing complementary information from external resources such as Stack Overflow. Existing extractive-based summarization approaches excel in producing faithful summaries that accurately represent the source content without input length restrictions. Neverthe… ▽ More

    Submitted 10 January, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  25. arXiv:2312.03250  [pdf, other

    cs.SE

    "Add more config detail": A Taxonomy of Installation Instruction Changes

    Authors: Haoyu Gao, Christoph Treude, Mansooreh Zahedi

    Abstract: README files play an important role in providing installation-related instructions to software users and are widely used in open source software systems on platforms such as GitHub. However, these files often suffer from various documentation issues, leading to challenges in comprehension and potential errors in content. Despite their significance, there is a lack of systematic understanding regar… ▽ More

    Submitted 14 July, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: under submission to IEEE Transactions on Software Engineering

  26. arXiv:2311.16396  [pdf, other

    cs.SE

    Toward Effective Secure Code Reviews: An Empirical Study of Security-Related Coding Weaknesses

    Authors: Wachiraphan Charoenwet, Patanamon Thongtanunam, Van-Thuan Pham, Christoph Treude

    Abstract: Identifying security issues early is encouraged to reduce the latent negative impacts on software systems. Code review is a widely-used method that allows developers to manually inspect modified code, catching security issues during a software development cycle. However, existing code review studies often focus on known vulnerabilities, neglecting coding weaknesses, which can introduce real-world… ▽ More

    Submitted 8 May, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

  27. Application of Collaborative Learning Paradigms within Software Engineering Education: A Systematic Mapping Study

    Authors: Rita Garcia, Christoph Treude, Andrew Valentine

    Abstract: Collaboration is used in Software Engineering (SE) to develop software. Industry seeks SE graduates with collaboration skills to contribute to productive software development. SE educators can use Collaborative Learning (CL) to help students develop collaboration skills. This paper uses a Systematic Mapping Study (SMS) to examine the application of the CL educational theory in SE Education. The SM… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: 7 pages

  28. arXiv:2309.04197  [pdf, other

    cs.SE

    Lessons from the Long Tail: Analysing Unsafe Dependency Updates across Software Ecosystems

    Authors: Supatsara Wattanakriengkrai, Raula Gaikovina Kula, Christoph Treude, Kenichi Matsumoto

    Abstract: A risk in adopting third-party dependencies into an application is their potential to serve as a doorway for malicious code to be injected (most often unknowingly). While many initiatives from both industry and research communities focus on the most critical dependencies (i.e., those most depended upon within the ecosystem), little is known about whether the rest of the ecosystem suffers the same… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

  29. DevGPT: Studying Developer-ChatGPT Conversations

    Authors: Tao Xiao, Christoph Treude, Hideaki Hata, Kenichi Matsumoto

    Abstract: This paper introduces DevGPT, a dataset curated to explore how software developers interact with ChatGPT, a prominent large language model (LLM). The dataset encompasses 29,778 prompts and responses from ChatGPT, including 19,106 code snippets, and is linked to corresponding software development artifacts such as source code, commits, issues, pull requests, discussions, and Hacker News threads. Th… ▽ More

    Submitted 13 February, 2024; v1 submitted 31 August, 2023; originally announced September 2023.

    Comments: MSR 2024 Mining Challenge Proposal

  30. arXiv:2308.12079  [pdf, other

    cs.SE

    Using the TypeScript compiler to fix erroneous Node.js snippets

    Authors: Brittany Reid, Christoph Treude, Markus Wagner

    Abstract: Most online code snippets do not run. This means that developers looking to reuse code from online sources must manually find and fix errors. We present an approach for automatically evaluating and correcting errors in Node.js code snippets: Node Code Correction (NCC). NCC leverages the ability of the TypeScript compiler to generate errors and inform code corrections through the combination of Typ… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: Accepted in the 23rd IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM) 2023

  31. Evaluating Transfer Learning for Simplifying GitHub READMEs

    Authors: Haoyu Gao, Christoph Treude, Mansooreh Zahedi

    Abstract: Software documentation captures detailed knowledge about a software product, e.g., code, technologies, and design. It plays an important role in the coordination of development teams and in conveying ideas to various stakeholders. However, software documentation can be hard to comprehend if it is written with jargon and complicated sentence structure. In this study, we explored the potential of te… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: Accepted by ESEC/FSE 2023

  32. arXiv:2308.09637  [pdf, other

    cs.SE

    Visually Analyzing Company-wide Software Service Dependencies: An Industrial Case Study

    Authors: Sebastian Baltes, Brian Pfitzmann, Thomas Kowark, Christoph Treude, Fabian Beck

    Abstract: Managing dependencies between software services is a crucial task for any company operating cloud applications. Visualizations can help to understand and maintain these complex dependencies. In this paper, we present a force-directed service dependency visualization and filtering tool that has been developed and used within SAP. The tool's use cases include guiding service retirement as well as un… ▽ More

    Submitted 22 August, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: 5 pages, 3 figures, 1 table, 11th IEEE Working Conference on Software Visualization (VISSOFT 2023)

  33. arXiv:2307.10793  [pdf, other

    cs.SE

    Addressing Compiler Errors: Stack Overflow or Large Language Models?

    Authors: Patricia Widjojo, Christoph Treude

    Abstract: Compiler error messages serve as an initial resource for programmers dealing with compilation errors. However, previous studies indicate that they often lack sufficient targeted information to resolve code issues. Consequently, programmers typically rely on their own research to fix errors. Historically, Stack Overflow has been the primary resource for such information, but recent advances in larg… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

  34. arXiv:2307.04291  [pdf, other

    cs.SE

    Wait, wasn't that code here before? Detecting Outdated Software Documentation

    Authors: Wen Siang Tan, Markus Wagner, Christoph Treude

    Abstract: Encountering outdated documentation is not a rare occurrence for developers and users in the software engineering community. To ensure that software documentation is up-to-date, developers often have to manually check whether the documentation needs to be updated whenever changes are made to the source code. In our previous work, we proposed an approach to automatically detect outdated code elemen… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

  35. arXiv:2306.10021  [pdf, other

    cs.SE

    Promises and Perils of Mining Software Package Ecosystem Data

    Authors: Raula Gaikovina Kula, Katsuro Inoue, Christoph Treude

    Abstract: The use of third-party packages is becoming increasingly popular and has led to the emergence of large software package ecosystems with a maze of inter-dependencies. Since the reliance on these ecosystems enables developers to reduce development effort and increase productivity, it has attracted the interest of researchers: understanding the infrastructure and dynamics of package ecosystems has gi… ▽ More

    Submitted 28 May, 2023; originally announced June 2023.

    Comments: Submitted as a Book Chapter

  36. arXiv:2306.10019  [pdf, ps, other

    cs.CY cs.CR cs.SE

    Ethical Considerations Towards Protestware

    Authors: Marc Cheong, Raula Gaikovina Kula, Christoph Treude

    Abstract: A key drawback to using a Open Source third-party library is the risk of introducing malicious attacks. In recently times, these threats have taken a new form, when maintainers turn their Open Source libraries into protestware. This is defined as software containing political messages delivered through these libraries, which can either be malicious or benign. Since developers are willing to freely… ▽ More

    Submitted 4 January, 2024; v1 submitted 27 May, 2023; originally announced June 2023.

    Comments: Under submission

  37. 18 Million Links in Commit Messages: Purpose, Evolution, and Decay

    Authors: Tao Xiao, Sebastian Baltes, Hideaki Hata, Christoph Treude, Raula Gaikovina Kula, Takashi Ishio, Kenichi Matsumoto

    Abstract: Commit messages contain diverse and valuable types of knowledge in all aspects of software maintenance and evolution. Links are an example of such knowledge. Previous work on "9.6 million links in source code comments" showed that links are prone to decay, become outdated, and lack bidirectional traceability. We conducted a large-scale study of 18,201,165 links from commits in 23,110 GitHub reposi… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Journal ref: Empir Software Eng 28, 91 (2023)

  38. arXiv:2305.16365  [pdf, other

    cs.SE

    The Impact of a Continuous Integration Service on the Delivery Time of Merged Pull Requests

    Authors: João Helis Bernardo, Daniel Alencar da Costa, Uirá Kulesza, Christoph Treude

    Abstract: Continuous Integration (CI) is a software development practice that builds and tests software frequently (e.g., at every push). One main motivator to adopt CI is the potential to deliver software functionalities more quickly than not using CI. However, there is little empirical evidence to support that CI helps projects deliver software functionalities more quickly. Through the analysis of 162,653… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  39. Barriers and Self-Efficacy: A Large-Scale Study on the Impact of OSS Courses on Student Perceptions

    Authors: Larissa Salerno, Simone de França Tonhão, Igor Steinmacher, Christoph Treude

    Abstract: Open source software (OSS) development offers a unique opportunity for students in Software Engineering to experience and participate in large-scale software development, however, the impact of such courses on students' self-efficacy and the challenges faced by students are not well understood. This paper aims to address this gap by analyzing data from multiple instances of OSS development courses… ▽ More

    Submitted 3 July, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

  40. arXiv:2304.05766  [pdf, other

    cs.SE

    We Live in a Society: Motivators for Contributions in an OSS Ecosystem

    Authors: Supatsara Wattanakriengkrai, Raula Gaikovina Kula, Christoph Treude, Kenichi Matsumoto

    Abstract: Due to the increasing number of attacks targeting open source library ecosystems, assisting maintainers has become a top priority. This is especially important since maintainers are usually overworked. Although the motivation of Open Source developers has been widely studied, the extent to which maintainers assist libraries that they depend on is unknown. Surveying NPM developers, our early result… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  41. arXiv:2303.15684  [pdf, other

    cs.SE

    Understanding the Role of Images on Stack Overflow

    Authors: Dong Wang, Tao Xiao, Christoph Treude, Raula Gaikovina Kula, Hideaki Hata, Yasutaka Kamei

    Abstract: Images are increasingly being shared by software developers in diverse channels including question-and-answer forums like Stack Overflow. Although prior work has pointed out that these images are meaningful and provide complementary information compared to their associated text, how images are used to support questions is empirically unknown. To address this knowledge gap, in this paper we specifi… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  42. arXiv:2303.13729  [pdf, other

    cs.SE cs.IT

    Applying Information Theory to Software Evolution

    Authors: Adriano Torres, Sebastian Baltes, Christoph Treude, Markus Wagner

    Abstract: Although information theory has found success in disciplines, the literature on its applications to software evolution is limit. We are still missing artifacts that leverage the data and tooling available to measure how the information content of a project can be a proxy for its complexity. In this work, we explore two definitions of entropy, one structural and one textual, and apply it to the his… ▽ More

    Submitted 26 April, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: 8 pages, 6 figures, Accepted at the NLBSE 2023 workshop

  43. arXiv:2303.10439  [pdf, other

    cs.SE cs.CL

    Stop Words for Processing Software Engineering Documents: Do they Matter?

    Authors: Yaohou Fan, Chetan Arora, Christoph Treude

    Abstract: Stop words, which are considered non-predictive, are often eliminated in natural language processing tasks. However, the definition of uninformative vocabulary is vague, so most algorithms use general knowledge-based stop lists to remove stop words. There is an ongoing debate among academics about the usefulness of stop word elimination, especially in domain-specific settings. In this work, we inv… ▽ More

    Submitted 12 June, 2023; v1 submitted 18 March, 2023; originally announced March 2023.

    Comments: Accepted for publication at the 2nd Intl. Workshop on NL-based Software Engineering (NLBSE 2023)

  44. arXiv:2303.10131  [pdf, ps, other

    cs.SE cs.AI cs.CY cs.LG

    She Elicits Requirements and He Tests: Software Engineering Gender Bias in Large Language Models

    Authors: Christoph Treude, Hideaki Hata

    Abstract: Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embed… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: 6 pages, MSR 2023

  45. arXiv:2303.09727  [pdf, other

    cs.SE

    Towards Understanding the Open Source Interest in Gender-Related GitHub Projects

    Authors: Rita Garcia, Christoph Treude, Wendy La

    Abstract: The open-source community uses the GitHub platform to exchange and share software applications and services of interest. This paper aims to identify the open-source community's interest in gender-related projects on GitHub. Our findings create research opportunities and identify resources by the open-source community that promote diversity, equity, and inclusion. We use data mining to identify Git… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 16th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE 2023)

  46. Socialz: Multi-Feature Social Fuzz Testing

    Authors: Francisco Zanartu, Christoph Treude, Markus Wagner

    Abstract: Online social networks have become an integral aspect of our daily lives and play a crucial role in shaping our relationships with others. However, bugs and glitches, even minor ones, can cause anything from frustrating problems to serious data leaks that can have farreaching impacts on millions of users. To mitigate these risks, fuzz testing, a method of testing with randomised inputs, can provid… ▽ More

    Submitted 4 July, 2024; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: to be published in GECCO 2024, July 14-18, 2024, Melbourne, VIC, Australia

  47. arXiv:2302.05564  [pdf, other

    cs.SE

    Overcoming Challenges in DevOps Education through Teaching Methods

    Authors: Samuel Ferino, Marcelo Fernandes, Elder Cirilo, Lucas Agnez, Bruno Batista, Uirá Kulesza, Eduardo Aranha, Christoph Treude

    Abstract: DevOps is a set of practices that deals with coordination between development and operation teams and ensures rapid and reliable new software releases that are essential in industry. DevOps education assumes the vital task of preparing new professionals in these practices using appropriate teaching methods. However, there are insufficient studies investigating teaching methods in DevOps. We perfor… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

    Comments: 11 pages, 3 figures, 2 tables

  48. arXiv:2301.12169  [pdf, other

    cs.SE cs.HC

    Navigating Complexity in Software Engineering: A Prototype for Comparing GPT-n Solutions

    Authors: Christoph Treude

    Abstract: Navigating the diverse solution spaces of non-trivial software engineering tasks requires a combination of technical knowledge, problem-solving skills, and creativity. With multiple possible solutions available, each with its own set of trade-offs, it is essential for programmers to evaluate the various options and select the one that best suits the specific requirements and constraints of a proje… ▽ More

    Submitted 28 January, 2023; originally announced January 2023.

  49. arXiv:2212.01479  [pdf, other

    cs.SE

    Detecting Outdated Code Element References in Software Repository Documentation

    Authors: Wen Siang Tan, Markus Wagner, Christoph Treude

    Abstract: Outdated documentation is a pervasive problem in software development, preventing effective use of software, and misleading users and developers alike. We posit that one possible reason why documentation becomes out of sync so easily is that developers are unaware of when their source code modifications render the documentation obsolete. Ensuring that the documentation is always in sync with the s… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

  50. An Empirical Study of Package Management Issues via Stack Overflow

    Authors: Syful Islam, Raula Gaikovina Kula, Christoph Treude, Bodin Chinthanet, Takashi Ishio, Kenichi Matsumoto

    Abstract: The package manager (PM) is crucial to most technology stacks, acting as a broker to ensure that a verified dependency package is correctly installed, configured, or removed from an application. Diversity in technology stacks has led to dozens of PMs with various features. While our recent study indicates that package management features of PM are related to end-user experiences, it is unclear wha… ▽ More

    Submitted 22 November, 2022; v1 submitted 21 November, 2022; originally announced November 2022.