Zum Hauptinhalt springen

Showing 1–47 of 47 results for author: Hata, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.05489  [pdf, other

    cs.SE

    The Impact of Sanctions on GitHub Developers and Activities

    Authors: Youmei Fan, Ani Hovhannisyan, Hideaki Hata, Christoph Treude, Raula Gaikovina Kula

    Abstract: The GitHub platform has fueled the creation of truly global software, enabling contributions from developers across various geographical regions of the world. As software becomes more entwined with global politics and social regulations, it becomes similarly subject to government sanctions. In 2019, GitHub restricted access to certain services for users in specific locations but rolled back these… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  2. Generative AI for Pull Request Descriptions: Adoption, Impact, and Developer Interventions

    Authors: Tao Xiao, Hideaki Hata, Christoph Treude, Kenichi Matsumoto

    Abstract: GitHub's Copilot for Pull Requests (PRs) is a promising service aiming to automate various developer tasks related to PRs, such as generating summaries of changes or providing complete walkthroughs with links to the relevant code. As this innovative technology gains traction in the Open Source Software (OSS) community, it is crucial to examine its early adoption and its impact on the development p… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  3. Quantifying and Characterizing Clones of Self-Admitted Technical Debt in Build Systems

    Authors: Tao Xiao, Zhili Zeng, Dong Wang, Hideaki Hata, Shane McIntosh, Kenichi Matsumoto

    Abstract: Self-Admitted Technical Debt (SATD) annotates development decisions that intentionally exchange long-term software artifact quality for short-term goals. Recent work explores the existence of SATD clones (duplicate or near duplicate SATD comments) in source code. Cloning of SATD in build systems (e.g., CMake and Maven) may propagate suboptimal design choices, threatening qualities of the build sys… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  4. arXiv:2401.16715  [pdf, ps, other

    cs.SE

    Going Viral: Case Studies on the Impact of Protestware

    Authors: Youmei Fan, Dong Wang, Supatsara Wattanakriengkrai, Hathaichanok Damrongsiri, Christoph Treude, Hideaki Hata, Raula Gaikovina Kula

    Abstract: Maintainers are now self-sabotaging their work in order to take political or economic stances, a practice referred to as "protestware". In this poster, we present our approach to understand how the discourse about such an attack went viral, how it is received by the community, and whether developers respond to the attack in a timely manner. We study two notable protestware cases, i.e., Colors.js a… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  5. arXiv:2401.02755  [pdf, other

    cs.SE

    "My GitHub Sponsors profile is live!" Investigating the Impact of Twitter/X Mentions on GitHub Sponsors

    Authors: Youmei Fan, Tao Xiao, Hideaki Hata, Christoph Treude, Kenichi Matsumoto

    Abstract: GitHub Sponsors was launched in 2019, enabling donations to open-source software developers to provide financial support, as per GitHub's slogan: "Invest in the projects you depend on". However, a 2022 study on GitHub Sponsors found that only two-fifths of developers who were seeking sponsorship received a donation. The study found that, other than internal actions (such as offering perks to spons… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  6. arXiv:2309.15017  [pdf, other

    cs.SE

    Studying the association between Gitcoin's issues and resolving outcomes

    Authors: Morakot Choetkiertikul, Arada Puengmongkolchaikit, Pandaree Chandra, Chaiyong Ragkitwetsakul, Rungroj Maipradit, Hideaki Hata, Thanwadee Sunetnanta, Kenichi Matsumoto

    Abstract: The development of open-source software (OSS) projects usually have been driven through collaborations among contributors and strongly relies on volunteering. Thus, allocating software practitioners (e.g., contributors) to a particular task is non-trivial and draws attention away from the development. Therefore, a number of bug bounty platforms have emerged to address this problem through bounty r… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  7. DevGPT: Studying Developer-ChatGPT Conversations

    Authors: Tao Xiao, Christoph Treude, Hideaki Hata, Kenichi Matsumoto

    Abstract: This paper introduces DevGPT, a dataset curated to explore how software developers interact with ChatGPT, a prominent large language model (LLM). The dataset encompasses 29,778 prompts and responses from ChatGPT, including 19,106 code snippets, and is linked to corresponding software development artifacts such as source code, commits, issues, pull requests, discussions, and Hacker News threads. Th… ▽ More

    Submitted 13 February, 2024; v1 submitted 31 August, 2023; originally announced September 2023.

    Comments: MSR 2024 Mining Challenge Proposal

  8. 18 Million Links in Commit Messages: Purpose, Evolution, and Decay

    Authors: Tao Xiao, Sebastian Baltes, Hideaki Hata, Christoph Treude, Raula Gaikovina Kula, Takashi Ishio, Kenichi Matsumoto

    Abstract: Commit messages contain diverse and valuable types of knowledge in all aspects of software maintenance and evolution. Links are an example of such knowledge. Previous work on "9.6 million links in source code comments" showed that links are prone to decay, become outdated, and lack bidirectional traceability. We conducted a large-scale study of 18,201,165 links from commits in 23,110 GitHub reposi… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Journal ref: Empir Software Eng 28, 91 (2023)

  9. arXiv:2305.03251  [pdf, other

    cs.SE

    Meta-Maintanance for Dockerfiles: Are We There Yet?

    Authors: Takeru Tanaka, Hideaki Hata, Bodin Chinthanet, Raula Gaikovina Kula, Kenichi Matsumoto

    Abstract: Docker allows for the packaging of applications and dependencies, and its instructions are described in Dockerfiles. Nowadays, version pinning is recommended to avoid unexpected changes in the latest version of a package. However, version pinning in Dockerfiles is not yet fully realized (only 17k of the 141k Dockerfiles we analyzed), because of the difficulties caused by version pinning. To mainta… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: 10 pages

  10. arXiv:2303.15684  [pdf, other

    cs.SE

    Understanding the Role of Images on Stack Overflow

    Authors: Dong Wang, Tao Xiao, Christoph Treude, Raula Gaikovina Kula, Hideaki Hata, Yasutaka Kamei

    Abstract: Images are increasingly being shared by software developers in diverse channels including question-and-answer forums like Stack Overflow. Although prior work has pointed out that these images are meaningful and provide complementary information compared to their associated text, how images are used to support questions is empirically unknown. To address this knowledge gap, in this paper we specifi… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  11. arXiv:2303.10131  [pdf, ps, other

    cs.SE cs.AI cs.CY cs.LG

    She Elicits Requirements and He Tests: Software Engineering Gender Bias in Large Language Models

    Authors: Christoph Treude, Hideaki Hata

    Abstract: Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embed… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: 6 pages, MSR 2023

  12. arXiv:2204.12712  [pdf, other

    cs.SE

    Release as a Contract: A Concept of Meta-Maintenance for the Entire FLOSS Ecosystem

    Authors: Hideaki Hata

    Abstract: We advocate for a paradigm shift in supporting free/libre and open source software (FLOSS) ecosystem maintenance, from focusing on individual projects to monitoring a whole organic system of the entire FLOSS ecosystem, which we call software meta-maintenance. We discuss challenges of building a global source code management system, a global issue management system, and FLOSS human capital index, b… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: 4 pages

  13. arXiv:2204.06531  [pdf, other

    cs.SE

    Software Supply Chain Map: How Reuse Networks Expand

    Authors: Hideaki Hata, Takashi Ishio

    Abstract: Clone-and-own is a typical code reuse approach because of its simplicity and efficiency. Cloned software components are maintained independently by a new owner. These clone-and-own operations can be occurred sequentially, that is, cloned components can be cloned again and owned by other new owners on the supply chain. In general, code reuse is not documented well, consequently, appropriate changes… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

    Comments: 12 pages, 14 figures

  14. GitHub Sponsors: Exploring a New Way to Contribute to Open Source

    Authors: Naomichi Shimada, Tao Xiao, Hideaki Hata, Christoph Treude, Kenichi Matsumoto

    Abstract: GitHub Sponsors, launched in 2019, enables donations to individual open source software (OSS) developers. Financial support for OSS maintainers and developers is a major issue in terms of sustaining OSS projects, and the ability to donate to individuals is expected to support the sustainability of developers, projects, and community. In this work, we conducted a mixed-methods study of GitHub Spons… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Comments: 12 pages, ICSE 2022

  15. arXiv:2109.02878  [pdf, other

    cs.SE

    FixMe: A GitHub Bot for Detecting and Monitoring On-Hold Self-Admitted Technical Debt

    Authors: Saranphon Phaithoon, Supakarn Wongnil, Patiphol Pussawong, Morakot Choetkiertikul, Chaiyong Ragkhitwetsagul, Thanwadee Sunetnanta, Rungroj Maipradit, Hideaki Hata, Kenichi Matsumoto

    Abstract: Self-Admitted Technical Debt (SATD) is a special form of technical debt in which developers intentionally record their hacks in the code by adding comments for attention. Here, we focus on issue-related "On-hold SATD", where developers suspend proper implementation due to issues reported inside or outside the project. When the referenced issues are resolved, the On-hold SATD also need to be addres… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: 5 pages, ASE 2021

  16. arXiv:2104.05891  [pdf, ps, other

    cs.SE cs.DL

    Science-Software Linkage: The Challenges of Traceability between Scientific Knowledge and Software Artifacts

    Authors: Hideaki Hata, Jin L. C. Guo, Raula Gaikovina Kula, Christoph Treude

    Abstract: Although computer science papers are often accompanied by software artifacts, connecting research papers to their software artifacts and vice versa is not always trivial. First of all, there is a lack of well-accepted standards for how such links should be provided. Furthermore, the provided links, if any, often become outdated: they are affected by link rot when pre-prints are removed, when repos… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: 5 pages

  17. Characterizing and Mitigating Self-Admitted Technical Debt in Build Systems

    Authors: Tao Xiao, Dong Wang, Shane McIntosh, Hideaki Hata, Raula Gaikovina Kula, Takashi Ishio, Kenichi Matsumoto

    Abstract: Technical Debt is a metaphor used to describe the situation in which long-term software artifact quality is traded for short-term goals in software projects. In recent years, the concept of self-admitted technical debt (SATD) was proposed, which focuses on debt that is intentionally introduced and described by developers. Although prior work has made important observations about admitted technical… ▽ More

    Submitted 2 November, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

  18. arXiv:2102.06355  [pdf, other

    cs.SE

    Same File, Different Changes: The Potential of Meta-Maintenance on GitHub

    Authors: Hideaki Hata, Raula Gaikovina Kula, Takashi Ishio, Christoph Treude

    Abstract: Online collaboration platforms such as GitHub have provided software developers with the ability to easily reuse and share code between repositories. With clone-and-own and forking becoming prevalent, maintaining these shared files is important, especially for keeping the most up-to-date version of reused code. Different to related work, we propose the concept of meta-maintenance -- i.e., tracking… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

    Comments: 12 pages, ICSE 2021

  19. arXiv:2102.05230  [pdf, other

    cs.SE cs.CY

    GitHub Discussions: An Exploratory Study of Early Adoption

    Authors: Hideaki Hata, Nicole Novielli, Sebastian Baltes, Raula Gaikovina Kula, Christoph Treude

    Abstract: Discussions is a new feature of GitHub for asking questions or discussing topics outside of specific Issues or Pull Requests. Before being available to all projects in December 2020, it had been tested on selected open source software projects. To understand how developers use this novel feature, how they perceive it, and how it impacts the development processes, we conducted a mixed-methods study… ▽ More

    Submitted 30 September, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: 37 pages, Empirical Software Engineering

  20. arXiv:2102.01325  [pdf, ps, other

    cs.SE

    FLOSS != GitHub: A Case Study of Linux/BSD Perceptions from Microsoft's Acquisition of GitHub

    Authors: Raula Gaikovina Kula, Hideki Hata, Kenichi Matsumoto

    Abstract: In 2018, the software industry giants Microsoft made a move into the Open Source world by completing the acquisition of mega Open Source platform, GitHub. This acquisition was not without controversy, as it is well-known that the free software communities includes not only the ability to use software freely, but also the libre nature in Open Source Software. In this study, our aim is to explore th… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: 5 pages

  21. A Framework for Conditional Statement Technical Debt Identification and Description

    Authors: Abdulaziz Alhefdhi, Hoa Khanh Dam, Yusuf Sulistyo Nugroho, Hideaki Hata, Takashi Ishio, Aditya Ghose

    Abstract: Technical Debt occurs when development teams favour short-term operability over long-term stability. Since this places software maintainability at risk, technical debt requires early attention to avoid paying for accumulated interest. Most of the existing work focuses on detecting technical debt using code comments, known as Self-Admitted Technical Debt (SATD). However, there are many cases where… ▽ More

    Submitted 13 October, 2022; v1 submitted 22 December, 2020; originally announced December 2020.

    Journal ref: Autom Softw Eng 29, 60 (2022)

  22. arXiv:2009.13113  [pdf, other

    cs.SE

    Automated Identification of On-hold Self-admitted Technical Debt

    Authors: Rungroj Maipradit, Bin Lin, Csaba Nagy, Gabriele Bavota, Michele Lanza, Hideaki Hata, Kenichi Matsumoto

    Abstract: Modern software is developed under considerable time pressure, which implies that developers more often than not have to resort to compromises when it comes to code that is well written and code that just does the job. This has led over the past decades to the concept of "technical debt", a short-term hack that potentially generates long-term maintenance problems. Self-admitted technical debt (SAT… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

    Comments: 11 pages, 20th IEEE International Working Conference on Source Code Analysis and Manipulation

  23. arXiv:2009.09130  [pdf, other

    cs.SE

    How are Project-Specific Forums Utilized? A Study of Participation, Content, and Sentiment in the Eclipse Ecosystem

    Authors: Yusuf Sulistyo Nugroho, Syful Islam, Keitaro Nakasai, Ifraz Rehman, Hideaki Hata, Raula Gaikovina Kula, Meiyappan Nagappan, Kenichi Matsumoto

    Abstract: Although many software development projects have moved their developer discussion forums to generic platforms such as Stack Overflow, Eclipse has been steadfast in hosting their self-supported community forums. While recent studies show forums share similarities to generic communication channels, it is unknown how project-specific forums are utilized. In this paper, we analyze 832,058 forum thread… ▽ More

    Submitted 5 August, 2021; v1 submitted 18 September, 2020; originally announced September 2020.

    Comments: 33 pages, 7 figures

  24. Predicting Defective Lines Using a Model-Agnostic Technique

    Authors: Supatsara Wattanakriengkrai, Patanamon Thongtanunam, Chakkrit Tantithamthavorn, Hideaki Hata, Kenichi Matsumoto

    Abstract: Defect prediction models are proposed to help a team prioritize source code areas files that need Software QualityAssurance (SQA) based on the likelihood of having defects. However, developers may waste their unnecessary effort on the whole filewhile only a small fraction of its source code lines are defective. Indeed, we find that as little as 1%-3% of lines of a file are defective. Hence, in thi… ▽ More

    Submitted 8 September, 2020; originally announced September 2020.

  25. arXiv:2006.14185  [pdf, ps, other

    cs.GT

    Optimizing Affine Maximizer Auctions via Linear Programming: an Application to Revenue Maximizing Mechanism Design for Zero-Day Exploits Markets

    Authors: Mingyu Guo, Hideaki Hata, Ali Babar

    Abstract: Optimizing within the affine maximizer auctions (AMA) is an effective approach for revenue maximizing mechanism design. The AMA mechanisms are strategy-proof and individually rational (if the agents' valuations for the outcomes are nonnegative). Every AMA mechanism is characterized by a list of parameters. By focusing on the AMA mechanisms, we turn mechanism design into a value optimization proble… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Journal ref: PRIMA 2017: Principles and Practice of Multi-Agent Systems

  26. arXiv:2006.14184  [pdf, ps, other

    cs.GT

    Revenue Maximizing Markets for Zero-Day Exploits

    Authors: Mingyu Guo, Hideaki Hata, Ali Babar

    Abstract: Markets for zero-day exploits (software vulnerabilities unknown to the vendor) have a long history and a growing popularity. We study these markets from a revenue-maximizing mechanism design perspective. We first propose a theoretical model for zero-day exploits markets. In our model, one exploit is being sold to multiple buyers. There are two kinds of buyers, which we call the defenders and the o… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Journal ref: PRIMA 2016: Principles and Practice of Multi-Agent Systems

  27. Pandemic Programming: How COVID-19 affects software developers and how their organizations can help

    Authors: Paul Ralph, Sebastian Baltes, Gianisa Adisaputri, Richard Torkar, Vladimir Kovalenko, Marcos Kalinowski, Nicole Novielli, Shin Yoo, Xavier Devroey, Xin Tan, Minghui Zhou, Burak Turhan, Rashina Hoda, Hideaki Hata, Gregorio Robles, Amin Milani Fard, Rana Alkadhi

    Abstract: Context. As a novel coronavirus swept the world in early 2020, thousands of software developers began working from home. Many did so on short notice, under difficult and stressful conditions. Objective. This study investigates the effects of the pandemic on developers' wellbeing and productivity. Method. A questionnaire survey was created mainly from existing, validated scales and translated into… ▽ More

    Submitted 20 July, 2020; v1 submitted 3 May, 2020; originally announced May 2020.

    Comments: 34 pages, 7 tables, 5 figures, to appear in Empirical Software Engineering

    Journal ref: Empirical Software Engineering, 2020

  28. arXiv:2004.00199  [pdf, other

    cs.SE cs.DL

    GitHub Repositories with Links to Academic Papers: Public Access, Traceability, and Evolution

    Authors: Supatsara Wattanakriengkrai, Bodin Chinthanet, Hideaki Hata, Raula Gaikovina Kula, Christoph Treude, Jin Guo, Kenichi Matsumoto

    Abstract: Traceability between published scientific breakthroughs and their implementation is essential, especially in the case of open-source scientific software which implements bleeding-edge science in its code. However, aligning the link between GitHub repositories and academic papers can prove difficult, and the current practice of establishing and maintaining such links remains unknown. This paper inv… ▽ More

    Submitted 3 October, 2021; v1 submitted 31 March, 2020; originally announced April 2020.

    Comments: 28 pages

  29. Ammonia: An Approach for Deriving Project-specific Bug Patterns

    Authors: Yoshiki Higo, Shinpei Hayashi, Hideaki Hata, Meiyappan Nagappan

    Abstract: Finding and fixing buggy code is an important and cost-intensive maintenance task, and static analysis (SA) is one of the methods developers use to perform it. SA tools warn developers about potential bugs by scanning their source code for commonly occurring bug patterns, thus giving those developers opportunities to fix the warnings (potential bugs) before they release the software. Typically, SA… ▽ More

    Submitted 14 March, 2020; v1 submitted 27 January, 2020; originally announced January 2020.

    Comments: 28 pages, Empirical Software Engineering

    Journal ref: Empirical Software Engineering, 25(3):1951-1979, 2020

  30. Challenges for Inclusion in Software Engineering: The Case of the Emerging Papua New Guinean Society

    Authors: Raula Gaikovina Kula, Christoph Treude, Hideaki Hata, Sebastian Baltes, Igor Steinmacher, Marco Aurelio Gerosa, Winifred Kula Amini

    Abstract: Software plays a central role in modern societies, with its high economic value and potential for advancing societal change. In this paper, we characterise challenges and opportunities for a country progressing towards entering the global software industry, focusing on Papua New Guinea (PNG). By hosting a Software Engineering workshop, we conducted a qualitative study by recording talks (n=3), emp… ▽ More

    Submitted 22 July, 2021; v1 submitted 31 October, 2019; originally announced November 2019.

    Comments: IEEE Software

    Journal ref: IEEE Software (2021)

  31. arXiv:1910.06932  [pdf, other

    cs.SE cs.DL

    From Academia to Software Development: Publication Citations in Source Code Comments

    Authors: Akira Inokuchi, Yusuf Sulistyo Nugroho, Supatsara Wattanakriengkrai, Fumiaki Konishi, Hideaki Hata, Christoph Treude, Akito Monden, Kenichi Matsumoto

    Abstract: Academic publications have been evaluated in terms of their impact on research communities based on many metrics, such as the number of citations. On the other hand, the impact of academic publications on industry has been rarely studied. This paper investigates how academic publications contribute to software development by analyzing publication citations in source code comments in open source so… ▽ More

    Submitted 1 May, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: 33 pages

  32. Towards Generation of Visual Attention Map for Source Code

    Authors: Takeshi D. Itoh, Takatomi Kubo, Kiyoka Ikeda, Yuki Maruno, Yoshiharu Ikutani, Hideaki Hata, Kenichi Matsumoto, Kazushi Ikeda

    Abstract: Program comprehension is a dominant process in software development and maintenance. Experts are considered to comprehend the source code efficiently by directing their gaze, or attention, to important components in it. However, reflecting the importance of components is still a remaining issue in gaze behavior analysis for source code comprehension. Here we show a conceptual framework to compare… ▽ More

    Submitted 13 August, 2019; v1 submitted 14 July, 2019; originally announced July 2019.

    Comments: 4 pages, 2 figures; APSIPA 2019 ACCEPTED

    Journal ref: APSIPA ASC (2019)

  33. arXiv:1907.04557  [pdf, other

    cs.SE

    Identifying Algorithm Names in Code Comments

    Authors: Jakapong Klainongsuang, Yusuf Sulistyo Nugroho, Hideaki Hata, Bundit Manaskasemsak, Arnon Rungsawang, Pattara Leelaprute, Kenichi Matsumoto

    Abstract: For recent machine-learning-based tasks like API sequence generation, comment generation, and document generation, large amount of data is needed. When software developers implement algorithms in code, we find that they often mention algorithm names in code comments. Code annotated with such algorithm names can be valuable data sources. In this paper, we propose an automatic method of algorithm na… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

    Comments: 10 pages

  34. A Topological Analysis of Communication Channels for Knowledge Sharing in Contemporary GitHub Projects

    Authors: Jirateep Tantisuwankul, Yusuf Sulistyo Nugroho, Raula Gaikovina Kula, Hideaki Hata, Arnon Rungsawang, Pattara Leelaprute, Kenichi Matsumoto

    Abstract: With over 28 million developers, success of the GitHub collaborative platform is highlighted through an abundance of communication channels among contemporary software projects. Knowledge is broken into two forms and its sharing (through communication channels) can be described as externalization or combination by the SECI model. Such platforms have revolutionized the way developers work, introduc… ▽ More

    Submitted 8 September, 2019; v1 submitted 9 May, 2019; originally announced May 2019.

    Comments: 30 pages

  35. arXiv:1904.12162  [pdf, other

    cs.IR cs.CL

    Sentiment Classification using N-gram IDF and Automated Machine Learning

    Authors: Rungroj Maipradit, Hideaki Hata, Kenichi Matsumoto

    Abstract: We propose a sentiment classification method with a general machine learning framework. For feature representation, n-gram IDF is used to extract software-engineering-related, dataset-specific, positive, neutral, and negative n-gram expressions. For classifiers, an automated machine learning tool is used. In the comparison using publicly available datasets, our method achieved the highest F1 value… ▽ More

    Submitted 25 May, 2019; v1 submitted 27 April, 2019; originally announced April 2019.

    Comments: 4 pages, IEEE Software

  36. arXiv:1903.06320  [pdf, other

    cs.SE cs.AI

    Toward Imitating Visual Attention of Experts in Software Development Tasks

    Authors: Yoshiharu Ikutani, Nishanth Koganti, Hideaki Hata, Takatomi Kubo, Kenichi Matsumoto

    Abstract: Expert programmers' eye-movements during source code reading are valuable sources that are considered to be associated with their domain expertise. We advocate a vision of new intelligent systems incorporating expertise of experts for software development tasks, such as issue localization, comment generation, and code generation. We present a conceptual framework of neural autonomous agents based… ▽ More

    Submitted 14 March, 2019; originally announced March 2019.

    Comments: 4 pages, EMIP 2019

  37. How Different Are Different diff Algorithms in Git?

    Authors: Yusuf Sulistyo Nugroho, Hideaki Hata, Kenichi Matsumoto

    Abstract: Automatic identification of the differences between two versions of a file is a common and basic task in several applications of mining code repositories. Git, a version control system, has a diff utility and users can select algorithms of diff from the default algorithm Myers to the advanced Histogram algorithm. From our systematic mapping, we identified three popular applications of diff in rece… ▽ More

    Submitted 16 October, 2019; v1 submitted 6 February, 2019; originally announced February 2019.

    Comments: 38 pages, Empirical Software Engineering

  38. arXiv:1901.09511  [pdf, other

    cs.SE

    Wait For It: Identifying "On-Hold" Self-Admitted Technical Debt

    Authors: Rungroj Maipradit, Christoph Treude, Hideaki Hata, Kenichi Matsumoto

    Abstract: Self-admitted technical debt refers to situations where a software developer knows that their current implementation is not optimal and indicates this using a source code comment. In this work, we hypothesize that it is possible to develop automated techniques to understand a subset of these comments in more detail, and to propose tool support that can help developers manage self-admitted technica… ▽ More

    Submitted 21 October, 2019; v1 submitted 27 January, 2019; originally announced January 2019.

    Comments: 33 pages

  39. arXiv:1901.07440  [pdf, other

    cs.SE

    9.6 Million Links in Source Code Comments: Purpose, Evolution, and Decay

    Authors: Hideaki Hata, Christoph Treude, Raula Gaikovina Kula, Takashi Ishio

    Abstract: Links are an essential feature of the World Wide Web, and source code repositories are no exception. However, despite their many undisputed benefits, links can suffer from decay, insufficient versioning, and lack of bidirectional traceability. In this paper, we investigate the role of links contained in source code comments from these perspectives. We conducted a large-scale study of around 9.6 mi… ▽ More

    Submitted 15 February, 2019; v1 submitted 22 January, 2019; originally announced January 2019.

    Comments: 12 pages, ICSE 2019

  40. arXiv:1812.07170  [pdf, other

    cs.SE

    Learning to Generate Corrective Patches using Neural Machine Translation

    Authors: Hideaki Hata, Emad Shihab, Graham Neubig

    Abstract: Bug fixing is generally a manually-intensive task. However, recent work has proposed the idea of automated program repair, which aims to repair (at least a subset of) bugs in different ways such as code mutation, etc. Following in the same line of work as automated bug repair, in this paper we aim to leverage past fixes to propose fixes of current/future bugs. Specifically, we propose Ratchet, a c… ▽ More

    Submitted 3 July, 2019; v1 submitted 17 December, 2018; originally announced December 2018.

    Comments: 20 pages

  41. arXiv:1805.03844  [pdf, other

    cs.SE cs.SI

    Human Capital in Software Engineering: A Systematic Mapping of Reconceptualized Human Aspect Studies

    Authors: Saya Onoue, Hideaki Hata, Raula Gaikovina Kula, Kenichi Matsumoto

    Abstract: The human capital invested into software development plays a vital role in the success of any software project. By human capital, we do not mean the individuals themselves, but involves the range of knowledge and skills (i.e., human aspects) invested to create value during development. However, there is still no consensus on how these broad terms of human aspects relate to the health of a project.… ▽ More

    Submitted 10 May, 2018; originally announced May 2018.

    Comments: 35 pages

  42. arXiv:1803.04129  [pdf, other

    cs.SE cs.CY

    Are Donation Badges Appealing? A Case Study of Developer Responses to Eclipse Bug Reports

    Authors: Keitaro Nakasai, Hideaki Hata, Kenichi Matsumoto

    Abstract: Eclipse, an open source software project, acknowledges its donors by presenting donation badges in its issue tracking system Bugzilla. However, the rewarding effect of this strategy is currently unknown. We applied a framework of causal inference to investigate relative promptness of developer response to bug reports with donation badges compared with bug reports without the badges, and estimated… ▽ More

    Submitted 18 July, 2018; v1 submitted 12 March, 2018; originally announced March 2018.

    Comments: 4 pages, IEEE Software

  43. arXiv:1710.00446  [pdf, other

    cs.SE

    Extracting Insights from the Topology of the JavaScript Package Ecosystem

    Authors: Nuttapon Lertwittayatrai, Raula Gaikovina Kula, Saya Onoue, Hideaki Hata, Arnon Rungsawang, Pattara Leelaprute, Kenichi Matsumoto

    Abstract: Software ecosystems have had a tremendous impact on computing and society, capturing the attention of businesses, researchers, and policy makers alike. Massive ecosystems like the JavaScript node package manager (npm) is evidence of how packages are readily available for use by software projects. Due to its high-dimension and complex properties, software ecosystem analysis has been limited. In thi… ▽ More

    Submitted 1 October, 2017; originally announced October 2017.

    Comments: 10 pages, APSEC 2017

  44. arXiv:1709.10324  [pdf, other

    cs.SE

    The Health and Wealth of OSS Projects: Evidence from Community Activities and Product Evolution

    Authors: Saya Onoue, Raula Gaikovina Kula, Hideaki Hata, Kenichi Matsumoto

    Abstract: Background: Understanding the condition of OSS projects is important to analyze features and predict the future of projects. In the field of demography and economics, health and wealth are considered to understand the condition of a country. Aim: In this paper, we apply this framework to OSS projects to understand the communities and the evolution of OSS projects from the perspectives of health an… ▽ More

    Submitted 29 September, 2017; originally announced September 2017.

    Comments: Submitted 2017

  45. arXiv:1709.06224  [pdf, other

    cs.SE

    Understanding the Heterogeneity of Contributors in Bug Bounty Programs

    Authors: Hideaki Hata, Mingyu Guo, M. Ali Babar

    Abstract: Background: While bug bounty programs are not new in software development, an increasing number of companies, as well as open source projects, rely on external parties to perform the security assessment of their software for reward. However, there is relatively little empirical knowledge about the characteristics of bug bounty program contributors. Aim: This paper aims to understand those contribu… ▽ More

    Submitted 18 September, 2017; originally announced September 2017.

    Comments: 6 pages, ESEM 2017

  46. arXiv:1709.05768  [pdf, other

    cs.SE

    Using High-Rising Cities to Visualize Performance in Real-Time

    Authors: Katsuya Ogami, Raula Gaikovina Kula, Hideaki Hata, Takashi Ishio, Kenichi Matsumoto

    Abstract: For developers concerned with a performance drop or improvement in their software, a profiler allows a developer to quickly search and identify bottlenecks and leaks that consume much execution time. Non real-time profilers analyze the history of already executed stack traces, while a real-time profiler outputs the results concurrently with the execution of software, so users can know the results… ▽ More

    Submitted 17 September, 2017; originally announced September 2017.

    Comments: 10 pages, VISSOFT 2017, Artifact: https://github.com/sefield/high-rising-city-artifact

  47. arXiv:1709.05763  [pdf, other

    cs.SE

    Bug or Not? Bug Report Classification Using N-Gram IDF

    Authors: Pannavat Terdchanakul, Hideaki Hata, Passakorn Phannachitta, Kenichi Matsumoto

    Abstract: Previous studies have found that a significant number of bug reports are misclassified between bugs and non-bugs, and that manually classifying bug reports is a time-consuming task. To address this problem, we propose a bug reports classification model with N-gram IDF, a theoretical extension of Inverse Document Frequency (IDF) for handling words and phrases of any length. N-gram IDF enables us to… ▽ More

    Submitted 17 September, 2017; originally announced September 2017.

    Comments: 5 pages, ICSME 2017