MLR-predictor: a versatile and efficient computational framework for multi-label requirements classification

Summra Saleem; Muhammad Nabeel Asim; Ludger Van Elst; Markus Junker; Andreas Dengel

doi:10.3389/frai.2024.1481581

MLR-predictor: a versatile and efficient computational framework for multi-label requirements classification

Front Artif Intell. 2024 Nov 27:7:1481581. doi: 10.3389/frai.2024.1481581. eCollection 2024.

Authors

Summra Saleem^#^{1

2}, Muhammad Nabeel Asim^#², Ludger Van Elst², Markus Junker², Andreas Dengel^{1

2}

Affiliations

¹ Department of Computer Science, Rheinland Pfälzische Technische Universität, Kaiserslautern, Germany.
² German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany.

^# Contributed equally.

Abstract

Introduction: Requirements classification is an essential task for development of a successful software by incorporating all relevant aspects of users' needs. Additionally, it aids in the identification of project failure risks and facilitates to achieve project milestones in more comprehensive way. Several machine learning predictors are developed for binary or multi-class requirements classification. However, a few predictors are designed for multi-label classification and they are not practically useful due to less predictive performance.

Method: MLR-Predictor makes use of innovative OkapiBM25 model to transforms requirements text into statistical vectors by computing words informative patterns. Moreover, predictor transforms multi-label requirements classification data into multi-class classification problem and utilize logistic regression classifier for categorization of requirements. The performance of the proposed predictor is evaluated and compared with 123 machine learning and 9 deep learning-based predictive pipelines across three public benchmark requirements classification datasets using eight different evaluation measures.

Results: The large-scale experimental results demonstrate that proposed MLR-Predictor outperforms 123 adopted machine learning and 9 deep learning predictive pipelines, as well as the state-of-the-art requirements classification predictor. Specifically, in comparison to state-of-the-art predictor, it achieves a 13% improvement in macro F1-measure on the PROMISE dataset, a 1% improvement on the EHR-binary dataset, and a 2.5% improvement on the EHR-multiclass dataset.

Discussion: As a case study, the generalizability of proposed predictor is evaluated on softwares customer reviews classification data. In this context, the proposed predictor outperformed the state-of-the-art BERT language model by F-1 score of 1.4%. These findings underscore the robustness and effectiveness of the proposed MLR-Predictor in various contexts, establishing its utility as a promising solution for requirements classification task.

Keywords: OkapiBM25; data transformation; deep learning predictors; label powerset; machine learning classifiers; multi-label requirements; software requirements; swarm optimizer.

Grants and funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.