-
The Impact of Responsible AI Research on Innovation and Development
Authors:
Ali Akbar Septiandri,
Marios Constantinides,
Daniele Quercia
Abstract:
Translational research, especially in the fast-evolving field of Artificial Intelligence (AI), is key to converting scientific findings into practical innovations. In Responsible AI (RAI) research, translational impact is often viewed through various pathways, including research papers, blogs, news articles, and the drafting of forthcoming AI legislation (e.g., the EU AI Act). However, the real-wo…
▽ More
Translational research, especially in the fast-evolving field of Artificial Intelligence (AI), is key to converting scientific findings into practical innovations. In Responsible AI (RAI) research, translational impact is often viewed through various pathways, including research papers, blogs, news articles, and the drafting of forthcoming AI legislation (e.g., the EU AI Act). However, the real-world impact of RAI research remains an underexplored area. Our study aims to capture it through two pathways: \emph{patents} and \emph{code repositories}, both of which provide a rich and structured source of data. Using a dataset of 200,000 papers from 1980 to 2022 in AI and related fields, including Computer Vision, Natural Language Processing, and Human-Computer Interaction, we developed a Sentence-Transformers Deep Learning framework to identify RAI papers. This framework calculates the semantic similarity between paper abstracts and a set of RAI keywords, which are derived from the NIST's AI Risk Management Framework; a framework that aims to enhance trustworthiness considerations in the design, development, use, and evaluation of AI products, services, and systems. We identified 1,747 RAI papers published in top venues such as CHI, CSCW, NeurIPS, FAccT, and AIES between 2015 and 2022. By analyzing these papers, we found that a small subset that goes into patents or repositories is highly cited, with the translational process taking between 1 year for repositories and up to 8 years for patents. Interestingly, impactful RAI research is not limited to top U.S. institutions, but significant contributions come from European and Asian institutions. Finally, the multidisciplinary nature of RAI papers, often incorporating knowledge from diverse fields of expertise, was evident as these papers tend to build on unconventional combinations of prior knowledge.
△ Less
Submitted 19 August, 2024; v1 submitted 22 July, 2024;
originally announced July 2024.
-
WEIRD ICWSM: How Western, Educated, Industrialized, Rich, and Democratic is Social Computing Research?
Authors:
Ali Akbar Septiandri,
Marios Constantinides,
Daniele Quercia
Abstract:
Much of the research in social computing analyzes data from social media platforms, which may inherently carry biases. An overlooked source of such bias is the over-representation of WEIRD (Western, Educated, Industrialized, Rich, and Democratic) populations, which might not accurately mirror the global demographic diversity. We evaluated the dependence on WEIRD populations in research presented a…
▽ More
Much of the research in social computing analyzes data from social media platforms, which may inherently carry biases. An overlooked source of such bias is the over-representation of WEIRD (Western, Educated, Industrialized, Rich, and Democratic) populations, which might not accurately mirror the global demographic diversity. We evaluated the dependence on WEIRD populations in research presented at the AAAI ICWSM conference; the only venue whose proceedings are fully dedicated to social computing research. We did so by analyzing 494 papers published from 2018 to 2022, which included full research papers, dataset papers and posters. After filtering out papers that analyze synthetic datasets or those lacking clear country of origin, we were left with 420 papers from which 188 participants in a crowdsourcing study with full manual validation extracted data for the WEIRD scores computation. This data was then used to adapt existing WEIRD metrics to be applicable for social media data. We found that 37% of these papers focused solely on data from Western countries. This percentage is significantly less than the percentages observed in research from CHI (76%) and FAccT (84%) conferences, suggesting a greater diversity of dataset origins within ICWSM. However, the studies at ICWSM still predominantly examine populations from countries that are more Educated, Industrialized, and Rich in comparison to those in FAccT, with a special note on the 'Democratic' variable reflecting political freedoms and rights. This points out the utility of social media data in shedding light on findings from countries with restricted political freedoms. Based on these insights, we recommend extensions of current "paper checklists" to include considerations about the WEIRD bias and call for the community to broaden research inclusivity by encouraging the use of diverse datasets from underrepresented regions.
△ Less
Submitted 11 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
The Potential Impact of AI Innovations on U.S. Occupations
Authors:
Ali Akbar Septiandri,
Marios Constantinides,
Daniele Quercia
Abstract:
An occupation is comprised of interconnected tasks, and it is these tasks, not occupations themselves, that are affected by AI. To evaluate how tasks may be impacted, previous approaches utilized manual annotations or coarse-grained matching. Leveraging recent advancements in machine learning, we replace coarse-grained matching with more precise deep learning approaches. Introducing the AI Impact…
▽ More
An occupation is comprised of interconnected tasks, and it is these tasks, not occupations themselves, that are affected by AI. To evaluate how tasks may be impacted, previous approaches utilized manual annotations or coarse-grained matching. Leveraging recent advancements in machine learning, we replace coarse-grained matching with more precise deep learning approaches. Introducing the AI Impact (AII) measure, we employ Deep Learning Natural Language Processing to automatically identify AI patents that may impact various occupational tasks at scale. Our methodology relies on a comprehensive dataset of 17,879 task descriptions and quantifies AI's potential impact through analysis of 24,758 AI patents filed with the United States Patent and Trademark Office (USPTO) between 2015 and 2022. Our results reveal that some occupations will potentially be impacted, and that impact is intricately linked to specific skills. These include not only routine tasks (codified as a series of steps), as previously thought, but also non-routine ones (e.g., diagnosing health conditions, programming computers, and tracking flight routes). However, AI's impact on labour is limited by the fact that some of the occupations affected are augmented rather than replaced (e.g., neurologists, software engineers, air traffic controllers), and the sectors affected are experiencing labour shortages (e.g., IT, Healthcare, Transport).
△ Less
Submitted 30 July, 2024; v1 submitted 7 December, 2023;
originally announced December 2023.
-
WEIRD FAccTs: How Western, Educated, Industrialized, Rich, and Democratic is FAccT?
Authors:
Ali Akbar Septiandri,
Marios Constantinides,
Mohammad Tahaei,
Daniele Quercia
Abstract:
Studies conducted on Western, Educated, Industrialized, Rich, and Democratic (WEIRD) samples are considered atypical of the world's population and may not accurately represent human behavior. In this study, we aim to quantify the extent to which the ACM FAccT conference, the leading venue in exploring Artificial Intelligence (AI) systems' fairness, accountability, and transparency, relies on WEIRD…
▽ More
Studies conducted on Western, Educated, Industrialized, Rich, and Democratic (WEIRD) samples are considered atypical of the world's population and may not accurately represent human behavior. In this study, we aim to quantify the extent to which the ACM FAccT conference, the leading venue in exploring Artificial Intelligence (AI) systems' fairness, accountability, and transparency, relies on WEIRD samples. We collected and analyzed 128 papers published between 2018 and 2022, accounting for 30.8% of the overall proceedings published at FAccT in those years (excluding abstracts, tutorials, and papers without human-subject studies or clear country attribution for the participants). We found that 84% of the analyzed papers were exclusively based on participants from Western countries, particularly exclusively from the U.S. (63%). Only researchers who undertook the effort to collect data about local participants through interviews or surveys added diversity to an otherwise U.S.-centric view of science. Therefore, we suggest that researchers collect data from under-represented populations to obtain an inclusive worldview. To achieve this goal, scientific communities should champion data collection from such populations and enforce transparent reporting of data biases.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
NusaCrowd: Open Source Initiative for Indonesian NLP Resources
Authors:
Samuel Cahyawijaya,
Holy Lovenia,
Alham Fikri Aji,
Genta Indra Winata,
Bryan Wilie,
Rahmad Mahendra,
Christian Wibisono,
Ade Romadhony,
Karissa Vincentio,
Fajri Koto,
Jennifer Santoso,
David Moeljadi,
Cahya Wirawan,
Frederikus Hudi,
Ivan Halim Parmonangan,
Ika Alfina,
Muhammad Satrio Wicaksono,
Ilham Firdausi Putra,
Samsul Rahmadani,
Yulianti Oenang,
Ali Akbar Septiandri,
James Jaya,
Kaustubh D. Dhole,
Arie Ardiyanti Suryani,
Rifki Afina Putri
, et al. (22 additional authors not shown)
Abstract:
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for Indonesian languages, including opening access to previously non-public resources. Through this initiative, we have brought together 137 datasets and 118 standardized data loaders. The quality of the datasets has been assessed manually and automatically, and their value is demonstrated through multiple exp…
▽ More
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for Indonesian languages, including opening access to previously non-public resources. Through this initiative, we have brought together 137 datasets and 118 standardized data loaders. The quality of the datasets has been assessed manually and automatically, and their value is demonstrated through multiple experiments. NusaCrowd's data collection enables the creation of the first zero-shot benchmarks for natural language understanding and generation in Indonesian and the local languages of Indonesia. Furthermore, NusaCrowd brings the creation of the first multilingual automatic speech recognition benchmark in Indonesian and the local languages of Indonesia. Our work strives to advance natural language processing (NLP) research for languages that are under-represented despite being widely spoken.
△ Less
Submitted 21 July, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
Cost-Sensitive Machine Learning Classification for Mass Tuberculosis Verbal Screening
Authors:
Ali Akbar Septiandri,
Aditiawarman,
Roy Tjiong,
Erlina Burhan,
Anuraj Shankar
Abstract:
Score-based algorithms for tuberculosis (TB) verbal screening perform poorly, causing misclassification that leads to missed cases and unnecessary costly laboratory tests for false positives. We compared score-based classification defined by clinicians to machine learning classification such as SVM-RBF, logistic regression, and XGBoost. We restricted our analyses to data from adults, the populatio…
▽ More
Score-based algorithms for tuberculosis (TB) verbal screening perform poorly, causing misclassification that leads to missed cases and unnecessary costly laboratory tests for false positives. We compared score-based classification defined by clinicians to machine learning classification such as SVM-RBF, logistic regression, and XGBoost. We restricted our analyses to data from adults, the population most affected by TB, and investigated the difference between untuned and unweighted classifiers to the cost-sensitive ones. Predictions were compared with the corresponding GeneXpert MTB/Rif results. After adjusting the weight of the positive class to 40 for XGBoost, we achieved 96.64% sensitivity and 35.06% specificity. As such, the sensitivity of our identifier increased by 1.26% while specificity increased by 13.19% in absolute value compared to the traditional score-based method defined by our clinicians. Our approach further demonstrated that only 2000 data points were sufficient to enable the model to converge. The results indicate that even with limited data we can actually devise a better method to identify TB suspects from verbal screening.
△ Less
Submitted 14 November, 2020;
originally announced November 2020.
-
Human Blastocyst Classification after In Vitro Fertilization Using Deep Learning
Authors:
Ali Akbar Septiandri,
Ade Jamal,
Pritta Ameilia Iffanolida,
Oki Riayati,
Budi Wiweko
Abstract:
Embryo quality assessment after in vitro fertilization (IVF) is primarily done visually by embryologists. Variability among assessors, however, remains one of the main causes of the low success rate of IVF. This study aims to develop an automated embryo assessment based on a deep learning model. This study includes a total of 1084 images from 1226 embryos. The images were captured by an inverted m…
▽ More
Embryo quality assessment after in vitro fertilization (IVF) is primarily done visually by embryologists. Variability among assessors, however, remains one of the main causes of the low success rate of IVF. This study aims to develop an automated embryo assessment based on a deep learning model. This study includes a total of 1084 images from 1226 embryos. The images were captured by an inverted microscope at day 3 after fertilization. The images were labelled based on Veeck criteria that differentiate embryos to grade 1 to 5 based on the size of the blastomere and the grade of fragmentation. Our deep learning grading results were compared to the grading results from trained embryologists to evaluate the model performance. Our best model from fine-tuning a pre-trained ResNet50 on the dataset results in 91.79% accuracy. The model presented could be developed into an automated embryo assessment method in point-of-care settings.
△ Less
Submitted 28 August, 2020;
originally announced August 2020.
-
UKARA 1.0 Challenge Track 1: Automatic Short-Answer Scoring in Bahasa Indonesia
Authors:
Ali Akbar Septiandri,
Yosef Ardhito Winatmoko
Abstract:
We describe our third-place solution to the UKARA 1.0 challenge on automated essay scoring. The task consists of a binary classification problem on two datasets | answers from two different questions. We ended up using two different models for the two datasets. For task A, we applied a random forest algorithm on features extracted using unigram with latent semantic analysis (LSA). On the other han…
▽ More
We describe our third-place solution to the UKARA 1.0 challenge on automated essay scoring. The task consists of a binary classification problem on two datasets | answers from two different questions. We ended up using two different models for the two datasets. For task A, we applied a random forest algorithm on features extracted using unigram with latent semantic analysis (LSA). On the other hand, for task B, we only used logistic regression on TF-IDF features. Our model results in F1 score of 0.812.
△ Less
Submitted 27 February, 2020;
originally announced February 2020.
-
Aspect and Opinion Term Extraction for Hotel Reviews using Transfer Learning and Auxiliary Labels
Authors:
Yosef Ardhito Winatmoko,
Ali Akbar Septiandri,
Arie Pratama Sutiono
Abstract:
Aspect and opinion term extraction is a critical step in Aspect-Based Sentiment Analysis (ABSA). Our study focuses on evaluating transfer learning using pre-trained BERT (Devlin et al., 2018) to classify tokens from hotel reviews in bahasa Indonesia. The primary challenge is the language informality of the review texts. By utilizing transfer learning from a multilingual model, we achieved up to 2%…
▽ More
Aspect and opinion term extraction is a critical step in Aspect-Based Sentiment Analysis (ABSA). Our study focuses on evaluating transfer learning using pre-trained BERT (Devlin et al., 2018) to classify tokens from hotel reviews in bahasa Indonesia. The primary challenge is the language informality of the review texts. By utilizing transfer learning from a multilingual model, we achieved up to 2% difference on token level F1-score compared to the state-of-the-art Bi-LSTM model with fewer training epochs (3 vs. 200 epochs). The fine-tuned model clearly outperforms the Bi-LSTM model on the entity level. Furthermore, we propose a method to include CRF with auxiliary labels as an output layer for the BERT-based models. The CRF addition further improves the F1-score for both token and entity level.
△ Less
Submitted 1 October, 2020; v1 submitted 26 September, 2019;
originally announced September 2019.
-
Aspect and Opinion Terms Extraction Using Double Embeddings and Attention Mechanism for Indonesian Hotel Reviews
Authors:
Jordhy Fernando,
Masayu Leylia Khodra,
Ali Akbar Septiandri
Abstract:
Aspect and opinion terms extraction from review texts is one of the key tasks in aspect-based sentiment analysis. In order to extract aspect and opinion terms for Indonesian hotel reviews, we adapt double embeddings feature and attention mechanism that outperform the best system at SemEval 2015 and 2016. We conduct experiments using 4000 reviews to find the best configuration and show the influenc…
▽ More
Aspect and opinion terms extraction from review texts is one of the key tasks in aspect-based sentiment analysis. In order to extract aspect and opinion terms for Indonesian hotel reviews, we adapt double embeddings feature and attention mechanism that outperform the best system at SemEval 2015 and 2016. We conduct experiments using 4000 reviews to find the best configuration and show the influences of double embeddings and attention mechanism toward model performance. Using 1000 reviews for evaluation, we achieved F1-measure of 0.914 and 0.90 for aspect and opinion terms extraction in token and entity (term) level respectively.
△ Less
Submitted 19 August, 2019; v1 submitted 13 August, 2019;
originally announced August 2019.
-
Predicting the Gender of Indonesian Names
Authors:
Ali Akbar Septiandri
Abstract:
We investigated a way to predict the gender of a name using character-level Long-Short Term Memory (char-LSTM). We compared our method with some conventional machine learning methods, namely Naive Bayes, logistic regression, and XGBoost with n-grams as the features. We evaluated the models on a dataset consisting of the names of Indonesian people. It is not common to use a family name as the surna…
▽ More
We investigated a way to predict the gender of a name using character-level Long-Short Term Memory (char-LSTM). We compared our method with some conventional machine learning methods, namely Naive Bayes, logistic regression, and XGBoost with n-grams as the features. We evaluated the models on a dataset consisting of the names of Indonesian people. It is not common to use a family name as the surname in Indonesian culture, except in some ethnicities. Therefore, we inferred the gender from both full names and first names. The results show that we can achieve 92.25% accuracy from full names, while using first names only yields 90.65% accuracy. These results are better than the ones from applying the classical machine learning algorithms to n-grams.
△ Less
Submitted 17 September, 2017; v1 submitted 22 July, 2017;
originally announced July 2017.