-
Measuring the Accuracy of Automatic Speech Recognition Solutions
Authors:
Korbinian Kuhn,
Verena Kersken,
Benedikt Reuter,
Niklas Egger,
Gottfried Zimmermann
Abstract:
For d/Deaf and hard of hearing (DHH) people, captioning is an essential accessibility tool. Significant developments in artificial intelligence (AI) mean that Automatic Speech Recognition (ASR) is now a part of many popular applications. This makes creating captions easy and broadly available - but transcription needs high levels of accuracy to be accessible. Scientific publications and industry r…
▽ More
For d/Deaf and hard of hearing (DHH) people, captioning is an essential accessibility tool. Significant developments in artificial intelligence (AI) mean that Automatic Speech Recognition (ASR) is now a part of many popular applications. This makes creating captions easy and broadly available - but transcription needs high levels of accuracy to be accessible. Scientific publications and industry report very low error rates, claiming AI has reached human parity or even outperforms manual transcription. At the same time the DHH community reports serious issues with the accuracy and reliability of ASR. There seems to be a mismatch between technical innovations and the real-life experience for people who depend on transcription. Independent and comprehensive data is needed to capture the state of ASR. We measured the performance of eleven common ASR services with recordings of Higher Education lectures. We evaluated the influence of technical conditions like streaming, the use of vocabularies, and differences between languages. Our results show that accuracy ranges widely between vendors and for the individual audio samples. We also measured a significant lower quality for streaming ASR, which is used for live events. Our study shows that despite the recent improvements of ASR, common services lack reliability in accuracy.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Beyond Levenshtein: Leveraging Multiple Algorithms for Robust Word Error Rate Computations And Granular Error Classifications
Authors:
Korbinian Kuhn,
Verena Kersken,
Gottfried Zimmermann
Abstract:
The Word Error Rate (WER) is the common measure of accuracy for Automatic Speech Recognition (ASR). Transcripts are usually pre-processed by substituting specific characters to account for non-semantic differences. As a result of this normalisation, information on the accuracy of punctuation or capitalisation is lost. We present a non-destructive, token-based approach using an extended Levenshtein…
▽ More
The Word Error Rate (WER) is the common measure of accuracy for Automatic Speech Recognition (ASR). Transcripts are usually pre-processed by substituting specific characters to account for non-semantic differences. As a result of this normalisation, information on the accuracy of punctuation or capitalisation is lost. We present a non-destructive, token-based approach using an extended Levenshtein distance algorithm to compute a robust WER and additional orthographic metrics. Transcription errors are also classified more granularly by existing string similarity and phonetic algorithms. An evaluation on several datasets demonstrates the practical equivalence of our approach compared to common WER computations. We also provide an exemplary analysis of derived use cases, such as a punctuation error rate, and a web application for interactive use and visualisation of our implementation. The code is available open-source.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization
Authors:
Xuhai Xu,
Han Zhang,
Yasaman Sefidgar,
Yiyi Ren,
Xin Liu,
Woosuk Seo,
Jennifer Brown,
Kevin Kuehn,
Mike Merrill,
Paula Nurius,
Shwetak Patel,
Tim Althoff,
Margaret E. Morris,
Eve Riskin,
Jennifer Mankoff,
Anind K. Dey
Abstract:
Recent research has demonstrated the capability of behavior signals captured by smartphones and wearables for longitudinal behavior modeling. However, there is a lack of a comprehensive public dataset that serves as an open testbed for fair comparison among algorithms. Moreover, prior studies mainly evaluate algorithms using data from a single population within a short period, without measuring th…
▽ More
Recent research has demonstrated the capability of behavior signals captured by smartphones and wearables for longitudinal behavior modeling. However, there is a lack of a comprehensive public dataset that serves as an open testbed for fair comparison among algorithms. Moreover, prior studies mainly evaluate algorithms using data from a single population within a short period, without measuring the cross-dataset generalizability of these algorithms. We present the first multi-year passive sensing datasets, containing over 700 user-years and 497 unique users' data collected from mobile and wearable sensors, together with a wide range of well-being metrics. Our datasets can support multiple cross-dataset evaluations of behavior modeling algorithms' generalizability across different users and years. As a starting point, we provide the benchmark results of 18 algorithms on the task of depression detection. Our results indicate that both prior depression detection algorithms and domain generalization techniques show potential but need further research to achieve adequate cross-dataset generalizability. We envision our multi-year datasets can support the ML community in developing generalizable longitudinal behavior modeling algorithms.
△ Less
Submitted 4 March, 2023; v1 submitted 4 November, 2022;
originally announced November 2022.
-
Data Driven Prediction of Battery Cycle Life Before Capacity Degradation
Authors:
Anmol Singh,
Caitlin Feltner,
Jamie Peck,
Kurt I. Kuhn
Abstract:
Ubiquitous use of lithium-ion batteries across multiple industries presents an opportunity to explore cost saving initiatives as the price to performance ratio continually decreases in a competitive environment. Manufacturers using lithium-ion batteries ranging in applications from mobile phones to electric vehicles need to know how long batteries will last for a given service life. To understand…
▽ More
Ubiquitous use of lithium-ion batteries across multiple industries presents an opportunity to explore cost saving initiatives as the price to performance ratio continually decreases in a competitive environment. Manufacturers using lithium-ion batteries ranging in applications from mobile phones to electric vehicles need to know how long batteries will last for a given service life. To understand this, expensive testing is required.
This paper utilizes the data and methods implemented by Kristen A. Severson, et al, to explore the methodologies that the research team used and presents another method to compare predicted results vs. actual test data for battery capacity fade. The fundamental effort is to find out if machine learning techniques may be trained to use early life cycle data in order to accurately predict battery capacity over the battery life cycle. Results show comparison of methods between Gaussian Process Regression (GPR) and Elastic Net Regression (ENR) and highlight key data features used from the extensive dataset found in the work of Severson, et al.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
A machine learning approach to galaxy properties: joint redshift-stellar mass probability distributions with Random Forest
Authors:
S. Mucesh,
W. G. Hartley,
A. Palmese,
O. Lahav,
L. Whiteway,
A. F. L. Bluck,
A. Alarcon,
A. Amon,
K. Bechtol,
G. M. Bernstein,
A. Carnero Rosell,
M. Carrasco Kind,
A. Choi,
K. Eckert,
S. Everett,
D. Gruen,
R. A. Gruendl,
I. Harrison,
E. M. Huff,
N. Kuropatkin,
I. Sevilla-Noarbe,
E. Sheldon,
B. Yanny,
M. Aguena,
S. Allam
, et al. (50 additional authors not shown)
Abstract:
We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep phot…
▽ More
We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep photometry in the $griz$ bands, and the second reflecting the photometric scatter present in the main DES survey, with carefully constructed representative training data in each case. We validate our joint PDFs for $10,699$ test galaxies by utilizing the copula probability integral transform and the Kendall distribution function, and their univariate counterparts to validate the marginals. Benchmarked against a basic set-up of the template-fitting code BAGPIPES, our ML-based method outperforms template fitting on all of our predefined performance metrics. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under $6$ min with consumer computer hardware. Such speed enables PDFs to be derived in real time within analysis codes, solving potential storage issues. As part of this work we have developed GALPRO, a highly intuitive and efficient Python package to rapidly generate multivariate PDFs on-the-fly. GALPRO is documented and available for researchers to use in their cosmology and galaxy evolution studies.
△ Less
Submitted 19 February, 2021; v1 submitted 10 December, 2020;
originally announced December 2020.
-
Machine Learning for Searching the Dark Energy Survey for Trans-Neptunian Objects
Authors:
B. Henghes,
O. Lahav,
D. W. Gerdes,
E. Lin,
R. Morgan,
T. M. C. Abbott,
M. Aguena,
S. Allam,
J. Annis,
S. Avila,
E. Bertin,
D. Brooks,
D. L. Burke,
A. CarneroRosell,
M. CarrascoKind,
J. Carretero,
C. Conselice,
M. Costanzi,
L. N. da Costa,
J. DeVicente,
S. Desai,
H. T. Diehl,
P. Doel,
S. Everett,
I. Ferrero
, et al. (34 additional authors not shown)
Abstract:
In this paper we investigate how implementing machine learning could improve the efficiency of the search for Trans-Neptunian Objects (TNOs) within Dark Energy Survey (DES) data when used alongside orbit fitting. The discovery of multiple TNOs that appear to show a similarity in their orbital parameters has led to the suggestion that one or more undetected planets, an as yet undiscovered "Planet 9…
▽ More
In this paper we investigate how implementing machine learning could improve the efficiency of the search for Trans-Neptunian Objects (TNOs) within Dark Energy Survey (DES) data when used alongside orbit fitting. The discovery of multiple TNOs that appear to show a similarity in their orbital parameters has led to the suggestion that one or more undetected planets, an as yet undiscovered "Planet 9", may be present in the outer Solar System. DES is well placed to detect such a planet and has already been used to discover many other TNOs. Here, we perform tests on eight different supervised machine learning algorithms, using a dataset consisting of simulated TNOs buried within real DES noise data. We found that the best performing classifier was the Random Forest which, when optimised, performed well at detecting the rare objects. We achieve an area under the receiver operating characteristic (ROC) curve, (AUC) $= 0.996 \pm 0.001$. After optimizing the decision threshold of the Random Forest, we achieve a recall of 0.96 while maintaining a precision of 0.80. Finally, by using the optimized classifier to pre-select objects, we are able to run the orbit-fitting stage of our detection pipeline five times faster.
△ Less
Submitted 10 December, 2020; v1 submitted 27 September, 2020;
originally announced September 2020.
-
How Does COVID-19 impact Students with Disabilities/Health Concerns?
Authors:
Han Zhang,
Paula Nurius,
Yasaman Sefidgar,
Margaret Morris,
Sreenithi Balasubramanian,
Jennifer Brown,
Anind K. Dey,
Kevin Kuehn,
Eve Riskin,
Xuhai Xu,
Jen Mankoff
Abstract:
The impact of COVID-19 on students has been enormous, with an increase in worries about fiscal and physical health, a rapid shift to online learning, and increased isolation. In addition to these changes, students with disabilities/health concerns may face accessibility problems with online learning or communication tools, and their stress may be compounded by additional risks such as financial st…
▽ More
The impact of COVID-19 on students has been enormous, with an increase in worries about fiscal and physical health, a rapid shift to online learning, and increased isolation. In addition to these changes, students with disabilities/health concerns may face accessibility problems with online learning or communication tools, and their stress may be compounded by additional risks such as financial stress or pre-existing conditions. To our knowledge, no one has looked specifically at the impact of COVID-19 on students with disabilities/health concerns. In this paper, we present data from a survey of 147 students with and without disabilities collected in late March to early April of 2020 to assess the impact of COVID-19 on these students' education and mental health. Our findings show that students with disabilities/health concerns were more concerned about classes going online than their peers without disabilities. In addition, students with disabilities/health concerns also reported that they have experienced more COVID-19 related adversities compared to their peers without disabilities/health concerns. We argue that students with disabilities/health concerns in higher education need confidence in the accessibility of the online learning tools that are becoming increasingly prevalent in higher education not only because of COVID-19 but also more generally. In addition, educational technologies will be more accessible if they consider the learning context, and are designed to provide a supportive, calm, and connecting learning environment.
△ Less
Submitted 6 May, 2021; v1 submitted 11 May, 2020;
originally announced May 2020.