Zum Hauptinhalt springen

Showing 1–22 of 22 results for author: Hines, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.16754  [pdf, other

    cs.RO

    A compact neuromorphic system for ultra energy-efficient, on-device robot localization

    Authors: Adam D. Hines, Michael Milford, Tobias Fischer

    Abstract: Neuromorphic computing offers a transformative pathway to overcome the computational and energy challenges faced in deploying robotic localization and navigation systems at the edge. Visual place recognition, a critical component for navigation, is often hampered by the high resource demands of conventional systems, making them unsuitable for small-scale robotic platforms which still require to pe… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: 28 pages, 4 main figures, 4 supplementary figures, 1 supplementary table, and 1 movie. Under review

  2. arXiv:2403.15336  [pdf, other

    eess.AS cs.MM

    Dialogue Understandability: Why are we streaming movies with subtitles?

    Authors: Helard Becerra Martinez, Alessandro Ragano, Diptasree Debnath, Asad Ullah, Crisron Rudolf Lucas, Martin Walsh, Andrew Hines

    Abstract: Watching movies and TV shows with subtitles enabled is not simply down to audibility or speech intelligibility. A variety of evolving factors related to technological advances, cinema production and social behaviour challenge our perception and understanding. This study seeks to formalise and give context to these influential factors under a wider and novel term referred to as Dialogue Understanda… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  3. arXiv:2309.16284  [pdf, other

    cs.SD eess.AS

    NOMAD: Unsupervised Learning of Perceptual Embeddings for Speech Enhancement and Non-matching Reference Audio Quality Assessment

    Authors: Alessandro Ragano, Jan Skoglund, Andrew Hines

    Abstract: This paper presents NOMAD (Non-Matching Audio Distance), a differentiable perceptual similarity metric that measures the distance of a degraded signal against non-matching references. The proposed method is based on learning deep feature embeddings via a triplet loss guided by the Neurogram Similarity Index Measure (NSIM) to capture degradation intensity. During inference, the similarity score bet… ▽ More

    Submitted 19 January, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: Accepted for ICASSP 2024

  4. arXiv:2309.12763  [pdf, other

    eess.AS cs.CL cs.SD

    Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language augmentation for Low Resource Self-Supervised Speech Models

    Authors: Asad Ullah, Alessandro Ragano, Andrew Hines

    Abstract: Self-supervised representation learning (SSRL) has demonstrated superior performance than supervised models for tasks including phoneme recognition. Training SSRL models poses a challenge for low-resource languages where sufficient pre-training data may not be available. A common approach is cross-lingual pre-training. Instead, we propose to use audio augmentation techniques, namely: pitch variati… ▽ More

    Submitted 28 June, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: Paper accepted in Interspeech2024

  5. arXiv:2309.10225  [pdf, other

    cs.RO

    VPRTempo: A Fast Temporally Encoded Spiking Neural Network for Visual Place Recognition

    Authors: Adam D. Hines, Peter G. Stratton, Michael Milford, Tobias Fischer

    Abstract: Spiking Neural Networks (SNNs) are at the forefront of neuromorphic computing thanks to their potential energy-efficiency, low latencies, and capacity for continual learning. While these capabilities are well suited for robotics tasks, SNNs have seen limited adaptation in this field thus far. This work introduces a SNN for Visual Place Recognition (VPR) that is both trainable within minutes and qu… ▽ More

    Submitted 29 February, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: 8 pages, 3 figures, accepted to the IEEE International Conference on Robotics and Automation (ICRA) 2024

  6. arXiv:2306.08959  [pdf, other

    cs.CY

    Statutory Professions in AI governance and their consequences for explainable AI

    Authors: Labhaoise NiFhaolain, Andrew Hines, Vivek Nallur

    Abstract: Intentional and accidental harms arising from the use of AI have impacted the health, safety and rights of individuals. While regulatory frameworks are being developed, there remains a lack of consensus on methods necessary to deliver safe AI. The potential for explainable AI (XAI) to contribute to the effectiveness of the regulation of AI is being increasingly examined. Regulation must include me… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: Accepted for publication at xAI-2023 conference

  7. arXiv:2211.07445  [pdf, other

    eess.AS cs.SD q-bio.QM

    Exploring the Impact of Noise and Degradations on Heart Sound Classification Models

    Authors: Davoud Shariat Panah, Andrew Hines, Susan McKeever

    Abstract: The development of data-driven heart sound classification models has been an active area of research in recent years. To develop such data-driven models in the first place, heart sound signals need to be captured using a signal acquisition device. However, it is almost impossible to capture noise-free heart sound signals due to the presence of internal and external noises in most situations. Such… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Submitted to Computers in Biology and Medicine Journal

  8. arXiv:2210.15310  [pdf, other

    eess.AS cs.SD

    Learning Music Representations with wav2vec 2.0

    Authors: Alessandro Ragano, Emmanouil Benetos, Andrew Hines

    Abstract: Learning music representations that are general-purpose offers the flexibility to finetune several downstream tasks using smaller datasets. The wav2vec 2.0 speech representation model showed promising results in many downstream speech tasks, but has been less effective when adapted to music. In this paper, we evaluate whether pre-training wav2vec 2.0 directly on music data can be a better solution… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  9. arXiv:2209.06358  [pdf, other

    cs.SD cs.LG eess.AS

    Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset

    Authors: Michael Chinen, Jan Skoglund, Chandan K A Reddy, Alessandro Ragano, Andrew Hines

    Abstract: Non-reference speech quality models are important for a growing number of applications. The VoiceMOS 2022 challenge provided a dataset of synthetic voice conversion and text-to-speech samples with subjective labels. This study looks at the amount of variance that can be explained in subjective ratings of speech quality from metadata and the distribution imbalances of the dataset. Speech quality mo… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: Preprint; accepted for Interspeech 2022

  10. arXiv:2202.02454  [pdf, other

    cs.NI cs.LG cs.MM

    Supervised Learning based QoE Prediction of Video Streaming in Future Networks: A Tutorial with Comparative Study

    Authors: Arslan Ahmad, Atif Bin Mansoor, Alcardo Alex Barakabitze, Andrew Hines, Luigi Atzori, Ray Walshe

    Abstract: The Quality of Experience (QoE) based service management remains key for successful provisioning of multimedia services in next-generation networks such as 5G/6G, which requires proper tools for quality monitoring, prediction and resource management where machine learning (ML) can play a crucial role. In this paper, we provide a tutorial on the development and deployment of the QoE measurement and… ▽ More

    Submitted 3 January, 2022; originally announced February 2022.

    Journal ref: IEEE Communications Magazine, vol. 59, no. 11, pp. 88-94, November 2021

  11. AQP: An Open Modular Python Platform for Objective Speech and Audio Quality Metrics

    Authors: Jack Geraghty, Jiazheng Li, Alessandro Ragano, Andrew Hines

    Abstract: Audio quality assessment has been widely researched in the signal processing area. Full-reference objective metrics (e.g., POLQA, ViSQOL) have been developed to estimate the audio quality relying only on human rating experiments. To evaluate the audio quality of novel audio processing techniques, researchers constantly need to compare objective quality metrics. Testing different implementations of… ▽ More

    Submitted 30 June, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

    Comments: 6 pages, 3 figures, accepted and presented at ACM MMSys22, June, 2022, Athlone, Ireland

    ACM Class: H.5.5; D.2.11; D.2.13

  12. More for Less: Non-Intrusive Speech Quality Assessment with Limited Annotations

    Authors: Alessandro Ragano, Emmanouil Benetos, Andrew Hines

    Abstract: Non-intrusive speech quality assessment is a crucial operation in multimedia applications. The scarcity of annotated data and the lack of a reference signal represent some of the main challenges for designing efficient quality assessment metrics. In this paper, we propose two multi-task models to tackle the problems above. In the first model, we first learn a feature representation with a degradat… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

    Comments: Published in 2021 13th International Conference on Quality of Multimedia Experience (QoMEX)

  13. arXiv:2007.07032  [pdf

    cs.MM

    QUALINET White Paper on Definitions of Immersive Media Experience (IMEx)

    Authors: Andrew Perkis, Christian Timmerer, Sabina Baraković, Jasmina Baraković Husić, Søren Bech, Sebastian Bosse, Jean Botev, Kjell Brunnström, Luis Cruz, Katrien De Moor, Andrea de Polo Saibanti, Wouter Durnez, Sebastian Egger-Lampl, Ulrich Engelke, Tiago H. Falk, Jesús Gutiérrez, Asim Hameed, Andrew Hines, Tanja Kojic, Dragan Kukolj, Eirini Liotou, Dragorad Milovanovic, Sebastian Möller, Niall Murray, Babak Naderi , et al. (19 additional authors not shown)

    Abstract: With the coming of age of virtual/augmented reality and interactive media, numerous definitions, frameworks, and models of immersion have emerged across different fields ranging from computer graphics to literary works. Immersion is oftentimes used interchangeably with presence as both concepts are closely related. However, there are noticeable interdisciplinary differences regarding definitions,… ▽ More

    Submitted 24 November, 2020; v1 submitted 10 June, 2020; originally announced July 2020.

  14. arXiv:2006.14750  [pdf, other

    cs.CY

    Could regulating the creators deliver trustworthy AI?

    Authors: Labhaoise Ni Fhaolain, Andrew Hines

    Abstract: Is a new regulated profession, such as Artificial Intelligence (AI) Architect who is responsible and accountable for AI outputs necessary to ensure trustworthy AI? AI is becoming all pervasive and is often deployed in everyday technologies, devices and services without our knowledge. There is heightened awareness of AI in recent years which has brought with it fear. This fear is compounded by the… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Comments: To be published in The Second Workshop on Implementing Machine Ethics, Dublin, Ireland, 30 June 2020

  15. arXiv:2004.09584  [pdf, other

    eess.AS cs.SD eess.SP

    ViSQOL v3: An Open Source Production Ready Objective Speech and Audio Metric

    Authors: Michael Chinen, Felicia S. C. Lim, Jan Skoglund, Nikita Gureev, Feargus O'Gorman, Andrew Hines

    Abstract: Estimation of perceptual quality in audio and speech is possible using a variety of methods. The combined v3 release of ViSQOL and ViSQOLAudio (for speech and audio, respectively,) provides improvements upon previous versions, in terms of both design and usage. As an open source C++ library or binary with permissive licensing, ViSQOL can now be deployed beyond the research context into production… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX)

  16. How Crisp is the Crease? A Subjective Study on Web Browsing Perception of Above-The-Fold

    Authors: Hamed Z. Jahromi, Declan T. Delaney, Andrew Hines

    Abstract: Quality of Experience (QoE) for various types of websites has gained significant attention in recent years. In order to design and evaluate websites, a metric that can estimate a user's experienced quality robustly for diverse content is necessary. SpeedIndex (SI) has been widely adopted to estimate perceived web page loading progress. It measures the speed of rendering pixels for the webpage that… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

  17. arXiv:2003.11882  [pdf, other

    eess.AS cs.SD

    Speech Quality Factors for Traditional and Neural-Based Low Bit Rate Vocoders

    Authors: Wissam A. Jassim, Jan Skoglund, Michael Chinen, Andrew Hines

    Abstract: This study compares the performances of different algorithms for coding speech at low bit rates. In addition to widely deployed traditional vocoders, a selection of recently developed generative-model-based coders at different bit rates are contrasted. Performance analysis of the coded speech is evaluated for different quality aspects: accuracy of pitch periods estimation, the word error rates for… ▽ More

    Submitted 26 March, 2020; originally announced March 2020.

    Comments: 6 pages, 11 figures, conference

  18. arXiv:2003.11100  [pdf, other

    cs.MM cs.CV cs.LG eess.IV

    How deep is your encoder: an analysis of features descriptors for an autoencoder-based audio-visual quality metric

    Authors: Helard Martinez, Andrew Hines, Mylene C. Q. Farias

    Abstract: The development of audio-visual quality assessment models poses a number of challenges in order to obtain accurate predictions. One of these challenges is the modelling of the complex interaction that audio and visual stimuli have and how this interaction is interpreted by human users. The No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder (NAViDAd) deals with this problem from a… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

  19. You Drive Me Crazy! Interactive QoE Assessment for Telepresence Robot Control

    Authors: Hamed Z. Jahromi, Ivan Bartolec, Edwin Gamboa, Andrew Hines, Raimund Schatz

    Abstract: Telepresence robots (TPRs) are versatile, remotely controlled vehicles that enable physical presence and human-to-human interaction over a distance. Thanks to improving hardware and dropping price points, TPRs enjoy the growing interest in various industries and application domains. Still, a satisfying experience remains key for their acceptance and successful adoption, not only in terms of enabli… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

  20. Audio Impairment Recognition Using a Correlation-Based Feature Representation

    Authors: Alessandro Ragano, Emmanouil Benetos, Andrew Hines

    Abstract: Audio impairment recognition is based on finding noise in audio files and categorising the impairment type. Recently, significant performance improvement has been obtained thanks to the usage of advanced deep learning models. However, feature robustness is still an unresolved issue and it is one of the main reasons why we need powerful deep learning architectures. In the presence of a variety of m… ▽ More

    Submitted 24 March, 2020; v1 submitted 22 March, 2020; originally announced March 2020.

    Comments: This publication has been accepted in 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX)

  21. NAViDAd: A No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder

    Authors: Helard Martinez, M. C. Farias, A. Hines

    Abstract: The development of models for quality prediction of both audio and video signals is a fairly mature field. But, although several multimodal models have been proposed, the area of audio-visual quality prediction is still an emerging area. In fact, despite the reasonable performance obtained by combination and parametric metrics, currently there is no reliable pixel-based audio-visual quality metric… ▽ More

    Submitted 4 February, 2020; v1 submitted 30 January, 2020; originally announced January 2020.

    Comments: 5 pages

    Journal ref: 2019 27th European Signal Processing Conference (EUSIPCO), IEEE, 2019, pp 1-5

  22. arXiv:1912.02802  [pdf

    cs.NI cs.DC cs.MM eess.SY

    5G network slicing using SDN and NFV- A survey of taxonomy, architectures and future challenges

    Authors: Alcardo Alex Barakabitze, Arslan Ahmad, Rashid Mijumbi, Andrew Hines

    Abstract: In this paper, we provide a comprehensive review and updated solutions related to 5G network slicing using SDN and NFV. Firstly, we present 5G service quality and business requirements followed by a description of 5G network softwarization and slicing paradigms including essential concepts, history and different use cases. Secondly, we provide a tutorial of 5G network slicing technology enablers i… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

    Comments: 40 Pages, 22 figures, published in computer networks (Open Access)

    MSC Class: 68 (Computer Science)

    Journal ref: 2019