Zum Hauptinhalt springen

Showing 1–20 of 20 results for author: Junior, A C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.20989  [pdf, other

    cs.SD cs.LG eess.AS

    Contrasting Deep Learning Models for Direct Respiratory Insufficiency Detection Versus Blood Oxygen Saturation Estimation

    Authors: Marcelo Matheus Gauy, Natalia Hitomi Koza, Ricardo Mikio Morita, Gabriel Rocha Stanzione, Arnaldo Candido Junior, Larissa Cristina Berti, Anna Sara Shafferman Levin, Ester Cerdeira Sabino, Flaviane Romani Fernandes Svartman, Marcelo Finger

    Abstract: We contrast high effectiveness of state of the art deep learning architectures designed for general audio classification tasks, refined for respiratory insufficiency (RI) detection and blood oxygen saturation (SpO$_2$) estimation and classification through automated audio analysis. Recently, multiple deep learning architectures have been proposed to detect RI in COVID patients through audio analys… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

    Comments: 23 pages, 4 figures, in review at Journal of Biomedical Signal Processing and Control

  2. An Incremental MaxSAT-based Model to Learn Interpretable and Balanced Classification Rules

    Authors: Antônio Carlos Souza Ferreira Júnior, Thiago Alves Rocha

    Abstract: The increasing advancements in the field of machine learning have led to the development of numerous applications that effectively address a wide range of problems with accurate predictions. However, in certain cases, accuracy alone may not be sufficient. Many real-world problems also demand explanations and interpretability behind the predictions. One of the most popular interpretable models that… ▽ More

    Submitted 29 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: 16 pages, 5 tables, submitted to BRACIS 2023 (Brazilian Conference on Intelligent Systems), accepted version published in Intelligent Systems, LNCS, vol 14195

    ACM Class: I.2.4; I.2.6

    Journal ref: Intelligent Systems (2023), LNCS, vol 14195 (pp. 227-242), Springer Nature

  3. arXiv:2401.02909  [pdf, other

    cs.CL

    Introducing Bode: A Fine-Tuned Large Language Model for Portuguese Prompt-Based Task

    Authors: Gabriel Lino Garcia, Pedro Henrique Paiola, Luis Henrique Morelli, Giovani Candido, Arnaldo Cândido Júnior, Danilo Samuel Jodas, Luis C. S. Afonso, Ivan Rizzo Guilherme, Bruno Elias Penteado, João Paulo Papa

    Abstract: Large Language Models (LLMs) are increasingly bringing advances to Natural Language Processing. However, low-resource languages, those lacking extensive prominence in datasets for various NLP tasks, or where existing datasets are not as substantial, such as Portuguese, already obtain several benefits from LLMs, but not to the same extent. LLMs trained on multilingual datasets normally struggle to… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 10 pages, 3 figures

  4. arXiv:2310.16148  [pdf, other

    cs.CV cs.AI

    Yin Yang Convolutional Nets: Image Manifold Extraction by the Analysis of Opposites

    Authors: Augusto Seben da Rosa, Frederico Santos de Oliveira, Anderson da Silva Soares, Arnaldo Candido Junior

    Abstract: Computer vision in general presented several advances such as training optimizations, new architectures (pure attention, efficient block, vision language models, generative models, among others). This have improved performance in several tasks such as classification, and others. However, the majority of these models focus on modifications that are taking distance from realistic neuroscientific app… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 12 pages, 5 tables and 6 figures

    ACM Class: I.2.10

  5. arXiv:2306.13116  [pdf, other

    cs.LG cs.AI

    A Machine Learning Pressure Emulator for Hydrogen Embrittlement

    Authors: Minh Triet Chau, João Lucas de Sousa Almeida, Elie Alhajjar, Alberto Costa Nogueira Junior

    Abstract: A recent alternative for hydrogen transportation as a mixture with natural gas is blending it into natural gas pipelines. However, hydrogen embrittlement of material is a major concern for scientists and gas installation designers to avoid process failures. In this paper, we propose a physics-informed machine learning model to predict the gas pressure on the pipes' inner wall. Despite its high-fid… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  6. arXiv:2306.10097  [pdf, other

    eess.AS cs.AI cs.CL

    CML-TTS A Multilingual Dataset for Speech Synthesis in Low-Resource Languages

    Authors: Frederico S. Oliveira, Edresson Casanova, Arnaldo Cândido Júnior, Anderson S. Soares, Arlindo R. Galvão Filho

    Abstract: In this paper, we present CML-TTS, a recursive acronym for CML-Multi-Lingual-TTS, a new Text-to-Speech (TTS) dataset developed at the Center of Excellence in Artificial Intelligence (CEIA) of the Federal University of Goias (UFG). CML-TTS is based on Multilingual LibriSpeech (MLS) and adapted for training TTS models, consisting of audiobooks in seven languages: Dutch, French, German, Italian, Port… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: 12 pages, 5 figures, Accepted at the 25th International Conference on Text, Speech and Dialogue (TSD 2022)

  7. arXiv:2306.09979  [pdf, other

    cs.SD cs.AI eess.AS

    Evaluation of Speech Representations for MOS prediction

    Authors: Frederico S. Oliveira, Edresson Casanova, Arnaldo Cândido Júnior, Lucas R. S. Gris, Anderson S. Soares, Arlindo R. Galvão Filho

    Abstract: In this paper, we evaluate feature extraction models for predicting speech quality. We also propose a model architecture to compare embeddings of supervised learning and self-supervised learning models with embeddings of speaker verification models to predict the metric MOS. Our experiments were performed on the VCC2018 dataset and a Brazilian-Portuguese dataset called BRSpeechMOS, which was creat… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: 12 pages, 4 figures, Accepted to the 26th International Conference of Text, Speech and Dialogue (TSD2023)

  8. arXiv:2305.14580  [pdf, other

    cs.CL cs.AI

    Evaluating OpenAI's Whisper ASR for Punctuation Prediction and Topic Modeling of life histories of the Museum of the Person

    Authors: Lucas Rafael Stefanel Gris, Ricardo Marcacini, Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Sandra Maria Aluísio

    Abstract: Automatic speech recognition (ASR) systems play a key role in applications involving human-machine interactions. Despite their importance, ASR models for the Portuguese language proposed in the last decade have limitations in relation to the correct identification of punctuation marks in automatic transcriptions, which hinder the use of transcriptions by other systems, models, and even by humans.… ▽ More

    Submitted 26 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  9. arXiv:2304.04966  [pdf

    cs.CV

    Computer Vision-Aided Intelligent Monitoring of Coffee: Towards Sustainable Coffee Production

    Authors: Francisco Eron, Muhammad Noman, Raphael Ricon de Oliveira, Deigo de Souza Marques, Rafael Serapilha Durelli, Andre Pimenta Freire, Antonio Chalfun Junior

    Abstract: Coffee which is prepared from the grinded roasted seeds of harvested coffee cherries, is one of the most consumed beverage and traded commodity, globally. To manually monitor the coffee field regularly, and inform about plant and soil health, as well as estimate yield and harvesting time, is labor-intensive, time-consuming and error-prone. Some recent studies have developed sensors for estimating… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  10. arXiv:2211.14372  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Interpretability Analysis of Deep Models for COVID-19 Detection

    Authors: Daniel Peixoto Pinto da Silva, Edresson Casanova, Lucas Rafael Stefanel Gris, Arnaldo Candido Junior, Marcelo Finger, Flaviane Svartman, Beatriz Raposo, Marcus Vinícius Moreira Martins, Sandra Maria Aluísio, Larissa Cristina Berti, João Paulo Teixeira

    Abstract: During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2. In this paper we present an interpretability analysis of a convolutional neural network based model for COVID-19 detection in audios. We investigate which features are important for model decision process, investigating spectrograms, F0, F0 standard deviation, sex and age.… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: 14 pages, 4 figures

  11. arXiv:2210.07852  [pdf, other

    cs.CL cs.SD eess.AS

    Bringing NURC/SP to Digital Life: the Role of Open-source Automatic Speech Recognition Models

    Authors: Lucas Rafael Stefanel Gris, Arnaldo Candido Junior, Vinícius G. dos Santos, Bruno A. Papa Dias, Marli Quadros Leite, Flaviane Romani Fernandes Svartman, Sandra Aluísio

    Abstract: The NURC Project that started in 1969 to study the cultured linguistic urban norm spoken in five Brazilian capitals, was responsible for compiling a large corpus for each capital. The digitized NURC/SP comprises 375 inquiries in 334 hours of recordings taken in São Paulo capital. Although 47 inquiries have transcripts, there was no alignment between the audio-transcription, and 328 inquiries were… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  12. arXiv:2206.01604  [pdf, other

    cs.LG nlin.CD

    Non-Intrusive Reduced Models based on Operator Inference for Chaotic Systems

    Authors: João Lucas de Sousa Almeida, Arthur Cancellieri Pires, Klaus Feine Vaz Cid, Alberto Costa Nogueira Junior

    Abstract: This work explores the physics-driven machine learning technique Operator Inference (OpInf) for predicting the state of chaotic dynamical systems. OpInf provides a non-intrusive approach to infer approximations of polynomial operators in reduced space without having access to the full order operators appearing in discretized models. Datasets for the physics systems are generated using conventional… ▽ More

    Submitted 21 September, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: 16 pages, 37 figures, accepted for publication in the IEEE-TAI-PIML

  13. arXiv:2204.00618  [pdf, other

    eess.AS cs.CL cs.SD

    ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion

    Authors: Edresson Casanova, Christopher Shulby, Alexander Korolev, Arnaldo Candido Junior, Anderson da Silva Soares, Sandra Aluísio, Moacir Antonelli Ponti

    Abstract: We explore cross-lingual multi-speaker speech synthesis and cross-lingual voice conversion applied to data augmentation for automatic speech recognition (ASR) systems in low/medium-resource scenarios. Through extensive experiments, we show that our approach permits the application of speech synthesis and voice conversion to improve ASR systems using only one target-language speaker during model tr… ▽ More

    Submitted 20 May, 2023; v1 submitted 29 March, 2022; originally announced April 2022.

    Comments: This paper was accepted at INTERSPEECH 2023

  14. arXiv:2112.02418  [pdf, other

    cs.SD cs.CL eess.AS

    YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

    Authors: Edresson Casanova, Julian Weber, Christopher Shulby, Arnaldo Candido Junior, Eren Gölge, Moacir Antonelli Ponti

    Abstract: YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker TTS. Our method builds upon the VITS model and adds several novel modifications for zero-shot multi-speaker and multilingual training. We achieved state-of-the-art (SOTA) results in zero-shot multi-speaker TTS and results comparable to SOTA in zero-shot voice conversion on the VCTK dataset. Additionally, our… ▽ More

    Submitted 30 April, 2023; v1 submitted 4 December, 2021; originally announced December 2021.

    Comments: An Erratum was added on the last page of this paper

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2709-2720, 2022

  15. arXiv:2110.15731  [pdf, other

    cs.CL cs.SD eess.AS

    CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

    Authors: Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Frederico Santos de Oliveira, Lucas Oliveira, Ricardo Corso Fernandes Junior, Daniel Peixoto Pinto da Silva, Fernando Gorgulho Fayet, Bruno Baldissera Carlotto, Lucas Rafael Stefanel Gris, Sandra Maria Aluísio

    Abstract: Automatic Speech recognition (ASR) is a complex and challenging task. In recent years, there have been significant advances in the area. In particular, for the Brazilian Portuguese (BP) language, there were about 376 hours public available for ASR task until the second half of 2020. With the release of new datasets in early 2021, this number increased to 574 hours. The existing resources, however,… ▽ More

    Submitted 18 November, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: This paper is under consideration at Language Resources and Evaluation (LREV)

  16. arXiv:2107.11414  [pdf, other

    cs.CL

    Brazilian Portuguese Speech Recognition Using Wav2vec 2.0

    Authors: Lucas Rafael Stefanel Gris, Edresson Casanova, Frederico Santos de Oliveira, Anderson da Silva Soares, Arnaldo Candido Junior

    Abstract: Deep learning techniques have been shown to be efficient in various tasks, especially in the development of speech recognition systems, that is, systems that aim to transcribe an audio sentence in a sequence of written words. Despite the progress in the area, speech recognition can still be considered difficult, especially for languages lacking available data, such as Brazilian Portuguese (BP). In… ▽ More

    Submitted 22 December, 2021; v1 submitted 23 July, 2021; originally announced July 2021.

  17. arXiv:2104.05557  [pdf, other

    eess.AS cs.SD

    SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model

    Authors: Edresson Casanova, Christopher Shulby, Eren Gölge, Nicolas Michael Müller, Frederico Santos de Oliveira, Arnaldo Candido Junior, Anderson da Silva Soares, Sandra Maria Aluisio, Moacir Antonelli Ponti

    Abstract: In this paper, we propose SC-GlowTTS: an efficient zero-shot multi-speaker text-to-speech model that improves similarity for speakers unseen during training. We propose a speaker-conditional architecture that explores a flow-based decoder that works in a zero-shot scenario. As text encoders, we explore a dilated residual convolutional-based encoder, gated convolutional-based encoder, and transform… ▽ More

    Submitted 15 June, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

    Comments: Accepted on Interspeech 2021

  18. arXiv:2005.05144  [pdf, other

    eess.AS cs.CL cs.LG

    TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese

    Authors: Edresson Casanova, Arnaldo Candido Junior, Christopher Shulby, Frederico Santos de Oliveira, João Paulo Teixeira, Moacir Antonelli Ponti, Sandra Maria Aluisio

    Abstract: Speech provides a natural way for human-computer interaction. In particular, speech synthesis systems are popular in different applications, such as personal assistants, GPS applications, screen readers and accessibility tools. However, not all languages are on the same level when in terms of resources and systems for speech synthesis. This work consists of creating publicly available resources fo… ▽ More

    Submitted 29 January, 2022; v1 submitted 11 May, 2020; originally announced May 2020.

  19. arXiv:2002.11213  [pdf, other

    cs.CL cs.SD eess.AS

    Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition Models

    Authors: Edresson Casanova, Arnaldo Candido Junior, Christopher Shulby, Frederico Santos de Oliveira, Lucas Rafael Stefanel Gris, Hamilton Pereira da Silva, Sandra Maria Aluisio, Moacir Antonelli Ponti

    Abstract: In this paper we present an efficient method for training models for speaker recognition using small or under-resourced datasets. This method requires less data than other SOTA (State-Of-The-Art) methods, e.g. the Angular Prototypical and GE2E loss functions, while achieving similar results to those methods. This is done using the knowledge of the reconstruction of a phoneme in the speaker's voice… ▽ More

    Submitted 18 June, 2021; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: Submitted to BRACIS

  20. arXiv:1911.07673  [pdf, other

    cs.DB cs.AI cs.IR

    Using Mapping Languages for Building Legal Knowledge Graphs from XML Files

    Authors: Ademar Crotti Junior, Fabrizio Orlandi, Declan O'Sullivan, Christian Dirschl, Quentin Reul

    Abstract: This paper presents our experience on building RDF knowledge graphs for an industrial use case in the legal domain. The information contained in legal information systems are often accessed through simple keyword interfaces and presented as a simple list of hits. In order to improve search accuracy one may avail of knowledge graphs, where the semantics of the data can be made explicit. Significant… ▽ More

    Submitted 18 November, 2019; originally announced November 2019.

    Comments: Presented at the 2nd International Contextualized Knowledge Graphs Workshop (CKG'19) at the 18th International Semantic Web Conference (ISWC) 2019