Zum Hauptinhalt springen

Showing 1–22 of 22 results for author: Chou, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.17686  [pdf

    cs.CL cs.AI

    AviationGPT: A Large Language Model for the Aviation Domain

    Authors: Liya Wang, Jason Chou, Xin Zhou, Alex Tien, Diane M Baumgartner

    Abstract: The advent of ChatGPT and GPT-4 has captivated the world with large language models (LLMs), demonstrating exceptional performance in question-answering, summarization, and content generation. The aviation industry is characterized by an abundance of complex, unstructured text data, replete with technical jargon and specialized terminology. Moreover, labeled data for model building are scarce in th… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  2. arXiv:2311.09526  [pdf, other

    cs.DC

    Towards Serverless Optimization with In-place Scaling

    Authors: Vincent Hsieh, Jerry Chou

    Abstract: Serverless computing has gained popularity due to its cost efficiency, ease of deployment, and enhanced scalability. However, in serverless environments, servers are initiated only after receiving a request, leading to increased response times. This delay is commonly known as the cold start problem. In this study, we explore the in-place scaling feature released in Kubernetes v1.27 and examine its… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  3. arXiv:2310.08715  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Toward Joint Language Modeling for Speech Units and Text

    Authors: Ju-Chieh Chou, Chung-Ming Chien, Wei-Ning Hsu, Karen Livescu, Arun Babu, Alexis Conneau, Alexei Baevski, Michael Auli

    Abstract: Speech and text are two major forms of human language. The research community has been focusing on mapping speech to text or vice versa for many years. However, in the field of language modeling, very little effort has been made to model them jointly. In light of this, we explore joint language modeling for speech units and text. Specifically, we compare different speech tokenizers to transform co… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: EMNLP findings 2023

  4. arXiv:2310.05919  [pdf, other

    cs.CL eess.AS

    Few-Shot Spoken Language Understanding via Joint Speech-Text Models

    Authors: Chung-Ming Chien, Mingjiamei Zhang, Ju-Chieh Chou, Karen Livescu

    Abstract: Recent work on speech representation models jointly pre-trained with text has demonstrated the potential of improving speech representations by encoding speech and text in a shared space. In this paper, we leverage such shared representations to address the persistent challenge of limited data availability in spoken language understanding tasks. By employing a pre-trained speech-text model, we fin… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  5. arXiv:2309.08030  [pdf, other

    eess.AS cs.CL cs.SD

    AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement

    Authors: Ju-Chieh Chou, Chung-Ming Chien, Karen Livescu

    Abstract: Speech enhancement systems are typically trained using pairs of clean and noisy speech. In audio-visual speech enhancement (AVSE), there is not as much ground-truth clean data available; most audio-visual datasets are collected in real-world environments with background noise and reverberation, hampering the development of AVSE. In this work, we introduce AV2Wav, a resynthesis-based audio-visual s… ▽ More

    Submitted 8 April, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: extended version for the accepted paper at ICASSP 2024

  6. TaleStream: Supporting Story Ideation with Trope Knowledge

    Authors: Jean-Peïc Chou, Alexa F. Siu, Nedim Lipka, Ryan Rossi, Franck Dernoncourt, Maneesh Agrawala

    Abstract: Story ideation is a critical part of the story-writing process. It is challenging to support computationally due to its exploratory and subjective nature. Tropes, which are recurring narrative elements across stories, are essential in stories as they shape the structure of narratives and our understanding of them. In this paper, we propose to use tropes as an intermediate representation of stories… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: 12 pages, 6 figures, 3 tables

    ACM Class: D.2.2; H.1.2; H.5.2

  7. arXiv:2305.09556  [pdf

    cs.CL

    Adapting Sentence Transformers for the Aviation Domain

    Authors: Liya Wang, Jason Chou, Dave Rouck, Alex Tien, Diane M Baumgartner

    Abstract: Learning effective sentence representations is crucial for many Natural Language Processing (NLP) tasks, including semantic search, semantic textual similarity (STS), and clustering. While multiple transformer models have been developed for sentence embedding learning, these models may not perform optimally when dealing with specialized domains like aviation, which has unique characteristics such… ▽ More

    Submitted 29 November, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

  8. arXiv:2210.01986  [pdf, other

    cs.LG eess.SP q-bio.NC

    MAtt: A Manifold Attention Network for EEG Decoding

    Authors: Yue-Ting Pan, Jing-Lun Chou, Chun-Shu Wei

    Abstract: Recognition of electroencephalographic (EEG) signals highly affect the efficiency of non-invasive brain-computer interfaces (BCIs). While recent advances of deep-learning (DL)-based EEG decoders offer improved performances, the development of geometric learning (GL) has attracted much attention for offering exceptional robustness in decoding noisy EEG data. However, there is a lack of studies on t… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

  9. arXiv:2202.05711  [pdf, ps, other

    cs.DC

    Global Optimization of Data Pipelines in Heterogeneous Cloud Environments

    Authors: Erica Lin, Luna Xu, Suraj Bramhavar, Marco Montes de Oca, Sean Gorsky, Lingyun Yi, Arianna Groetsema, Jeffrey Chou

    Abstract: Modern production data processing and machine learning pipelines on the cloud are critical components for many cloud-based companies. These pipelines are typically composed of complex workflows represented by directed acyclic graphs (DAGs). Cloud environments are attractive to these workflows due to the wide range of choice with heterogeneous instances and prices that can provide the flexibility f… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Comments: 13 pages

  10. arXiv:2111.04494  [pdf

    cs.LG cs.AI

    Multi-Airport Delay Prediction with Transformers

    Authors: Liya Wang, Alex Tien, Jason Chou

    Abstract: Airport performance prediction with a reasonable look-ahead time is a challenging task and has been attempted by various prior research. Traffic, demand, weather, and traffic management actions are all critical inputs to any prediction model. In this paper, a novel approach based on Temporal Fusion Transformer (TFT) was proposed to predict departure and arrival delays simultaneously for multiple a… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

  11. arXiv:2107.04734  [pdf, other

    cs.CL cs.LG eess.AS

    Layer-wise Analysis of a Self-supervised Speech Representation Model

    Authors: Ankita Pasad, Ju-Chieh Chou, Karen Livescu

    Abstract: Recently proposed self-supervised learning approaches have been successful for pre-training speech representation models. The utility of these learned representations has been observed empirically, but not much has been studied about the type or extent of information encoded in the pre-trained representations themselves. Developing such insights can help understand the capabilities and limits of t… ▽ More

    Submitted 3 December, 2022; v1 submitted 9 July, 2021; originally announced July 2021.

    Comments: Accepted to ASRU 2021. Code: https://github.com/ankitapasad/layerwise-analysis

  12. arXiv:2106.04624  [pdf, other

    eess.AS cs.AI cs.LG cs.SD

    SpeechBrain: A General-Purpose Speech Toolkit

    Authors: Mirco Ravanelli, Titouan Parcollet, Peter Plantinga, Aku Rouhe, Samuele Cornell, Loren Lugosch, Cem Subakan, Nauman Dawalatabad, Abdelwahab Heba, Jianyuan Zhong, Ju-Chieh Chou, Sung-Lin Yeh, Szu-Wei Fu, Chien-Feng Liao, Elena Rastorgueva, François Grondin, William Aris, Hwidong Na, Yan Gao, Renato De Mori, Yoshua Bengio

    Abstract: SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the research and development of neural speech processing technologies by being simple, flexible, user-friendly, and well-documented. This paper describes the core architecture designed to support several tasks of common interest, allowing users to naturally conceive, compare and share novel speech processing… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: Preprint

  13. An Incremental Dimensionality Reduction Method for Visualizing Streaming Multidimensional Data

    Authors: Takanori Fujiwara, Jia-Kai Chou, Shilpika, Panpan Xu, Liu Ren, Kwan-Liu Ma

    Abstract: Dimensionality reduction (DR) methods are commonly used for analyzing and visualizing multidimensional data. However, when data is a live streaming feed, conventional DR methods cannot be directly used because of their computational complexity and inability to preserve the projected data positions at previous time points. In addition, the problem becomes even more challenging when the dynamic data… ▽ More

    Submitted 15 October, 2019; v1 submitted 10 May, 2019; originally announced May 2019.

    Comments: This is the author's version of the article that has been published in IEEE Transactions on Visualization and Computer Graphics. The final version of this record is available at: 10.1109/TVCG.2019.2934433

    ACM Class: I.3.8

  14. arXiv:1904.10937  [pdf, other

    cs.LG stat.ML

    Generated Loss and Augmented Training of MNIST VAE

    Authors: Jason Chou

    Abstract: The variational autoencoder (VAE) framework is a popular option for training unsupervised generative models, featuring ease of training and latent representation of data. The objective function of VAE does not guarantee to achieve the latter, however, and failure to do so leads to a frequent failure mode called posterior collapse. Even in successful cases, VAEs often result in low-precision recons… ▽ More

    Submitted 24 April, 2019; originally announced April 2019.

    ACM Class: I.2.6

  15. arXiv:1904.10446  [pdf, other

    cs.LG stat.ML

    Generated Loss, Augmented Training, and Multiscale VAE

    Authors: Jason Chou, Gautam Hathi

    Abstract: The variational autoencoder (VAE) framework remains a popular option for training unsupervised generative models, especially for discrete data where generative adversarial networks (GANs) require workaround to create gradient for the generator. In our work modeling US postal addresses, we show that our discrete VAE with tree recursive architecture demonstrates limited capability of capturing field… ▽ More

    Submitted 23 April, 2019; originally announced April 2019.

    ACM Class: I.2.6

  16. arXiv:1904.05742  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization

    Authors: Ju-chieh Chou, Cheng-chieh Yeh, Hung-yi Lee

    Abstract: Recently, voice conversion (VC) without parallel data has been successfully adapted to multi-target scenario in which a single model is trained to convert the input voice to many different speakers. However, such model suffers from the limitation that it can only convert the voice to the speakers in the training data, which narrows down the applicable scenario of VC. In this paper, we proposed a n… ▽ More

    Submitted 22 August, 2019; v1 submitted 10 April, 2019; originally announced April 2019.

    Comments: Interspeech 2019

  17. arXiv:1904.04990  [pdf, other

    cs.LG stat.ML

    Identifying Sub-Phenotypes of Acute Kidney Injury using Structured and Unstructured Electronic Health Record Data with Memory Networks

    Authors: Zhenxing Xu, Jingyuan Chou, Xi Sheryl Zhang, Yuan Luo, Tamara Isakova, Prakash Adekkanattu, Jessica S. Ancker, Guoqian Jiang, Richard C. Kiefer, Jennifer A. Pacheco, Luke V. Rasmussen, Jyotishman Pathak, Fei Wang

    Abstract: Acute Kidney Injury (AKI) is a common clinical syndrome characterized by the rapid loss of kidney excretory function, which aggravates the clinical severity of other diseases in a large number of hospitalized patients. Accurate early prediction of AKI can enable in-time interventions and treatments. However, AKI is highly heterogeneous, thus identification of AKI sub-phenotypes can lead to an impr… ▽ More

    Submitted 22 December, 2019; v1 submitted 9 April, 2019; originally announced April 2019.

  18. arXiv:1809.06018  [pdf, other

    cs.LG stat.ML

    Integrative Analysis of Patient Health Records and Neuroimages via Memory-based Graph Convolutional Network

    Authors: Xi Sheryl Zhang, Jingyuan Chou, Fei Wang

    Abstract: With the arrival of the big data era, more and more data are becoming readily available in various real-world applications and those data are usually highly heterogeneous. Taking computational medicine as an example, we have both Electronic Health Records (EHR) and medical images for each patient. For complicated diseases such as Parkinson's and Alzheimer's, both EHR and neuroimaging information a… ▽ More

    Submitted 7 May, 2019; v1 submitted 17 September, 2018; originally announced September 2018.

  19. arXiv:1808.03113  [pdf, other

    cs.SD eess.AS

    Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences

    Authors: Cheng-chieh Yeh, Po-chun Hsu, Ju-chieh Chou, Hung-yi Lee, Lin-shan Lee

    Abstract: Speaking rate refers to the average number of phonemes within some unit time, while the rhythmic patterns refer to duration distributions for realizations of different phonemes within different phonetic structures. Both are key components of prosody in speech, which is different for different speakers. Models like cycle-consistent adversarial network (Cycle-GAN) and variational auto-encoder (VAE)… ▽ More

    Submitted 9 August, 2018; originally announced August 2018.

    Comments: 8 pages, 6 figures, Submitted to SLT 2018

  20. arXiv:1804.02812  [pdf, other

    eess.AS cs.CL cs.SD

    Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations

    Authors: Ju-chieh Chou, Cheng-chieh Yeh, Hung-yi Lee, Lin-shan Lee

    Abstract: Recently, cycle-consistent adversarial network (Cycle-GAN) has been successfully applied to voice conversion to a different speaker without parallel data, although in those approaches an individual model is needed for each target speaker. In this paper, we propose an adversarial learning framework for voice conversion, with which a single model can be trained to convert the voice to many different… ▽ More

    Submitted 24 June, 2018; v1 submitted 9 April, 2018; originally announced April 2018.

    Comments: Accepted to Interspeech 2018

  21. arXiv:1703.00797  [pdf

    physics.med-ph cs.CV

    A Simple, Fast and Fully Automated Approach for Midline Shift Measurement on Brain Computed Tomography

    Authors: Huan-Chih Wang, Shih-Hao Ho, Furen Xiao, Jen-Hai Chou

    Abstract: Brain CT has become a standard imaging tool for emergent evaluation of brain condition, and measurement of midline shift (MLS) is one of the most important features to address for brain CT assessment. We present a simple method to estimate MLS and propose a new alternative parameter to MLS: the ratio of MLS over the maximal width of intracranial region (MLS/ICWMAX). Three neurosurgeons and our aut… ▽ More

    Submitted 2 March, 2017; originally announced March 2017.

  22. arXiv:0710.4681  [pdf

    cs.AR

    A Quality-of-Service Mechanism for Interconnection Networks in System-on-Chips

    Authors: Wolf-Dietrich Weber, Joe Chou, Ian Swarbrick, Drew Wingard

    Abstract: As Moore's Law continues to fuel the ability to build ever increasingly complex system-on-chips (SoCs), achieving performance goals is rising as a critical challenge to completing designs. In particular, the system interconnect must efficiently service a diverse set of data flows with widely ranging quality-of-service (QoS) requirements. However, the known solutions for off-chip interconnects su… ▽ More

    Submitted 25 October, 2007; originally announced October 2007.

    Comments: Submitted on behalf of EDAA (http://www.edaa.com/)

    Journal ref: Dans Design, Automation and Test in Europe - DATE'05, Munich : Allemagne (2005)