Zum Hauptinhalt springen

Showing 1–42 of 42 results for author: Tsai, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.06817  [pdf, other

    math.NT cs.DM math.CO

    Periodic minimum in the count of binomial coefficients not divisible by a prime

    Authors: Hsien-Kuei Hwang, Svante Janson, Tsung-Hsi Tsai

    Abstract: The summatory function of the number of binomial coefficients not divisible by a prime is known to exhibit regular periodic oscillations, yet identifying the less regularly behaved minimum of the underlying periodic functions has been open for almost all cases. We propose an approach to identify such minimum in some generality, solving particularly a previous conjecture of B. Wilson [Asymptotic be… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    MSC Class: 05A10; 11B65; 11B37; 39A23; 65Q30

  2. arXiv:2403.12024  [pdf, other

    cs.CL

    Enhancing Taiwanese Hokkien Dual Translation by Exploring and Standardizing of Four Writing Systems

    Authors: Bo-Han Lu, Yi-Hsuan Lin, En-Shiun Annie Lee, Richard Tzong-Han Tsai

    Abstract: Machine translation focuses mainly on high-resource languages (HRLs), while low-resource languages (LRLs) like Taiwanese Hokkien are relatively under-explored. The study aims to address this gap by developing a dual translation model between Taiwanese Hokkien and both Traditional Mandarin Chinese and English. We employ a pre-trained LLaMA 2-7B model specialized in Traditional Mandarin Chinese to l… ▽ More

    Submitted 14 May, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted by LREC-COLING 2024 as a long oral paper

  3. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  4. arXiv:2402.01685  [pdf, other

    cs.CL cs.AI cs.DB

    SMUTF: Schema Matching Using Generative Tags and Hybrid Features

    Authors: Yu Zhang, Mei Di, Haozheng Luo, Chenwei Xu, Richard Tzong-Han Tsai

    Abstract: We introduce SMUTF, a unique approach for large-scale tabular data schema matching (SM), which assumes that supervised learning does not affect performance in open-domain tasks, thereby enabling effective cross-domain matching. This system uniquely combines rule-based feature engineering, pre-trained language models, and generative large language models. In an innovative adaptation inspired by the… ▽ More

    Submitted 6 February, 2024; v1 submitted 22 January, 2024; originally announced February 2024.

  5. arXiv:2401.16803  [pdf, other

    cs.SD cs.LG eess.AS

    PBSCR: The Piano Bootleg Score Composer Recognition Dataset

    Authors: Arhan Jain, Alec Bunn, Austin Pham, TJ Tsai

    Abstract: This article motivates, describes, and presents the PBSCR dataset for studying composer recognition of classical piano music. Our goal was to design a dataset that facilitates large-scale research on composer recognition that is suitable for modern architectures and training practices. To achieve this goal, we utilize the abundance of sheet music images and rich metadata on IMSLP, use a previously… ▽ More

    Submitted 5 August, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 19 pages, 6 figures, to be published in Transactions of the International Society for Music Information Retrieval

  6. arXiv:2401.15879  [pdf, other

    cs.LG stat.ML

    lil'HDoC: An Algorithm for Good Arm Identification under Small Threshold Gap

    Authors: Tzu-Hsien Tsai, Yun-Da Tsai, Shou-De Lin

    Abstract: Good arm identification (GAI) is a pure-exploration bandit problem in which a single learner outputs an arm as soon as it is identified as a good arm. A good arm is defined as an arm with an expected reward greater than or equal to a given threshold. This paper focuses on the GAI problem under a small threshold gap, which refers to the distance between the expected rewards of arms and the given th… ▽ More

    Submitted 12 March, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  7. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  8. arXiv:2311.05782  [pdf, other

    cs.DC

    MPGemmFI: A Fault Injection Technique for Mixed Precision GEMM in ML Applications

    Authors: Bo Fang, Xinyi Li, Harvey Dam, Cheng Tan, Siva Kumar Sastry Hari, Timothy Tsai, Ignacio Laguna, Dingwen Tao, Ganesh Gopalakrishnan, Prashant Nair, Kevin Barker, Ang Li

    Abstract: Emerging deep learning workloads urgently need fast general matrix multiplication (GEMM). To meet such demand, one of the critical features of machine-learning-specific accelerators such as NVIDIA Tensor Cores, AMD Matrix Cores, and Google TPUs is the support of mixed-precision enabled GEMM. For DNN models, lower-precision FP data formats and computation offer acceptable correctness but significan… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  9. arXiv:2310.04799  [pdf, other

    cs.CL

    Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages

    Authors: Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, Hung-yi Lee

    Abstract: Recently, the development of open-source large language models (LLMs) has advanced rapidly. Nevertheless, due to data constraints, the capabilities of most open-source LLMs are primarily focused on English. To address this issue, we introduce the concept of $\textit{chat vector}$ to equip pre-trained language models with instruction following and human value alignment via simple model arithmetic.… ▽ More

    Submitted 7 June, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: ACL 2024 camera-ready version

  10. arXiv:2308.15118  [pdf, other

    cs.CL

    Large Language Models on the Chessboard: A Study on ChatGPT's Formal Language Comprehension and Complex Reasoning Skills

    Authors: Mu-Tien Kuo, Chih-Chung Hsueh, Richard Tzong-Han Tsai

    Abstract: While large language models have made strides in natural language processing, their proficiency in complex reasoning tasks requiring formal language comprehension, such as chess, remains less investigated. This paper probes the performance of ChatGPT, a sophisticated language model by OpenAI in tackling such complex reasoning tasks, using chess as a case study. Through robust metrics examining bot… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  11. arXiv:2303.07154  [pdf, other

    cs.LG stat.ML

    Differential Good Arm Identification

    Authors: Yun-Da Tsai, Tzu-Hsien Tsai, Shou-De Lin

    Abstract: This paper targets a variant of the stochastic multi-armed bandit problem called good arm identification (GAI). GAI is a pure-exploration bandit problem with the goal to output as many good arms using as few samples as possible, where a good arm is defined as an arm whose expected reward is greater than a given threshold. In this work, we propose DGAI - a differentiable good arm identification alg… ▽ More

    Submitted 15 February, 2024; v1 submitted 13 March, 2023; originally announced March 2023.

  12. arXiv:2301.08937  [pdf, other

    cs.CL cs.AI

    Exploring Methods for Building Dialects-Mandarin Code-Mixing Corpora: A Case Study in Taiwanese Hokkien

    Authors: Sin-En Lu, Bo-Han Lu, Chao-Yi Lu, Richard Tzong-Han Tsai

    Abstract: In natural language processing (NLP), code-mixing (CM) is a challenging task, especially when the mixed languages include dialects. In Southeast Asian countries such as Singapore, Indonesia, and Malaysia, Hokkien-Mandarin is the most widespread code-mixed language pair among Chinese immigrants, and it is also common in Taiwan. However, dialects such as Hokkien often have a scarcity of resources an… ▽ More

    Submitted 21 January, 2023; originally announced January 2023.

    Comments: The paper was accepted by EMNLP 2022 findings

  13. arXiv:2210.10968  [pdf, other

    cs.DS math.CO

    Identities and periodic oscillations of divide-and-conquer recurrences splitting at half

    Authors: Hsien-Kuei Hwang, Svante Janson, Tsung-Hsi Tsai

    Abstract: We study divide-and-conquer recurrences of the form \begin{equation*} f(n) = αf(\lfloor \tfrac n2\rfloor) + βf(\lceil \tfrac n2\rceil) + g(n) \qquad(n\ge2), \end{equation*} with $g(n)$ and $f(1)$ given, where $α,β\ge0$ with $α+β>0$; such recurrences appear often in analysis of computer algorithms, numeration systems, combinatorial sequences, and related areas. We show that the solution sat… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: 69 pages, 13 figures, 13 tables

    MSC Class: 68Q25; 39B12; 11B37; 11B83; 05A15; 05A16; 42A16 ACM Class: F.2.2; G.2.1; G.2.3

  14. arXiv:2206.07860  [pdf, other

    cs.SD cs.LG eess.AS

    EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning

    Authors: Li-Chin Chen, Po-Hsun Chen, Richard Tzong-Han Tsai, Yu Tsao

    Abstract: Speech generation and enhancement based on articulatory movements facilitate communication when the scope of verbal communication is absent, e.g., in patients who have lost the ability to speak. Although various techniques have been proposed to this end, electropalatography (EPG), which is a monitoring technique that records contact between the tongue and hard palate during speech, has not been ad… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: Accepted By IEEE Signal Processing Letter

    Journal ref: IEEE Signal Processing Letters, vol. 29, p. 2582-2586, 2022

  15. arXiv:2205.03347  [pdf, other

    cs.AI cs.RO

    Zhuyi: Perception Processing Rate Estimation for Safety in Autonomous Vehicles

    Authors: Yu-Shun Hsiao, Siva Kumar Sastry Hari, Michał Filipiuk, Timothy Tsai, Michael B. Sullivan, Vijay Janapa Reddi, Vasu Singh, Stephen W. Keckler

    Abstract: The processing requirement of autonomous vehicles (AVs) for high-accuracy perception in complex scenarios can exceed the resources offered by the in-vehicle computer, degrading safety and comfort. This paper proposes a sensor frame processing rate (FPR) estimation model, Zhuyi, that quantifies the minimum safe FPR continuously in a driving scenario. Zhuyi can be employed post-deployment as an onli… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: 2022 Design Automation Conference (DAC), July 10-14, 2022, San Francisco

  16. arXiv:2203.07474  [pdf, other

    cs.AR cs.LG

    Distributed On-Sensor Compute System for AR/VR Devices: A Semi-Analytical Simulation Framework for Power Estimation

    Authors: Jorge Gomez, Saavan Patel, Syed Shakib Sarwar, Ziyun Li, Raffaele Capoccia, Zhao Wang, Reid Pinkham, Andrew Berkovich, Tsung-Hsun Tsai, Barbara De Salvo, Chiao Liu

    Abstract: Augmented Reality/Virtual Reality (AR/VR) glasses are widely foreseen as the next generation computing platform. AR/VR glasses are a complex "system of systems" which must satisfy stringent form factor, computing-, power- and thermal- requirements. In this paper, we will show that a novel distributed on-sensor compute architecture, coupled with new semiconductor technologies (such as dense 3D-IC i… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: 6 pages, 5 figures, TinyML Research Symposium

  17. arXiv:2105.01899  [pdf, other

    cs.LG cs.CV

    MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering

    Authors: Tsung Wei Tsai, Chongxuan Li, Jun Zhu

    Abstract: We present Mixture of Contrastive Experts (MiCE), a unified probabilistic clustering framework that simultaneously exploits the discriminative representations learned by contrastive learning and the semantic structures captured by a latent mixture model. Motivated by the mixture of experts, MiCE employs a gating function to partition an unlabeled dataset into subsets according to the latent semant… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Comments: International Conference on Learning Representations (ICLR) 2021

  18. arXiv:2103.07403  [pdf, other

    cs.RO cs.AI eess.SY

    Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles

    Authors: Zahra Ghodsi, Siva Kumar Sastry Hari, Iuri Frosio, Timothy Tsai, Alejandro Troccoli, Stephen W. Keckler, Siddharth Garg, Anima Anandkumar

    Abstract: Extracting interesting scenarios from real-world data as well as generating failure cases is important for the development and testing of autonomous systems. We propose efficient mechanisms to both characterize and generate testing scenarios using a state-of-the-art driving simulator. For any scenario, our method generates a set of possible driving paths and identifies all the possible safe drivin… ▽ More

    Submitted 12 March, 2021; originally announced March 2021.

  19. Significant Otter: Understanding the Role of Biosignals in Communication

    Authors: Fannie Liu, Chunjong Park, Yu Jiang Tham, Tsung-Yu Tsai, Laura Dabbish, Geoff Kaufman, Andrés Monroy-Hernández

    Abstract: With the growing ubiquity of wearable devices, sensed physiological responses provide new means to connect with others. While recent research demonstrates the expressive potential for biosignals, the value of sharing these personal data remains unclear. To understand their role in communication, we created Significant Otter, an Apple Watch/iPhone app that enables romantic partners to share and res… ▽ More

    Submitted 15 April, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: CHI Conference on Human Factors in Computing Systems (CHI '21), May 8--13, 2021, Yokohama, Japan

  20. arXiv:2010.12173  [pdf, other

    eess.AS cs.SD

    A Cross-Verification Approach for Protecting World Leaders from Fake and Tampered Audio

    Authors: Mengyi Shan, TJ Tsai

    Abstract: This paper tackles the problem of verifying the authenticity of speech recordings from world leaders. Whereas previous work on detecting deep fake or tampered audio focus on scrutinizing an audio recording in isolation, we instead reframe the problem and focus on cross-verifying a questionable recording against trusted references. We present a method for cross-verifying a speech recording against… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: 5 pages, 4 figures, 1 table

  21. arXiv:2007.14587  [pdf, other

    cs.CV cs.CL cs.LG

    Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining

    Authors: TJ Tsai, Kevin Ji

    Abstract: This paper studies composer style classification of piano sheet music images. Previous approaches to the composer classification task have been limited by a scarcity of data. We address this issue in two ways: (1) we recast the problem to be based on raw sheet music images rather than a symbolic music format, and (2) we propose an approach that can be trained on unlabeled data. Our approach first… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Comments: 8 pages, 7 figures. Accepted paper at the International Society for Music Information Retrieval Conference (ISMIR) 2020

  22. arXiv:2007.14580  [pdf, other

    cs.MM cs.SD eess.AS eess.IV

    Improved Handling of Repeats and Jumps in Audio-Sheet Image Synchronization

    Authors: Mengyi Shan, TJ Tsai

    Abstract: This paper studies the problem of automatically generating piano score following videos given an audio recording and raw sheet music images. Whereas previous works focus on synthetic sheet music where the data has been cleaned and preprocessed, we instead focus on developing a system that can cope with the messiness of raw, unprocessed sheet music PDFs from IMSLP. We investigate how well existing… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Comments: 8 pages, 5 figures. Accepted paper at the International Society for Music Information Retrieval Conference (ISMIR) 2020

  23. arXiv:2007.14579  [pdf, other

    cs.CV cs.IR eess.IV

    Camera-Based Piano Sheet Music Identification

    Authors: Daniel Yang, TJ Tsai

    Abstract: This paper presents a method for large-scale retrieval of piano sheet music images. Our work differs from previous studies on sheet music retrieval in two ways. First, we investigate the problem at a much larger scale than previous studies, using all solo piano sheet music images in the entire IMSLP dataset as a searchable database. Second, we use cell phone images of sheet music as our input quer… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Comments: 8 pages, 3 figures, 2 tables. Accepted paper at the International Society for Music Information Retrieval Conference (ISMIR) 2020

  24. arXiv:2006.04984  [pdf, other

    cs.DC cs.LG

    Making Convolutions Resilient via Algorithm-Based Error Detection Techniques

    Authors: Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler

    Abstract: The ability of Convolutional Neural Networks (CNNs) to accurately process real-time telemetry has boosted their use in safety-critical and high-performance computing systems. As such systems require high levels of resilience to errors, CNNs must execute correctly in the presence of hardware faults. Full duplication provides the needed assurance but incurs a prohibitive 100% overhead. Algorithmic t… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  25. arXiv:2005.01445  [pdf, other

    cs.DC cs.AR

    Estimating Silent Data Corruption Rates Using a Two-Level Model

    Authors: Siva Kumar Sastry Hari, Paolo Rech, Timothy Tsai, Mark Stephenson, Arslan Zulfiqar, Michael Sullivan, Philip Shirvani, Paul Racunas, Joel Emer, Stephen W. Keckler

    Abstract: High-performance and safety-critical system architects must accurately evaluate the application-level silent data corruption (SDC) rates of processors to soft errors. Such an evaluation requires error propagation all the way from particle strikes on low-level state up to the program output. Existing approaches that rely on low-level simulations with fault injection cannot evaluate full application… ▽ More

    Submitted 27 April, 2020; originally announced May 2020.

  26. arXiv:2004.13004  [pdf, other

    cs.CR cs.CV cs.LG cs.RO

    ML-driven Malware that Targets AV Safety

    Authors: Saurabh Jha, Shengkun Cui, Subho S. Banerjee, Timothy Tsai, Zbigniew Kalbarczyk, Ravi Iyer

    Abstract: Ensuring the safety of autonomous vehicles (AVs) is critical for their mass deployment and public adoption. However, security attacks that violate safety constraints and cause accidents are a significant deterrent to achieving public trust in AVs, and that hinders a vendor's ability to deploy AVs. Creating a security hazard that results in a severe safety compromise (for example, an accident) is c… ▽ More

    Submitted 12 June, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Accepted for DSN 2020

    Journal ref: 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks

  27. arXiv:2004.11724  [pdf, other

    cs.MM cs.SD eess.AS eess.IV

    Using Cell Phone Pictures of Sheet Music To Retrieve MIDI Passages

    Authors: TJ Tsai, Daniel Yang, Mengyi Shan, Thitaree Tanprasert, Teerapat Jenrungrot

    Abstract: This article investigates a cross-modal retrieval problem in which a user would like to retrieve a passage of music from a MIDI file by taking a cell phone picture of several lines of sheet music. This problem is challenging for two reasons: it has a significant runtime constraint since it is a user-facing application, and there is very little relevant training data containing cell phone images of… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

    Comments: 13 pages, 8 figures, 3 tables. Accepted article in IEEE Transactions on Multimedia. arXiv admin note: text overlap with arXiv:2004.10347

  28. arXiv:2004.10391  [pdf, other

    eess.AS cs.MM cs.SD eess.IV

    Towards Linking the Lakh and IMSLP Datasets

    Authors: TJ Tsai

    Abstract: This paper investigates the problem of matching a MIDI file against a large database of piano sheet music images. Previous sheet-audio and sheet-MIDI alignment approaches have primarily focused on a 1-to-1 alignment task, which is not a scalable solution for retrieval from large databases. We propose a method for scalable cross-modal retrieval that might be used to link the Lakh MIDI dataset with… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.

    Comments: 5 pages, 4 figures, 1 table. Accepted paper at the International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020

  29. arXiv:2004.10347  [pdf, other

    cs.MM cs.SD eess.AS eess.IV

    MIDI Passage Retrieval Using Cell Phone Pictures of Sheet Music

    Authors: Daniel Yang, Thitaree Tanprasert, Teerapat Jenrungrot, Mengyi Shan, TJ Tsai

    Abstract: This paper investigates a cross-modal retrieval problem in which a user would like to retrieve a passage of music from a MIDI file by taking a cell phone picture of a physical page of sheet music. While audio-sheet music retrieval has been explored by a number of works, this scenario is novel in that the query is a cell phone picture rather than a digital scan. To solve this problem, we introduce… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

    Comments: 8 pages, 8 figures, 1 table. Accepted paper at the International Society for Music Information Retrieval Conference (ISMIR) 2019

  30. arXiv:2004.10345  [pdf, other

    cs.MM cs.SD eess.AS eess.IV

    MIDI-Sheet Music Alignment Using Bootleg Score Synthesis

    Authors: Thitaree Tanprasert, Teerapat Jenrungrot, Meinard Mueller, T. J. Tsai

    Abstract: MIDI-sheet music alignment is the task of finding correspondences between a MIDI representation of a piece and its corresponding sheet music images. Rather than using optical music recognition to bridge the gap between sheet music and MIDI, we explore an alternative approach: projecting the MIDI data into pixel space and performing alignment in the image domain. Our method converts the MIDI data i… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

    Comments: 8 pages, 6 figures, 1 table. Accepted paper at the International Society for Music Information Retrieval Conference (ISMIR) 2019

  31. arXiv:2002.09786  [pdf, other

    cs.LG cs.CV stat.ML

    HarDNN: Feature Map Vulnerability Evaluation in CNNs

    Authors: Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, Pavlo Molchanov, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler

    Abstract: As Convolutional Neural Networks (CNNs) are increasingly being employed in safety-critical applications, it is important that they behave reliably in the face of hardware errors. Transient hardware errors may percolate undesirable state during execution, resulting in software-manifested errors which can adversely affect high-level decision making. This paper presents HarDNN, a software-directed ap… ▽ More

    Submitted 25 February, 2020; v1 submitted 22 February, 2020; originally announced February 2020.

    Comments: 14 pages, 5 figures, a short version accepted for publication in First Workshop on Secure and Resilient Autonomy (SARA) co-located with MLSys2020

  32. arXiv:1907.01051  [pdf, other

    cs.LG cs.SE stat.ML

    ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection

    Authors: Saurabh Jha, Subho S. Banerjee, Timothy Tsai, Siva K. S. Hari, Michael B. Sullivan, Zbigniew T. Kalbarczyk, Stephen W. Keckler, Ravishankar K. Iyer

    Abstract: The safety and resilience of fully autonomous vehicles (AVs) are of significant concern, as exemplified by several headline-making accidents. While AV development today involves verification, validation, and testing, end-to-end assessment of AV systems under accidental faults in realistic driving scenarios has been largely unexplored. This paper presents DriveFI, a machine learning-based fault inj… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

    Comments: Accepted at 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks

  33. arXiv:1907.01024  [pdf, other

    cs.SE

    Kayotee: A Fault Injection-based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors

    Authors: Saurabh Jha, Timothy Tsai, Siva Hari, Michael Sullivan, Zbigniew Kalbarczyk, Stephen W. Keckler, Ravishankar K. Iyer

    Abstract: Fully autonomous vehicles (AVs), i.e., AVs with autonomy level 5, are expected to dominate road transportation in the near-future and contribute trillions of dollars to the global economy. The general public, government organizations, and manufacturers all have significant concern regarding resiliency and safety standards of the autonomous driving system (ADS) of AVs . In this work, we proposed an… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

    Comments: Presented at Automotive Reliability and Testing (ART) 2018 colocated with International Testing Conference

  34. arXiv:1905.13305  [pdf, other

    cs.CV cs.LG stat.ML

    Countering Noisy Labels By Learning From Auxiliary Clean Labels

    Authors: Tsung Wei Tsai, Chongxuan Li, Jun Zhu

    Abstract: We consider the learning from noisy labels (NL) problem which emerges in many real-world applications. In addition to the widely-studied synthetic noise in the NL literature, we also consider the pseudo labels in semi-supervised learning (Semi-SL) as a special case of NL. For both types of noise, we argue that the generalization performance of existing methods is highly coupled with the quality of… ▽ More

    Submitted 12 September, 2019; v1 submitted 23 May, 2019; originally announced May 2019.

  35. Revised JNLPBA Corpus: A Revised Version of Biomedical NER Corpus for Relation Extraction Task

    Authors: Ming-Siang Huang, Po-Ting Lai, Richard Tzong-Han Tsai, Wen-Lian Hsu

    Abstract: The advancement of biomedical named entity recognition (BNER) and biomedical relation extraction (BRE) researches promotes the development of text mining in biological domains. As a cornerstone of BRE, robust BNER system is required to identify the mentioned NEs in plain texts for further relation extraction stage. However, the current BNER corpora, which play important roles in these tasks, paid… ▽ More

    Submitted 29 January, 2019; originally announced January 2019.

    Comments: 17 pages

    Journal ref: Briefings in Bioinformatics, 2020, bbaa054

  36. Textual Analysis for Studying Chinese Historical Documents and Literary Novels

    Authors: Chao-Lin Liu, Guan-Tao Jin, Hongsu Wang, Qing-Feng Liu, Wen-Huei Cheng, Wei-Yun Chiu, Richard Tzong-Han Tsai, Yu-Chun Wang

    Abstract: We analyzed historical and literary documents in Chinese to gain insights into research issues, and overview our studies which utilized four different sources of text materials in this paper. We investigated the history of concepts and transliterated words in China with the Database for the Study of Modern China Thought and Literature, which contains historical documents about China between 1830 a… ▽ More

    Submitted 11 October, 2015; originally announced October 2015.

    Comments: 11 pages, 7 figures, 2 tables, The Fourth ASE International Conference on Social Informatics

  37. Compressive Hyperspectral Imaging with Side Information

    Authors: Xin Yuan, Tsung-Han Tsai, Ruoyu Zhu, Patrick Llull, David Brady, Lawrence Carin

    Abstract: A blind compressive sensing algorithm is proposed to reconstruct hyperspectral images from spectrally-compressed measurements.The wavelength-dependent data are coded and then superposed, mapping the three-dimensional hyperspectral datacube to a two-dimensional image. The inversion algorithm learns a dictionary {\em in situ} from the measurements via global-local shrinkage priors. By using RGB imag… ▽ More

    Submitted 22 February, 2015; originally announced February 2015.

    Comments: 20 pages, 21 figures. To appear in the IEEE Journal of Selected Topics Signal Processing

  38. arXiv:1409.4955  [pdf, other

    math.PR cs.DS

    Probabilistic analysis of the (1+1)-evolutionary algorithm

    Authors: Hsien-Kuei Hwang, Alois Panholzer, Nicolas Rolin, Tsung-Hsi Tsai, Wei-Mei Chen

    Abstract: We give a detailed analysis of the cost used by the (1+1)-evolutionary algorithm. The problem has been approached in the evolutionary algorithm literature under various views, formulation and degree of rigor. Our asymptotic approximations for the mean and the variance represent the strongest of their kind. The approach we develop is also applicable to characterize the limit laws and is based on as… ▽ More

    Submitted 17 September, 2014; originally announced September 2014.

    Comments: 53 pages with 8 figures and 4 appendices

    MSC Class: 60C05; 68W40 (Primary); 60F06; 65Q30 (Secondary)

  39. arXiv:1111.6224  [pdf, ps, other

    cs.DS

    Threshold phenomena in k-dominant skylines of random samples

    Authors: Hsien-Kuei Hwang, Tsung-Hsi Tsai, Wei-Mei Chen

    Abstract: Skylines emerged as a useful notion in database queries for selecting representative groups in multivariate data samples for further decision making, multi-objective optimization or data processing, and the $k$-dominant skylines were naturally introduced to resolve the abundance of skylines when the dimensionality grows or when the coordinates are negatively correlated. We prove in this paper that… ▽ More

    Submitted 26 November, 2011; originally announced November 2011.

    Comments: 38 pages, 4 figures

    MSC Class: 60C05; 68P15; 60F20; 68Q25; 82B26

  40. arXiv:0910.1392  [pdf, other

    cs.DS cs.CG

    Simple, efficient maxima-finding algorithms for multidimensional samples

    Authors: Wei-Mei Chen, Hsien-Kuei Hwang, Tsung-Hsi Tsai

    Abstract: New algorithms are devised for finding the maxima of multidimensional point samples, one of the very first problems studied in computational geometry. The algorithms are very simple and easily coded and modified for practical needs. The expected complexity of some measures related to the performance of the algorithms is analyzed. We also compare the efficiency of the algorithms with a few major… ▽ More

    Submitted 7 October, 2009; originally announced October 2009.

  41. arXiv:0805.0883  [pdf

    cs.OH

    Portable Valve-less Peristaltic Micro-pump Design and Fabrication

    Authors: H. Yang, T. -H. Tsai, C. -C. Hu

    Abstract: This paper is to describe a design and fabrication method for a valve-less peristaltic micro-pump. The valve-less peristaltic micro-pump with three membrane chambers in a serial is actuated by three piezoelectric (PZT) actuators. With the fluidic flow design, liquid in the flow channel is pumped to a constant flow speed ranged from 0.4 to 0.48 mm/s. In term of the maximum flow rate of the micro-… ▽ More

    Submitted 7 May, 2008; originally announced May 2008.

    Comments: Submitted on behalf of EDA Publishing Association (http://irevues.inist.fr/handle/2042/16838)

    Journal ref: Dans Symposium on Design, Test, Integration and Packaging of MEMS/MOEMS - DTIP 2008, Nice : France (2008)

  42. arXiv:math/0309285  [pdf, ps, other

    math.NA astro-ph cs.CE cs.DS cs.IT math.CO

    An Algorithm for Optimal Partitioning of Data on an Interval

    Authors: Brad Jackson, Jeffrey D. Scargle, David Barnes, Sundararajan Arabhi, Alina Alt, Peter Gioumousis, Elyus Gwin, Paungkaew Sangtrakulcharoen, Linda Tan, Tun Tao Tsai

    Abstract: Many signal processing problems can be solved by maximizing the fitness of a segmented model over all possible partitions of the data interval. This letter describes a simple but powerful algorithm that searches the exponentially large space of partitions of $N$ data points in time $O(N^2)$. The algorithm is guaranteed to find the exact global optimum, automatically determines the model order (t… ▽ More

    Submitted 9 April, 2004; v1 submitted 17 September, 2003; originally announced September 2003.

    Comments: 3 pages, 1 figure, submitted to IEEE Signal Processing Letters, revised version with added references

    MSC Class: 65C60