Zum Hauptinhalt springen

Showing 1–21 of 21 results for author: Tsai, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.09096  [pdf, other

    cs.LG cs.RO math.OC

    Optimizing Sensor Network Design for Multiple Coverage

    Authors: Lukas Taus, Yen-Hsi Richard Tsai

    Abstract: Sensor placement optimization methods have been studied extensively. They can be applied to a wide range of applications, including surveillance of known environments, optimal locations for 5G towers, and placement of missile defense systems. However, few works explore the robustness and efficiency of the resulting sensor network concerning sensor failure or adversarial attacks. This paper address… ▽ More

    Submitted 20 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  2. arXiv:2403.12024  [pdf, other

    cs.CL

    Enhancing Taiwanese Hokkien Dual Translation by Exploring and Standardizing of Four Writing Systems

    Authors: Bo-Han Lu, Yi-Hsuan Lin, En-Shiun Annie Lee, Richard Tzong-Han Tsai

    Abstract: Machine translation focuses mainly on high-resource languages (HRLs), while low-resource languages (LRLs) like Taiwanese Hokkien are relatively under-explored. The study aims to address this gap by developing a dual translation model between Taiwanese Hokkien and both Traditional Mandarin Chinese and English. We employ a pre-trained LLaMA 2-7B model specialized in Traditional Mandarin Chinese to l… ▽ More

    Submitted 14 May, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted by LREC-COLING 2024 as a long oral paper

  3. arXiv:2402.03021  [pdf, other

    cs.LG math.NA

    Data-induced multiscale losses and efficient multirate gradient descent schemes

    Authors: Juncai He, Liangchen Liu, Yen-Hsi Richard Tsai

    Abstract: This paper investigates the impact of multiscale data on machine learning algorithms, particularly in the context of deep learning. A dataset is multiscale if its distribution shows large variations in scale across different directions. This paper reveals multiscale structures in the loss landscape, including its gradients and Hessians inherited from the data. Correspondingly, it introduces a nove… ▽ More

    Submitted 6 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 28 pages, 4 figures, submitted under review

    MSC Class: 65F10; 65F45; 68T07 ACM Class: G.1.6; I.2.6

  4. arXiv:2402.02304  [pdf, other

    math.AP cs.LG

    Efficient Numerical Wave Propagation Enhanced By An End-to-End Deep Learning Model

    Authors: Luis Kaiser, Richard Tsai, Christian Klingenberg

    Abstract: Recent advances in wave modeling use sufficiently accurate fine solver outputs to train a neural network that enhances the accuracy of a fast but inaccurate coarse solver. In this paper we build upon the work of Nguyen and Tsai (2023) and present a novel unified system that integrates a numerical solver with a deep learning component into an end-to-end framework. In the proposed setting, we invest… ▽ More

    Submitted 13 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  5. arXiv:2402.01685  [pdf, other

    cs.CL cs.AI cs.DB

    SMUTF: Schema Matching Using Generative Tags and Hybrid Features

    Authors: Yu Zhang, Mei Di, Haozheng Luo, Chenwei Xu, Richard Tzong-Han Tsai

    Abstract: We introduce SMUTF, a unique approach for large-scale tabular data schema matching (SM), which assumes that supervised learning does not affect performance in open-domain tasks, thereby enabling effective cross-domain matching. This system uniquely combines rule-based feature engineering, pre-trained language models, and generative large language models. In an innovative adaptation inspired by the… ▽ More

    Submitted 6 February, 2024; v1 submitted 22 January, 2024; originally announced February 2024.

  6. arXiv:2310.04799  [pdf, other

    cs.CL

    Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages

    Authors: Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, Hung-yi Lee

    Abstract: Recently, the development of open-source large language models (LLMs) has advanced rapidly. Nevertheless, due to data constraints, the capabilities of most open-source LLMs are primarily focused on English. To address this issue, we introduce the concept of $\textit{chat vector}$ to equip pre-trained language models with instruction following and human value alignment via simple model arithmetic.… ▽ More

    Submitted 7 June, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: ACL 2024 camera-ready version

  7. arXiv:2309.08545  [pdf, other

    cs.LG cs.RO math.OC

    Efficient and robust Sensor Placement in Complex Environments

    Authors: Lukas Taus, Yen-Hsi Richard Tsai

    Abstract: We address the problem of efficient and unobstructed surveillance or communication in complex environments. On one hand, one wishes to use a minimal number of sensors to cover the environment. On the other hand, it is often important to consider solutions that are robust against sensor failure or adversarial attacks. This paper addresses these challenges of designing minimal sensor sets that achie… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  8. arXiv:2308.15118  [pdf, other

    cs.CL

    Large Language Models on the Chessboard: A Study on ChatGPT's Formal Language Comprehension and Complex Reasoning Skills

    Authors: Mu-Tien Kuo, Chih-Chung Hsueh, Richard Tzong-Han Tsai

    Abstract: While large language models have made strides in natural language processing, their proficiency in complex reasoning tasks requiring formal language comprehension, such as chess, remains less investigated. This paper probes the performance of ChatGPT, a sophisticated language model by OpenAI in tackling such complex reasoning tasks, using chess as a case study. Through robust metrics examining bot… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  9. arXiv:2307.04102  [pdf, other

    stat.ML cs.LG math.OC stat.CO

    A generative flow for conditional sampling via optimal transport

    Authors: Jason Alfonso, Ricardo Baptista, Anupam Bhakta, Noam Gal, Alfin Hou, Isa Lyubimova, Daniel Pocklington, Josef Sajonz, Giulio Trigila, Ryan Tsai

    Abstract: Sampling conditional distributions is a fundamental task for Bayesian inference and density estimation. Generative models, such as normalizing flows and generative adversarial networks, characterize conditional distributions by learning a transport map that pushes forward a simple reference (e.g., a standard Gaussian) to a target distribution. While these approaches successfully describe many non-… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: 18 pages, 5 figures

  10. arXiv:2307.02478  [pdf, other

    cs.LG math.DG

    Linear Regression on Manifold Structured Data: the Impact of Extrinsic Geometry on Solutions

    Authors: Liangchen Liu, Juncai He, Richard Tsai

    Abstract: In this paper, we study linear regression applied to data structured on a manifold. We assume that the data manifold is smooth and is embedded in a Euclidean space, and our objective is to reveal the impact of the data manifold's extrinsic geometry on the regression. Specifically, we analyze the impact of the manifold's curvatures (or higher order nonlinearity in the parameterization when the curv… ▽ More

    Submitted 22 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: 13 pages, 6 figures, accepted to TAGML23 workshop of ICML2023, to be published in PMLR

    MSC Class: 53Z50 62J05 (Primary) 65D18 68T07 (Secondary) ACM Class: G.1.2; G.4

  11. arXiv:2301.08937  [pdf, other

    cs.CL cs.AI

    Exploring Methods for Building Dialects-Mandarin Code-Mixing Corpora: A Case Study in Taiwanese Hokkien

    Authors: Sin-En Lu, Bo-Han Lu, Chao-Yi Lu, Richard Tzong-Han Tsai

    Abstract: In natural language processing (NLP), code-mixing (CM) is a challenging task, especially when the mixed languages include dialects. In Southeast Asian countries such as Singapore, Indonesia, and Malaysia, Hokkien-Mandarin is the most widespread code-mixed language pair among Chinese immigrants, and it is also common in Taiwan. However, dialects such as Hokkien often have a scarcity of resources an… ▽ More

    Submitted 21 January, 2023; originally announced January 2023.

    Comments: The paper was accepted by EMNLP 2022 findings

  12. arXiv:2206.07860  [pdf, other

    cs.SD cs.LG eess.AS

    EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning

    Authors: Li-Chin Chen, Po-Hsun Chen, Richard Tzong-Han Tsai, Yu Tsao

    Abstract: Speech generation and enhancement based on articulatory movements facilitate communication when the scope of verbal communication is absent, e.g., in patients who have lost the ability to speak. Although various techniques have been proposed to this end, electropalatography (EPG), which is a monitoring technique that records contact between the tongue and hard palate during speech, has not been ad… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: Accepted By IEEE Signal Processing Letter

    Journal ref: IEEE Signal Processing Letters, vol. 29, p. 2582-2586, 2022

  13. arXiv:2203.00614  [pdf, other

    cs.LG math.NA stat.ML

    Side Effects of Learning from Low-dimensional Data Embedded in a Euclidean Space

    Authors: Juncai He, Richard Tsai, Rachel Ward

    Abstract: The low-dimensional manifold hypothesis posits that the data found in many applications, such as those involving natural images, lie (approximately) on low-dimensional manifolds embedded in a high-dimensional Euclidean space. In this setting, a typical neural network defines a function that takes a finite number of vectors in the embedding space as input. However, one often needs to consider evalu… ▽ More

    Submitted 4 February, 2023; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: 53 pages (11 pages for Appendix), 24 figures

  14. arXiv:2010.09001  [pdf, other

    cs.AI math.OC

    Visibility Optimization for Surveillance-Evasion Games

    Authors: Louis Ly, Yen-Hsi Richard Tsai

    Abstract: We consider surveillance-evasion differential games, where a pursuer must try to constantly maintain visibility of a moving evader. The pursuer loses as soon as the evader becomes occluded. Optimal controls for game can be formulated as a Hamilton-Jacobi-Isaac equation. We use an upwind scheme to compute the feedback value function, corresponding to the end-game time of the differential game. Alth… ▽ More

    Submitted 26 March, 2022; v1 submitted 18 October, 2020; originally announced October 2020.

  15. arXiv:1911.10737  [pdf, other

    cs.CV cs.LG

    Nearest Neighbor Sampling of Point Sets using Rays

    Authors: Liangchen Liu, Louis Ly, Colin Macdonald, Yen-Hsi Richard Tsai

    Abstract: We propose a new framework for the sampling, compression, and analysis of distributions of point sets and other geometric objects embedded in Euclidean spaces. Our approach involves constructing a tensor called the RaySense sketch, which captures nearest neighbors from the underlying geometry of points along a set of rays. We explore various operations that can be performed on the RaySense sketch,… ▽ More

    Submitted 13 September, 2023; v1 submitted 25 November, 2019; originally announced November 2019.

    Comments: 48 pages, 14 figures, accepted to Communication on Applied Mathematics and Computation (CAMC), Focused Issue in Honor of Prof. Stanley Osher on the Occasion of His 80th Birthday. Fixed typos and improved notations

    MSC Class: 68T09; 65D19 (Primary) 68T07; 65D40 (Secondary) ACM Class: G.0; G.2.3; G.3; I.2.10; I.4.7

  16. arXiv:1911.07394  [pdf, other

    cs.RO

    Strategy Synthesis for Surveillance-Evasion Games with Learning-Enabled Visibility Optimization

    Authors: Suda Bharadwaj, Louis Ly, Bo Wu, Richard Tsai, Ufuk Topcu

    Abstract: This paper studies a two-player game with a quantitative surveillance requirement on an adversarial target moving in a discrete state space and a secondary objective to maximize short-term visibility of the environment. We impose the surveillance requirement as a temporal logic constraint.We then use a greedy approach to determine vantage points that optimize a notion of information gain, namely,… ▽ More

    Submitted 17 November, 2019; originally announced November 2019.

  17. Revised JNLPBA Corpus: A Revised Version of Biomedical NER Corpus for Relation Extraction Task

    Authors: Ming-Siang Huang, Po-Ting Lai, Richard Tzong-Han Tsai, Wen-Lian Hsu

    Abstract: The advancement of biomedical named entity recognition (BNER) and biomedical relation extraction (BRE) researches promotes the development of text mining in biological domains. As a cornerstone of BRE, robust BNER system is required to identify the mentioned NEs in plain texts for further relation extraction stage. However, the current BNER corpora, which play important roles in these tasks, paid… ▽ More

    Submitted 29 January, 2019; originally announced January 2019.

    Comments: 17 pages

    Journal ref: Briefings in Bioinformatics, 2020, bbaa054

  18. arXiv:1809.06025  [pdf, other

    cs.LG math.OC stat.ML

    Greedy Algorithms for Sparse Sensor Placement via Deep Learning

    Authors: Louis Ly, Yen-Hsi Richard Tsai

    Abstract: We consider the exploration problem: an agent equipped with a depth sensor must map out a previously unknown environment using as few sensor measurements as possible. We propose an approach based on supervised learning of a greedy algorithm. We provide a bound on the optimality of the greedy algorithm using submodularity theory. Using a level set representation, we train a convolutional neural net… ▽ More

    Submitted 26 March, 2022; v1 submitted 17 September, 2018; originally announced September 2018.

  19. Textual Analysis for Studying Chinese Historical Documents and Literary Novels

    Authors: Chao-Lin Liu, Guan-Tao Jin, Hongsu Wang, Qing-Feng Liu, Wen-Huei Cheng, Wei-Yun Chiu, Richard Tzong-Han Tsai, Yu-Chun Wang

    Abstract: We analyzed historical and literary documents in Chinese to gain insights into research issues, and overview our studies which utilized four different sources of text materials in this paper. We investigated the history of concepts and transliterated words in China with the Database for the Study of Modern China Thought and Literature, which contains historical documents about China between 1830 a… ▽ More

    Submitted 11 October, 2015; originally announced October 2015.

    Comments: 11 pages, 7 figures, 2 tables, The Fourth ASE International Conference on Social Informatics

  20. arXiv:1504.06206  [pdf, other

    cs.CV

    An Elastic Image Registration Approach for Wireless Capsule Endoscope Localization

    Authors: Isabel N. Figueiredo, Carlos Leal, Luís Pinto, Pedro N. Figueiredo, Richard Tsai

    Abstract: Wireless Capsule Endoscope (WCE) is an innovative imaging device that permits physicians to examine all the areas of the Gastrointestinal (GI) tract. It is especially important for the small intestine, where traditional invasive endoscopies cannot reach. Although WCE represents an extremely important advance in medical imaging, a major drawback that remains unsolved is the WCE precise location in… ▽ More

    Submitted 23 April, 2015; originally announced April 2015.

  21. Automated polyp detection in colon capsule endoscopy

    Authors: Alexander V. Mamonov, Isabel N. Figueiredo, Pedro N. Figueiredo, Yen-Hsi Richard Tsai

    Abstract: Colorectal polyps are important precursors to colon cancer, a major health problem. Colon capsule endoscopy (CCE) is a safe and minimally invasive examination procedure, in which the images of the intestine are obtained via digital cameras on board of a small capsule ingested by a patient. The video sequence is then analyzed for the presence of polyps. We propose an algorithm that relieves the lab… ▽ More

    Submitted 27 March, 2014; v1 submitted 8 May, 2013; originally announced May 2013.

    Comments: 16 pages, 9 figures, 4 tables

    ACM Class: I.4.8

    Journal ref: IEEE Transactions on Medical Imaging 33(7):1488-1502, 2014