Skip to main content

Showing 1–50 of 64 results for author: Chiang, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.05216  [pdf, other

    cs.CL

    Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course

    Authors: Cheng-Han Chiang, Wei-Chih Chen, Chun-Yi Kuan, Chienchou Yang, Hung-yi Lee

    Abstract: Using large language models (LLMs) for automatic evaluation has become an important evaluation method in NLP research. However, it is unclear whether these LLM-based evaluators can be applied in real-world classrooms to assess student assignments. This empirical report shares how we use GPT-4 as an automatic assignment evaluator in a university course with 1,028 students. Based on student response… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: An empirical report of our course: Introduction to Generative AI 2024 Spring (https://speech.ee.ntu.edu.tw/~hylee/genai/2024-spring.php)

  2. arXiv:2406.19538  [pdf, other

    cs.CL

    Context Matters: An Empirical Study of the Impact of Contextual Information in Temporal Question Answering Systems

    Authors: Dan Schumacher, Fatemeh Haji, Tara Grey, Niharika Bandlamudi, Nupoor Karnik, Gagana Uday Kumar, Jason Cho-Yu Chiang, Paul Rad, Nishant Vishwamitra, Anthony Rios

    Abstract: Large language models (LLMs) often struggle with temporal reasoning, crucial for tasks like historical event analysis and time-sensitive information retrieval. Despite advancements, state-of-the-art models falter in handling temporal information, especially when faced with irrelevant or noisy contexts. This paper addresses this gap by empirically examining the robustness of temporal question-answe… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2404.10528  [pdf, other

    cs.MM

    AllTheDocks road safety dataset: A cyclist's perspective and experience

    Authors: Chia-Yen Chiang, Ruikang Zhong, Jennifer Ding, Joseph Wood, Stephen Bee, Mona Jaber

    Abstract: Active travel is an essential component in intelligent transportation systems. Cycling, as a form of active travel, shares the road space with motorised traffic which often affects the cyclists' safety and comfort and therefore peoples' propensity to uptake cycling instead of driving. This paper presents a unique dataset, collected by cyclists across London, that includes video footage, accelerome… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  4. arXiv:2403.17847  [pdf, other

    cs.LG cs.AI

    Climate Downscaling: A Deep-Learning Based Super-resolution Model of Precipitation Data with Attention Block and Skip Connections

    Authors: Chia-Hao Chiang, Zheng-Han Huang, Liwen Liu, Hsin-Chien Liang, Yi-Chi Wang, Wan-Ling Tseng, Chao Wang, Che-Ta Chen, Ko-Chih Wang

    Abstract: Human activities accelerate consumption of fossil fuels and produce greenhouse gases, resulting in urgent issues today: global warming and the climate change. These indirectly cause severe natural disasters, plenty of lives suffering and huge losses of agricultural properties. To mitigate impacts on our lands, scientists are developing renewable, reusable, and clean energies and climatologists are… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  5. arXiv:2402.12786  [pdf, other

    cs.CL eess.AS

    Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations

    Authors: Guan-Ting Lin, Cheng-Han Chiang, Hung-yi Lee

    Abstract: In spoken dialogue, even if two current turns are the same sentence, their responses might still differ when they are spoken in different styles. The spoken styles, containing paralinguistic and prosodic information, mark the most significant difference between text and speech modality. When using text-only LLMs to model spoken dialogue, text-only LLMs cannot give different responses based on the… ▽ More

    Submitted 30 May, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024

  6. arXiv:2402.05629  [pdf, other

    cs.CL

    Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations

    Authors: Cheng-Han Chiang, Hung-yi Lee

    Abstract: Long-form generations from large language models (LLMs) contain a mix of factual and non-factual claims, making evaluating factuality difficult. Prior works evaluate the factuality of a long paragraph by decomposing it into multiple facts, verifying those facts independently, and aggregating the results. Such methods assume that combining factual claims forms a factual paragraph. The above assumpt… ▽ More

    Submitted 6 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: ACL 2024 Findings

  7. arXiv:2402.03988  [pdf, other

    eess.AS cs.CL cs.SD

    REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

    Authors: Liang-Hsuan Tseng, En-Pei Hu, Cheng-Han Chiang, Yuan Tseng, Hung-yi Lee, Lin-shan Lee, Shao-Hua Sun

    Abstract: Unsupervised automatic speech recognition (ASR) aims to learn the mapping between the speech signal and its corresponding textual transcription without the supervision of paired speech-text data. A word/phoneme in the speech signal is represented by a segment of speech signal with variable length and unknown boundary, and this segmental structure makes learning the mapping between speech and text… ▽ More

    Submitted 28 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  8. arXiv:2402.01057  [pdf, other

    cs.LG

    Expert Proximity as Surrogate Rewards for Single Demonstration Imitation Learning

    Authors: Chia-Cheng Chiang, Li-Cheng Lan, Wei-Fang Sun, Chien Feng, Cho-Jui Hsieh, Chun-Yi Lee

    Abstract: In this paper, we focus on single-demonstration imitation learning (IL), a practical approach for real-world applications where acquiring multiple expert demonstrations is costly or infeasible and the ground truth reward function is not available. In contrast to typical IL settings with multiple demonstrations, single-demonstration IL involves an agent having access to only one expert trajectory.… ▽ More

    Submitted 7 July, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Published at ICML 2024. Code: https://github.com/stanl1y/tdil

  9. arXiv:2401.11467  [pdf, other

    cs.CL

    Over-Reasoning and Redundant Calculation of Large Language Models

    Authors: Cheng-Han Chiang, Hung-yi Lee

    Abstract: Large language models (LLMs) can solve problems step-by-step. While this chain-of-thought (CoT) reasoning boosts LLMs' performance, it is unclear if LLMs \textit{know} when to use CoT and whether those CoT are always necessary to answer the question. This paper shows that LLMs tend to generate redundant calculations and reasoning on a manually constructed math QA dataset, GSM8K-Zero. GSM8K-Zero is… ▽ More

    Submitted 20 March, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

    Comments: EACL 2024 main conference paper. Camera-ready version

  10. arXiv:2312.06152  [pdf, other

    hep-ph cs.LG hep-ex

    Improving the performance of weak supervision searches using transfer and meta-learning

    Authors: Hugues Beauchesne, Zong-En Chen, Cheng-Wei Chiang

    Abstract: Weak supervision searches have in principle the advantages of both being able to train on experimental data and being able to learn distinctive signal properties. However, the practical applicability of such searches is limited by the fact that successfully training a neural network via weak supervision can require a large amount of signal. In this work, we seek to create neural networks that can… ▽ More

    Submitted 1 March, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: 20 pages, 7 figures, matches the published version

  11. arXiv:2311.10798  [pdf, other

    cs.LG cs.AI cs.CV eess.IV

    INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

    Authors: Shih-Cheng Huang, Zepeng Huo, Ethan Steinberg, Chia-Chun Chiang, Matthew P. Lungren, Curtis P. Langlotz, Serena Yeung, Nigam H. Shah, Jason A. Fries

    Abstract: Synthesizing information from multiple data sources plays a crucial role in the practice of modern medicine. Current applications of artificial intelligence in medicine often focus on single-modality data due to a lack of publicly available, multimodal medical datasets. To address this limitation, we introduce INSPECT, which contains de-identified longitudinal records from a large cohort of patien… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  12. arXiv:2310.16146  [pdf, other

    cs.IR cs.AI cs.CL

    Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature

    Authors: Alejandro Lozano, Scott L Fleming, Chia-Chun Chiang, Nigam Shah

    Abstract: The quickly-expanding nature of published medical literature makes it challenging for clinicians and researchers to keep up with and summarize recent, relevant findings in a timely manner. While several closed-source summarization tools based on large language models (LLMs) now exist, rigorous and systematic evaluations of their outputs are lacking. Furthermore, there is a paucity of high-quality… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Preprint of an article published in Pacific Symposium on Biocomputing copyright 2024 World Scientific Publishing Co., Singapore, http://psb.stanford.edu/

  13. arXiv:2310.15211  [pdf, other

    q-bio.QM cs.AI cs.LG q-bio.MN

    Modeling Path Importance for Effective Alzheimer's Disease Drug Repurposing

    Authors: Shunian Xiang, Patrick J. Lawrence, Bo Peng, ChienWei Chiang, Dokyoon Kim, Li Shen, Xia Ning

    Abstract: Recently, drug repurposing has emerged as an effective and resource-efficient paradigm for AD drug discovery. Among various methods for drug repurposing, network-based methods have shown promising results as they are capable of leveraging complex networks that integrate multiple interaction types, such as protein-protein interactions, to more effectively identify candidate drugs. However, existing… ▽ More

    Submitted 27 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: 16 pages, 3 figures, 2 tables, 1 supplementary figure, 5 supplementary tables, Preprint of an article accepted for publication in Pacific Symposium on Biocomputing ©2023 World Scientific Publishing Co., Singapore, http://psb.stanford.edu/

  14. arXiv:2310.05657  [pdf, other

    cs.CL

    A Closer Look into Automatic Evaluation Using Large Language Models

    Authors: Cheng-Han Chiang, Hung-yi Lee

    Abstract: Using large language models (LLMs) to evaluate text quality has recently gained popularity. Some prior works explore the idea of using LLMs for evaluation, while they differ in some details of the evaluation process. In this paper, we analyze LLM evaluation (Chiang and Lee, 2023) and G-Eval (Liu et al., 2023), and we discuss how those details in the evaluation process change how well the ratings g… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 findings (short paper). Code: https://github.com/d223302/A-Closer-Look-To-LLM-Evaluation/

  15. arXiv:2309.14774  [pdf, other

    cs.LG cs.CL cs.CV cs.HC

    BLIP-Adapter: Parameter-Efficient Transfer Learning for Mobile Screenshot Captioning

    Authors: Ching-Yu Chiang, I-Hua Chang, Shih-Wei Liao

    Abstract: This study aims to explore efficient tuning methods for the screenshot captioning task. Recently, image captioning has seen significant advancements, but research in captioning tasks for mobile screens remains relatively scarce. Current datasets and use cases describing user behaviors within product screenshots are notably limited. Consequently, we sought to fine-tune pre-existing models for the s… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  16. arXiv:2309.08216  [pdf, other

    cs.LG

    Unified Risk Analysis for Weakly Supervised Learning

    Authors: Chao-Kai Chiang, Masashi Sugiyama

    Abstract: Among the flourishing research of weakly supervised learning (WSL), we recognize the lack of a unified interpretation of the mechanism behind the weakly supervised scenarios, let alone a systematic treatment of the risk rewrite problem, a crucial step in the empirical risk minimization approach. In this paper, we introduce a framework providing a comprehensive understanding and a unified methodolo… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  17. arXiv:2308.14763  [pdf, other

    eess.AS cs.CL cs.SD

    VoiceBank-2023: A Multi-Speaker Mandarin Speech Corpus for Constructing Personalized TTS Systems for the Speech Impaired

    Authors: Jia-Jyu Su, Pang-Chen Liao, Yen-Ting Lin, Wu-Hao Li, Guan-Ting Liou, Cheng-Che Kao, Wei-Cheng Chen, Jen-Chieh Chiang, Wen-Yang Chang, Pin-Han Lin, Chen-Yu Chiang

    Abstract: Services of personalized TTS systems for the Mandarin-speaking speech impaired are rarely mentioned. Taiwan started the VoiceBanking project in 2020, aiming to build a complete set of services to deliver personalized Mandarin TTS systems to amyotrophic lateral sclerosis patients. This paper reports the corpus design, corpus recording, data purging and correction for the corpus, and evaluations of… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: submitted to 26th International Conference of the ORIENTAL-COCOSDA

  18. arXiv:2308.14089  [pdf, other

    cs.CL cs.AI cs.LG

    MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records

    Authors: Scott L. Fleming, Alejandro Lozano, William J. Haberkorn, Jenelle A. Jindal, Eduardo P. Reis, Rahul Thapa, Louis Blankemeier, Julian Z. Genkins, Ethan Steinberg, Ashwin Nayak, Birju S. Patel, Chia-Chun Chiang, Alison Callahan, Zepeng Huo, Sergios Gatidis, Scott J. Adams, Oluseyi Fayanju, Shreya J. Shah, Thomas Savage, Ethan Goh, Akshay S. Chaudhari, Nima Aghaeepour, Christopher Sharp, Michael A. Pfeffer, Percy Liang , et al. (5 additional authors not shown)

    Abstract: The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture… ▽ More

    Submitted 24 December, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

  19. arXiv:2308.13229  [pdf, other

    cs.CV

    ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking

    Authors: Cheng-Che Cheng, Min-Xuan Qiu, Chen-Kuo Chiang, Shang-Hong Lai

    Abstract: Multi-Camera Multi-Object Tracking (MC-MOT) utilizes information from multiple views to better handle problems with occlusion and crowded scenes. Recently, the use of graph-based approaches to solve tracking problems has become very popular. However, many current graph-based methods do not effectively utilize information regarding spatial and temporal consistency. Instead, they rely on single-came… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023

  20. arXiv:2306.05083  [pdf, other

    cs.CL

    Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS

    Authors: Cheng-Han Chiang, Yung-Sung Chuang, James Glass, Hung-yi Lee

    Abstract: Existing sentence textual similarity benchmark datasets only use a single number to summarize how similar the sentence encoder's decision is to humans'. However, it is unclear what kind of sentence pairs a sentence encoder (SE) would consider similar. Moreover, existing SE benchmarks mainly consider sentence pairs with low lexical overlap, so it is unclear how the SEs behave when two sentences hav… ▽ More

    Submitted 13 June, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: ACL 2023 repl4nlp (representation learning for NLP) workshop poster paper. Dataset at https://huggingface.co/datasets/dcml0714/Heros

  21. arXiv:2305.01937  [pdf, other

    cs.CL cs.HC

    Can Large Language Models Be an Alternative to Human Evaluations?

    Authors: Cheng-Han Chiang, Hung-yi Lee

    Abstract: Human evaluation is indispensable and inevitable for assessing the quality of texts generated by machine learning models or written by humans. However, human evaluation is very difficult to reproduce and its quality is notoriously unstable, hindering fair comparisons among different natural language processing (NLP) models and algorithms. Recently, large language models (LLMs) have demonstrated ex… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: ACL 2023 main conference paper. Main content: 10 pages (including limitations). Appendix: 13 pages

  22. arXiv:2302.14407  [pdf, other

    cs.LG math.ST stat.ML

    The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models

    Authors: Jongyeong Lee, Chao-Kai Chiang, Masashi Sugiyama

    Abstract: Thompson sampling (TS) has been known for its outstanding empirical performance supported by theoretical guarantees across various reward models in the classical stochastic multi-armed bandit problems. Nonetheless, its optimality is often restricted to specific priors due to the common observation that TS is fairly insensitive to the choice of the prior when it comes to asymptotic regret bounds. H… ▽ More

    Submitted 12 December, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

    Comments: 55 pages, TBA AAAI2024

  23. arXiv:2302.01544  [pdf, other

    cs.LG math.ST stat.ML

    Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits

    Authors: Jongyeong Lee, Junya Honda, Chao-Kai Chiang, Masashi Sugiyama

    Abstract: In the stochastic multi-armed bandit problem, a randomized probability matching policy called Thompson sampling (TS) has shown excellent performance in various reward models. In addition to the empirical performance, TS has been shown to achieve asymptotic problem-dependent lower bounds in several models. However, its optimality has been mainly addressed under light-tailed or one-parameter models… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    Comments: 49 pages, a preprint

  24. arXiv:2212.12454  [pdf

    cs.CL

    Generalizable Natural Language Processing Framework for Migraine Reporting from Social Media

    Authors: Yuting Guo, Swati Rajwal, Sahithi Lakamana, Chia-Chun Chiang, Paul C. Menell, Adnan H. Shahid, Yi-Chieh Chen, Nikita Chhabra, Wan-Ju Chao, Chieh-Ju Chao, Todd J. Schwedt, Imon Banerjee, Abeed Sarker

    Abstract: Migraine is a high-prevalence and disabling neurological disorder. However, information migraine management in real-world settings could be limited to traditional health information sources. In this paper, we (i) verify that there is substantial migraine-related chatter available on social media (Twitter and Reddit), self-reported by migraine sufferers; (ii) develop a platform-independent text cla… ▽ More

    Submitted 23 December, 2022; originally announced December 2022.

    Comments: Accepted by AMIA 2023 Informatics Summit

  25. arXiv:2211.06770  [pdf, other

    cs.CV cs.LG eess.IV

    MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

    Authors: Andrey Ignatov, Anastasia Sycheva, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc Van Gool

    Abstract: While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity. In this paper, we present a novel MicroISP model designed specifically for edge devices, taking into account their computational and memory limitations. The propo… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2211.06263

  26. arXiv:2211.06263  [pdf, other

    cs.CV cs.LG eess.IV

    PyNet-V2 Mobile: Efficient On-Device Photo Processing With Neural Networks

    Authors: Andrey Ignatov, Grigory Malivenko, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc Van Gool

    Abstract: The increased importance of mobile photography created a need for fast and performant RAW image processing pipelines capable of producing good visual results in spite of the mobile camera sensor limitations. While deep learning-based approaches can efficiently solve this problem, their computational requirements usually remain too large for high-resolution on-device image processing. To address th… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

  27. arXiv:2211.05256  [pdf, other

    eess.IV cs.CV

    Power Efficient Video Super-Resolution on Mobile NPUs with Deep Learning, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Cheng-Ming Chiang, Hsien-Kai Kuo, Yu-Syuan Xu, Man-Yu Lee, Allen Lu, Chia-Ming Cheng, Chih-Cheng Chen, Jia-Ying Yong, Hong-Han Shuai, Wen-Huang Cheng, Zhuang Jia, Tianyu Xu, Yijian Zhang, Long Bao, Heng Sun, Diankai Zhang, Si Gao, Shaoli Liu, Biao Wu, Xiaofeng Zhang, Chengjian Zheng, Kaidi Lu, Ning Wang , et al. (29 additional authors not shown)

    Abstract: Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this prob… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.08826, arXiv:2105.07809, arXiv:2211.04470, arXiv:2211.03885

  28. arXiv:2210.06664  [pdf

    eess.IV cs.AI cs.CV

    Are Macula or Optic Nerve Head Structures better at Diagnosing Glaucoma? An Answer using AI and Wide-Field Optical Coherence Tomography

    Authors: Charis Y. N. Chiang, Fabian Braeu, Thanadet Chuangsuwanich, Royston K. Y. Tan, Jacqueline Chua, Leopold Schmetterer, Alexandre Thiery, Martin Buist, Michaël J. A. Girard

    Abstract: Purpose: (1) To develop a deep learning algorithm to automatically segment structures of the optic nerve head (ONH) and macula in 3D wide-field optical coherence tomography (OCT) scans; (2) To assess whether 3D macula or ONH structures (or the combination of both) provide the best diagnostic power for glaucoma. Methods: A cross-sectional comparative study was performed which included wide-field sw… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: 23 pages, 5 figures

  29. arXiv:2210.02844  [pdf, other

    cs.CL

    Are Synonym Substitution Attacks Really Synonym Substitution Attacks?

    Authors: Cheng-Han Chiang, Hung-yi Lee

    Abstract: In this paper, we explore the following question: Are synonym substitution attacks really synonym substitution attacks (SSAs)? We approach this question by examining how SSAs replace words in the original sentence and show that there are still unresolved obstacles that make current SSAs generate invalid adversarial samples. We reveal that four widely used word substitution methods generate a large… ▽ More

    Submitted 7 May, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: Findings in ACL 2023. Major revisions compared with previous versions are made to incorporate the reviewers' suggestions. The modifications made are listed in Appendix A

  30. arXiv:2209.05978  [pdf, other

    cs.LG cs.SD eess.AS

    A Distributed Acoustic Sensor System for Intelligent Transportation using Deep Learning

    Authors: Chia-Yen Chiang, Mona Jaber, Peter Hayward

    Abstract: Intelligent transport systems (ITS) are pivotal in the development of sustainable and green urban living. ITS is data-driven and enabled by the profusion of sensors ranging from pneumatic tubes to smart cameras. This work explores a novel data source based on optical fibre-based distributed acoustic sensors (DAS) for traffic analysis. Detecting the type of vehicle and estimating the occupancy of v… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: 9 pages, 4 figures

  31. arXiv:2206.06886  [pdf, ps, other

    quant-ph cs.CC

    Space-efficient Quantization Method for Reversible Markov Chains

    Authors: Chen-Fu Chiang, Anirban Chowdhury, Pawel Wocjan

    Abstract: In a seminal paper, Szegedy showed how to construct a quantum walk $W(P)$ for any reversible Markov chain $P$ such that its eigenvector with eigenphase $0$ is a quantum sample of the limiting distribution of the random walk and its eigenphase gap is quadratically larger than the spectral gap of $P$. The standard construction of Szegedy's quantum walk requires an ancilla register of Hilbert-space d… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

  32. arXiv:2204.04580  [pdf, other

    cs.CL

    Re-Examining Human Annotations for Interpretable NLP

    Authors: Cheng-Han Chiang, Hung-yi Lee

    Abstract: Explanation methods in Interpretable NLP often explain the model's decision by extracting evidence (rationale) from the input texts supporting the decision. Benchmark datasets for rationales have been released to evaluate how good the rationale is. The ground truth rationales in these datasets are often human annotations obtained via crowd-sourced websites. Valuable as these datasets are, the deta… ▽ More

    Submitted 9 April, 2022; originally announced April 2022.

    Comments: Explainable Agency in Artificial Intelligence Workshop, AAAI 2022

  33. arXiv:2204.04458  [pdf, other

    cs.CL

    Understanding, Detecting, and Separating Out-of-Distribution Samples and Adversarial Samples in Text Classification

    Authors: Cheng-Han Chiang, Hung-yi Lee

    Abstract: In this paper, we study the differences and commonalities between statistically out-of-distribution (OOD) samples and adversarial (Adv) samples, both of which hurting a text classification model's performance. We conduct analyses to compare the two types of anomalies (OOD and Adv samples) with the in-distribution (ID) ones from three aspects: the input features, the hidden representations in each… ▽ More

    Submitted 9 April, 2022; originally announced April 2022.

    Comments: Preprint. Work in progress

  34. arXiv:2204.00721  [pdf

    astro-ph.EP cs.LG

    Identifying Exoplanets with Machine Learning Methods: A Preliminary Study

    Authors: Yucheng Jin, Lanyi Yang, Chia-En Chiang

    Abstract: The discovery of habitable exoplanets has long been a heated topic in astronomy. Traditional methods for exoplanet identification include the wobble method, direct imaging, gravitational microlensing, etc., which not only require a considerable investment of manpower, time, and money, but also are limited by the performance of astronomical telescopes. In this study, we proposed the idea of using m… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

    Comments: 12 pages with 9 figures and 2 tables

    Journal ref: International Journal on Cybernetics & Informatics (IJCI) 11 (1/2), pp. 31-42, 2022

  35. arXiv:2202.06626  [pdf, other

    eess.IV cs.CV cs.LG

    MuZero with Self-competition for Rate Control in VP9 Video Compression

    Authors: Amol Mandhane, Anton Zhernov, Maribeth Rauh, Chenjie Gu, Miaosen Wang, Flora Xue, Wendy Shang, Derek Pang, Rene Claus, Ching-Han Chiang, Cheng Chen, Jingning Han, Angie Chen, Daniel J. Mankowitz, Jackson Broshear, Julian Schrittwieser, Thomas Hubert, Oriol Vinyals, Timothy Mann

    Abstract: Video streaming usage has seen a significant rise as entertainment, education, and business increasingly rely on online video. Optimizing video compression has the potential to increase access and quality of content to users, and reduce energy use and costs overall. In this paper, we present an application of the MuZero algorithm to the challenge of video compression. Specifically, we target the p… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

  36. arXiv:2109.03537  [pdf, other

    cs.CL

    On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets

    Authors: Cheng-Han Chiang, Hung-yi Lee

    Abstract: Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to achieve exceptional downstream performance than their counterparts directly trained on the downstream tasks. In this work, we study what specific traits in the pre-training data, other than the semantics, make a pre-trained LM superior to their counterparts trained from scratch on downstream tasks.… ▽ More

    Submitted 18 February, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

    Comments: AAAI 2022 main conference paper. 10 pages, 3 figures, 2 tables

  37. arXiv:2105.07809  [pdf, other

    eess.IV cs.CV cs.LG

    Learned Smartphone ISP on Mobile NPUs with Deep Learning, Mobile AI 2021 Challenge: Report

    Authors: Andrey Ignatov, Cheng-Ming Chiang, Hsien-Kai Kuo, Anastasia Sycheva, Radu Timofte, Min-Hung Chen, Man-Yu Lee, Yu-Syuan Xu, Yu Tseng, Shusong Xu, Jin Guo, Chao-Hung Chen, Ming-Chun Hsyu, Wen-Chia Tsai, Chao-Wei Chen, Grigory Malivenko, Minsu Kwon, Myungje Lee, Jaeyoon Yoo, Changbeom Kang, Shinjo Wang, Zheng Shaolong, Hao Dejun, Xie Fen, Feng Zhuang , et al. (16 additional authors not shown)

    Abstract: As the quality of mobile cameras starts to play a crucial role in modern smartphones, more and more attention is now being paid to ISP algorithms used to improve various perceptual aspects of mobile photos. In this Mobile AI challenge, the target was to develop an end-to-end deep learning-based image signal processing (ISP) pipeline that can replace classical hand-crafted ISPs and achieve nearly r… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: Mobile AI 2021 Workshop and Challenges: https://ai-benchmark.com/workshops/mai/2021/

  38. arXiv:2101.09113  [pdf, ps, other

    cs.LG

    Pareto GAN: Extending the Representational Power of GANs to Heavy-Tailed Distributions

    Authors: Todd Huster, Jeremy E. J. Cohen, Zinan Lin, Kevin Chan, Charles Kamhoua, Nandi Leslie, Cho-Yu Jason Chiang, Vyas Sekar

    Abstract: Generative adversarial networks (GANs) are often billed as "universal distribution learners", but precisely what distributions they can represent and learn is still an open question. Heavy-tailed distributions are prevalent in many different domains such as financial risk-assessment, physics, and epidemiology. We observe that existing GAN architectures do a poor job of matching the asymptotic beha… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

  39. arXiv:2012.11995  [pdf, other

    cs.CL

    Pre-Training a Language Model Without Human Language

    Authors: Cheng-Han Chiang, Hung-yi Lee

    Abstract: In this paper, we study how the intrinsic nature of pre-training data contributes to the fine-tuned downstream performance. To this end, we pre-train different transformer-based masked language models on several corpora with certain features, and we fine-tune those language models on GLUE benchmarks. We find that models pre-trained on unstructured data beat those trained directly from scratch on d… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

    Comments: 9 pages, work in progress

  40. arXiv:2012.05339  [pdf, other

    cs.LG cs.CV

    Neural Rate Control for Video Encoding using Imitation Learning

    Authors: Hongzi Mao, Chenjie Gu, Miaosen Wang, Angie Chen, Nevena Lazic, Nir Levine, Derek Pang, Rene Claus, Marisabel Hechtman, Ching-Han Chiang, Cheng Chen, Jingning Han

    Abstract: In modern video encoders, rate control is a critical component and has been heavily engineered. It decides how many bits to spend to encode each frame, in order to optimize the rate-distortion trade-off over all video frames. This is a challenging constrained planning problem because of the complex dependency among decisions for different video frames and the bitrate constraint defined at the end… ▽ More

    Submitted 9 December, 2020; originally announced December 2020.

  41. arXiv:2012.02328  [pdf, other

    cs.LG cs.DC

    MLPerf Mobile Inference Benchmark

    Authors: Vijay Janapa Reddi, David Kanter, Peter Mattson, Jared Duke, Thai Nguyen, Ramesh Chukka, Ken Shiring, Koan-Sin Tan, Mark Charlebois, William Chou, Mostafa El-Khamy, Jungwook Hong, Tom St. John, Cindy Trinh, Michael Buch, Mark Mazumder, Relia Markovic, Thomas Atta, Fatih Cakir, Masoud Charkhabi, Xiaodong Chen, Cheng-Ming Chiang, Dave Dexter, Terry Heo, Gunther Schmuelling , et al. (2 additional authors not shown)

    Abstract: This paper presents the first industry-standard open-source machine learning (ML) benchmark to allow perfor mance and accuracy evaluation of mobile devices with different AI chips and software stacks. The benchmark draws from the expertise of leading mobile-SoC vendors, ML-framework providers, and model producers. It comprises a suite of models that operate with standard data sets, quality metrics… ▽ More

    Submitted 6 April, 2022; v1 submitted 3 December, 2020; originally announced December 2020.

  42. arXiv:2010.08708  [pdf, other

    cs.CV cs.CL

    Answer-checking in Context: A Multi-modal FullyAttention Network for Visual Question Answering

    Authors: Hantao Huang, Tao Han, Wei Han, Deep Yap, Cheng-Ming Chiang

    Abstract: Visual Question Answering (VQA) is challenging due to the complex cross-modal relations. It has received extensive attention from the research community. From the human perspective, to answer a visual question, one needs to read the question and then refer to the image to generate an answer. This answer will then be checked against the question and image again for the final confirmation. In this p… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

    Comments: Accepted in ICPR2020

  43. Deep Learning based Automated Forest Health Diagnosis from Aerial Images

    Authors: Chia-Yen Chiang, Chloe Barnes, Plamen Angelov, Richard Jiang

    Abstract: Global climate change has had a drastic impact on our environment. Previous study showed that pest disaster occured from global climate change may cause a tremendous number of trees died and they inevitably became a factor of forest fire. An important portent of the forest fire is the condition of forests. Aerial image-based forest analysis can give an early detection of dead trees and living tree… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

    Comments: 16 pages

    ACM Class: I.4.6; I.4.9; I.2.6; J.2

    Journal ref: IEEE Access, vol. 8, pp. 144064-144076, 2020

  44. arXiv:2010.02480  [pdf, other

    cs.CL

    Pretrained Language Model Embryology: The Birth of ALBERT

    Authors: Cheng-Han Chiang, Sung-Feng Huang, Hung-yi Lee

    Abstract: While behaviors of pretrained language models (LMs) have been thoroughly examined, what happened during pretraining is rarely studied. We thus investigate the developmental process from a set of randomly initialized parameters to a totipotent language model, which we refer to as the embryology of a pretrained language model. Our results show that ALBERT learns to reconstruct and predict tokens of… ▽ More

    Submitted 28 October, 2020; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: Accepted to EMNLP 2020, short paper

  45. Trust-Region Method with Deep Reinforcement Learning in Analog Design Space Exploration

    Authors: Kai-En Yang, Chia-Yu Tsai, Hung-Hao Shen, Chen-Feng Chiang, Feng-Ming Tsai, Chung-An Wang, Yiju Ting, Chia-Shun Yeh, Chin-Tang Lai

    Abstract: This paper introduces new perspectives on analog design space search. To minimize the time-to-market, this endeavor better cast as constraint satisfaction problem than global optimization defined in prior arts. We incorporate model-based agents, contrasted with model-free learning, to implement a trust-region strategy. As such, simple feed-forward networks can be trained with supervised learning,… ▽ More

    Submitted 2 December, 2021; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: 6 pages, 3 figures, 5 tables

  46. Becoming the Super Turker: Increasing Wages via a Strategy from High Earning Workers

    Authors: Saiph Savage, Chun-Wei Chiang, Susumu Saito, Carlos Toxtli, Jeffrey Bigham

    Abstract: Crowd markets have traditionally limited workers by not providing transparency information concerning which tasks pay fairly or which requesters are unreliable. Researchers believe that a key reason why crowd workers earn low wages is due to this lack of transparency. As a result, tools have been developed to provide more transparency within crowd markets to help workers. However, while most worke… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Comments: 12.pages, 8 figures, The Web Conference 202, ACM WWW 2020

    ACM Class: I.2.7

  47. arXiv:2005.05053  [pdf, other

    q-bio.NC cs.LG cs.NE eess.SP stat.ML

    Low-Rank Nonlinear Decoding of $μ$-ECoG from the Primary Auditory Cortex

    Authors: Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit, Alyson K. Fletcher, Sundeep Rangan, Michael Trumpis, Brinnae Bent, Chia-Han Chiang, Jonathan Viventi

    Abstract: This paper considers the problem of neural decoding from parallel neural measurements systems such as micro-electrocorticography ($μ$-ECoG). In systems with large numbers of array elements at very high sampling rates, the dimension of the raw measurement data may be large. Learning neural decoders for this high-dimensional data can be challenging, particularly when the number of training samples i… ▽ More

    Submitted 6 May, 2020; originally announced May 2020.

    Comments: 4 pages, 3 figures

  48. arXiv:2004.12599  [pdf, other

    cs.CV eess.IV

    Deploying Image Deblurring across Mobile Devices: A Perspective of Quality and Latency

    Authors: Cheng-Ming Chiang, Yu Tseng, Yu-Syuan Xu, Hsien-Kai Kuo, Yi-Min Tsai, Guan-Yu Chen, Koan-Sin Tan, Wei-Ting Wang, Yu-Chieh Lin, Shou-Yao Roy Tseng, Wei-Shiang Lin, Chia-Lin Yu, BY Shen, Kloze Kao, Chia-Ming Cheng, Hung-Jen Chen

    Abstract: Recently, image enhancement and restoration have become important applications on mobile devices, such as super-resolution and image deblurring. However, most state-of-the-art networks present extremely high computational complexity. This makes them difficult to be deployed on mobile devices with acceptable latency. Moreover, when deploying to different mobile devices, there is a large latency var… ▽ More

    Submitted 27 April, 2020; originally announced April 2020.

    Comments: CVPR 2020 Workshop on New Trends in Image Restoration and Enhancement (NTIRE)

  49. arXiv:1908.02125  [pdf, other

    eess.IV cs.CV cs.LG

    Architecture-aware Network Pruning for Vision Quality Applications

    Authors: Wei-Ting Wang, Han-Lin Li, Wei-Shiang Lin, Cheng-Ming Chiang, Yi-Min Tsai

    Abstract: Convolutional neural network (CNN) delivers impressive achievements in computer vision and machine learning field. However, CNN incurs high computational complexity, especially for vision quality applications because of large image resolution. In this paper, we propose an iterative architecture-aware pruning algorithm with adaptive magnitude threshold while cooperating with quality-metric measurem… ▽ More

    Submitted 4 August, 2019; originally announced August 2019.

    Comments: Accepted to be Published in the 26th IEEE International Conference on Image Processing (ICIP 2019). Updated to contain the IEEE copyright notice

  50. TurkScanner: Predicting the Hourly Wage of Microtasks

    Authors: Susumu Saito, Chun-Wei Chiang, Saiph Savage, Teppei Nakano, Tetsunori Kobayashi, Jeffrey Bigham

    Abstract: Workers in crowd markets struggle to earn a living. One reason for this is that it is difficult for workers to accurately gauge the hourly wages of microtasks, and they consequently end up performing labor with little pay. In general, workers are provided with little information about tasks, and are left to rely on noisy signals, such as textual description of the task or rating of the requester.… ▽ More

    Submitted 17 March, 2019; originally announced March 2019.

    Comments: Proceedings of the 28th International Conference on World Wide Web (WWW '19), San Francisco, CA, USA, May 13-17, 2019