Skip to main content

Showing 1–50 of 144 results for author: Chang, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06537  [pdf, other

    cs.CL cs.AI

    Efficient and Accurate Memorable Conversation Model using DPO based on sLLM

    Authors: Youngkyung Seo, Yoonseok Heo, Jun-Seok Koh, Du-Seoung Chang

    Abstract: In multi-session dialog system, it is essential to continuously update the memory as the session progresses. Simply accumulating memory can make it difficult to focus on the content of the conversation for inference due to the limited input sentence size. Therefore, efficient and accurate conversation model that is capable of managing memory to reflect the conversation history continuously is nece… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2407.03051  [pdf, other

    cs.CL

    Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment

    Authors: Janghwan Lee, Seongmin Park, Sukjin Hong, Minsoo Kim, Du-Seong Chang, Jungwook Choi

    Abstract: The rapid advancement of large language models (LLMs) has facilitated their transformation into conversational chatbots that can grasp contextual nuances and generate pertinent sentences, closely mirroring human values through advanced techniques such as instruction tuning and reinforcement learning from human feedback (RLHF). However, the computational efficiency required for LLMs, achieved throu… ▽ More

    Submitted 18 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: ACL 2024 Main

  3. arXiv:2406.16758  [pdf, other

    cs.CL

    Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters

    Authors: Euiin Yi, Taehyeon Kim, Hongseok Jeung, Du-Seong Chang, Se-Young Yun

    Abstract: Large language models (LLMs) have revolutionized natural language processing and broadened their applicability across diverse commercial applications. However, the deployment of these models is constrained by high inference time in multilingual settings. To mitigate this challenge, this paper explores a training recipe of an assistant model in speculative decoding, which are leveraged to draft and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.16469  [pdf, other

    cs.CL cs.CV

    Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration

    Authors: Yujin Baek, ChaeHun Park, Jaeseok Kim, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo

    Abstract: To create culturally inclusive vision-language models (VLMs), the foremost requirement is developing a test benchmark that can diagnose the models' ability to respond to questions reflecting cultural elements. This paper addresses the necessity for such benchmarks, noting that existing research has relied on human annotators' manual efforts, which impedes diversity and efficiency. We propose a sem… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  5. arXiv:2406.13317  [pdf, other

    cs.CV

    M4Fog: A Global Multi-Regional, Multi-Modal, and Multi-Stage Dataset for Marine Fog Detection and Forecasting to Bridge Ocean and Atmosphere

    Authors: Mengqiu Xu, Ming Wu, Kaixin Chen, Yixiang Huang, Mingrui Xu, Yujia Yang, Yiqing Feng, Yiying Guo, Bin Huang, Dongliang Chang, Zhenwei Shi, Chuang Zhang, Zhanyu Ma, Jun Guo

    Abstract: Marine fog poses a significant hazard to global shipping, necessitating effective detection and forecasting to reduce economic losses. In recent years, several machine learning (ML) methods have demonstrated superior detection accuracy compared to traditional meteorological methods. However, most of these works are developed on proprietary datasets, and the few publicly accessible datasets are oft… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2406.11813  [pdf, other

    cs.CL

    How Do Large Language Models Acquire Factual Knowledge During Pretraining?

    Authors: Hoyeon Chang, Jinho Park, Seonghyeon Ye, Sohee Yang, Youngkyung Seo, Du-Seong Chang, Minjoon Seo

    Abstract: Despite the recent observation that large language models (LLMs) can store substantial factual knowledge, there is a limited understanding of the mechanisms of how they acquire factual knowledge through pretraining. This work addresses this gap by studying how LLMs acquire factual knowledge during pretraining. The findings reveal several important insights into the dynamics of factual knowledge ac… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    ACM Class: I.2.7

  7. arXiv:2406.08718  [pdf, other

    cs.CL

    Enhancing Psychotherapy Counseling: A Data Augmentation Pipeline Leveraging Large Language Models for Counseling Conversations

    Authors: Jun-Woo Kim, Ji-Eun Han, Jun-Seok Koh, Hyeon-Tae Seo, Du-Seong Chang

    Abstract: We introduce a pipeline that leverages Large Language Models (LLMs) to transform single-turn psychotherapy counseling sessions into multi-turn interactions. While AI-supported online counseling services for individuals with mental disorders exist, they are often constrained by the limited availability of multi-turn training datasets and frequently fail to fully utilize therapists' expertise. Our p… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: IJCAI 2024 AI4Research workshop

  8. arXiv:2406.05963  [pdf, other

    cs.CV cs.AI

    Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024

    Authors: Jinwoo Ahn, Junhyeok Park, Min-Jun Kim, Kang-Hyeon Kim, So-Yeong Sohn, Yun-Ji Lee, Du-Seong Chang, Yu-Jung Heo, Eun-Sol Kim

    Abstract: In this paper, the solution of HYU MLLAB KT Team to the Multimodal Algorithmic Reasoning Task: SMART-101 CVPR 2024 Challenge is presented. Beyond conventional visual question-answering problems, the SMART-101 challenge aims to achieve human-level multimodal understanding by tackling complex visio-linguistic puzzles designed for children in the 6-8 age group. To solve this problem, we suggest two m… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  9. arXiv:2406.02331  [pdf, other

    cs.CL

    Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering

    Authors: ChaeHun Park, Koanho Lee, Hyesu Lim, Jaeseok Kim, Junmo Park, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo

    Abstract: Building a reliable visual question answering~(VQA) system across different languages is a challenging problem, primarily due to the lack of abundant samples for training. To address this challenge, recent studies have employed machine translation systems for the cross-lingual VQA task. This involves translating the evaluation samples into a source language (usually English) and using monolingual… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings Accepted

  10. arXiv:2405.19595  [pdf

    cs.CV

    The RSNA Abdominal Traumatic Injury CT (RATIC) Dataset

    Authors: Jeffrey D. Rudie, Hui-Ming Lin, Robyn L. Ball, Sabeena Jalal, Luciano M. Prevedello, Savvas Nicolaou, Brett S. Marinelli, Adam E. Flanders, Kirti Magudia, George Shih, Melissa A. Davis, John Mongan, Peter D. Chang, Ferco H. Berger, Sebastiaan Hermans, Meng Law, Tyler Richards, Jan-Peter Grunz, Andreas Steven Kunz, Shobhit Mathur, Sandro Galea-Soler, Andrew D. Chung, Saif Afat, Chin-Chi Kuo, Layal Aweidah , et al. (15 additional authors not shown)

    Abstract: The RSNA Abdominal Traumatic Injury CT (RATIC) dataset is the largest publicly available collection of adult abdominal CT studies annotated for traumatic injuries. This dataset includes 4,274 studies from 23 institutions across 14 countries. The dataset is freely available for non-commercial use via Kaggle at https://www.kaggle.com/competitions/rsna-2023-abdominal-trauma-detection. Created for the… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 40 pages, 2 figures, 3 tables

  11. arXiv:2405.14017  [pdf, other

    cs.CV

    MagicPose4D: Crafting Articulated Models with Appearance and Motion Control

    Authors: Hao Zhang, Di Chang, Fang Li, Mohammad Soleymani, Narendra Ahuja

    Abstract: With the success of 2D and 3D visual generative models, there is growing interest in generating 4D content. Existing methods primarily rely on text prompts to produce 4D content, but they often fall short of accurately defining complex or rare motions. To address this limitation, we propose MagicPose4D, a novel framework for refined control over both appearance and motion in 4D generation. Unlike… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Project Page: https://boese0601.github.io/magicpose4d

  12. arXiv:2405.12235  [pdf

    cs.LG q-bio.QM

    Hypergraph: A Unified and Uniform Definition with Application to Chemical Hypergraph

    Authors: Daniel T. Chang

    Abstract: The conventional definition of hypergraph has two major issues: (1) there is not a standard definition of directed hypergraph and (2) there is not a formal definition of nested hypergraph. To resolve these issues, we propose a new definition of hypergraph that unifies the concepts of undirected, directed and nested hypergraphs, and that is uniform in using hyperedge as a single construct for repre… ▽ More

    Submitted 18 June, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.03623 by other authors

  13. arXiv:2405.06495  [pdf, other

    cs.HC

    Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling

    Authors: Lyumanshan Ye, Jiandong Jiang, Danni Chang, Pengfei Liu

    Abstract: Interactive storytelling has been widely adopted by educators in teaching activities of young children. Such a teaching method combines storytelling with active child participation, benefiting their expressive abilities, creative thinking, and understanding of stories. Interactive storytelling requires facilitators to unidirectionally narrate the story content and encourage children's participatio… ▽ More

    Submitted 13 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

  14. arXiv:2404.16767  [pdf, other

    cs.LG cs.CL cs.CV

    REBEL: Reinforcement Learning via Regressing Relative Rewards

    Authors: Zhaolin Gao, Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kianté Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun

    Abstract: While originally developed for continuous control problems, Proximal Policy Optimization (PPO) has emerged as the work-horse of a variety of reinforcement learning (RL) applications, including the fine-tuning of generative models. Unfortunately, PPO requires multiple heuristics to enable stable convergence (e.g. value networks, clipping), and is notorious for its sensitivity to the precise impleme… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: New experimental results on general chat

  15. arXiv:2404.12734  [pdf, other

    cs.CV

    DLoRA-TrOCR: Mixed Text Mode Optical Character Recognition Based On Transformer

    Authors: Da Chang, Yu Li

    Abstract: With the continuous development of Optical Character Recognition (OCR) and the expansion of application fields, text recognition in complex scenes has become a key challenge. Factors such as multiple fonts, mixed scenes and complex layouts seriously affect the recognition accuracy of traditional OCR models. Although OCR models based on deep learning have performed well in specific fields or simila… ▽ More

    Submitted 23 April, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  16. arXiv:2404.08513  [pdf, other

    cs.LG cs.AI

    Adversarial Imitation Learning via Boosting

    Authors: Jonathan D. Chang, Dhruv Sreenivas, Yingbing Huang, Kianté Brantley, Wen Sun

    Abstract: Adversarial imitation learning (AIL) has stood out as a dominant framework across various imitation learning (IL) applications, with Discriminator Actor Critic (DAC) (Kostrikov et al.,, 2019) demonstrating the effectiveness of off-policy learning algorithms in improving sample efficiency and scalability to higher-dimensional observations. Despite DAC's empirical success, the original AIL objective… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 19 pages, 7 figures, 4 tables, 3 algorithms, ICLR 2024

  17. arXiv:2404.08495  [pdf, other

    cs.LG cs.AI cs.CL

    Dataset Reset Policy Optimization for RLHF

    Authors: Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Kianté Brantley, Dipendra Misra, Jason D. Lee, Wen Sun

    Abstract: Reinforcement Learning (RL) from Human Preference-based feedback is a popular paradigm for fine-tuning generative models, which has produced impressive models such as GPT-4 and Claude3 Opus. This framework often consists of two steps: learning a reward model from an offline preference dataset followed by running online RL to optimize the learned reward model. In this work, leveraging the idea of r… ▽ More

    Submitted 16 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 28 pages, 6 tables, 3 Figures, 3 Algorithms

  18. arXiv:2404.07947  [pdf, other

    cs.DC cs.LG

    ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference

    Authors: Hyungjun Oh, Kihong Kim, Jaemin Kim, Sungkyun Kim, Junyeol Lee, Du-seong Chang, Jiwon Seo

    Abstract: This paper presents ExeGPT, a distributed system designed for constraint-aware LLM inference. ExeGPT finds and runs with an optimal execution schedule to maximize inference throughput while satisfying a given latency constraint. By leveraging the distribution of input and output sequences, it effectively allocates resources and determines optimal execution configurations, including batch sizes and… ▽ More

    Submitted 15 March, 2024; originally announced April 2024.

    Comments: Accepted to ASPLOS 2024 (summer cycle)

    Journal ref: 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems(ASPLOS 24 summer cycle), Volume 2, Nov 15, 2023 (Notification Date)

  19. arXiv:2404.03673  [pdf, other

    cs.CV cs.AI cs.LG

    RL for Consistency Models: Faster Reward Guided Text-to-Image Generation

    Authors: Owen Oertell, Jonathan D. Chang, Yiyi Zhang, Kianté Brantley, Wen Sun

    Abstract: Reinforcement learning (RL) has improved guided image generation with diffusion models by directly optimizing rewards that capture image quality, aesthetics, and instruction following capabilities. However, the resulting generative policies inherit the same iterative sampling process of diffusion models that causes slow generation. To overcome this limitation, consistency models proposed learning… ▽ More

    Submitted 22 June, 2024; v1 submitted 25 March, 2024; originally announced April 2024.

    Comments: 18 pages, 9 figures, 1 table

  20. arXiv:2404.01136  [pdf, ps, other

    cs.IT

    Density Evolution Analysis of Generalized Low-density Parity-check Codes under a Posteriori Probability Decoder

    Authors: Dongxu Chang, Qingqing Peng, Guanghui Wang, Dawei Yin

    Abstract: In this study, the performance of generalized low-density parity-check (GLDPC) codes under the a posteriori probability (APP) decoder is analyzed. We explore the concentration, symmetry, and monotonicity properties of GLDPC codes under the APP decoder, extending the applicability of density evolution to GLDPC codes. We demonstrate that with an appropriate proportion of generalized constraint (GC)… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  21. arXiv:2404.00930  [pdf, other

    cs.CL

    PSYDIAL: Personality-based Synthetic Dialogue Generation using Large Language Models

    Authors: Ji-Eun Han, Jun-Seok Koh, Hyeon-Tae Seo, Du-Seong Chang, Kyung-Ah Sohn

    Abstract: We present a novel end-to-end personality-based synthetic dialogue data generation pipeline, specifically designed to elicit responses from large language models via prompting. We design the prompts to generate more human-like dialogues considering real-world scenarios when users engage with chatbots. We introduce PSYDIAL, the first Korean dialogue dataset focused on personality-based dialogues, c… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: LREC-COLING 2024 Main

  22. arXiv:2403.19266  [pdf, other

    cs.IT

    On the Performance of Low-complexity Decoders of LDPC and Polar Codes

    Authors: Qingqing Peng, Dawei Yin, Dongxu Chang, Yuan Li, Huazi Zhang, Guiying Yan, Guanghui Wang

    Abstract: Efficient decoding is crucial to high-throughput and low-power wireless communication scenarios. A theoretical analysis of the performance-complexity tradeoff toward low-complexity decoding is required for a better understanding of the fundamental limits in the above-mentioned scenarios. This study aims to explore the performance of decoders with complexity constraints. Specifically, we investigat… ▽ More

    Submitted 3 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2012.13378 by other authors

  23. arXiv:2403.09069  [pdf, other

    cs.CV

    Dyadic Interaction Modeling for Social Behavior Generation

    Authors: Minh Tran, Di Chang, Maksim Siniukov, Mohammad Soleymani

    Abstract: Human-human communication is like a delicate dance where listeners and speakers concurrently interact to maintain conversational dynamics. Hence, an effective model for generating listener nonverbal behaviors requires understanding the dyadic context and interaction. In this paper, we present an effective framework for creating 3D facial motions in dyadic interactions. Existing work consider a lis… ▽ More

    Submitted 17 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: The first two authors contribute equally. The paper is accepted by ECCV 2024. Project Page: https://boese0601.github.io/dim/ Code: https://github.com/Boese0601/Dyadic-Interaction-Modeling

  24. arXiv:2402.09587  [pdf, other

    cs.CV

    DeepATLAS: One-Shot Localization for Biomedical Data

    Authors: Peter D. Chang

    Abstract: This paper introduces the DeepATLAS foundational model for localization tasks in the domain of high-dimensional biomedical data. Upon convergence of the proposed self-supervised objective, a pretrained model maps an input to an anatomically-consistent embedding from which any point or set of points (e.g., boxes or segmentations) may be identified in a one-shot or few-shot approach. As a representa… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 18 pages

  25. arXiv:2401.15691  [pdf, other

    cs.LG

    One for all: A novel Dual-space Co-training baseline for Large-scale Multi-View Clustering

    Authors: Zisen Kong, Zhiqiang Fu, Dongxia Chang, Yiming Wang, Yao Zhao

    Abstract: In this paper, we propose a novel multi-view clustering model, named Dual-space Co-training Large-scale Multi-view Clustering (DSCMC). The main objective of our approach is to enhance the clustering performance by leveraging co-training in two distinct spaces. In the original space, we learn a projection matrix to obtain latent consistent anchor graphs from different views. This process involves c… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  26. arXiv:2401.11196  [pdf, ps, other

    eess.SY cs.LG

    Machine learning based state observer for discrete time systems evolving on Lie groups

    Authors: Soham Shanbhag, Dong Eui Chang

    Abstract: In this paper, a machine learning based observer for systems evolving on manifolds is designed such that the state of the observer is restricted to the Lie group on which the system evolves. Conventional techniques involving machine learning based observers on systems evolving on Lie groups involve designing charts for the Lie group, training a machine learning based observer for each chart, and s… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

  27. arXiv:2312.13016  [pdf, other

    cs.CV

    DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis

    Authors: Yuming Gu, You Xie, Hongyi Xu, Guoxian Song, Yichun Shi, Di Chang, Jing Yang, Linjie Luo

    Abstract: We present DiffPortrait3D, a conditional diffusion model that is capable of synthesizing 3D-consistent photo-realistic novel views from as few as a single in-the-wild portrait. Specifically, given a single RGB input, we aim to synthesize plausible but consistent facial details rendered from novel camera views with retained both identity and facial expression. In lieu of time-consuming optimization… ▽ More

    Submitted 19 March, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  28. arXiv:2311.16973  [pdf, other

    cs.CV cs.AI cs.LG

    DemoFusion: Democratising High-Resolution Image Generation With No $$$

    Authors: Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma

    Abstract: High-resolution image generation with Generative Artificial Intelligence (GenAI) has immense potential but, due to the enormous capital investment required for training, it is increasingly centralised to a few large corporations, and hidden behind paywalls. This paper aims to democratise high-resolution GenAI by advancing the frontier of high-resolution generation while remaining accessible to a b… ▽ More

    Submitted 14 December, 2023; v1 submitted 23 November, 2023; originally announced November 2023.

    Comments: Project Page: https://ruoyidu.github.io/demofusion/demofusion.html

  29. arXiv:2311.12052  [pdf, other

    cs.CV

    MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion

    Authors: Di Chang, Yichun Shi, Quankai Gao, Jessica Fu, Hongyi Xu, Guoxian Song, Qing Yan, Yizhe Zhu, Xiao Yang, Mohammad Soleymani

    Abstract: In this work, we propose MagicPose, a diffusion-based model for 2D human pose and facial expression retargeting. Specifically, given a reference image, we aim to generate a person's new images by controlling the poses and facial expressions while keeping the identity unchanged. To this end, we propose a two-stage training strategy to disentangle human motions and appearance (e.g., facial expressio… ▽ More

    Submitted 5 May, 2024; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: Accepted by ICML 2024. MagicPose and MagicDance are the same project. Website:https://boese0601.github.io/magicdance/ Code:https://github.com/Boese0601/MagicDance

  30. arXiv:2310.11047   

    cs.HC

    The Impact of Gamified Auditory-Verbal Training for Hearing-Challenged Children at Intermediate and Advanced Rehabilitation Stages

    Authors: Yan Xiang, Zhen Zhang, Danni Chang, Lei Tu

    Abstract: Auditory-verbal training is essential for children with hearing challenges, and the gamification approach has become a promising direction for improving the rehabilitation experience and effect. However, the specific influence of the gamified training approach on participants at different rehabilitation stages has not been empirically studied. This paper is thusly intended to investigate the resea… ▽ More

    Submitted 22 January, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: The analysis section requires refinement

  31. arXiv:2310.10054  [pdf, other

    cs.CL cs.AI cs.LG

    NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models

    Authors: Jongwoo Ko, Seungjoon Park, Yujin Kim, Sumyeong Ahn, Du-Seong Chang, Euijai Ahn, Se-Young Yun

    Abstract: Structured pruning methods have proven effective in reducing the model size and accelerating inference speed in various network architectures such as Transformers. Despite the versatility of encoder-decoder models in numerous NLP tasks, the structured pruning methods on such models are relatively less explored compared to encoder-only models. In this study, we investigate the behavior of the struc… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Findings of the Association for Computational Linguistics: EMNLP 2023

  32. arXiv:2310.09779  [pdf

    cs.HC

    Exploring the Correlation between Urban Microclimate Simulation and Urban Morphology: A Case Study in Yeongdeungpo-gu, Seoul

    Authors: Yan Xiang, Danni Chang, Jieli Cheng

    Abstract: Different social backgrounds and planning policies give rise to diverse urban morphologies. These morphologies influence urban microclimate factors and contribute to the formation of unique local microclimates, particularly in terms of outdoor temperature. In recent times, the heat island effect has gained increasing significance during the summer season. Therefore, this study aims to explore the… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

  33. arXiv:2310.09778  [pdf

    cs.HC

    Leveraging Urban Big Data for Informed Business Location Decisions: A Case Study of Starbucks in Tianhe District, Guangzhou City

    Authors: Yan Xiang, Danni Chang, Xuan Feng

    Abstract: With the development of the information age, cities provide a large amount of data that can be analyzed and utilized to facilitate the decision-making process. Urban big data and analytics are particularly valuable in the analysis of business location decisions, providing insight and supporting informed choices. By examining data relating to commercial locations, it becomes possible to analyze var… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

  34. arXiv:2310.04407  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Policy-Gradient Training of Language Models for Ranking

    Authors: Ge Gao, Jonathan D. Chang, Claire Cardie, Kianté Brantley, Thorsten Joachim

    Abstract: Text retrieval plays a crucial role in incorporating factual knowledge for decision making into language processing pipelines, ranging from chat-based web search to question answering systems. Current state-of-the-art text retrieval models leverage pre-trained large language models (LLMs) to achieve competitive performance, but training LLM-based retrievers via typical contrastive losses requires… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  35. arXiv:2308.12380  [pdf, other

    cs.CV

    FG-Net: Facial Action Unit Detection with Generalizable Pyramidal Features

    Authors: Yufeng Yin, Di Chang, Guoxian Song, Shen Sang, Tiancheng Zhi, Jing Liu, Linjie Luo, Mohammad Soleymani

    Abstract: Automatic detection of facial Action Units (AUs) allows for objective facial expression analysis. Due to the high cost of AU labeling and the limited size of existing benchmarks, previous AU detection methods tend to overfit the dataset, resulting in a significant performance loss when evaluated across corpora. To address this problem, we propose FG-Net for generalizable facial action unit detecti… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

  36. arXiv:2308.10713  [pdf, other

    cs.CV

    LibreFace: An Open-Source Toolkit for Deep Facial Expression Analysis

    Authors: Di Chang, Yufeng Yin, Zongjian Li, Minh Tran, Mohammad Soleymani

    Abstract: Facial expression analysis is an important tool for human-computer interaction. In this paper, we introduce LibreFace, an open-source toolkit for facial expression analysis. This open-source toolbox offers real-time and offline analysis of facial behavior through deep learning models, including facial action unit (AU) detection, AU intensity estimation, and facial expression recognition. To accomp… ▽ More

    Submitted 23 August, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: 10 pages, 5 figures. Accepted by WACV 2024 Round 1. (Application Track) Project Page: https://boese0601.github.io/libreface/

  37. arXiv:2308.06744  [pdf, other

    cs.CL

    Token-Scaled Logit Distillation for Ternary Weight Generative Language Models

    Authors: Minsoo Kim, Sihwa Lee, Janghwan Lee, Sukjin Hong, Du-Seong Chang, Wonyong Sung, Jungwook Choi

    Abstract: Generative Language Models (GLMs) have shown impressive performance in tasks such as text generation, understanding, and reasoning. However, the large model size poses challenges for practical deployment. To solve this problem, Quantization-Aware Training (QAT) has become increasingly popular. However, current QAT methods for generative models have resulted in a noticeable loss of accuracy. To cou… ▽ More

    Submitted 2 December, 2023; v1 submitted 13 August, 2023; originally announced August 2023.

    Comments: NeurIPS 2023 Camera Ready

  38. arXiv:2306.17089  [pdf

    cs.LG cs.CL

    Concept-Oriented Deep Learning with Large Language Models

    Authors: Daniel T. Chang

    Abstract: Large Language Models (LLMs) have been successfully used in many natural-language tasks and applications including text generation and AI chatbots. They also are a promising new technology for concept-oriented deep learning (CODL). However, the prerequisite is that LLMs understand concepts and ensure conceptual consistency. We discuss these in this paper, as well as major uses of LLMs for CODL inc… ▽ More

    Submitted 19 September, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

  39. arXiv:2306.14649  [pdf, other

    cs.NE

    CIMulator: A Comprehensive Simulation Platform for Computing-In-Memory Circuit Macros with Low Bit-Width and Real Memory Materials

    Authors: Hoang-Hiep Le, Md. Aftab Baig, Wei-Chen Hong, Cheng-Hsien Tsai, Cheng-Jui Yeh, Fu-Xiang Liang, I-Ting Huang, Wei-Tzu Tsai, Ting-Yin Cheng, Sourav De, Nan-Yow Chen, Wen-Jay Lee, Ing-Chao Lin, Da-Wei Chang, Darsen D. Lu

    Abstract: This paper presents a simulation platform, namely CIMulator, for quantifying the efficacy of various synaptic devices in neuromorphic accelerators for different neural network architectures. Nonvolatile memory devices, such as resistive random-access memory, ferroelectric field-effect transistor, and volatile static random-access memory devices, can be selected as synaptic devices. A multilayer pe… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  40. arXiv:2306.11816  [pdf, other

    cs.LG cs.AI cs.CL

    Learning to Generate Better Than Your LLM

    Authors: Jonathan D. Chang, Kiante Brantley, Rajkumar Ramamurthy, Dipendra Misra, Wen Sun

    Abstract: Reinforcement learning (RL) has emerged as a powerful paradigm for fine-tuning Large Language Models (LLMs) for text generation. In particular, recent LLMs such as ChatGPT and GPT-4 can engage in fluent conversations with users after finetuning with RL. Capitalizing on key properties of text generation, we seek to investigate RL algorithms beyond general purpose algorithms like Proximal Policy Opt… ▽ More

    Submitted 13 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: 23 pages, 5 figures, 7 tables, 4 algorithms

  41. arXiv:2304.04687  [pdf, other

    cs.CV cs.HC

    Learning to Detect Touches on Cluttered Tables

    Authors: Norberto Adrian Goussies, Kenji Hata, Shruthi Prabhakara, Abhishek Amit, Tony Aube, Carl Cepress, Diana Chang, Li-Te Cheng, Horia Stefan Ciurdar, Mike Cleron, Chelsey Fleming, Ashwin Ganti, Divyansh Garg, Niloofar Gheissari, Petra Luna Grutzik, David Hendon, Daniel Iglesia, Jin Kim, Stuart Kyle, Chris LaRosa, Roman Lewkow, Peter F McDermott, Chris Melancon, Paru Nackeeran, Neal Norwitz , et al. (6 additional authors not shown)

    Abstract: We present a novel self-contained camera-projector tabletop system with a lamp form-factor that brings digital intelligence to our tables. We propose a real-time, on-device, learning-based touch detection algorithm that makes any tabletop interactive. The top-down configuration and learning-based algorithm makes our method robust to the presence of clutter, a main limitation of existing camera-pro… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  42. arXiv:2303.13762  [pdf, ps, other

    cs.IT

    An Optimization Model for Offline Scheduling Policy of Low-density Parity-check Codes

    Authors: Dongxu Chang, Guanghui Wang, Guiying Yan, Dawei Yin

    Abstract: In this study, an optimization model for offline scheduling policy of low-density parity-check (LDPC) codes is proposed to improve the decoding efficiency of the belief propagation (BP). The optimization model uses the number of messages passed (NMP) as a metric to evaluate complexity, and two metrics, average entropy (AE), and gap to maximum a posteriori (GAP), to evaluate BP decoding performance… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

  43. arXiv:2303.10590  [pdf, other

    cs.CV

    Multi-modal Facial Action Unit Detection with Large Pre-trained Models for the 5th Competition on Affective Behavior Analysis in-the-wild

    Authors: Yufeng Yin, Minh Tran, Di Chang, Xinrui Wang, Mohammad Soleymani

    Abstract: Facial action unit detection has emerged as an important task within facial expression analysis, aimed at detecting specific pre-defined, objective facial expressions, such as lip tightening and cheek raising. This paper presents our submission to the Affective Behavior Analysis in-the-wild (ABAW) 2023 Competition for AU detection. We propose a multi-modal method for facial action unit detection w… ▽ More

    Submitted 17 April, 2023; v1 submitted 19 March, 2023; originally announced March 2023.

    Comments: 8 pages, 7 figures, 5 tables

  44. arXiv:2303.02469  [pdf

    cs.CL cs.LG

    Variational Quantum Classifiers for Natural-Language Text

    Authors: Daniel T. Chang

    Abstract: As part of the recent research effort on quantum natural language processing (QNLP), variational quantum sentence classifiers (VQSCs) have been implemented and supported in lambeq / DisCoPy, based on the DisCoCat model of sentence meaning. We discuss in some detail VQSCs, including category theory, DisCoCat for modeling sentence as string diagram, and DisCoPy for encoding string diagram as paramet… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

  45. Improving Training Stability for Multitask Ranking Models in Recommender Systems

    Authors: Jiaxi Tang, Yoel Drori, Daryl Chang, Maheswaran Sathiamoorthy, Justin Gilmer, Li Wei, Xinyang Yi, Lichan Hong, Ed H. Chi

    Abstract: Recommender systems play an important role in many content platforms. While most recommendation research is dedicated to designing better models to improve user experience, we found that research on stabilizing the training for such models is severely under-explored. As recommendation models become larger and more sophisticated, they are more susceptible to training instability issues, i.e., loss… ▽ More

    Submitted 15 June, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: Accepted at KDD 2023; 12 pages

  46. arXiv:2302.01530  [pdf, other

    cs.CL cs.AI cs.LG

    Revisiting Intermediate Layer Distillation for Compressing Language Models: An Overfitting Perspective

    Authors: Jongwoo Ko, Seungjoon Park, Minchan Jeong, Sukjin Hong, Euijai Ahn, Du-Seong Chang, Se-Young Yun

    Abstract: Knowledge distillation (KD) is a highly promising method for mitigating the computational problems of pre-trained language models (PLMs). Among various KD approaches, Intermediate Layer Distillation (ILD) has been a de facto standard KD method with its performance efficacy in the NLP field. In this paper, we find that existing ILD methods are prone to overfitting to training datasets, although the… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    Comments: The 17th Conference of the European Chapter of the Association for Computational Linguistics (Findings)

  47. arXiv:2211.17161  [pdf, other

    cs.CV

    Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification

    Authors: Jijie Wu, Dongliang Chang, Aneeshan Sain, Xiaoxu Li, Zhanyu Ma, Jie Cao, Jun Guo, Yi-Zhe Song

    Abstract: The main challenge for fine-grained few-shot image classification is to learn feature representations with higher inter-class and lower intra-class variations, with a mere few labelled samples. Conventional few-shot learning methods however cannot be naively adopted for this fine-grained setting -- a quick pilot study reveals that they in fact push for the opposite (i.e., lower inter-class variati… ▽ More

    Submitted 8 January, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Accepted in AAAI-23

  48. arXiv:2211.11014  [pdf, other

    cs.CL cs.AI

    Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer Encoders

    Authors: Minsoo Kim, Sihwa Lee, Sukjin Hong, Du-Seong Chang, Jungwook Choi

    Abstract: Knowledge distillation (KD) has been a ubiquitous method for model compression to strengthen the capability of a lightweight model with the transferred knowledge from the teacher. In particular, KD has been employed in quantization-aware training (QAT) of Transformer encoders like BERT to improve the accuracy of the student model with the reduced-precision weight parameters. However, little is und… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022 Main Track Long Paper

  49. arXiv:2211.05225  [pdf

    quant-ph cs.LG

    Variational Quantum Kernels with Task-Specific Quantum Metric Learning

    Authors: Daniel T. Chang

    Abstract: Quantum kernel methods, i.e., kernel methods with quantum kernels, offer distinct advantages as a hybrid quantum-classical approach to quantum machine learning (QML), including applicability to Noisy Intermediate-Scale Quantum (NISQ) devices and usage for solving all types of machine learning problems. Kernel methods rely on the notion of similarity between points in a higher (possibly infinite) d… ▽ More

    Submitted 26 November, 2022; v1 submitted 8 November, 2022; originally announced November 2022.

  50. arXiv:2211.04906  [pdf, other

    cs.CV

    Cross-view Graph Contrastive Representation Learning on Partially Aligned Multi-view Data

    Authors: Yiming Wang, Dongxia Chang, Zhiqiang Fu, Jie Wen, Yao Zhao

    Abstract: Multi-view representation learning has developed rapidly over the past decades and has been applied in many fields. However, most previous works assumed that each view is complete and aligned. This leads to an inevitable deterioration in their performance when encountering practical problems such as missing or unaligned views. To address the challenge of representation learning on partially aligne… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.