Zum Hauptinhalt springen

Showing 1–50 of 261 results for author: Han, G

.
  1. arXiv:2408.15592  [pdf, ps, other

    cs.IT

    $r$-Minimal Codes with Respect to Rank Metric

    Authors: Yang Xu, Haibin Kan, Guangyue Han

    Abstract: In this paper, we propose and study $r$-minimal codes, a natural extension of minimal codes which have been extensively studied with respect to Hamming metric, rank metric and sum-rank metric. We first propose $r$-minimal codes in a general setting where the ambient space is a finite dimensional left module over a division ring and is supported on a lattice. We characterize minimal subcodes and… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2407.16574  [pdf, other

    cs.CL

    TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback

    Authors: Eunseop Yoon, Hee Suk Yoon, SooHwan Eom, Gunsoo Han, Daniel Wontae Nam, Daejin Jo, Kyoung-Woon On, Mark A. Hasegawa-Johnson, Sungwoong Kim, Chang D. Yoo

    Abstract: Reinforcement Learning from Human Feedback (RLHF) leverages human preference data to train language models to align more closely with human essence. These human preference data, however, are labeled at the sequence level, creating a mismatch between sequence-level preference labels and tokens, which are autoregressively generated from the language model. Although several recent approaches have tri… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: ACL2024 Findings

  3. arXiv:2407.11291  [pdf, ps, other

    math.RA

    Normal forms of elements in the Weyl algebra and Dixmier Conjecture

    Authors: Gang Han, Zhennan Pan, Yulin Chen

    Abstract: A result of A. Joseph says that any nilpotent or semisimple element $z$ in the Weyl algebra $A_1$ over some algebracally closed field $K$ of characterstic 0 has a normal form up to the action of the automorphism group of $A_1$. It is shown in this note that the normal form corresponds to some unique pair of integers $(k,n)$ with $k\ge n\ge 0$, and will be called the Joseph norm form of $z$. Simila… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  4. arXiv:2407.04064  [pdf, other

    cs.RO

    Collision Avoidance for Multiple UAVs in Unknown Scenarios with Causal Representation Disentanglement

    Authors: Jiafan Zhuang, Zihao Xia, Gaofei Han, Boxi Wang, Wenji Li, Dongliang Wang, Zhifeng Hao, Ruichu Cai, Zhun Fan

    Abstract: Deep reinforcement learning (DRL) has achieved remarkable progress in online path planning tasks for multi-UAV systems. However, existing DRL-based methods often suffer from performance degradation when tackling unseen scenarios, since the non-causal factors in visual representations adversely affect policy learning. To address this issue, we propose a novel representation learning approach, \ie,… ▽ More

    Submitted 15 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  5. arXiv:2407.04056  [pdf, other

    cs.RO

    Robust Policy Learning for Multi-UAV Collision Avoidance with Causal Feature Selection

    Authors: Jiafan Zhuang, Gaofei Han, Zihao Xia, Boxi Wang, Wenji Li, Dongliang Wang, Zhifeng Hao, Ruichu Cai, Zhun Fan

    Abstract: In unseen and complex outdoor environments, collision avoidance navigation for unmanned aerial vehicle (UAV) swarms presents a challenging problem. It requires UAVs to navigate through various obstacles and complex backgrounds. Existing collision avoidance navigation methods based on deep reinforcement learning show promising performance but suffer from poor generalization abilities, resulting in… ▽ More

    Submitted 15 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  6. arXiv:2407.03231  [pdf

    cond-mat.mtrl-sci cond-mat.str-el

    Dimensionality Engineering of Magnetic Anisotropy from Anomalous Hall Effect in Synthetic SrRuO3 Crystals

    Authors: Seung Gyo Jeong, Seong Won Cho, Sehwan Song, Jin Young Oh, Do Gyeom Jeong, Gyeongtak Han, Hu Young Jeong, Ahmed Yousef Mohamed, Woo-suk Noh, Sungkyun Park, Jong Seok Lee, Suyoun Lee, Young-Min Kim, Deok-Yong Cho, Woo Seok Choi

    Abstract: Magnetic anisotropy in atomically thin correlated heterostructures is essential for exploring quantum magnetic phases for next-generation spintronics. Whereas previous studies have mostly focused on van der Waals systems, here, we investigate the impact of dimensionality of epitaxially-grown correlated oxides down to the monolayer limit on structural, magnetic, and orbital anisotropies. By designi… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 23 pages

    Journal ref: published 2024

  7. arXiv:2406.02013  [pdf, other

    cs.LG

    Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning

    Authors: Jiahang Cao, Qiang Zhang, Ziqing Wang, Jiaxu Wang, Hao Cheng, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, Renjing Xu

    Abstract: Sequential modeling has demonstrated remarkable capabilities in offline reinforcement learning (RL), with Decision Transformer (DT) being one of the most notable representatives, achieving significant success. However, RL trajectories possess unique properties to be distinguished from the conventional sequence (e.g., text or audio): (1) local correlation, where the next states in RL are theoretica… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 16 pages, 5 figures

  8. arXiv:2405.18405  [pdf, other

    cs.CV cs.AI

    WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization

    Authors: Jiawei Ma, Yulei Niu, Shiyuan Huang, Guangxing Han, Shih-Fu Chang

    Abstract: Language has been useful in extending the vision encoder to data from diverse distributions without empirical discovery in training domains. However, as the image description is mostly at coarse-grained level and ignores visual details, the resulted embeddings are still ineffective in overcoming complexity of domains at inference time. We present a self-supervision framework WIDIn, Wording Images… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  9. arXiv:2405.07052  [pdf, other

    cs.CL

    Length-Aware Multi-Kernel Transformer for Long Document Classification

    Authors: Guangzeng Han, Jack Tsao, Xiaolei Huang

    Abstract: Lengthy documents pose a unique challenge to neural language models due to substantial memory consumption. While existing state-of-the-art (SOTA) models segment long texts into equal-length snippets (e.g., 128 tokens per snippet) or deploy sparse attention networks, these methods have new challenges of context fragmentation and generalizability due to sentence boundaries and varying text lengths.… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Accepted to SEM 2024

  10. arXiv:2405.06983  [pdf, other

    cs.NI

    ISAC-Assisted Wireless Rechargeable Sensor Networks with Multiple Mobile Charging Vehicles

    Authors: Muhammad Umar Farooq Qaisar, Weijie Yuan, Paolo Bellavista, Guangjie Han, Adeel Ahmed

    Abstract: As IoT-based wireless sensor networks (WSNs) become more prevalent, the issue of energy shortages becomes more pressing. One potential solution is the use of wireless power transfer (WPT) technology, which is the key to building a new shape of wireless rechargeable sensor networks (WRSNs). However, efficient charging and scheduling are critical for WRSNs to function properly. Motivated by the fact… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Accepted for publication in the Special Issue Q1'2024, "Integrating Sensing and Communication for Ubiquitous Internet of Things," IEEE Internet of Things Magazine

  11. arXiv:2404.13654  [pdf, other

    cs.MA

    Multi-AUV Cooperative Underwater Multi-Target Tracking Based on Dynamic-Switching-enabled Multi-Agent Reinforcement Learning

    Authors: Shengbo Wang, Chuan Lin, Guangjie Han, Shengchao Zhu, Zhixian Li, Zhenyu Wang

    Abstract: With the rapid development of underwater communication, sensing, automation, robot technologies, autonomous underwater vehicle (AUV) swarms are gradually becoming popular and have been widely promoted in ocean exploration and underwater tracking or surveillance, etc. However, the complex underwater environment poses significant challenges for AUV swarm-based accurate tracking for the underwater mo… ▽ More

    Submitted 22 April, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

  12. arXiv:2404.07950  [pdf, other

    cs.CV cs.AI cs.LG

    Reinforcement Learning with Generalizable Gaussian Splatting

    Authors: Jiaxu Wang, Qiang Zhang, Jingkai Sun, Jiahang Cao, Gang Han, Wen Zhao, Weining Zhang, Yecheng Shao, Yijie Guo, Renjing Xu

    Abstract: An excellent representation is crucial for reinforcement learning (RL) performance, especially in vision-based reinforcement learning tasks. The quality of the environment representation directly influences the achievement of the learning task. Previous vision-based RL typically uses explicit or implicit ways to represent environments, such as images, points, voxels, and neural radiance fields. Ho… ▽ More

    Submitted 5 August, 2024; v1 submitted 18 March, 2024; originally announced April 2024.

    Comments: 7 pages,2 figures

  13. arXiv:2404.04863  [pdf

    cond-mat.mtrl-sci

    Microscopic Insights into Fatigue Mechanism in Wurtzite Ferroelectric Al$_{0.65}$Sc$_{0.35}$N: Oxygen Infiltration Enabled Grain Amorphization Spanning Boundary to Bulk

    Authors: Ruiqing Wang, Danyang Yao, Jiuren Zhou, Yang Li, Zhi Jiang, Dongliang Chen, Xu Ran, Yu Gao, Zixuan Cheng, Yong Wang, Yan Liu, Yue Hao, Genquan Han

    Abstract: For the first time, the fatigue behavior involving external oxygen in highly Sc-doped AlN ferroelectric film was observed using transmission electron microscope techniques. Despite increasing the Sc composition in AlScN film contributes to reducing the device operation voltage, the inherent affinity of Sc for oxygen introduces instability in device performance. In this study, oxygen incorporation… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 2 Pages,7 figures

  14. arXiv:2404.04656  [pdf, other

    cs.LG cs.AI cs.CL

    Binary Classifier Optimization for Large Language Model Alignment

    Authors: Seungjae Jung, Gunsoo Han, Daniel Wontae Nam, Kyoung-Woon On

    Abstract: Aligning Large Language Models (LLMs) to human preferences through preference optimization has been crucial but labor-intensive, necessitating for each prompt a comparison of both a chosen and a rejected text completion by evaluators. Recently, Kahneman-Tversky Optimization (KTO) has demonstrated that LLMs can be aligned using merely binary "thumbs-up" or "thumbs-down" signals on each prompt-compl… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 18 pages, 9 figures

  15. arXiv:2404.04480  [pdf

    cond-mat.mtrl-sci

    Possible charge density wave induced lattice distortion in ferromagnetic FeGe film

    Authors: Guangdong Nie, Guanghui Han, Erfa S. Z., Shijian Chen, Hao Ding, Fangdong Tang, Licong Peng, Young Sun, Deshun Hong

    Abstract: Binary compound FeGe hosts multiple structures, where skyrmion lattice emerges in the chiral B20 phase and antiferromagnet with charge density wave shows up in the hexagonal phase. Here, we synthesized monoclinic FeGe films which are ferromagnetic with Curie temperature as high as 800 K. By low temperature transmission electron microscope, lattice reconstructions in both real and reciprocal space… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  16. arXiv:2404.02838  [pdf, other

    cs.AI

    I-Design: Personalized LLM Interior Designer

    Authors: Ata Çelen, Guo Han, Konrad Schindler, Luc Van Gool, Iro Armeni, Anton Obukhov, Xi Wang

    Abstract: Interior design allows us to be who we are and live how we want - each design is as unique as our distinct personality. However, it is not trivial for non-professionals to express and materialize this since it requires aligning functional and visual expectations with the constraints of physical space; this renders interior design a luxury. To make it more accessible, we present I-Design, a persona… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  17. arXiv:2403.13786  [pdf, other

    cs.CL

    Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts

    Authors: Guangzeng Han, Weisi Liu, Xiaolei Huang, Brian Borsari

    Abstract: Automatic coding patient behaviors is essential to support decision making for psychotherapists during the motivational interviewing (MI), a collaborative communication intervention approach to address psychiatric issues, such as alcohol and drug addiction. While the behavior coding task has rapidly adapted machine learning to predict patient states during the MI sessions, lacking of domain-specif… ▽ More

    Submitted 23 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted to IEEE ICHI 2024

  18. arXiv:2403.10945  [pdf, other

    stat.ME stat.AP

    Zero-Inflated Stochastic Volatility Model for Disaggregated Inflation Data with Exact Zeros

    Authors: Geonhee Han, Kaoru Irie

    Abstract: The disaggregated time-series data for Consumer Price Index often exhibits frequent instances of exact zero price changes, stemming from measurement errors inherent in the data collection process. However, the currently prominent stochastic volatility model of trend inflation is designed for aggregate measures of price inflation, where exact zero price changes rarely occur. We propose a zero-infla… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  19. arXiv:2403.10492  [pdf, other

    cs.CV

    Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning

    Authors: Dongmin Park, Zhaofang Qian, Guangxing Han, Ser-Nam Lim

    Abstract: Mitigating hallucinations of Large Vision Language Models,(LVLMs) is crucial to enhance their reliability for general-purpose assistants. This paper shows that such hallucinations of LVLMs can be significantly exacerbated by preceding user-system dialogues. To precisely measure this, we first present an evaluation benchmark by extending popular multi-modal benchmark datasets with prepended halluci… ▽ More

    Submitted 25 May, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  20. arXiv:2402.18294  [pdf, other

    cs.RO

    Whole-body Humanoid Robot Locomotion with Human Reference

    Authors: Qiang Zhang, Peter Cui, David Yan, Jingkai Sun, Yiqun Duan, Gang Han, Wen Zhao, Weining Zhang, Yijie Guo, Arthur Zhang, Renjing Xu

    Abstract: Recently, humanoid robots have made significant advances in their ability to perform challenging tasks due to the deployment of Reinforcement Learning (RL), however, the inherent complexity of humanoid robots, including the difficulty of designing complicated reward functions and training entire sophisticated systems, still poses a notable challenge. To conquer these challenges, after many iterati… ▽ More

    Submitted 26 August, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 7pages, 7 figures

  21. arXiv:2402.17774  [pdf

    physics.med-ph physics.bio-ph q-bio.QM

    A paper-based multiplexed serological test to monitor immunity against SARS-CoV-2 using machine learning

    Authors: Merve Eryilmaz, Artem Goncharov, Gyeo-Re Han, Hyou-Arm Joung, Zachary S. Ballard, Rajesh Ghosh, Yijie Zhang, Dino Di Carlo, Aydogan Ozcan

    Abstract: The rapid spread of SARS-CoV-2 caused the COVID-19 pandemic and accelerated vaccine development to prevent the spread of the virus and control the disease. Given the sustained high infectivity and evolution of SARS-CoV-2, there is an ongoing interest in developing COVID-19 serology tests to monitor population-level immunity. To address this critical need, we designed a paper-based multiplexed vert… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 19 Pages, 4 Figures

    Journal ref: ACS Nano (2024)

  22. arXiv:2402.11195  [pdf

    physics.med-ph physics.app-ph physics.bio-ph

    Deep learning-enhanced paper-based vertical flow assay for high-sensitivity troponin detection using nanoparticle amplification

    Authors: Gyeo-Re Han, Artem Goncharov, Merve Eryilmaz, Hyou-Arm Joung, Rajesh Ghosh, Geon Yim, Nicole Chang, Minsoo Kim, Kevin Ngo, Marcell Veszpremi, Kun Liao, Omai B. Garner, Dino Di Carlo, Aydogan Ozcan

    Abstract: Successful integration of point-of-care testing (POCT) into clinical settings requires improved assay sensitivity and precision to match laboratory standards. Here, we show how innovations in amplified biosensing, imaging, and data processing, coupled with deep learning, can help improve POCT. To demonstrate the performance of our approach, we present a rapid and cost-effective paper-based high-se… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: 23 Pages, 4 Figures, 1 Table

  23. arXiv:2402.10873  [pdf, ps, other

    cs.NI eess.SP

    Probabilistic On-Demand Charging Scheduling for ISAC-Assisted WRSNs with Multiple Mobile Charging Vehicles

    Authors: Muhammad Umar Farooq Qaisar, Weijie Yuan, Paolo Bellavista, Guangjie Han, Rabiu Sale Zakariyya, Adeel Ahmed

    Abstract: The internet of things (IoT) based wireless sensor networks (WSNs) face an energy shortage challenge that could be overcome by the novel wireless power transfer (WPT) technology. The combination of WSNs and WPT is known as wireless rechargeable sensor networks (WRSNs), with the charging efficiency and charging scheduling being the primary concerns. Therefore, this paper proposes a probabilistic on… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted for publication at the IEEE Global Communications Conference (GLOBECOM) 2023

  24. arXiv:2401.16941  [pdf, ps, other

    math.RA

    Deformed Laurent series rings and completions of the Weyl division ring

    Authors: Gang Han, Yulin Chen, Zhennan Pan

    Abstract: Let $ L((T^{-1}))$ be the space of (inverse) Laurent serieswith coefficients in some field $L$. It has a standard degree map and the induced topology. With its usual addition and a new product on this space which is continuous and preserves the standard degree map, it will be a complete topological division ring, and called a deformed Laurent series ring. Under mild restrictions, we give the neces… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  25. arXiv:2401.08121  [pdf, other

    cs.LG cs.AI eess.SY

    CycLight: learning traffic signal cooperation with a cycle-level strategy

    Authors: Gengyue Han, Xiaohan Liu, Xianyue Peng, Hao Wang, Yu Han

    Abstract: This study introduces CycLight, a novel cycle-level deep reinforcement learning (RL) approach for network-level adaptive traffic signal control (NATSC) systems. Unlike most traditional RL-based traffic controllers that focus on step-by-step decision making, CycLight adopts a cycle-level strategy, optimizing cycle length and splits simultaneously using Parameterized Deep Q-Networks (PDQN) algorithm… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  26. arXiv:2312.12423  [pdf, other

    cs.CV cs.AI

    Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model

    Authors: Shraman Pramanick, Guangxing Han, Rui Hou, Sayan Nag, Ser-Nam Lim, Nicolas Ballas, Qifan Wang, Rama Chellappa, Amjad Almahairi

    Abstract: The ability of large language models (LLMs) to process visual inputs has given rise to general-purpose vision systems, unifying various vision-language (VL) tasks by instruction tuning. However, due to the enormous diversity in input-output formats in the vision domain, existing general-purpose models fail to successfully integrate segmentation and multi-image inputs with coarse-level tasks into a… ▽ More

    Submitted 19 June, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 Highlight

  27. arXiv:2312.12227  [pdf, other

    cs.CV cs.AI

    HuTuMotion: Human-Tuned Navigation of Latent Motion Diffusion Models with Minimal Feedback

    Authors: Gaoge Han, Shaoli Huang, Mingming Gong, Jinglei Tang

    Abstract: We introduce HuTuMotion, an innovative approach for generating natural human motions that navigates latent motion diffusion models by leveraging few-shot human feedback. Unlike existing approaches that sample latent variables from a standard normal prior distribution, our method adapts the prior distribution to better suit the characteristics of the data, as indicated by human feedback, thus enhan… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024 Main Track

  28. arXiv:2311.15013  [pdf, ps, other

    math.CO math.NT

    Inequalities and asymptotics for hook numbers in restricted partitions

    Authors: William Craig, Madeline Locus Dawsey, Guo-Niu Han

    Abstract: In this paper, we consider the asymptotic properties of hook numbers of partitions in restricted classes. More specifically, we compare the frequency with which partitions into odd parts and partitions into distinct parts have hook numbers equal to $h \geq 1$ by deriving an asymptotic formula for the total number of hooks equal to $h$ that appear among partitions into odd and distinct parts, respe… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  29. arXiv:2311.14264  [pdf, ps, other

    eess.SP

    An ADMM-Based Geometric Configuration Optimization in RSSD-Based Source Localization By UAVs with Spread Angle Constraint

    Authors: Xin Cheng, Guangjie Han, Jinlin Peng, Jinfang Jiang, Yu He, Weiqiang Zhu, Feng Shu, Jiangzhou Wang

    Abstract: Deploying multiple unmanned aerial vehicles (UAVs) to locate a signal-emitting source covers a wide range of military and civilian applications like rescue and target tracking. It is well known that the UAVs-source (sensors-target) geometry, namely geometric configuration, significantly affects the final localization accuracy. This paper focuses on the geometric configuration optimization for rece… ▽ More

    Submitted 17 July, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

  30. arXiv:2311.01018  [pdf, other

    cs.CV

    Expanding Expressiveness of Diffusion Models with Limited Data via Self-Distillation based Fine-Tuning

    Authors: Jiwan Hur, Jaehyun Choi, Gyojin Han, Dong-Jae Lee, Junmo Kim

    Abstract: Training diffusion models on limited datasets poses challenges in terms of limited generation capacity and expressiveness, leading to unsatisfactory results in various downstream tasks utilizing pretrained diffusion models, such as domain translation and text-guided image manipulation. In this paper, we propose Self-Distillation for Fine-Tuning diffusion models (SDFT), a methodology to address the… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: WACV 2024

  31. arXiv:2310.10856  [pdf

    eess.SY cs.LG cs.MA

    Joint Optimization of Traffic Signal Control and Vehicle Routing in Signalized Road Networks using Multi-Agent Deep Reinforcement Learning

    Authors: Xianyue Peng, Hang Gao, Gengyue Han, Hao Wang, Michael Zhang

    Abstract: Urban traffic congestion is a critical predicament that plagues modern road networks. To alleviate this issue and enhance traffic efficiency, traffic signal control and vehicle routing have proven to be effective measures. In this paper, we propose a joint optimization approach for traffic signal control and vehicle routing in signalized road networks. The objective is to enhance network performan… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  32. arXiv:2310.06404  [pdf, other

    cs.CL cs.AI cs.LG

    Hexa: Self-Improving for Knowledge-Grounded Dialogue System

    Authors: Daejin Jo, Daniel Wontae Nam, Gunsoo Han, Kyoung-Woon On, Taehwan Kwon, Seungeun Rho, Sungwoong Kim

    Abstract: A common practice in knowledge-grounded dialogue generation is to explicitly utilize intermediate steps (e.g., web-search, memory retrieval) with modular approaches. However, data for such steps are often inaccessible compared to those of dialogue responses as they are unobservable in an ordinary dialogue. To fill in the absence of these data, we develop a self-improving method to improve the gene… ▽ More

    Submitted 2 April, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  33. arXiv:2309.03509  [pdf, other

    cs.CV

    BroadCAM: Outcome-agnostic Class Activation Mapping for Small-scale Weakly Supervised Applications

    Authors: Jiatai Lin, Guoqiang Han, Xuemiao Xu, Changhong Liang, Tien-Tsin Wong, C. L. Philip Chen, Zaiyi Liu, Chu Han

    Abstract: Class activation mapping~(CAM), a visualization technique for interpreting deep learning models, is now commonly used for weakly supervised semantic segmentation~(WSSS) and object localization~(WSOL). It is the weighted aggregation of the feature maps by activating the high class-relevance ones. Current CAM methods achieve it relying on the training outcomes, such as predicted scores~(forward info… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  34. arXiv:2308.00783  [pdf, other

    cs.CV

    Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking

    Authors: Mingzhan Yang, Guangxin Han, Bin Yan, Wenhua Zhang, Jinqing Qi, Huchuan Lu, Dong Wang

    Abstract: Multi-Object Tracking (MOT) aims to detect and associate all desired objects across frames. Most methods accomplish the task by explicitly or implicitly leveraging strong cues (i.e., spatial and appearance information), which exhibit powerful instance-level discrimination. However, when object occlusion and clustering occur, spatial and appearance information will become ambiguous simultaneously d… ▽ More

    Submitted 20 January, 2024; v1 submitted 1 August, 2023; originally announced August 2023.

    Comments: Accepted to AAAI 2024

  35. arXiv:2307.08671  [pdf, other

    cs.CR cs.AI

    Deep Cross-Modal Steganography Using Neural Representations

    Authors: Gyojin Han, Dong-Jae Lee, Jiwan Hur, Jaehyun Choi, Junmo Kim

    Abstract: Steganography is the process of embedding secret data into another message or data, in such a way that it is not easily noticeable. With the advancement of deep learning, Deep Neural Networks (DNNs) have recently been utilized in steganography. However, existing deep steganography techniques are limited in scope, as they focus on specific data types and are not effective for cross-modal steganogra… ▽ More

    Submitted 7 October, 2023; v1 submitted 2 July, 2023; originally announced July 2023.

    Comments: ICIP 2023 Oral

  36. arXiv:2307.05889  [pdf, other

    cs.CV

    Rethinking Mitosis Detection: Towards Diverse Data and Feature Representation

    Authors: Hao Wang, Jiatai Lin, Danyi Li, Jing Wang, Bingchao Zhao, Zhenwei Shi, Xipeng Pan, Huadeng Wang, Bingbing Li, Changhong Liang, Guoqiang Han, Li Liang, Chu Han, Zaiyi Liu

    Abstract: Mitosis detection is one of the fundamental tasks in computational pathology, which is extremely challenging due to the heterogeneity of mitotic cell. Most of the current studies solve the heterogeneity in the technical aspect by increasing the model complexity. However, lacking consideration of the biological knowledge and the complex model design may lead to the overfitting problem while limited… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  37. M3PT: A Multi-Modal Model for POI Tagging

    Authors: Jingsong Yang, Guanzhou Han, Deqing Yang, Jingping Liu, Yanghua Xiao, Xiang Xu, Baohua Wu, Shenghua Ni

    Abstract: POI tagging aims to annotate a point of interest (POI) with some informative tags, which facilitates many services related to POIs, including search, recommendation, and so on. Most of the existing solutions neglect the significance of POI images and seldom fuse the textual and visual features of POIs, resulting in suboptimal tagging performance. In this paper, we propose a novel Multi-Modal Model… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted by KDD 2023

    ACM Class: H.3.0

  38. arXiv:2306.07289  [pdf, other

    cs.HC

    Multi-Interactive-Modality based Modeling for Myopia Pro-Gression of Adolescent Student

    Authors: Xiangyu Yan, Gongen Han, Can Fang, Xuan Jing

    Abstract: Myopia is a common visual disorder that affects millions of people worldwide and its prevalence has been increasing in recent years. Environmental factors, such as reading time, viewing distance, and ambient lighting, have been identified as potential factors in the development of myopia. In this study, we investigated the relationship between three major factors and myopia in 120 adolescents. By… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 9 pages, 5 figures

  39. arXiv:2306.02393  [pdf, other

    cs.RO cs.CV

    Accessible Robot Control in Mixed Reality

    Authors: Ganlin Zhang, Deheng Zhang, Longteng Duan, Guo Han

    Abstract: A novel method to control the Spot robot of Boston Dynamics by Hololens 2 is proposed. This method is mainly designed for people with physical disabilities, users can control the robot's movement and robot arm without using their hands. The eye gaze tracking and head motion tracking technologies of Hololens 2 are utilized for sending control commands. The movement of the robot would follow the eye… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: Course Project of Mixed Reality at ETH Zurich

  40. arXiv:2305.13973  [pdf, other

    cs.CL

    Effortless Integration of Memory Management into Open-Domain Conversation Systems

    Authors: Eunbi Choi, Kyoung-Woon On, Gunsoo Han, Sungwoong Kim, Daniel Wontae Nam, Daejin Jo, Seung Eun Rho, Taehwan Kwon, Minjoon Seo

    Abstract: Open-domain conversation systems integrate multiple conversation skills into a single system through a modular approach. One of the limitations of the system, however, is the absence of management capability for external memory. In this paper, we propose a simple method to improve BlenderBot3 by integrating memory management ability into it. Since no training data exists for this purpose, we propo… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  41. arXiv:2304.07701  [pdf, ps, other

    math.CO math.AC

    A Gröbner Basis Approach to Combinatorial Nullstellensatz

    Authors: Yang Xu, Haibin Kan, Guangyue Han

    Abstract: In this paper, using some conditions that arise naturally in Alon's combinatorial Nullstellensatz as well as its various extensions and generalizations, we characterize Gröbner bases consisting of monic polynomials, which helps us to establish a Nullstellensatz from a Gröbner basis perspective. As corollaries of this general Nullstellensatz, we establish four special Nullstellensatz, which, among… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

  42. arXiv:2304.04625  [pdf, other

    cs.LG cs.CR cs.CV

    Reinforcement Learning-Based Black-Box Model Inversion Attacks

    Authors: Gyojin Han, Jaehyun Choi, Haeil Lee, Junmo Kim

    Abstract: Model inversion attacks are a type of privacy attack that reconstructs private data used to train a machine learning model, solely by accessing the model. Recently, white-box model inversion attacks leveraging Generative Adversarial Networks (GANs) to distill knowledge from public datasets have been receiving great attention because of their excellent attack performance. On the other hand, current… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: CVPR 2023, Accepted

  43. arXiv:2303.15466  [pdf, other

    cs.CV cs.AI

    Supervised Masked Knowledge Distillation for Few-Shot Transformers

    Authors: Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang

    Abstract: Vision Transformers (ViTs) emerge to achieve impressive performance on many data-abundant computer vision tasks by capturing long-range dependencies among local features. However, under few-shot learning (FSL) settings on small datasets with only a few labeled data, ViT tends to overfit and suffers from severe performance degradation due to its absence of CNN-alike inductive bias. Previous works i… ▽ More

    Submitted 28 March, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

    Comments: To appear in CVPR 2023

  44. arXiv:2303.09674  [pdf, other

    cs.CV cs.AI

    DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection

    Authors: Jiawei Ma, Yulei Niu, Jincheng Xu, Shiyuan Huang, Guangxing Han, Shih-Fu Chang

    Abstract: Generalized few-shot object detection aims to achieve precise detection on both base classes with abundant annotations and novel classes with limited training data. Existing approaches enhance few-shot generalization with the sacrifice of base-class performance, or maintain high precision in base-class detection with limited improvement in novel-class adaptation. In this paper, we point out the re… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: CVPR 2023 Camera Ready (Supp Attached). Code Link: https://github.com/Phoenix-V/DiGeo

  45. arXiv:2303.07683  [pdf, other

    q-bio.NC

    Recovering Arrhythmic EEG Transients from Their Stochastic Interference

    Authors: Javier Díaz, Hiroyasu Ando, GoEun Han, Olga Malyshevskaya, Xifang Hayashi, Juan-Carlos Letelier, Masashi Yanagisawa, Kaspar E. Vogt

    Abstract: Traditionally, the neuronal dynamics underlying electroencephalograms (EEG) have been understood as arising from \textit{rhythmic oscillators with varying degrees of synchronization}. This dominant metaphor employs frequency domain EEG analysis to identify the most prominent populations of neuronal current sources in terms of their frequency and spectral power. However, emerging perspectives on EE… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Original research manuscript in PDF format, 46 pages long, with 13 figures and one table

  46. arXiv:2302.14139  [pdf, other

    cs.LG cs.AI cs.SE

    Scalable End-to-End ML Platforms: from AutoML to Self-serve

    Authors: Igor L. Markov, Pavlos A. Apostolopoulos, Mia R. Garrard, Tanya Qie, Yin Huang, Tanvi Gupta, Anika Li, Cesar Cardoso, George Han, Ryan Maghsoudian, Norm Zhou

    Abstract: ML platforms help enable intelligent data-driven applications and maintain them with limited engineering effort. Upon sufficiently broad adoption, such platforms reach economies of scale that bring greater component reuse while improving efficiency of system development and maintenance. For an end-to-end ML platform with broad adoption, scaling relies on pervasive ML automation and system integrat… ▽ More

    Submitted 3 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: 10 pages, 1 figure, 2 tables

  47. arXiv:2302.13073  [pdf, other

    cs.IT

    Feedback Capacity of the Continuous-Time ARMA(1,1) Gaussian Channel

    Authors: Jun Su, Guangyue Han, Shlomo Shamai

    Abstract: We consider the continuous-time ARMA(1,1) Gaussian channel and derive its feedback capacity in closed form. More specifically, the channel is given by $\boldsymbol{y}(t) =\boldsymbol{x}(t) +\boldsymbol{z}(t)$, where the channel input $\{\boldsymbol{x}(t) \}$ satisfies average power constraint $P$ and the noise $\{\boldsymbol{z}(t)\}$ is a first-order {\em autoregressive moving average} (ARMA(1,1))… ▽ More

    Submitted 10 April, 2024; v1 submitted 25 February, 2023; originally announced February 2023.

  48. arXiv:2302.12662  [pdf, other

    eess.IV cs.CV

    FedDBL: Communication and Data Efficient Federated Deep-Broad Learning for Histopathological Tissue Classification

    Authors: Tianpeng Deng, Yanqi Huang, Guoqiang Han, Zhenwei Shi, Jiatai Lin, Qi Dou, Zaiyi Liu, Xiao-jing Guo, C. L. Philip Chen, Chu Han

    Abstract: Histopathological tissue classification is a fundamental task in computational pathology. Deep learning-based models have achieved superior performance but centralized training with data centralization suffers from the privacy leakage problem. Federated learning (FL) can safeguard privacy by keeping training samples locally, but existing FL-based frameworks require a large number of well-annotated… ▽ More

    Submitted 17 December, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

  49. arXiv:2301.10934  [pdf

    physics.med-ph physics.app-ph physics.bio-ph

    Deep learning-enabled multiplexed point-of-care sensor using a paper-based fluorescence vertical flow assay

    Authors: Artem Goncharov, Hyou-Arm Joung, Rajesh Ghosh, Gyeo-Re Han, Zachary S. Ballard, Quinn Maloney, Alexandra Bell, Chew Tin Zar Aung, Omai B. Garner, Dino Di Carlo, Aydogan Ozcan

    Abstract: We demonstrate multiplexed computational sensing with a point-of-care serodiagnosis assay to simultaneously quantify three biomarkers of acute cardiac injury. This point-of-care sensor includes a paper-based fluorescence vertical flow assay (fxVFA) processed by a low-cost mobile reader, which quantifies the target biomarkers through trained neural networks, all within <15 min of test time using 50… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: 17 Pages, 6 Figures

    Journal ref: Small (2023)

  50. arXiv:2301.02044  [pdf, ps, other

    eess.SP

    Simultaneously Transmitting and Reflecting (STAR) RIS Assisted Over-the-Air Computation Systems

    Authors: Xiongfei Zhai, Guojun Han, Yunlong Cai, Yuanwei Liu, Lajos Hanzo

    Abstract: The performance of over-the-air computation (AirComp) systems degrades due to the hostile channel conditions of wireless devices (WDs), which can be significantly improved by the employment of reconfigurable intelligent surfaces (RISs). However, the conventional RISs require that the WDs have to be located in the half-plane of the reflection space, which restricts their potential benefits. To addr… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.