Zum Hauptinhalt springen

Showing 1–50 of 55 results for author: Lin, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.01702  [pdf, ps, other

    cs.IT eess.SP

    Beamforming for PIN Diode-Based IRS-Assisted Systems Under a Phase Shift-Dependent Power Consumption Model

    Authors: Qiucen Wu, Tian Lin, Xianghao Yu, Yu Zhu, Robert Schober

    Abstract: Intelligent reflecting surfaces (IRSs) have been regarded as a promising enabler for future wireless communication systems. In the literature, IRSs have been considered power-free or assumed to have constant power consumption. However, recent experimental results have shown that for positive-intrinsic-negative (PIN) diode-based IRSs, the power consumption dynamically changes with the phase shift c… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

  2. arXiv:2407.06957  [pdf, other

    eess.AS cs.CL cs.CY

    Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models

    Authors: Yi-Cheng Lin, Tzu-Quan Lin, Chih-Kai Yang, Ke-Han Lu, Wei-Chih Chen, Chun-Yi Kuan, Hung-yi Lee

    Abstract: Speech Integrated Large Language Models (SILLMs) combine large language models with speech perception to perform diverse tasks, such as emotion recognition to speaker verification, demonstrating universal audio understanding capability. However, these models may amplify biases present in training data, potentially leading to biased access to information for marginalized groups. This work introduce… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  3. arXiv:2406.18361  [pdf, other

    cs.CV cs.AI eess.IV

    Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process

    Authors: Tianyu Lin, Zhiguang Chen, Zhonghao Yan, Weijiang Yu, Fudan Zheng

    Abstract: Diffusion models have demonstrated their effectiveness across various generative tasks. However, when applied to medical image segmentation, these models encounter several challenges, including significant resource and time requirements. They also necessitate a multi-step reverse process and multiple samples to produce reliable predictions. To address these challenges, we introduce the first laten… ▽ More

    Submitted 9 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted at MICCAI 2024. Code and citation info see https://github.com/lin-tianyu/Stable-Diffusion-Seg

  4. arXiv:2406.18089  [pdf, other

    cs.SD cs.MM eess.AS

    A Study on Synthesizing Expressive Violin Performances: Approaches and Comparisons

    Authors: Tzu-Yun Hung, Jui-Te Wu, Yu-Chia Kuo, Yo-Wei Hsiao, Ting-Wei Lin, Li Su

    Abstract: Expressive music synthesis (EMS) for violin performance is a challenging task due to the disagreement among music performers in the interpretation of expressive musical terms (EMTs), scarcity of labeled recordings, and limited generalization ability of the synthesis model. These challenges create trade-offs between model effectiveness, diversity of generated results, and controllability of the syn… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 15 pages, 2 figures, 3 tables

  5. arXiv:2406.16942  [pdf, other

    eess.IV cs.AI cs.CV

    Enhancing Diagnostic Reliability of Foundation Model with Uncertainty Estimation in OCT Images

    Authors: Yuanyuan Peng, Aidi Lin, Meng Wang, Tian Lin, Ke Zou, Yinglin Cheng, Tingkun Shi, Xulong Liao, Lixia Feng, Zhen Liang, Xinjian Chen, Huazhu Fu, Haoyu Chen

    Abstract: Inability to express the confidence level and detect unseen classes has limited the clinical implementation of artificial intelligence in the real-world. We developed a foundation model with uncertainty estimation (FMUE) to detect 11 retinal conditions on optical coherence tomography (OCT). In the internal test set, FMUE achieved a higher F1 score of 96.76% than two state-of-the-art algorithms, RE… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: All codes are available at https://github.com/yuanyuanpeng0129/FMUE

  6. arXiv:2406.13977  [pdf, other

    eess.IV cs.CV

    Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning

    Authors: Tingyi Lin, Pengju Lyu, Jie Zhang, Yuqing Wang, Cheng Wang, Jianjun Zhu

    Abstract: Non-contrast CT (NCCT) imaging may reduce image contrast and anatomical visibility, potentially increasing diagnostic uncertainty. In contrast, contrast-enhanced CT (CECT) facilitates the observation of regions of interest (ROI). Leading generative models, especially the conditional diffusion model, demonstrate remarkable capabilities in medical image modality transformation. Typical conditional d… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  7. arXiv:2406.09317  [pdf, other

    eess.IV cs.CV

    Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

    Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, Jinming Guo, Xiaolin Chen, Jingcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

    Abstract: Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  8. arXiv:2406.06375  [pdf, other

    cs.SD cs.AI eess.AS

    MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing

    Authors: Yu-Fen Huang, Nikki Moran, Simon Coleman, Jon Kelly, Shun-Hwa Wei, Po-Yin Chen, Yun-Hsin Huang, Tsung-Ping Chen, Yu-Chia Kuo, Yu-Chi Wei, Chih-Hsuan Li, Da-Yu Huang, Hsuan-Kai Kao, Ting-Wei Lin, Li Su

    Abstract: In cross-modal music processing, translation between visual, auditory, and semantic content opens up new possibilities as well as challenges. The construction of such a transformative scheme depends upon a benchmark corpus with a comprehensive data infrastructure. In particular, the assembly of a large-scale cross-modal dataset presents major challenges. In this paper, we present the MOSA (Music m… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024. 14 pages, 7 figures. Dataset is available on: https://github.com/yufenhuang/MOSA-Music-mOtion-and-Semantic-Annotation-dataset/tree/main and https://zenodo.org/records/11393449

  9. arXiv:2406.05464  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models

    Authors: Tzu-Quan Lin, Hung-yi Lee, Hao Tang

    Abstract: Self-supervised speech models have shown to be useful for various tasks, but their large size limits the use in devices with low computing power and memory. In this work, we explore early exit, an approach for reducing latency by exiting the forward process of a network early. Most approaches of early exit need a separate early exit model for each task, with some even requiring fine-tuning of the… ▽ More

    Submitted 29 August, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  10. arXiv:2406.04997  [pdf, ps, other

    eess.AS cs.LG

    On the social bias of speech self-supervised models

    Authors: Yi-Cheng Lin, Tzu-Quan Lin, Hsi-Che Lin, Andy T. Liu, Hung-yi Lee

    Abstract: Self-supervised learning (SSL) speech models have achieved remarkable performance in various tasks, yet the biased outcomes, especially affecting marginalized groups, raise significant concerns. Social bias refers to the phenomenon where algorithms potentially amplify disparate properties between social groups present in the data used for training. Bias in SSL models can perpetuate injustice by au… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  11. arXiv:2405.20693  [pdf, other

    eess.IV cs.CV

    R$^2$-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction

    Authors: Ruyi Zha, Tao Jun Lin, Yuanhao Cai, Jiwen Cao, Yanhao Zhang, Hongdong Li

    Abstract: 3D Gaussian splatting (3DGS) has shown promising results in image rendering and surface reconstruction. However, its potential in volumetric reconstruction tasks, such as X-ray computed tomography, remains under-explored. This paper introduces R2-Gaussian, the first 3DGS-based framework for sparse-view tomographic reconstruction. By carefully deriving X-ray rasterization functions, we discover a p… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  12. arXiv:2405.18167  [pdf, other

    eess.IV cs.CV

    Confidence-aware multi-modality learning for eye disease screening

    Authors: Ke Zou, Tian Lin, Zongbo Han, Meng Wang, Xuedong Yuan, Haoyu Chen, Changqing Zhang, Xiaojing Shen, Huazhu Fu

    Abstract: Multi-modal ophthalmic image classification plays a key role in diagnosing eye diseases, as it integrates information from different sources to complement their respective performances. However, recent improvements have mainly focused on accuracy, often neglecting the importance of confidence and robustness in predictions for diverse modalities. In this study, we propose a novel multi-modality evi… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 27 pages, 7 figures, 9 tables

  13. arXiv:2405.12847  [pdf, other

    cs.IR cs.LG cs.MM cs.SD eess.AS

    A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability

    Authors: Li-Yang Tseng, Tzu-Ling Lin, Hong-Han Shuai, Jen-Wei Huang, Wen-Whei Chang

    Abstract: Nowadays, humans are constantly exposed to music, whether through voluntary streaming services or incidental encounters during commercial breaks. Despite the abundance of music, certain pieces remain more memorable and often gain greater popularity. Inspired by this phenomenon, we focus on measuring and predicting music memorability. To achieve this, we collect a new music piece dataset with relia… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Journal ref: Proceedings of the 24th International Society for Music Information Retrieval Conference, 174-181. Milan, Italy, November 5-9, 2023

  14. arXiv:2404.04482  [pdf, ps, other

    eess.SP

    Data-Driven Online Resource Allocation for User Experience Improvement in Mobile Edge Clouds

    Authors: Liqun Fu, Jingwen Tong, Tongtong Lin, Jun Zhang

    Abstract: As the cloud is pushed to the edge of the network, resource allocation for user experience improvement in mobile edge clouds (MEC) is increasingly important and faces multiple challenges. This paper studies quality of experience (QoE)-oriented resource allocation in MEC while considering user diversity, limited resources, and the complex relationship between allocated resources and user experience… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: This work was presented in part at IEEE ICC 2021

  15. arXiv:2403.16252  [pdf, other

    cs.RO eess.SY

    Legged Robot State Estimation within Non-inertial Environments

    Authors: Zijian He, Sangli Teng, Tzu-Yuan Lin, Maani Ghaffari, Yan Gu

    Abstract: This paper investigates the robot state estimation problem within a non-inertial environment. The proposed state estimation approach relaxes the common assumption of static ground in the system modeling. The process and measurement models explicitly treat the movement of the non-inertial environments without requiring knowledge of its motion in the inertial frame or relying on GPS or sensing envir… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  16. arXiv:2403.14287  [pdf, other

    cs.CV cs.AI eess.IV

    Enhancing Historical Image Retrieval with Compositional Cues

    Authors: Tingyu Lin, Robert Sablatnig

    Abstract: In analyzing vast amounts of digitally stored historical image data, existing content-based retrieval methods often overlook significant non-semantic information, limiting their effectiveness for flexible exploration across varied themes. To broaden the applicability of image retrieval methods for diverse purposes and uncover more general patterns, we innovatively introduce a crucial factor from c… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  17. arXiv:2403.09157  [pdf, ps, other

    eess.IV cs.CV

    VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation

    Authors: Mingya Zhang, Yue Yu, Limei Gu, Tingsheng Lin, Xianping Tao

    Abstract: In the field of medical image segmentation, models based on both CNN and Transformer have been thoroughly investigated. However, CNNs have limited modeling capabilities for long-range dependencies, making it challenging to exploit the semantic information within images fully. On the other hand, the quadratic computational complexity poses a challenge for Transformers. Recently, State Space Models… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 12 pages, 4 figures

  18. arXiv:2401.02122  [pdf, other

    cs.CL cs.SD eess.AS

    PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques

    Authors: Tzu-Han Lin, How-Shing Wang, Hao-Yung Weng, Kuang-Chen Peng, Zih-Ching Chen, Hung-yi Lee

    Abstract: Parameter-Efficient Fine-Tuning (PEFT) is increasingly recognized as an effective method in speech processing. However, the optimal approach and the placement of PEFT methods remain inconclusive. Our study conducts extensive experiments to compare different PEFT methods and their layer-wise placement adapting Differentiable Architecture Search (DARTS). We also explore the use of ensemble learning… ▽ More

    Submitted 7 February, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Comments: Accepted to ICASSP 2024 Self-supervision in Audio, Speech and Beyond (SASB) workshop

  19. arXiv:2312.09429  [pdf

    eess.SP cs.LG

    Deep Learning-Enabled Swallowing Monitoring and Postoperative Recovery Biosensing System

    Authors: Chih-Ning Tsai, Pei-Wen Yang, Tzu-Yen Huang, Jung-Chih Chen, Hsin-Yi Tseng, Che-Wei Wu, Amrit Sarmah, Tzu-En Lin

    Abstract: This study introduces an innovative 3D printed dry electrode tailored for biosensing in postoperative recovery scenarios. Fabricated through a drop coating process, the electrode incorporates a novel 2D material.

    Submitted 24 November, 2023; originally announced December 2023.

    Comments: the abstract can't uploaded fully

    MSC Class: NA ACM Class: A.0

  20. arXiv:2309.14324  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Towards General-Purpose Text-Instruction-Guided Voice Conversion

    Authors: Chun-Yi Kuan, Chen An Li, Tsu-Yuan Hsu, Tse-Yang Lin, Ho-Lam Chung, Kai-Wei Chang, Shuo-yiin Chang, Hung-yi Lee

    Abstract: This paper introduces a novel voice conversion (VC) model, guided by text instructions such as "articulate slowly with a deep tone" or "speak in a cheerful boyish voice". Unlike traditional methods that rely on reference utterances to determine the attributes of the converted speech, our model adds versatility and specificity to voice conversion. The proposed VC model is a neural codec language mo… ▽ More

    Submitted 16 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Accepted to ASRU 2023

  21. Synchro-Transient-Extracting Transform for the Analysis of Signals with Both Harmonic and Impulsive Components

    Authors: Yunlong Ma, Gang Yu, Tianran Lin, Qingtang Jiang

    Abstract: Time-frequency analysis (TFA) techniques play an important role in the field of machine fault diagnosis attributing to their superiority in dealing with nonstationary signals. Synchroextracting transform (SET) and transient-extracting transform (TET) are two newly emerging techniques that can produce energy concentrated representation for nonstationary signals. However, SET and TET are only suitab… ▽ More

    Submitted 7 February, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

  22. arXiv:2303.09790  [pdf, other

    eess.IV cs.CV

    Reliable Multimodality Eye Disease Screening via Mixture of Student's t Distributions

    Authors: Ke Zou, Tian Lin, Xuedong Yuan, Haoyu Chen, Xiaojing Shen, Meng Wang, Huazhu Fu

    Abstract: Multimodality eye disease screening is crucial in ophthalmology as it integrates information from diverse sources to complement their respective performances. However, the existing methods are weak in assessing the reliability of each unimodality, and directly fusing an unreliable modality may cause screening errors. To address this issue, we introduce a novel multimodality evidential fusion pipel… ▽ More

    Submitted 29 August, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: MICCAI 2023 (Early accept):11 pages, 4 figures

  23. arXiv:2302.08377  [pdf, ps, other

    cs.IT eess.SP

    Channel Estimation for BIOS-Assisted Multi-User MIMO Systems: A Heterogeneous Two-timescale Strategy

    Authors: Qiucen Wu, Tian Lin, Yu Zhu

    Abstract: Bilayer intelligent omni-surface (BIOS) has recently attracted increasing attention due to its capability of independent beamforming on both reflection and refraction sides. However, its specific bilayer structure makes the channel estimation problem more challenging than the conventional intelligent reflecting surface (IRS) or intelligent omni-surface (IOS). In this paper, we investigate the chan… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  24. arXiv:2211.09949  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Compressing Transformer-based self-supervised models for speech processing

    Authors: Tzu-Quan Lin, Tsung-Huan Yang, Chun-Yao Chang, Kuang-Ming Chen, Tzu-hsun Feng, Hung-yi Lee, Hao Tang

    Abstract: Despite the success of Transformers in self- supervised learning with applications to various downstream tasks, the computational cost of training and inference remains a major challenge for applying these models to a wide spectrum of devices. Several isolated attempts have been made to compress Transformers, but the settings and metrics are different across studies. Trade-off at various compressi… ▽ More

    Submitted 26 January, 2024; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: Submitted to IEEE Transactions on Audio, Speech and Language Processing (TASLP)

  25. arXiv:2211.09944  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    MelHuBERT: A simplified HuBERT on Mel spectrograms

    Authors: Tzu-Quan Lin, Hung-yi Lee, Hao Tang

    Abstract: Self-supervised models have had great success in learning speech representations that can generalize to various downstream tasks. However, most self-supervised models require a large amount of compute and multiple GPUs to train, significantly hampering the development of self-supervised learning. In an attempt to reduce the computation of training, we revisit the training of HuBERT, a highly succe… ▽ More

    Submitted 29 August, 2024; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: ASRU 2023

  26. arXiv:2210.08634  [pdf, other

    cs.CL cs.SD eess.AS

    SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning

    Authors: Tzu-hsun Feng, Annie Dong, Ching-Feng Yeh, Shu-wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe, Abdelrahman Mohamed, Shang-Wen Li, Hung-yi Lee

    Abstract: We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised speech representation for better performance, generalization, and efficiency. The challenge builds upon the SUPERB benchmark and implements metrics to measure the computation requirements of self-supervised learning (SSL) representation and to evaluate its generalizability and performance across the diverse SUPERB… ▽ More

    Submitted 29 October, 2022; v1 submitted 16 October, 2022; originally announced October 2022.

    Comments: Accepted by 2022 SLT Workshop

  27. arXiv:2210.07185  [pdf, other

    cs.CL eess.AS

    On the Utility of Self-supervised Models for Prosody-related Tasks

    Authors: Guan-Ting Lin, Chi-Luen Feng, Wei-Ping Huang, Yuan Tseng, Tzu-Han Lin, Chen-An Li, Hung-yi Lee, Nigel G. Ward

    Abstract: Self-Supervised Learning (SSL) from speech data has produced models that have achieved remarkable performance in many tasks, and that are known to implicitly represent many aspects of information latently present in speech signals. However, relatively little is known about the suitability of such models for prosody-related tasks or the extent to which they encode prosodic information. We present a… ▽ More

    Submitted 26 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE SLT 2022

  28. arXiv:2209.15140  [pdf, other

    cs.RO eess.SY

    Fully Proprioceptive Slip-Velocity-Aware State Estimation for Mobile Robots via Invariant Kalman Filtering and Disturbance Observer

    Authors: Xihang Yu, Sangli Teng, Theodor Chakhachiro, Wenzhe Tong, Tingjun Li, Tzu-Yuan Lin, Sarah Koehler, Manuel Ahumada, Jeffrey M. Walls, Maani Ghaffari

    Abstract: This paper develops a novel slip estimator using the invariant observer design theory and Disturbance Observer (DOB). The proposed state estimator for mobile robots is fully proprioceptive and combines data from an inertial measurement unit and body velocity within a Right Invariant Extended Kalman Filter (RI-EKF). By embedding the slip velocity into $\mathrm{SE}_3(3)$ matrix Lie group, the develo… ▽ More

    Submitted 30 September, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: The work will be presented in IROS2023. github repository at https://github.com/UMich-CURLY/slip_detection_DOB. arXiv admin note: text overlap with arXiv:1805.10410 by other authors

  29. arXiv:2209.12900  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    The Efficacy of Self-Supervised Speech Models for Audio Representations

    Authors: Tung-Yu Wu, Chen-An Li, Tzu-Han Lin, Tsu-Yuan Hsu, Hung-Yi Lee

    Abstract: Self-supervised learning (SSL) speech models, which can serve as powerful upstream models to extract meaningful speech representations, have achieved unprecedented success in speech representation learning. However, their effectiveness on non-speech datasets is relatively less explored. In this work, we propose an ensemble framework, with a combination of ensemble techniques, to fuse SSL speech mo… ▽ More

    Submitted 31 January, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: to appear in Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track

  30. arXiv:2208.04583  [pdf

    eess.SP

    Two-Factor Biometric Verification with ECG: Two Cancelable Approaches

    Authors: Jui-Kun Chiu, Tzu-Yun Lin, Wei-Shen Hsu, Shun-Chi Wu

    Abstract: Biometric authentication relies on an individual's physiological or behavioral traits to verify their identity before granting access permission to a system or device without remembering anything. Although electrocardiograms (ECGs) have been considered a biometric trait, an ECG biometric recognition system that operates in verification mode is rarely considered. This study proposes two two-factor… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

    Comments: Accepted by IEEE EMBC 2022, but withdrawn since the conference was physically held and going abroad was still restricted due to COVID-19 Pandemic

  31. arXiv:2208.03525  [pdf, other

    eess.SY math.OC

    Stochastic MPC with Dual Control for Autonomous Driving with Multi-Modal Interaction-Aware Predictions

    Authors: Siddharth H. Nair, Vijay Govindarajan, Theresa Lin, Yan Wang, Eric H. Tseng, Francesco Borrelli

    Abstract: We propose a Stochastic MPC (SMPC) approach for autonomous driving which incorporates multi-modal, interaction-aware predictions of surrounding vehicles. For each mode, vehicle motion predictions are obtained by a control model described using a basis of fixed features with unknown weights. The proposed SMPC formulation finds optimal controls which serves two purposes: 1) reducing conservatism of… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: Accepted to AVEC'22

  32. arXiv:2207.00129  [pdf, other

    cs.MA cs.CG cs.RO eess.SY math.OC

    Multi-Agent Shape Control with Optimal Transport

    Authors: Alex Tong Lin, Stanley J. Osher

    Abstract: We introduce a method called MASCOT (Multi-Agent Shape Control with Optimal Transport) to compute optimal control solutions of agents with shape/formation/density constraints. For example, we might want to apply shape constraints on the agents -- perhaps we desire the agents to hold a particular shape along the path, or we want agents to spread out in order to minimize collisions. We might also wa… ▽ More

    Submitted 3 February, 2023; v1 submitted 30 June, 2022; originally announced July 2022.

    Comments: Fixed expressions for g_shape and L_shape in section 4.1, 4.2, 5.2, and 5.3

  33. arXiv:2204.08478  [pdf, other

    eess.IV cs.CV

    Enhancing Non-mass Breast Ultrasound Cancer Classification With Knowledge Transfer

    Authors: Yangrun Hu, Yuanfan Guo, Fan Zhang, Mingda Wang, Tiancheng Lin, Rong Wu, Yi Xu

    Abstract: Much progress has been made in the deep neural network (DNN) based diagnosis of mass lesions breast ultrasound (BUS) images. However, the non-mass lesion is less investigated because of the limited data. Based on the insight that mass data is sufficient and shares the same knowledge structure with non-mass data of identifying the malignancy of a lesion based on the ultrasound image, we propose a n… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 4pages. Accepted by ISBI2022

  34. arXiv:2204.08477  [pdf, other

    eess.IV cs.CV

    Self Supervised Lesion Recognition For Breast Ultrasound Diagnosis

    Authors: Yuanfan Guo, Canqian Yang, Tiancheng Lin, Chunxiao Li, Rui Zhang, Yi Xu

    Abstract: Previous deep learning based Computer Aided Diagnosis (CAD) system treats multiple views of the same lesion as independent images. Since an ultrasound image only describes a partial 2D projection of a 3D lesion, such paradigm ignores the semantic relationship between different views of a lesion, which is inconsistent with the traditional diagnosis where sonographers analyze a lesion from at least… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 4pages. Accepted by ISBI2022

  35. arXiv:2203.06269  [pdf, other

    cs.LG cs.CE eess.SY math.NA

    Parameter Inference of Time Series by Delay Embeddings and Learning Differentiable Operators

    Authors: Alex Tong Lin, Adrian S. Wong, Robert Martin, Stanley J. Osher, Daniel Eckhardt

    Abstract: We provide a method to identify system parameters of dynamical systems, called ID-ODE -- Inference by Differentiation and Observing Delay Embeddings. In this setting, we are given a dataset of trajectories from a dynamical system with system parameter labels. Our goal is to identify system parameters of new trajectories. The given trajectories may or may not encompass the full state of the system,… ▽ More

    Submitted 16 November, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

  36. arXiv:2203.00911  [pdf, other

    eess.IV cs.CV

    Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence

    Authors: Zhihong Pan, Baopu Li, Dongliang He, Mingde Yao, Wenhao Wu, Tianwei Lin, Xin Li, Errui Ding

    Abstract: Deep learning based single image super-resolution models have been widely studied and superb results are achieved in upscaling low-resolution images with fixed scale factor and downscaling degradation kernel. To improve real world applicability of such models, there are growing interests to develop models optimized for arbitrary upscaling factors. Our proposed method is the first to treat arbitrar… ▽ More

    Submitted 7 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: To appear at CVPR 2022

  37. arXiv:2110.01744  [pdf, other

    eess.SY

    BeamSurfer: Minimalist Beam Management of Mobile mm-Wave Devices

    Authors: Santosh Ganji, Tzu-Hsiang Lin, Francisco A. Espinal, P. R. Kumar

    Abstract: Management of narrow directional beams is critical for mm-wave communication systems. Translational or rotational motion of the user can cause misalignment of transmit and receive beams with the base station losing track of the mobile. Reacquiring the user can take about one second in 5G NewRadio systems and significantly impair performance of applications, besides being energy intensive. It is th… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

  38. arXiv:2109.09792  [pdf, other

    eess.SY cs.RO

    Stochastic MPC with Multi-modal Predictions for Traffic Intersections

    Authors: Siddharth H. Nair, Vijay Govindarajan, Theresa Lin, Chris Meissen, H. Eric Tseng, Francesco Borrelli

    Abstract: We propose a Stochastic MPC (SMPC) formulation for autonomous driving at traffic intersections which incorporates multi-modal predictions of surrounding vehicles for collision avoidance constraints. The multi-modal predictions are obtained with Gaussian Mixture Models (GMM) and constraints are formulated as chance-constraints. Our main theoretical contribution is a SMPC formulation that optimizes… ▽ More

    Submitted 25 February, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

    Comments: Extended version of ITSC 2022 submission

  39. arXiv:2107.11605  [pdf, ps, other

    cs.IT eess.SP

    Channel Estimation for IRS-Assisted Millimeter-Wave MIMO Systems: Sparsity-Inspired Approaches

    Authors: Tian Lin, Xianghao Yu, Yu Zhu, Robert Schober

    Abstract: Due to their ability to create favorable line-of-sight (LoS) propagation environments, intelligent reflecting surfaces (IRSs) are regarded as promising enablers for future millimeter-wave (mm-wave) wireless communication. In this paper, we investigate channel estimation for IRS-assisted mm-wave multiple-input multiple-output (MIMO) {\color{black}wireles}s systems. By leveraging the sparsity of mm-… ▽ More

    Submitted 24 July, 2021; originally announced July 2021.

  40. arXiv:2107.08335  [pdf

    eess.SY cs.NI eess.SP

    Silent Tracker: In-band Beam Management for Soft Handover for mm-Wave Networks

    Authors: Santosh Ganji, Tzu-Hsiang Lin, Jaewon Kim, P. R. Kumar

    Abstract: In mm-wave networks, cell sizes are small due to high path and penetration losses. Mobiles need to frequently switch softly from one cell to another to preserve network connections and context. Each soft handover involves the mobile performing directional neighbor cell search, tracking cell beam, completing cell access request, and finally, context switching. The mobile must independently discover… ▽ More

    Submitted 17 July, 2021; originally announced July 2021.

  41. arXiv:2104.05376  [pdf, other

    cs.CV eess.IV

    Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer

    Authors: Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao

    Abstract: Artistic style transfer aims at migrating the style from an example image to a content image. Currently, optimization-based methods have achieved great stylization quality, but expensive time cost restricts their practical applications. Meanwhile, feed-forward methods still fail to synthesize complex style, especially when holistic global and local patterns exist. Inspired by the common painting p… ▽ More

    Submitted 17 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: Accepted by CVPR 2021. Codes will be released soon on https://github.com/PaddlePaddle/PaddleGAN/

  42. arXiv:2104.02658  [pdf, other

    cs.NI eess.SP eess.SY

    UNBLOCK: Low Complexity Transient Blockage Recovery for Mobile mm-Wave Devices

    Authors: Santosh Ganji, Tzu-Hsiang Lin, Francisco A. Espinal, P. R. Kumar

    Abstract: Directional radio beams are used in the mm-Wave band to combat the high path loss. The mm-Wave band also suffers from high penetration losses from drywall, wood, glass, concrete, etc., and also the human body. Hence, as a mobile user moves, the Line of Sight (LoS) path between the mobile and the Base Station (BS) can be blocked by objects interposed in the path, causing loss of the link. A mobile… ▽ More

    Submitted 23 February, 2021; originally announced April 2021.

  43. Downlink SCMA Codebook Design with Low Error Rate by Maximizing Minimum Euclidean Distance of Superimposed Codewords

    Authors: Chinwei Huang, Borching Su, Tingyi Lin, Yenming Huang

    Abstract: Sparse code multiple access (SCMA), as a codebook-based non-orthogonal multiple access (NOMA) technique, has received research attention in recent years. The codebook design problem for SCMA has also been studied to some extent since codebook choices are highly related to the system's error rate performance. In this paper, we approach the SCMA codebook design problem by formulating an optimization… ▽ More

    Submitted 1 May, 2022; v1 submitted 9 January, 2021; originally announced January 2021.

    Comments: 15 pages, 12 figures. This version is accepted to IEEE Transactions on Vehicular Technology, and the copyright is transferred to IEEE

  44. Blind Monaural Source Separation on Heart and Lung Sounds Based on Periodic-Coded Deep Autoencoder

    Authors: Kun-Hsi Tsai, Wei-Chien Wang, Chui-Hsuan Cheng, Chan-Yen Tsai, Jou-Kou Wang, Tzu-Hao Lin, Shih-Hau Fang, Li-Chin Chen, Yu Tsao

    Abstract: Auscultation is the most efficient way to diagnose cardiovascular and respiratory diseases. To reach accurate diagnoses, a device must be able to recognize heart and lung sounds from various clinical situations. However, the recorded chest sounds are mixed by heart and lung sounds. Thus, effectively separating these two sounds is critical in the pre-processing stage. Recent advances in machine lea… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

    Comments: 13 pages, 11 figures, Accepted by IEEE Journal of Biomedical and Health Informatics

  45. arXiv:2007.13034  [pdf, other

    cs.CV cs.LG eess.IV

    Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve

    Authors: Weicheng Kuo, Anelia Angelova, Tsung-Yi Lin, Angela Dai

    Abstract: Object recognition has seen significant progress in the image domain, with focus primarily on 2D perception. We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image by constructing a CAD-based representation of the objects and their poses. We present Mask2CAD, which jointly detects objects in real-world images and for… ▽ More

    Submitted 25 July, 2020; originally announced July 2020.

    Comments: ECCV 2020 (Spotlight)

  46. arXiv:2005.04720  [pdf, ps, other

    cs.IT eess.SP

    Channel Estimation for Intelligent Reflecting Surface-Assisted Millimeter Wave MIMO Systems

    Authors: Tian Lin, Xianghao Yu, Yu Zhu, Robert Schober

    Abstract: Intelligent reflecting surfaces (IRSs) are regarded as promising enablers for future millimeter wave (mmWave) wireless communication, due to their ability to create favorable line-of-sight (LoS) propagation environments. In this paper, we investigate channel estimation in downlink IRS-assisted mmWave multiple-input multiple-output (MIMO) systems. By leveraging the sparsity of mmWave channels, we f… ▽ More

    Submitted 10 May, 2020; originally announced May 2020.

  47. arXiv:2005.04703  [pdf, other

    eess.IV cs.CV cs.LG

    Hierarchical Regression Network for Spectral Reconstruction from RGB Images

    Authors: Yuzhi Zhao, Lai-Man Po, Qiong Yan, Wei Liu, Tingyu Lin

    Abstract: Capturing visual image with a hyperspectral camera has been successfully applied to many areas due to its narrow-band imaging technology. Hyperspectral reconstruction from RGB images denotes a reverse process of hyperspectral imaging by discovering an inverse response function. Current works mainly map RGB images directly to corresponding spectrum but do not consider context information explicitly… ▽ More

    Submitted 10 May, 2020; originally announced May 2020.

    Comments: 1st Place in CVPRW 2020 NTIRE Spectral Reconstruction Challenge

  48. arXiv:2004.04455  [pdf, other

    cs.CV eess.IV

    Decoupled Gradient Harmonized Detector for Partial Annotation: Application to Signet Ring Cell Detection

    Authors: Tiancheng Lin, Yuanfan Guo, Canqian Yang, Jiancheng Yang, Yi Xu

    Abstract: Early diagnosis of signet ring cell carcinoma dramatically improves the survival rate of patients. Due to lack of public dataset and expert-level annotations, automatic detection on signet ring cell (SRC) has not been thoroughly investigated. In MICCAI DigestPath2019 challenge, apart from foreground (SRC region)-background (normal tissue area) class imbalance, SRCs are partially annotated due to c… ▽ More

    Submitted 9 April, 2020; originally announced April 2020.

    Comments: accepted to Neurocomputing; 1st runner up of MICCAI DigestPath2019 challenge

  49. arXiv:2003.04685  [pdf, other

    cs.CE cs.AI eess.IV

    TopologyGAN: Topology Optimization Using Generative Adversarial Networks Based on Physical Fields Over the Initial Domain

    Authors: Zhenguo Nie, Tong Lin, Haoliang Jiang, Levent Burak Kara

    Abstract: In topology optimization using deep learning, load and boundary conditions represented as vectors or sparse matrices often miss the opportunity to encode a rich view of the design problem, leading to less than ideal generalization results. We propose a new data-driven topology optimization model called TopologyGAN that takes advantage of various physical fields computed on the original, unoptimize… ▽ More

    Submitted 11 March, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

    Comments: 18 pages, 16 figures

  50. arXiv:1912.05027  [pdf, other

    cs.CV cs.LG eess.IV

    SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

    Authors: Xianzhi Du, Tsung-Yi Lin, Pengchong Jin, Golnaz Ghiasi, Mingxing Tan, Yin Cui, Quoc V. Le, Xiaodan Song

    Abstract: Convolutional neural networks typically encode an input image into a series of intermediate features with decreasing resolutions. While this structure is suited to classification tasks, it does not perform well for tasks requiring simultaneous recognition and localization (e.g., object detection). The encoder-decoder architectures are proposed to resolve this by applying a decoder network onto a b… ▽ More

    Submitted 17 June, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: CVPR 2020