Zum Hauptinhalt springen

Showing 1–50 of 67 results for author: Lai, W

Searching in archive cs. Search in all archives.
.
  1. An Efficient Convex-Hull Relaxation Based Algorithm for Multi-User Discrete Passive Beamforming

    Authors: Wenhai Lai, Zheyu Wu, Yi Feng, Kaiming Shen, Ya-Feng Liu

    Abstract: Intelligent reflecting surface (IRS) is an emerging technology to enhance spatial multiplexing in wireless networks. This letter considers the discrete passive beamforming design for IRS in order to maximize the minimum signal-to-interference-plus-noise ratio (SINR) among multiple users in an IRS-assisted downlink network. The main design difficulty lies in the discrete phase-shift constraint. Dif… ▽ More

    Submitted 28 August, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

    Comments: 5 pages

    Journal ref: IEEE Signal Processing Letters 2024

  2. arXiv:2407.12648  [pdf, ps, other

    cs.IT eess.SP

    Blind Beamforming for Coverage Enhancement with Intelligent Reflecting Surface

    Authors: Fan Xu, Jiawei Yao, Wenhai Lai, Kaiming Shen, Xin Li, Xin Chen, Zhi-Quan Luo

    Abstract: Conventional policy for configuring an intelligent reflecting surface (IRS) typically requires channel state information (CSI), thus incurring substantial overhead costs and facing incompatibility with the current network protocols. This paper proposes a blind beamforming strategy in the absence of CSI, aiming to boost the minimum signal-to-noise ratio (SNR) among all the receiver positions, namel… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 17 pages

  3. arXiv:2407.02340  [pdf, other

    cs.CL cs.AI

    RVISA: Reasoning and Verification for Implicit Sentiment Analysis

    Authors: Wenna Lai, Haoran Xie, Guandong Xu, Qing Li

    Abstract: With an increasing social demand for fine-grained sentiment analysis (SA), implicit sentiment analysis (ISA) poses a significant challenge with the absence of salient cue words in expressions. It necessitates reliable reasoning to understand how the sentiment is aroused and thus determine implicit sentiments. In the era of Large Language Models (LLMs), Encoder-Decoder (ED) LLMs have gained popular… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 11 pages, 6 figures, and 4 tables

  4. arXiv:2407.02109  [pdf, other

    cs.CV cs.AI

    HRSAM: Efficiently Segment Anything in High-Resolution Images

    Authors: You Huang, Wenbin Lai, Jiayi Ji, Liujuan Cao, Shengchuan Zhang, Rongrong Ji

    Abstract: The Segment Anything Model (SAM) has significantly advanced interactive segmentation but struggles with high-resolution images crucial for high-precision segmentation. This is primarily due to the quadratic space complexity of SAM-implemented attention and the length extrapolation issue in common global attention. This study proposes HRSAM that integrates Flash Attention and incorporates Plain, Sh… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  5. arXiv:2406.01771  [pdf, other

    cs.CL

    LLMs Beyond English: Scaling the Multilingual Capability of LLMs with Cross-Lingual Feedback

    Authors: Wen Lai, Mohsen Mesgar, Alexander Fraser

    Abstract: To democratize large language models (LLMs) to most natural languages, it is imperative to make these models capable of understanding and generating texts in many languages, in particular low-resource ones. While recent multilingual LLMs demonstrate remarkable performance in such capabilities, these LLMs still support a limited number of human languages due to the lack of training data for low-res… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted to Findings of ACL 2024. The code, datasets, and models are publicly available at https://github.com/boschresearch/ACL24-MLLM

  6. A Vlogger-augmented Graph Neural Network Model for Micro-video Recommendation

    Authors: Weijiang Lai, Beihong Jin, Beibei Li, Yiyuan Zheng, Rui Zhao

    Abstract: Existing micro-video recommendation models exploit the interactions between users and micro-videos and/or multi-modal information of micro-videos to predict the next micro-video a user will watch, ignoring the information related to vloggers, i.e., the producers of micro-videos. However, in micro-video scenarios, vloggers play a significant role in user-video interactions, since vloggers generally… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Journal ref: (2023) Machine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track (pp. 684-699). Cham: Springer Nature Switzerland

  7. arXiv:2405.10022  [pdf, other

    eess.AS cs.SD

    Monaural speech enhancement on drone via Adapter based transfer learning

    Authors: Xingyu Chen, Hanwen Bi, Wei-Ting Lai, Fei Ma

    Abstract: Monaural Speech enhancement on drones is challenging because the ego-noise from the rotating motors and propellers leads to extremely low signal-to-noise ratios at onboard microphones. Although recent masking-based deep neural network methods excel in monaural speech enhancement, they struggle in the challenging drone noise scenario. Furthermore, existing drone noise datasets are limited, causing… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  8. arXiv:2404.06904  [pdf, other

    cs.RO

    Vision-Language Model-based Physical Reasoning for Robot Liquid Perception

    Authors: Wenqiang Lai, Yuan Gao, Tin Lun Lam

    Abstract: There is a growing interest in applying large language models (LLMs) in robotic tasks, due to their remarkable reasoning ability and extensive knowledge learned from vast training corpora. Grounding LLMs in the physical world remains an open challenge as they can only process textual input. Recent advancements in large vision-language models (LVLMs) have enabled a more comprehensive understanding… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 8 pages, 6 figures, submitted to IROS 2024

  9. arXiv:2403.08761  [pdf

    eess.IV cs.CV

    Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches

    Authors: Yun Xin Teoh, Alice Othmani, Siew Li Goh, Juliana Usman, Khin Wee Lai

    Abstract: Knee osteoarthritis is a degenerative joint disease that induces chronic pain and disability. Bone morphological analysis is a promising tool to understand the mechanical aspect of this disorder. This study proposes a 2D bone morphological analysis using manually segmented bones to explore morphological features related to distinct pain conditions. Furthermore, six semantic segmentation algorithms… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  10. On Defeating Graph Analysis of Anonymous Transactions

    Authors: Christoph Egger, Russell W. F. Lai, Viktoria Ronge, Ivy K. Y. Woo, Hoover H. F. Yin

    Abstract: In a ring-signature-based anonymous cryptocurrency, signers of a transaction are hidden among a set of potential signers, called a ring, whose size is much smaller than the number of all users. The ring-membership relations specified by the sets of transactions thus induce bipartite transaction graphs, whose distribution is in turn induced by the ring sampler underlying the cryptocurrency. Since… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Journal ref: Proceedings on Privacy Enhancing Technologies (PoPETs), Vol. 2022, Issue 3, Pages 538-557

  11. arXiv:2402.18394  [pdf, other

    cs.RO

    Dual-IMU State Estimation for Relative Localization of Two Mobile Agents

    Authors: Wenqian Lai, Ruonan Guo, Kejian J. Wu

    Abstract: In this paper, we address the problem of relative localization of two mobile agents. Specifically, we consider the Dual-IMU system, where each agent is equipped with one IMU, and employs relative pose observations between them. Previous works, however, typically assumed known ego motion and ignored biases of the IMUs. Instead, we study the most general case of unknown biases for both IMUs. Besides… ▽ More

    Submitted 6 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  12. arXiv:2401.01461  [pdf, other

    cs.CV

    Efficient Hybrid Zoom using Camera Fusion on Mobile Phones

    Authors: Xiaotong Wu, Wei-Sheng Lai, YiChang Shih, Charles Herrmann, Michael Krainin, Deqing Sun, Chia-Kai Liang

    Abstract: DSLR cameras can achieve multiple zoom levels via shifting lens distances or swapping lens types. However, these techniques are not possible on smartphone devices due to space constraints. Most smartphone manufacturers adopt a hybrid zoom system: commonly a Wide (W) camera at a low zoom level and a Telephoto (T) camera at a high zoom level. To simulate zoom levels between W and T, these systems cr… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Accepted to SIGGRAPH Asia 2023 (ACM TOG). Project website: https://www.wslai.net/publications/fusion_zoom

  13. arXiv:2311.12867  [pdf, other

    quant-ph cs.NE

    Amplitude-Ensemble Quantum-Inspired Tabu Search Algorithm for Solving 0/1 Knapsack Problems

    Authors: Kuo-Chun Tseng, Wei-Chieh Lai, I-Chia Chen, Yun-Hsiang Hsiao, Jr-Yu Chiue, Wei-Chun Huang

    Abstract: In this paper, an improved version of QTS (Quantum-inspired Tabu Search) has been proposed, which enhances the utilization of population information, called "amplitude-ensemble" QTS (AE-QTS). This makes AE-QTS more similar to the real quantum search algorithm, Grover Search Algorithm, in abstract concept, while keeping the simplicity of the algorithm. Later, we demonstrate the AE-QTS on the classi… ▽ More

    Submitted 17 March, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: 7 pages, 7 figures

  14. arXiv:2311.08538  [pdf, other

    cs.CL

    Extending Multilingual Machine Translation through Imitation Learning

    Authors: Wen Lai, Viktor Hangya, Alexander Fraser

    Abstract: Despite the growing variety of languages supported by existing multilingual neural machine translation (MNMT) models, most of the world's languages are still being left behind. We aim to extend large-scale MNMT models to a new language, allowing for translation between the newly added and all of the already supported languages in a challenging scenario: using only a parallel corpus between the new… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  15. arXiv:2310.11971  [pdf, other

    cs.LG cs.AI

    Improving Generalization of Alignment with Human Preferences through Group Invariant Learning

    Authors: Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: The success of AI assistants based on language models (LLMs) hinges crucially on Reinforcement Learning from Human Feedback (RLHF), which enables the generation of responses more aligned with human preferences. As universal AI assistants, there's a growing expectation for them to perform consistently across various domains. However, previous work shows that Reinforcement Learning (RL) often exploi… ▽ More

    Submitted 25 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  16. arXiv:2309.13819  [pdf, other

    eess.AS cs.SD

    A Two-Step Approach for Narrowband Source Localization in Reverberant Rooms

    Authors: Wei-Ting Lai, Lachlan Birnie, Thushara Abhayapala, Amy Bastine, Shaoheng Xu, Prasanga Samarasinghe

    Abstract: This paper presents a two-step approach for narrowband source localization within reverberant rooms. The first step involves dereverberation by modeling the homogeneous component of the sound field by an equivalent decomposition of planewaves using Iteratively Reweighted Least Squares (IRLS), while the second step focuses on source localization by modeling the dereverberated component as a sparse… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

  17. arXiv:2308.09380  [pdf, other

    cs.LG cs.AI cs.CV

    Deciphering knee osteoarthritis diagnostic features with explainable artificial intelligence: A systematic review

    Authors: Yun Xin Teoh, Alice Othmani, Siew Li Goh, Juliana Usman, Khin Wee Lai

    Abstract: Existing artificial intelligence (AI) models for diagnosing knee osteoarthritis (OA) have faced criticism for their lack of transparency and interpretability, despite achieving medical-expert-like performance. This opacity makes them challenging to trust in clinical practice. Recently, explainable artificial intelligence (XAI) has emerged as a specialized technique that can provide confidence in t… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

  18. arXiv:2308.06533  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Knowledge Distilled Ensemble Model for sEMG-based Silent Speech Interface

    Authors: Wenqiang Lai, Qihan Yang, Ye Mao, Endong Sun, Jiangnan Ye

    Abstract: Voice disorders affect millions of people worldwide. Surface electromyography-based Silent Speech Interfaces (sEMG-based SSIs) have been explored as a potential solution for decades. However, previous works were limited by small vocabularies and manually extracted features from raw data. To address these limitations, we propose a lightweight deep learning knowledge-distilled ensemble model for sEM… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

    Comments: 6 pages, 5 figures

  19. arXiv:2307.04964  [pdf, other

    cs.CL cs.AI cs.LG

    Secrets of RLHF in Large Language Models Part I: PPO

    Authors: Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang , et al. (2 additional authors not shown)

    Abstract: Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramount significance, and reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. Current… ▽ More

    Submitted 18 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

  20. arXiv:2305.18998  [pdf, other

    cs.IT eess.SP

    Blind Beamforming for Intelligent Reflecting Surface in Fading Channels without CSI

    Authors: Wenhai Lai, Wenyu Wang, Fan Xu, Xin Li, Shaobo Niu, Kaiming Shen

    Abstract: This paper discusses how to optimize the phase shifts of intelligent reflecting surface (IRS) to combat channel fading without any channel state information (CSI), namely blind beamforming. Differing from most previous works based on a two-stage paradigm of first estimating channels and then optimizing phase shifts, our approach is completely data-driven, only requiring a dataset of the received s… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: 14 pages, 14 figures

  21. arXiv:2305.12786  [pdf, other

    cs.CL

    Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation

    Authors: Wen Lai, Alexandra Chronopoulou, Alexander Fraser

    Abstract: Despite advances in multilingual neural machine translation (MNMT), we argue that there are still two major challenges in this area: data imbalance and representation degeneration. The data imbalance problem refers to the imbalance in the amount of parallel corpora for all language pairs, especially for long-tail languages (i.e., very low-resource languages). The representation degeneration proble… ▽ More

    Submitted 24 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of EMNLP 2023, add statistical significance tests. code available at https://github.com/lavine-lmu/Bi-ACL

  22. arXiv:2304.06862  [pdf, other

    cs.DS cs.CC

    The Longest Subsequence-Repeated Subsequence Problem

    Authors: Manuel Lafond, Wenfeng Lai, Adiesha Liyanage, Binhai Zhu

    Abstract: Motivated by computing duplication patterns in sequences, a new fundamental problem called the longest subsequence-repeated subsequence (LSRS) is proposed. Given a sequence $S$ of length $n$, a letter-repeated subsequence is a subsequence of $S$ in the form of $x_1^{d_1}x_2^{d_2}\cdots x_k^{d_k}$ with $x_i$ a subsequence of $S$, $x_j\neq x_{j+1}$ and $d_i\geq 2$ for all $i$ in $[k]$ and $j$ in… ▽ More

    Submitted 31 August, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: 15 pages, 1 figure

    MSC Class: 68W01; 68W32 ACM Class: F.2.2

  23. Coordinating Multiple Intelligent Reflecting Surfaces without Channel Information

    Authors: Fan Xu, Jiawei Yao, Wenhai Lai, Kaiming Shen, Xin Li, Xin Chen, Zhi-Quan Luo

    Abstract: Conventional beamforming methods for intelligent reflecting surfaces (IRSs) or reconfigurable intelligent surfaces (RISs) typically entail the full channel state information (CSI). However, the computational cost of channel acquisition soars exponentially with the number of IRSs. To bypass this difficulty, we propose a novel strategy called blind beamforming that coordinates multiple IRSs by means… ▽ More

    Submitted 8 January, 2024; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: 16 pages

    Journal ref: IEEE Transactions on Signal Processing 2024

  24. arXiv:2210.11912  [pdf, other

    cs.CL

    $m^4Adapter$: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter

    Authors: Wen Lai, Alexandra Chronopoulou, Alexander Fraser

    Abstract: Multilingual neural machine translation models (MNMT) yield state-of-the-art performance when evaluated on data from a domain and language pair seen at training time. However, when a MNMT model is used to translate under domain shift or to a new language pair, performance drops dramatically. We consider a very challenging scenario: adapting the MNMT model both to a new domain and to a new language… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: Accepted to Findings of EMNLP 2022

  25. arXiv:2209.00329  [pdf, other

    cs.RO

    Design and Development of a Tracked Inspection Robot

    Authors: Erika Sahari, Weiyao Lai, Alireza Pulles, XiaoQi Guo, Marc Bernhard

    Abstract: This paper presents the examination of the clever Differential with three levels of opportunity. The is the principal differential with that interprets differential speed and force to its three results when the results are under fluctuated loads, however deciphers equivalent movement and force to its results when exposed to approach loads. The kinematics and elements of the are determined and are… ▽ More

    Submitted 1 September, 2022; originally announced September 2022.

    Comments: 6 pages, 4 figures

  26. arXiv:2208.10439  [pdf, other

    cs.RO

    Design and Development of Miniature long distance multi-moving robots for 3D Smart Sensing for underground Pipe Inspection

    Authors: Alireza Pulles, Weiyao Lai, Erika Sahari, XiaoQi Guo, Marc Bernhard

    Abstract: Designing an in-pipe climbing robot that manipulates sharp gears to study complex line relationships. Traditional rolling/happening pipe climbing robots tend to slide when exploring pipe curves. The proposed gearbox connects to the farthest ground plane of a standard dual output gearbox. Instrumentation helps achieve a very well-defined deceleration sequence in which the robot slides and pulls as… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

    Comments: 6 pages, 5 figures

  27. arXiv:2208.06690  [pdf, other

    cs.RO

    Deployment of long distance multi-moving robots for underground pipe inspection

    Authors: Weiyao Lai, Wei Xu, Marc Bernhard

    Abstract: Blueprint of an in-pipe climbing robot that works with sharp transmissions to study complex line relationships. Standard wheeled/happening pipe climbing robots tend to slide when exploring pipe turns. Instruments help achieve a very distinct delay sequence in which the robot slides and drags as it progresses. The proposed transmission joins the farthest ground plane of the standard two-output tran… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: 6 pages, 5 figures

  28. arXiv:2207.11617  [pdf, other

    cs.CV cs.GR

    Face Deblurring using Dual Camera Fusion on Mobile Phones

    Authors: Wei-Sheng Lai, YiChang Shih, Lun-Cheng Chu, Xiaotong Wu, Sung-Fang Tsai, Michael Krainin, Deqing Sun, Chia-Kai Liang

    Abstract: Motion blur of fast-moving subjects is a longstanding problem in photography and very common on mobile phones due to limited light collection efficiency, particularly in low-light conditions. While we have witnessed great progress in image deblurring in recent years, most methods require significant computational power and have limitations in processing high-resolution photos with severe local mot… ▽ More

    Submitted 23 July, 2022; originally announced July 2022.

    Comments: Accepted to SIGGRAPH 2022 (ACM TOG). Project websit: https://www.wslai.net/publications/fusion_deblur/

  29. arXiv:2207.05736  [pdf, other

    cs.CV cs.GR

    Vision Transformer for NeRF-Based View Synthesis from a Single Input Image

    Authors: Kai-En Lin, Lin Yen-Chen, Wei-Sheng Lai, Tsung-Yi Lin, Yi-Chang Shih, Ravi Ramamoorthi

    Abstract: Although neural radiance fields (NeRF) have shown impressive advances for novel view synthesis, most methods typically require multiple input images of the same scene with accurate camera poses. In this work, we seek to substantially reduce the inputs to a single unposed image. Existing approaches condition on local image features to reconstruct a 3D object, but often render blurry predictions at… ▽ More

    Submitted 13 October, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

    Comments: WACV 2023 Project website: https://cseweb.ucsd.edu/~viscomp/projects/VisionNeRF/

  30. arXiv:2205.05122  [pdf, ps, other

    cs.IT

    Multichannel Optimal Tree-Decodable Codes are Not Always Optimal Prefix Codes

    Authors: Hoover H. F. Yin, Harry W. H. Wong, Mehrdad Tahernia, Russell W. F. Lai

    Abstract: The theory of multichannel prefix codes aims to generalize the classical theory of prefix codes. Although single- and two-channel prefix codes always have decoding trees, the same cannot be said when there are more than two channels. One question is of theoretical interest: Do there exist optimal tree-decodable codes that are not optimal prefix codes? Existing literature, which focused on generali… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: Full version of the conference version in ISIT'22

  31. arXiv:2204.07125  [pdf, other

    cs.DB

    Online Aggregation based Approximate Query Processing: A Literature Survey

    Authors: Pritom Saha Akash, Wei-Cheng Lai, Po-Wen Lin

    Abstract: In the current world, OLAP (Online Analytical Processing) is used intensively by modern organizations to perform ad hoc analysis of data, providing insight for better decision making. Thus, the performance for OLAP is crucial; however, it is costly to support OLAP for a large data-set. An approximate query process (AQP) was proposed to efficiently compute approximate values as close as to the exac… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

  32. arXiv:2203.07861  [pdf, other

    cs.CV cs.AI cs.LG

    Don't Get Me Wrong: How to Apply Deep Visual Interpretations to Time Series

    Authors: Christoffer Loeffler, Wei-Cheng Lai, Bjoern Eskofier, Dario Zanca, Lukas Schmidt, Christopher Mutschler

    Abstract: The correct interpretation and understanding of deep learning models are essential in many applications. Explanatory visual interpretation approaches for image, and natural language processing allow domain experts to validate and understand almost any deep learning model. However, they fall short when generalizing to arbitrary time series, which is inherently less intuitive and more diverse. Wheth… ▽ More

    Submitted 15 September, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: 36 pages, 13 figues

  33. arXiv:2201.10700  [pdf, other

    cs.CV

    Deep Image Deblurring: A Survey

    Authors: Kaihao Zhang, Wenqi Ren, Wenhan Luo, Wei-Sheng Lai, Bjorn Stenger, Ming-Hsuan Yang, Hongdong Li

    Abstract: Image deblurring is a classic problem in low-level computer vision with the aim to recover a sharp image from a blurred input image. Advances in deep learning have led to significant progress in solving this problem, and a large number of deblurring networks have been proposed. This paper presents a comprehensive and timely survey of recently published deep-learning based image deblurring approach… ▽ More

    Submitted 27 May, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

    Comments: To appear in International Journal of Computer Vision (IJCV)

  34. arXiv:2112.08288  [pdf, other

    cs.CL

    Improving Both Domain Robustness and Domain Adaptability in Machine Translation

    Authors: Wen Lai, Jindřich Libovický, Alexander Fraser

    Abstract: We consider two problems of NMT domain adaptation using meta-learning. First, we want to reach domain robustness, i.e., we want to reach high quality on both domains seen in the training data and unseen domains. Second, we want our systems to be adaptive, i.e., making it possible to finetune systems with just hundreds of in-domain parallel sentences. We study the domain adaptability of meta-learni… ▽ More

    Submitted 4 October, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: Accepted to COLING 2022

  35. arXiv:2112.05725  [pdf, ps, other

    cs.DS cs.CC

    Beyond the Longest Letter-duplicated Subsequence Problem

    Authors: Wenfeng Lai, Adiesha Liyanage, Binhai Zhu, Peng Zou

    Abstract: Given a sequence $S$ of length $n$, a letter-duplicated subsequence is a subsequence of $S$ in the form of $x_1^{d_1}x_2^{d_2}\cdots x_k^{d_k}$ with $x_i\inΣ$, $x_j\neq x_{j+1}$ and $d_i\geq 2$ for all $i$ in $[k]$ and $j$ in $[k-1]$. A linear time algorithm for computing the longest letter-duplicated subsequence (LLDS) of $S$ can be easily obtained. In this paper, we focus on two variants of this… ▽ More

    Submitted 4 January, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

    Comments: 18 pages

    MSC Class: 68W01; 68W32

  36. Correcting Face Distortion in Wide-Angle Videos

    Authors: Wei-Sheng Lai, YiChang Shih, Chia-Kai Liang, Ming-Hsuan Yang

    Abstract: Video blogs and selfies are popular social media formats, which are often captured by wide-angle cameras to show human subjects and expanded background. Unfortunately, due to perspective projection, faces near corners and edges exhibit apparent distortions that stretch and squish the facial features, resulting in poor video quality. In this work, we present a video warping algorithm to correct the… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Comments: Project website: https://www.wslai.net/publications/video_face_correction/

  37. Toward Real-World Super-Resolution via Adaptive Downsampling Models

    Authors: Sanghyun Son, Jaeha Kim, Wei-Sheng Lai, Ming-Husan Yang, Kyoung Mu Lee

    Abstract: Most image super-resolution (SR) methods are developed on synthetic low-resolution (LR) and high-resolution (HR) image pairs that are constructed by a predetermined operation, e.g., bicubic downsampling. As existing methods typically learn an inverse mapping of the specific function, they produce blurry results when applied to real-world images whose exact formulation is different and unknown. The… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

    Comments: Accepted at TPAMI

  38. arXiv:2105.13016  [pdf, other

    cs.CV

    Stylizing 3D Scene via Implicit Representation and HyperNetwork

    Authors: Pei-Ze Chiang, Meng-Shiun Tsai, Hung-Yu Tseng, Wei-sheng Lai, Wei-Chen Chiu

    Abstract: In this work, we aim to address the 3D scene stylization problem - generating stylized images of the scene at arbitrary novel view angles. A straightforward solution is to combine existing novel view synthesis and image/video style transfer approaches, which often leads to blurry results or inconsistent appearance. Inspired by the high-quality results of the neural radiance fields (NeRF) method, w… ▽ More

    Submitted 16 January, 2022; v1 submitted 27 May, 2021; originally announced May 2021.

    Comments: Accepted to WACV2022; Project page: https://ztex08010518.github.io/3dstyletransfer/

  39. arXiv:2105.03606  [pdf, ps, other

    cs.IT

    On Multi-Channel Huffman Codes for Asymmetric-Alphabet Channels

    Authors: Hoover H. F. Yin, Xishi Wang, Ka Hei Ng, Russell W. F. Lai, Lucien K. L. Ng, Jack P. K. Ma

    Abstract: Zero-error single-channel source coding has been studied extensively over the past decades. Its natural multi-channel generalization is however not well investigated. While the special case with multiple symmetric-alphabet channels was studied a decade ago, codes in such setting have no advantage over single-channel codes in data compression, making them worthless in most applications. With essent… ▽ More

    Submitted 8 May, 2021; originally announced May 2021.

    Comments: full version of the ISIT 2021 paper

  40. arXiv:2102.06205  [pdf, other

    cs.CV

    Hybrid Neural Fusion for Full-frame Video Stabilization

    Authors: Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang

    Abstract: Existing video stabilization methods often generate visible distortion or require aggressive cropping of frame boundaries, resulting in smaller field of views. In this work, we present a frame synthesis algorithm to achieve full-frame video stabilization. We first estimate dense warp fields from neighboring frames and then synthesize the stabilized frame by fusing the warped contents. Our core tec… ▽ More

    Submitted 23 August, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: ICCV 2021. Project page: https://alex04072000.github.io/FuSta/ Code: https://github.com/alex04072000/FuSta

  41. arXiv:2102.01279  [pdf, other

    cs.CV

    Deep Online Fused Video Stabilization

    Authors: Zhenmei Shi, Fuhao Shi, Wei-Sheng Lai, Chia-Kai Liang, Yingyu Liang

    Abstract: We present a deep neural network (DNN) that uses both sensor data (gyroscope) and image content (optical flow) to stabilize videos through unsupervised learning. The network fuses optical flow with real/virtual camera pose histories into a joint motion representation. Next, the LSTM block infers the new virtual camera pose, and this virtual pose is used to generate a warping grid that stabilizes t… ▽ More

    Submitted 3 April, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

    Comments: 9 pages. Project page: https://zhmeishi.github.io/dvs/

  42. arXiv:2101.02913  [pdf, other

    cs.NE

    When does the Physarum Solver Distinguish the Shortest Path from other Paths: the Transition Point and its Applications

    Authors: Yusheng Huang, Dong Chu, Joel Weijia Lai, Yong Deng, Kang Hao Cheong

    Abstract: Physarum solver, also called the physarum polycephalum inspired algorithm (PPA), is a newly developed bio-inspired algorithm that has an inherent ability to find the shortest path in a given graph. Recent research has proposed methods to develop this algorithm further by accelerating the original PPA (OPPA)'s path-finding process. However, when does the PPA ascertain that the shortest path has bee… ▽ More

    Submitted 8 January, 2021; originally announced January 2021.

  43. arXiv:2012.05903  [pdf, other

    cs.CV

    Portrait Neural Radiance Fields from a Single Image

    Authors: Chen Gao, Yichang Shih, Wei-Sheng Lai, Chia-Kai Liang, Jia-Bin Huang

    Abstract: We present a method for estimating Neural Radiance Fields (NeRF) from a single headshot portrait. While NeRF has demonstrated high-quality view synthesis, it requires multiple images of static scenes and thus impractical for casual captures and moving subjects. In this work, we propose to pretrain the weights of a multilayer perceptron (MLP), which implicitly models the volumetric density and colo… ▽ More

    Submitted 16 April, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

    Comments: Project webpage: https://portrait-nerf.github.io/

  44. arXiv:2010.10056  [pdf, other

    cs.CV

    Real-time Localized Photorealistic Video Style Transfer

    Authors: Xide Xia, Tianfan Xue, Wei-sheng Lai, Zheng Sun, Abby Chang, Brian Kulis, Jiawen Chen

    Abstract: We present a novel algorithm for transferring artistic styles of semantically meaningful local regions of an image onto local regions of a target video while preserving its photorealism. Local regions may be selected either fully automatically from an image, through using video segmentation algorithms, or from casual user guidance such as scribbles. Our method, based on a deep neural network archi… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: 16 pages, 15 figures

  45. arXiv:2008.04902  [pdf, other

    cs.CV eess.IV

    Learning to See Through Obstructions with Layered Decomposition

    Authors: Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang

    Abstract: We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions, or adherent raindrops, from a short sequence of images captured by a moving camera. Our method leverages motion differences between the background and obstructing elements to recover both layers. Specifically, we alternate between estimating dense optical flow fields of the two la… ▽ More

    Submitted 25 July, 2021; v1 submitted 11 August, 2020; originally announced August 2020.

    Comments: Project page: https://alex04072000.github.io/SOLD/ Code: https://github.com/alex04072000/SOLD Extension of the CVPR 2020 paper: arXiv:2004.01180

  46. arXiv:2008.01078  [pdf, other

    cs.LG cs.CV

    Ubicomp Digital 2020 -- Handwriting classification using a convolutional recurrent network

    Authors: Wei-Cheng Lai, Hendrik Schröter

    Abstract: The Ubicomp Digital 2020 -- Time Series Classification Challenge from STABILO is a challenge about multi-variate time series classification. The data collected from 100 volunteer writers, and contains 15 features measured with multiple sensors on a pen. In this paper,we use a neural network to classify the data into 52 classes, that is lower and upper cases of Arabic letters. The proposed architec… ▽ More

    Submitted 3 August, 2020; originally announced August 2020.

    Comments: CRNN, Handwriting recognition, Ubicomp, Stabilo

  47. arXiv:2007.05927  [pdf, other

    cs.RO cs.HC eess.SY

    A Three-limb Teleoperated Robotic System with Foot Control for Flexible Endoscopic Surgery

    Authors: Yanpei Huang, Wenjie Lai, Lin Cao, Jiajun Liu, Xiaoguo Li, Etienne Burdet, Soo Jay Phee

    Abstract: Flexible endoscopy requires high skills to manipulate both the endoscope and associated instruments. In most robotic flexible endoscopic systems, the endoscope and instruments are controlled separately by two operators, which may result in communication errors and inefficient operation. We present a novel teleoperation robotic endoscopic system that can be commanded by a surgeon alone. This 13 deg… ▽ More

    Submitted 12 July, 2020; originally announced July 2020.

    Comments: 9 pages, 11 figures

  48. arXiv:2004.01180  [pdf, other

    cs.CV

    Learning to See Through Obstructions

    Authors: Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang

    Abstract: We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions or raindrops, from a short sequence of images captured by a moving camera. Our method leverages the motion differences between the background and the obstructing elements to recover both layers. Specifically, we alternate between estimating dense optical flow fields of the two laye… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

    Comments: CVPR 2020. Project page: https://www.cmlab.csie.ntu.edu.tw/~yulunliu/ObstructionRemoval Code: https://github.com/alex04072000/ObstructionRemoval

  49. arXiv:2004.01179  [pdf, other

    eess.IV cs.CV

    Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline

    Authors: Yu-Lun Liu, Wei-Sheng Lai, Yu-Sheng Chen, Yi-Lung Kao, Ming-Hsuan Yang, Yung-Yu Chuang, Jia-Bin Huang

    Abstract: Recovering a high dynamic range (HDR) image from a single low dynamic range (LDR) input image is challenging due to missing details in under-/over-exposed regions caused by quantization and saturation of camera sensors. In contrast to existing learning-based methods, our core idea is to incorporate the domain knowledge of the LDR image formation pipeline into our model. We model the HDRto-LDR imag… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

    Comments: CVPR 2020. Project page: https://www.cmlab.csie.ntu.edu.tw/~yulunliu/SingleHDR Code: https://github.com/alex04072000/SingleHDR

  50. arXiv:2003.00893  [pdf, other

    cs.CV

    Gated Fusion Network for Degraded Image Super Resolution

    Authors: Xinyi Zhang, Hang Dong, Zhe Hu, Wei-Sheng Lai, Fei Wang, Ming-Hsuan Yang

    Abstract: Single image super resolution aims to enhance image quality with respect to spatial content, which is a fundamental task in computer vision. In this work, we address the task of single frame super resolution with the presence of image degradation, e.g., blur, haze, or rain streaks. Due to the limitations of frame capturing and formation processes, image degradation is inevitable, and the artifacts… ▽ More

    Submitted 4 March, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

    Comments: Accepted by IJCV. The code will be publicly available at https://github.com/BookerDeWitt/GFN-IJCV