Zum Hauptinhalt springen

Showing 1–50 of 84 results for author: Luan, J

.
  1. arXiv:2408.16224  [pdf, other

    cs.CV cs.AI

    LLaVA-SG: Leveraging Scene Graphs as Visual Semantic Expression in Vision-Language Models

    Authors: Jingyi Wang, Jianzhong Ju, Jian Luan, Zhidong Deng

    Abstract: Recent advances in large vision-language models (VLMs) typically employ vision encoders based on the Vision Transformer (ViT) architecture. The division of the images into patches by ViT results in a fragmented perception, thereby hindering the visual understanding capabilities of VLMs. In this paper, we propose an innovative enhancement to address this limitation by introducing a Scene Graph Expr… ▽ More

    Submitted 29 August, 2024; v1 submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2408.13459  [pdf, other

    cs.CV

    Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model

    Authors: Chen Rao, Guangyuan Li, Zehua Lan, Jiakai Sun, Junsheng Luan, Wei Xing, Lei Zhao, Huaizhong Lin, Jianfeng Dong, Dalong Zhang

    Abstract: Current video deblurring methods have limitations in recovering high-frequency information since the regression losses are conservative with high-frequency details. Since Diffusion Models (DMs) have strong capabilities in generating high-frequency details, we consider introducing DMs into the video deblurring task. However, we found that directly applying DMs to the video deblurring task has the f… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

    Comments: accepted by ECCV2024

    ACM Class: I.4.4

  3. arXiv:2407.05690  [pdf, other

    cs.CL cs.AI

    Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations

    Authors: Bowen Shen, Zheng Lin, Daren Zha, Wei Liu, Jian Luan, Bin Wang, Weiping Wang

    Abstract: Structured pruning fundamentally reduces computational and memory overheads of large language models (LLMs) and offers a feasible solution for end-side LLM deployment. Structurally pruned models remain dense and high-precision, highly compatible with further tuning and compression. However, as the coarse-grained structured pruning poses large damage to the highly interconnected model, achieving a… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Findings of ACL 2024

  4. arXiv:2407.00993  [pdf, other

    cs.AI cs.CL

    Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents

    Authors: Shihan Deng, Weikai Xu, Hongda Sun, Wei Liu, Tao Tan, Jianfeng Liu, Ang Li, Jian Luan, Bin Wang, Rui Yan, Shuo Shang

    Abstract: With the remarkable advancements of large language models (LLMs), LLM-based agents have become a research hotspot in human-computer interaction. However, there is a scarcity of benchmarks available for LLM-based mobile agents. Benchmarking these agents generally faces three main challenges: (1) The inefficiency of UI-only operations imposes limitations to task evaluation. (2) Specific instructions… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  5. arXiv:2406.06571  [pdf, other

    cs.CL cs.AI

    SUBLLM: A Novel Efficient Architecture with Token Sequence Subsampling for LLM

    Authors: Quandong Wang, Yuxuan Yuan, Xiaoyu Yang, Ruike Zhang, Kang Zhao, Wei Liu, Jian Luan, Daniel Povey, Bin Wang

    Abstract: While Large Language Models (LLMs) have achieved remarkable success in various fields, the efficiency of training and inference remains a major challenge. To address this issue, we propose SUBLLM, short for Subsampling-Upsampling-Bypass Large Language Model, an innovative architecture that extends the core decoder-only framework by incorporating subsampling, upsampling, and bypass modules. The sub… ▽ More

    Submitted 23 August, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures, accepted by ECAI 2024

    ACM Class: I.2.7

  6. arXiv:2406.05676  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Chern insulator phase realized in dual-gate-tuned MnBi2Te4 thin films grown by molecular beam epitaxy

    Authors: Yunhe Bai, Yuanzhao Li, Ruixuan Liu, Jianli Luan, Yang Chen, Wenyu Song, Peng-Fei Ji, Cui Ding, Zongwei Gao, Qinghua Zhang, Fanqi Meng, Bingbing Tong, Lin Li, Tianchen Zhu, Lin Gu, Lili Wang, Jinsong Zhang, Yayu Wang, Qi-Kun Xue, Ke He, Yang Feng, Xiao Feng

    Abstract: The intrinsic magnetic order, large topological-magnetic gap and rich topological phases make MnBi2Te4 a wonderful platform to study exotic topological quantum states such as axion insulator and Chern insulator. To realize and manipulate these topological phases in a MnBi2Te4 thin film, precise manipulation of the electric field across the film is essential, which requires a dual-gate structure. I… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: 24 pages, 4 figures

  7. arXiv:2405.11940  [pdf, other

    cond-mat.mtrl-sci physics.comp-ph

    On the equivalence of two spinodal decomposition criteria with a case study of Fe${}_{15}$Co${}_{15}$Ni${}_{35}$Cu${}_{35}$ multicomponent alloy

    Authors: Hengwei Luan, You Wu, Jingyi Kang, Liufei Huang, J. H. Luan, Jinfeng Li, Yang Shao, Ke-fu Yao, Jian Lu

    Abstract: Spinodal decomposition in multicomponent alloys has attracted increasing attention due to its beneficial effect on their mechanical and functional properties and potential applications. Both based on the Cahn-Hillard equation, the reference element method (REM) and the projection matrix method (PMM) are the two main methods to predict the occurrence of spinodal decomposition in multicomponent allo… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 27 pages, 3 figures, 1 supplementary file

  8. arXiv:2404.11474  [pdf, other

    cs.CV

    Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt

    Authors: Zhanjie Zhang, Quanwei Zhang, Huaizhong Lin, Wei Xing, Juncheng Mo, Shuaicheng Huang, Jinheng Xie, Guangyuan Li, Junsheng Luan, Lei Zhao, Dalong Zhang, Lixia Chen

    Abstract: Artistic style transfer aims to transfer the learned artistic style onto an arbitrary content image, generating artistic stylized images. Existing generative adversarial network-based methods fail to generate highly realistic stylized images and always introduce obvious artifacts and disharmonious patterns. Recently, large-scale pre-trained diffusion models opened up a new way for generating highl… ▽ More

    Submitted 12 August, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI2024

  9. arXiv:2404.09083  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Interplay between electronic dephasing and localization in finite-sized Chern insulator

    Authors: Yunhe Bai, Yuanzhao Li, Jianli Luan, Yang Chen, Zongwei Gao, Wenyu Song, Yitian Tong, Jinsong Zhang, Yayu Wang, Junjie Qi, Chui-Zhen Chen, Hua Jiang, X. C. Xie, Ke He, Yang Feng, Xiao Feng, Qi-Kun Xue

    Abstract: Anderson localization is anticipated to play a pivotal role in the manifestation of the quantum anomalous Hall effect, akin to its role in conventional quantum Hall effects. The significance of Anderson localization is particularly pronounced in elucidating the reasons behind the fragility of the observed quantum anomalous Hall state in the intrinsic magnetic topological insulator MnBi2Te4 with a… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: 20 pages, 4 figures

  10. arXiv:2403.06551  [pdf, other

    cs.IR

    ToolRerank: Adaptive and Hierarchy-Aware Reranking for Tool Retrieval

    Authors: Yuanhang Zheng, Peng Li, Wei Liu, Yang Liu, Jian Luan, Bin Wang

    Abstract: Tool learning aims to extend the capabilities of large language models (LLMs) with external tools. A major challenge in tool learning is how to support a large number of tools, including unseen tools. To address this challenge, previous studies have proposed retrieving suitable tools for the LLM based on the user query. However, previously proposed methods do not consider the differences between s… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: This paper is accepted for LREC-COLING 2024

    Journal ref: In Proceedings of LREC-COLING 2024, pages 16263-16273

  11. arXiv:2402.16775  [pdf, other

    cs.CL cs.AI

    A Comprehensive Evaluation of Quantization Strategies for Large Language Models

    Authors: Renren Jin, Jiangcun Du, Wuwei Huang, Wei Liu, Jian Luan, Bin Wang, Deyi Xiong

    Abstract: Increasing the number of parameters in large language models (LLMs) usually improves performance in downstream tasks but raises compute and memory costs, making deployment difficult in resource-limited settings. Quantization techniques, which reduce the bits needed for model weights or activations with minimal performance loss, have become popular due to the rise of LLMs. However, most quantizatio… ▽ More

    Submitted 6 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: ACL 2024 Findings

  12. arXiv:2401.12544  [pdf

    cond-mat.mes-hall

    Correlation between magnetic domain structures and quantum anomalous Hall effect in epitaxial MnBi2Te4 thin films

    Authors: Yang Shi, Yunhe Bai, Yuanzhao Li, Yang Feng, Qiang Li, Huanyu Zhang, Yang Chen, Yitian Tong, Jianli Luan, Ruixuan Liu, Pengfei Ji, Zongwei Gao, Hangwen Guo, Jinsong Zhang, Yayu Wang, Xiao Feng, Ke He, Xiaodong Zhou, Jian Shen

    Abstract: We use magnetic force microscopy (MFM) to study spatial uniformity of magnetization of epitaxially grown MnBi2Te4 thin films. Compared to films which exhibit no quantum anomalous Hall effect (QAH), films with QAH are observed to have more spatial uniformity of magnetization with larger domain size. The domain evolution upon magnetic field sweeping indicates that the magnetic domains or the spatial… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 14 pages, 4 figures

  13. arXiv:2401.11450  [pdf

    cond-mat.mes-hall

    Reentrant quantum anomalous Hall effect in molecular beam epitaxy-grown MnBi2Te4 thin films

    Authors: Yuanzhao Li, Yunhe Bai, Yang Feng, Jianli Luan, Zongwei Gao, Yang Chen, Yitian Tong, Ruixuan Liu, Su Kong Chong, Kang L. Wang, Xiaodong Zhou, Jian Shen, Jinsong Zhang, Yayu Wang, Chui-Zhen Chen, XinCheng Xie, Xiao Feng, Ke He, Qi-Kun Xue

    Abstract: In this study, we investigate intrinsic magnetic topological insulator MnBi2Te4 thin films grown by molecular beam epitaxy. We observe a reentrant quantum anomalous Hall effect when the Fermi energy enters the valance band and magnetic field equals zero, indicating the emergence of the Chern Anderson insulator state. The discovery opens a new avenue for realizing the QAH effect and underscores the… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: 15 pages, 4 figures

  14. arXiv:2401.05459  [pdf, other

    cs.HC cs.AI cs.SE

    Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

    Authors: Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu

    Abstract: Since the advent of personal computing devices, intelligent personal assistants (IPAs) have been one of the key technologies that researchers and engineers have focused on, aiming to help users efficiently obtain information and execute tasks, and provide users with more intelligent, convenient, and rich interaction experiences. With the development of smartphones and IoT, computing and sensing de… ▽ More

    Submitted 8 May, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: https://github.com/MobileLLM/Personal_LLM_Agents_Survey

  15. arXiv:2401.04283  [pdf, ps, other

    eess.AS cs.SD

    FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation

    Authors: Yang Liu, Li Wan, Yun Li, Yiteng Huang, Ming Sun, James Luan, Yangyang Shi, Xin Lei

    Abstract: Despite the potential of diffusion models in speech enhancement, their deployment in Acoustic Echo Cancellation (AEC) has been restricted. In this paper, we propose DI-AEC, pioneering a diffusion-based stochastic regeneration approach dedicated to AEC. Further, we propose FADI-AEC, fast score-based diffusion AEC framework to save computational demands, making it favorable for edge devices. It stan… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  16. arXiv:2312.06135  [pdf, other

    cs.CV

    ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank

    Authors: Zhanjie Zhang, Quanwei Zhang, Guangyuan Li, Wei Xing, Lei Zhao, Jiakai Sun, Zehua Lan, Junsheng Luan, Yiling Huang, Huaizhong Lin

    Abstract: Artistic style transfer aims to repaint the content image with the learned artistic style. Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches. Small model-based approaches can preserve the content strucuture, but fail to produce highly realistic stylized images and introduce artifacts and dish… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI2024

  17. arXiv:2311.15058  [pdf

    physics.optics

    Controlled generation of Poincaré sphere beams with inverse-designed multimode meta-waveguides

    Authors: Jing Luan, Shuang Zheng, Zhenyu Wan, Tiange Wu, Weijie Chang, Deming Liu, Minming Zhang

    Abstract: The angular momentum of light can be described by positions on various Poincaré spheres, where different structured light beams have proven useful for numerous optical applications. However, the dynamic generation and control of arbitrary structured light on different Poincaré spheres is still handled via bulky optics in free space. Here we propose and demonstrate multimode silicon photonic integr… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  18. arXiv:2311.03672  [pdf, other

    cs.CL

    CBSiMT: Mitigating Hallucination in Simultaneous Machine Translation with Weighted Prefix-to-Prefix Training

    Authors: Mengge Liu, Wen Zhang, Xiang Li, Yanzhi Tian, Yuhang Guo, Jian Luan, Bin Wang, Shuoying Chen

    Abstract: Simultaneous machine translation (SiMT) is a challenging task that requires starting translation before the full source sentence is available. Prefix-to-prefix framework is often applied to SiMT, which learns to predict target tokens using only a partial source prefix. However, due to the word order difference between languages, misaligned prefix pairs would make SiMT models suffer from serious ha… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  19. arXiv:2310.18659  [pdf, other

    cs.AI cs.CL

    DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy

    Authors: Hongda Sun, Weikai Xu, Wei Liu, Jian Luan, Bin Wang, Shuo Shang, Ji-Rong Wen, Rui Yan

    Abstract: Recent advances in large language models (LLMs) have revolutionized the landscape of reasoning tasks. To enhance the capabilities of LLMs to emulate human reasoning, prior studies have focused on modeling reasoning steps using various thought structures like chains, trees, or graphs. However, LLM-based reasoning still encounters the following challenges: (1) Limited adaptability of preset structur… ▽ More

    Submitted 26 May, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted at ACL 2024 Main, Code repo: https://github.com/XiaoMi/DetermLR

  20. arXiv:2307.15895  [pdf, other

    cs.CR

    Auditing Frameworks Need Resource Isolation: A Systematic Study on the Super Producer Threat to System Auditing and Its Mitigation

    Authors: Peng Jiang, Ruizhe Huang, Ding Li, Yao Guo, Xiangqun Chen, Jianhai Luan, Yuxin Ren, Xinwei Hu

    Abstract: System auditing is a crucial technique for detecting APT attacks. However, attackers may try to compromise the system auditing frameworks to conceal their malicious activities. In this paper, we present a comprehensive and systematic study of the super producer threat in auditing frameworks, which enables attackers to either corrupt the auditing framework or paralyze the entire system. We analyze… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

    Comments: 18 pages, to appear in the 32th USENIX Security Symposium (USENIX Security '23)

  21. arXiv:2306.16636  [pdf, other

    cs.CL cs.AI cs.LG

    CMATH: Can Your Language Model Pass Chinese Elementary School Math Test?

    Authors: Tianwen Wei, Jian Luan, Wei Liu, Shuang Dong, Bin Wang

    Abstract: We present the Chinese Elementary School Math Word Problems (CMATH) dataset, comprising 1.7k elementary school-level math word problems with detailed annotations, source from actual Chinese workbooks and exams. This dataset aims to provide a benchmark tool for assessing the following question: to what grade level of elementary school math do the abilities of popular large language models (LLMs) co… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  22. arXiv:2306.10543  [pdf, other

    cs.CL

    UniMC: A Unified Framework for Long-Term Memory Conversation via Relevance Representation Learning

    Authors: Kang Zhao, Wei Liu, Jian Luan, Minglei Gao, Li Qian, Hanlin Teng, Bin Wang

    Abstract: Open-domain long-term memory conversation can establish long-term intimacy with humans, and the key is the ability to understand and memorize long-term dialogue history information. Existing works integrate multiple models for modelling through a pipeline, which ignores the coupling between different stages. In this paper, we propose a Unified framework for Long-term Memory Conversations (UniMC),… ▽ More

    Submitted 18 June, 2023; originally announced June 2023.

  23. arXiv:2305.17415  [pdf, other

    cs.CL cs.AI

    Exploring Better Text Image Translation with Multimodal Codebook

    Authors: Zhibin Lan, Jiawei Yu, Xiang Li, Wen Zhang, Jian Luan, Bin Wang, Degen Huang, Jinsong Su

    Abstract: Text image translation (TIT) aims to translate the source texts embedded in the image to target translations, which has a wide range of applications and thus has important research value. However, current studies on TIT are confronted with two main bottlenecks: 1) this task lacks a publicly available TIT dataset, 2) dominant models are constructed in a cascaded manner, which tends to suffer from t… ▽ More

    Submitted 2 June, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: Accepted by ACL 2023 Main Conference

  24. arXiv:2303.00969  [pdf, other

    cs.CL

    Rethinking the Reasonability of the Test Set for Simultaneous Machine Translation

    Authors: Mengge Liu, Wen Zhang, Xiang Li, Jian Luan, Bin Wang, Yuhang Guo, Shuoying Chen

    Abstract: Simultaneous machine translation (SimulMT) models start translation before the end of the source sentence, making the translation monotonically aligned with the source sentence. However, the general full-sentence translation test set is acquired by offline translation of the entire source sentence, which is not designed for SimulMT evaluation, making us rethink whether this will underestimate the… ▽ More

    Submitted 13 March, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted by 48th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023)

  25. arXiv:2301.06745  [pdf, other

    cs.CL cs.AI

    BERT-ERC: Fine-tuning BERT is Enough for Emotion Recognition in Conversation

    Authors: Xiangyu Qin, Zhiyu Wu, Jinshi Cui, Tingting Zhang, Yanran Li, Jian Luan, Bin Wang, Li Wang

    Abstract: Previous works on emotion recognition in conversation (ERC) follow a two-step paradigm, which can be summarized as first producing context-independent features via fine-tuning pretrained language models (PLMs) and then analyzing contextual information and dialogue structure information among the extracted features. However, we discover that this paradigm has several limitations. Accordingly, we pr… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

  26. arXiv:2212.03435  [pdf, other

    cs.SD cs.CL eess.AS

    Improve Bilingual TTS Using Dynamic Language and Phonology Embedding

    Authors: Fengyu Yang, Jian Luan, Yujun Wang

    Abstract: In most cases, bilingual TTS needs to handle three types of input scripts: first language only, second language only, and second language embedded in the first language. In the latter two situations, the pronunciation and intonation of the second language are usually quite different due to the influence of the first language. Therefore, it is a big challenge to accurately model the pronunciation a… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: Submitted to ICASSP2023

  27. arXiv:2206.03773  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Quantized anomalous Hall resistivity achieved in molecular beam epitaxy-grown MnBi2Te4 thin films

    Authors: Yunhe Bai, Yuanzhao Li, Jianli Luan, Ruixuan Liu, Wenyu Song, Yang Chen, Peng-Fei Ji, Qinghua Zhang, Fanqi Meng, Bingbing Tong, Lin Li, Yuying Jiang, Zongwei Gao, Lin Gu, Jinsong Zhang, Yayu Wang, Qi-Kun Xue, Ke He, Yang Feng, Xiao Feng

    Abstract: The intrinsic magnetic topological insulator MnBi2Te4 provides a feasible pathway to high temperature quantum anomalous Hall (QAH) effect as well as various novel topological quantum phases. Although quantized transport properties have been observed in exfoliated MnBi2Te4 thin flakes, it remains a big challenge to achieve molecular beam epitaxy (MBE)-grown MnBi2Te4 thin films even close to the qua… ▽ More

    Submitted 17 April, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: 4 figures

    Journal ref: National Science Review, nwad189 (2023)

  28. arXiv:2205.15705  [pdf

    physics.optics nlin.PS physics.app-ph

    High-quality 8-fold self-compression of ultrashort near-UV pulses in Ar-filled ultrathin-walled photonic crystal fiber

    Authors: Jie Luan, Philip St. J. Russell, David Novoa

    Abstract: We demonstrate generation of 7.6 fs near-UV pulses centered at 400 nm via 8-fold soliton-effect self-compression in an Ar-filled hollow-core kagomé-style photonic crystal fiber with ultrathin core walls. Analytical calculations of the effective compression length and soliton order permit adjustment of the experimental parameters, and numerical modelling of the nonlinear pulse dynamics in the fiber… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: 7 pages, 5 figures

  29. arXiv:2111.10002  [pdf

    physics.optics

    High-speed and single-mode FP laser based on parity-time symmetry

    Authors: Sikang Yang, Jing Luan, Yu Han, Ruigang Zhang, Qi Tian, Pengxiang He, Deming Liu, Minming Zhang

    Abstract: The ability to manipulate cavity resonant modes is of critical importance in laser physics and applications. By exploiting the parity time (PT) symmetry, we propose and experimentally realize a single-mode FP laser with improved output power and high-speed modulation have been demonstrated. The proposed PT symmetric laser consists of two coupled structurally identical FP resonators. The gain and l… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Comments: 8 pages, 6 figures

  30. arXiv:2110.09780  [pdf, other

    cs.SD eess.AS

    Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation

    Authors: Fengyu Yang, Jian Luan, Yujun Wang

    Abstract: Learning emotion embedding from reference audio is a straightforward approach for multi-emotion speech synthesis in encoder-decoder systems. But how to get better emotion embedding and how to inject it into TTS acoustic model more effectively are still under investigation. In this paper, we propose an innovative constraint to help VAE extract emotion embedding with better cluster cohesion. Besides… ▽ More

    Submitted 28 January, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: accepted by ICASSP2022

  31. arXiv:2110.04486  [pdf, other

    cs.SD cs.AI cs.CL cs.LG

    PAMA-TTS: Progression-Aware Monotonic Attention for Stable Seq2Seq TTS With Accurate Phoneme Duration Control

    Authors: Yunchao He, Jian Luan, Yujun Wang

    Abstract: Sequence expansion between encoder and decoder is a critical challenge in sequence-to-sequence TTS. Attention-based methods achieve great naturalness but suffer from unstable issues like missing and repeating phonemes, not to mention accurate duration control. Duration-informed methods, on the contrary, seem to easily adjust phoneme duration but show obvious degradation in speech naturalness. This… ▽ More

    Submitted 18 March, 2022; v1 submitted 9 October, 2021; originally announced October 2021.

    Comments: Accepted by ICASSP 2022. 5 pages, 4 figures, 3 tables. Audio samples are available at: https://pama-tts.github.io/

  32. arXiv:2107.04407  [pdf

    physics.med-ph physics.flu-dyn

    Hemodynamic effects of stent-graft introducer sheath during thoracic endovascular aortic repair

    Authors: Yonghui Qiao, Le Mao, Yan Wang, Jingyang Luan, Yanlu Chen, Ting Zhu, Kun Luo, Jianren Fan

    Abstract: Thoracic endovascular aortic repair (TEVAR) has become the standard treatment of a variety of aortic pathologies. The objective of this study is to evaluate the hemodynamic effects of stent-graft introducer sheath during TEVAR. Three idealized representative diseased aortas of aortic aneurysm, coarctation of the aorta, and aortic dissection were designed. Computational fluid dynamics studies were… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Journal ref: Biomechanics and Modeling in Mechanobiology (2022)

  33. arXiv:2107.03065  [pdf, other

    cs.SD eess.AS

    Msdtron: a high-capability multi-speaker speech synthesis system for diverse data using characteristic information

    Authors: Qinghua Wu, Quanbo Shen, Jian Luan, YuJun Wang

    Abstract: In multi-speaker speech synthesis, data from a number of speakers usually tend to have great diversity due to the fact that the speakers may differ largely in ages, speaking styles, emotions, and so on. It is important but challenging to improve the modeling capabilities for multi-speaker speech synthesis. To address the issue, this paper proposes a high-capability speech synthesis system, called… ▽ More

    Submitted 11 February, 2022; v1 submitted 7 July, 2021; originally announced July 2021.

    Comments: Accepted by ICASSP-2022

  34. arXiv:2103.05215  [pdf

    cond-mat.supr-con cond-mat.mes-hall

    Gate Tunable Supercurrent in Josephson Junctions Based on Bi2Te3 Topological Insulator Thin Films

    Authors: Wei-Xiong Wu, Yang Feng, Yun-He Bai, Yu-Ying Jiang, Zong-Wei Gao, Yuan-Zhao Li, Jian-Li Luan, Heng-An Zhou, Wan-Jun Jiang, Xiao Feng, Jin-Song Zhang, Hao Zhang, Ke He, Xu-Cun Ma, Qi-Kun Xue, Ya-Yu Wang

    Abstract: We report transport measurements on Josephson junctions consisting of Bi2Te3 topological insulator (TI) thin films contacted by superconducting Nb electrodes. For a device with junction length L = 134 nm, the critical supercurrent Ic can be modulated by an electrical gate which tunes the carrier type and density of the TI film. Ic can reach a minimum when the TI is near the charge neutrality regim… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

    Comments: 6 pages, 4 figures, The manuscript with the same title will be published by Chinese Physics Letters

    Journal ref: Chinese Physics Letters 38, 037402 (2021)

  35. arXiv:2102.11205  [pdf

    physics.optics nlin.PS

    Efficient self-compression of ultrashort UV pulses in air-filled hollow-core photonic crystal fiber

    Authors: Jie Luan, Philip St. J. Russell, David Novoa

    Abstract: We report generation of ultrashort UV pulses by soliton self-compression in kagomé-style hollow-core photonic crystal fiber filled with ambient air. Pump pulses with energy 2.6 uJ and duration 54 fs at 400 nm were compressed temporally by a factor of 5, to a duration of ~11 fs. The experimental results are supported by numerical simulations, showing that both Raman and Kerr effects play a role in… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Comments: 7 pages, 5 figures

  36. arXiv:2102.00802  [pdf

    cond-mat.mtrl-sci

    Liquefaction-induced Plasticity from Entropy-boosted Amorphous Ceramics

    Authors: Haidong Bian, Quanfeng He, Junhua Luan, Yu Bu, Yong Yang, Zhengtao Xu, Jian Lu, Yang Yang Li

    Abstract: Ceramics are easy to break, and very few generic mechanisms are available for improving their mechanical properties, e.g., the 1975-discovered anti-fracture mechanism is strictly limited to zirconia and hafnia. Here we report a general mechanism for achieving high plasticity through liquefaction of ceramics. We further disclose the general material design strategies to achieve this difficult task… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: 16 pages,4 figures

  37. arXiv:2101.02382  [pdf

    cond-mat.mtrl-sci

    Highly Distorted Lattices in Chemically Complex Alloys Produce Ultra-Elastic Materials with Extraordinary Elinvar Effects

    Authors: Q. F. He, J. G. Wang, H. A. Chen, Z. Y. Ding, Z. Q. Zhou, L. H. Xiong, J. H. Luan, J. M. Pelletier, J. C. Qiao, Q. Wang, L. L. Fan, Y. Ren, Q. S. Zeng, C. T. Liu, C. W. Pao, D. J. Srolovitz, Y. Yang

    Abstract: Conventional crystalline alloys usually possess a low atomic size difference in order to stabilize its crystalline structure. However, in this article, we report a single phase chemically complex alloy which possesses a large atomic size misfit usually unaffordable to conventional alloys. Consequently, this alloy develops a rather complex atomic-scale chemical order and a highly distorted crystall… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  38. arXiv:2009.01776  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis

    Authors: Jiawei Chen, Xu Tan, Jian Luan, Tao Qin, Tie-Yan Liu

    Abstract: High-fidelity singing voices usually require higher sampling rate (e.g., 48kHz) to convey expression and emotion. However, higher sampling rate causes the wider frequency band and longer waveform sequences and throws challenges for singing voice synthesis (SVS) in both frequency and time domains. Conventional SVS systems that adopt small sampling rate cannot well address the above challenges. In t… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

  39. arXiv:2008.04658  [pdf, other

    eess.AS cs.SD

    Transfer Learning for Improving Singing-voice Detection in Polyphonic Instrumental Music

    Authors: Yuanbo Hou, Frank K. Soong, Jian Luan, Shengchen Li

    Abstract: Detecting singing-voice in polyphonic instrumental music is critical to music information retrieval. To train a robust vocal detector, a large dataset marked with vocal or non-vocal label at frame-level is essential. However, frame-level labeling is time-consuming and labor expensive, resulting there is little well-labeled dataset available for singing-voice detection (S-VD). Hence, we propose a d… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: Accepted by INTERSPEECH 2020

  40. arXiv:2008.02490  [pdf

    eess.AS cs.SD

    PPSpeech: Phrase based Parallel End-to-End TTS System

    Authors: Yahuan Cong, Ran Zhang, Jian Luan

    Abstract: Current end-to-end autoregressive TTS systems (e.g. Tacotron 2) have outperformed traditional parallel approaches on the quality of synthesized speech. However, they introduce new problems at the same time. Due to the autoregressive nature, the time cost of inference has to be proportional to the length of text, which pose a great challenge for online serving. On the other hand, the style of synth… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

  41. arXiv:2007.04590  [pdf, other

    eess.AS cs.CL cs.SD

    DeepSinger: Singing Voice Synthesis with Data Mined From the Web

    Authors: Yi Ren, Xu Tan, Tao Qin, Jian Luan, Zhou Zhao, Tie-Yan Liu

    Abstract: In this paper, we develop DeepSinger, a multi-lingual multi-singer singing voice synthesis (SVS) system, which is built from scratch using singing training data mined from music websites. The pipeline of DeepSinger consists of several steps, including data crawling, singing and accompaniment separation, lyrics-to-singing alignment, data filtration, and singing modeling. Specifically, we design a l… ▽ More

    Submitted 15 July, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: Accepted by KDD2020 research track

  42. arXiv:2006.10317  [pdf, other

    eess.AS cs.LG cs.SD

    Adversarially Trained Multi-Singer Sequence-To-Sequence Singing Synthesizer

    Authors: Jie Wu, Jian Luan

    Abstract: This paper presents a high quality singing synthesizer that is able to model a voice with limited available recordings. Based on the sequence-to-sequence singing model, we design a multi-singer framework to leverage all the existing singing data of different singers. To attenuate the issue of musical score unbalance among singers, we incorporate an adversarial task of singer classification to make… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: Submitted to INTERSPEECH2020

  43. arXiv:2006.06261  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    XiaoiceSing: A High-Quality and Integrated Singing Voice Synthesis System

    Authors: Peiling Lu, Jie Wu, Jian Luan, Xu Tan, Li Zhou

    Abstract: This paper presents XiaoiceSing, a high-quality singing voice synthesis system which employs an integrated network for spectrum, F0 and duration modeling. We follow the main architecture of FastSpeech while proposing some singing-specific design: 1) Besides phoneme ID and position encoding, features from musical score (e.g.note pitch and length) are also added. 2) To attenuate off-key issues, we a… ▽ More

    Submitted 11 June, 2020; originally announced June 2020.

  44. Tunable Dirac points and zero-energy modes in periodic curved graphene superlattices

    Authors: Jianli Luan, Kaiyi Guo, Shangyang Li, Tianxing Ma, Li-Gang Wang, Hai-Qing Lin

    Abstract: We combined periodic ripples and electrostatic potentials to form curved graphene superlattices and studied the effects of space-dependent Fermi velocity induced from curvature on their electronic properties. With equal periods and symmetric potentials, the Dirac points do not move, but their locations shift under asymmetric potentials. This shift can be tuned by curvature and potentials. Tunable… ▽ More

    Submitted 28 June, 2021; v1 submitted 23 July, 2019; originally announced July 2019.

    Comments: 10 pages and 7 figures. Published version

    Journal ref: Physics Letters A 409 (2021) 127510

  45. arXiv:1905.03802  [pdf, other

    astro-ph.EP

    Titan's Dynamic Love Number Implies Stably-Stratified Ocean

    Authors: Jing Luan

    Abstract: The dynamic quadrupole Love number of Titan measured by \Cassini is $k_\mathrm{2,obs}=0.616\pm 0.067$, strongly indicating a global subsurface ocean. However, the theoretical Love number due to equilibrium tides is at most $k_\mathrm{2,eq}^\mathrm{max}\approx 0.48$ in the absence of an ice shell on top of the ocean. In reality, there is an outer ice shell of thickness $ 100\,\mathrm{km}$, reducing… ▽ More

    Submitted 9 May, 2019; originally announced May 2019.

    Comments: Comments are welcome. Submit to Icarus

  46. arXiv:1901.00578  [pdf, ps, other

    cs.LG math.ST stat.CO stat.ML

    Prediction of multi-dimensional spatial variation data via Bayesian tensor completion

    Authors: Jiali Luan, Zheng Zhang

    Abstract: This paper presents a multi-dimensional computational method to predict the spatial variation data inside and across multiple dies of a wafer. This technique is based on tensor computation. A tensor is a high-dimensional generalization of a matrix or a vector. By exploiting the hidden low-rank property of a high-dimensional data array, the large amount of unknown variation testing data may be pred… ▽ More

    Submitted 2 January, 2019; originally announced January 2019.

  47. arXiv:1712.04206  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci

    $\textit{Zitterbewegung}$ near new Dirac points in graphene superlattice

    Authors: Jianli Luan, Shangyang Li, Tianxing Ma, Li-Gang Wang

    Abstract: New Dirac points appear when periodic potentials are applied to graphene, and there are many interesting effects near these new Dirac points. Here we investigate the $\textit{Zitterbewegung}$ effect of fermions described by a Gaussian wave packet in graphene superlattice near new Dirac points. The $\textit{Zitterbewegung}$ near different Dirac points has similar characteristics, while Fermions nea… ▽ More

    Submitted 24 April, 2018; v1 submitted 12 December, 2017; originally announced December 2017.

    Comments: 8 pages, 11 figures

    Journal ref: J. Phys.: Condens. Matter 30 395502 (2018)

  48. DAVs: Red edge and Outbursts

    Authors: Jing Luan, Peter Goldreich

    Abstract: As established by ground based surveys, white dwarfs with hydrogen atmospheres pulsate as they cool across the temperature range, $12500\,\mathrm{K} \gtrsim T_{\mathrm{eff}} \gtrsim 10800\,\mathrm{K}$. Known as DAVs or ZZ Ceti stars, their oscillations are attributed to overstable g-modes excited by convective driving. The effective temperature at the blue edge of the instability strip is slightly… ▽ More

    Submitted 2 July, 2018; v1 submitted 16 November, 2017; originally announced November 2017.

    Comments: accepted to ApJ

  49. How Cassini Can Constrain Tidal Dissipation in Saturn

    Authors: Jing Luan, Jim Fuller, Eliot Quataert

    Abstract: Tidal dissipation inside giant planets is important for the orbital evolution of their natural satellites. It is conventionally treated by parameterized equilibrium tidal theory, in which the tidal torque declines rapidly with distance, and orbital expansion was faster in the past. However, Lainey et al. (2017) find that some Saturnian satellites are currently migrating outward faster than predict… ▽ More

    Submitted 28 October, 2017; v1 submitted 8 July, 2017; originally announced July 2017.

    Comments: accepted to MNRAS

  50. "Influence Sketching": Finding Influential Samples In Large-Scale Regressions

    Authors: Mike Wojnowicz, Ben Cruz, Xuan Zhao, Brian Wallace, Matt Wolff, Jay Luan, Caleb Crable

    Abstract: There is an especially strong need in modern large-scale data analysis to prioritize samples for manual inspection. For example, the inspection could target important mislabeled samples or key vulnerabilities exploitable by an adversarial attack. In order to solve the "needle in the haystack" problem of which samples to inspect, we develop a new scalable version of Cook's distance, a classical sta… ▽ More

    Submitted 23 March, 2017; v1 submitted 17 November, 2016; originally announced November 2016.

    Comments: fixed additional typos

    Journal ref: Big Data (Big Data), 2016 IEEE International Conference on, pp. 3601 - 3612. IEEE, 2016