Zum Hauptinhalt springen

Showing 1–21 of 21 results for author: Dou, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00753  [pdf, other

    eess.AS cs.SD

    FLY-TTS: Fast, Lightweight and High-Quality End-to-End Text-to-Speech Synthesis

    Authors: Yinlin Guo, Yening Lv, Jinqiao Dou, Yan Zhang, Yuehai Wang

    Abstract: While recent advances in Text-To-Speech synthesis have yielded remarkable improvements in generating high-quality speech, research on lightweight and fast models is limited. This paper introduces FLY-TTS, a new fast, lightweight and high-quality speech synthesis system based on VITS. Specifically, 1) We replace the decoder with ConvNeXt blocks that generate Fourier spectral coefficients followed b… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Accepted to Interspeech 2024. 5 pages, 1 figure

  2. arXiv:2406.10098  [pdf, other

    cs.LG cs.AI

    ECGMamba: Towards Efficient ECG Classification with BiSSM

    Authors: Yupeng Qiang, Xunde Dong, Xiuling Liu, Yang Yang, Yihai Fang, Jianhong Dou

    Abstract: Electrocardiogram (ECG) signal analysis represents a pivotal technique in the diagnosis of cardiovascular diseases. Although transformer-based models have made significant progress in ECG classification, they exhibit inefficiencies in the inference phase. The issue is primarily attributable to the secondary computational complexity of Transformer's self-attention mechanism. particularly when proce… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 6 pages, 2 figures. arXiv admin note: text overlap with arXiv:2404.17858 by other authors

  3. arXiv:2406.08009  [pdf, other

    cs.CV cs.AI cs.RO

    OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding

    Authors: Yinan Deng, Jiahui Wang, Jingyu Zhao, Jianyu Dou, Yi Yang, Yufeng Yue

    Abstract: In recent years, there has been a surge of interest in open-vocabulary 3D scene reconstruction facilitated by visual language models (VLMs), which showcase remarkable capabilities in open-set retrieval. However, existing methods face some limitations: they either focus on learning point-wise features, resulting in blurry semantic understanding, or solely tackle object-level reconstruction, thereby… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 8 pages, 7figures. Project Url: https://openobj.github.io/

  4. arXiv:2403.10761  [pdf, other

    cs.AI cs.LG cs.RO

    Scheduling Drone and Mobile Charger via Hybrid-Action Deep Reinforcement Learning

    Authors: Jizhe Dou, Haotian Zhang, Guodong Sun

    Abstract: Recently there has been a growing interest in industry and academia, regarding the use of wireless chargers to prolong the operational longevity of unmanned aerial vehicles (commonly knowns as drones). In this paper we consider a charger-assisted drone application: a drone is deployed to observe a set points of interest, while a charger can move to recharge the drone's battery. We focus on the rou… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  5. arXiv:2308.00942  [pdf

    physics.optics cs.LG eess.IV

    On the use of deep learning for phase recovery

    Authors: Kaiqiang Wang, Li Song, Chutian Wang, Zhenbo Ren, Guangyuan Zhao, Jiazhen Dou, Jianglei Di, George Barbastathis, Renjie Zhou, Jianlin Zhao, Edmund Y. Lam

    Abstract: Phase recovery (PR) refers to calculating the phase of the light field from its intensity measurements. As exemplified from quantitative phase imaging and coherent diffraction imaging to adaptive optics, PR is essential for reconstructing the refractive index distribution or topography of an object and correcting the aberration of an imaging system. In recent years, deep learning (DL), often imple… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: 82 pages, 32 figures

    Journal ref: Light: Science & Applications 13, 4 (2024)

  6. arXiv:2305.02854  [pdf, other

    cs.DS cs.DC

    Distributed Construction of Near-Optimal Compact Routing Schemes for Planar Graphs

    Authors: Jinfeng Dou, Thorsten Götte, Henning Hillebrandt, Christian Scheideler, Julian Werthmann

    Abstract: We consider the problem of computing compact routing tables for a (weighted) planar graph $G:= (V, E,w)$ in the PRAM, CONGEST, and the novel HYBRID communication model. We present algorithms with polylogarithmic work and communication that are almost optimal in all relevant parameters, i.e., computation time, table sizes, and stretch. All algorithms are heavily randomized, and all our bounds hold… ▽ More

    Submitted 12 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

  7. arXiv:2304.06388  [pdf, other

    cs.IT eess.SP

    How Practical Phase-shift Errors Affect Beamforming of Reconfigurable Intelligent Surface?

    Authors: Jun Yang, Yijian Chen, Yijun Cui, Qingqing Wu, Jianwu Dou, Yuxin Wang

    Abstract: Reconfigurable intelligent surface (RIS) is a new technique that is able to manipulate the wireless environment smartly and has been exploited for assisting the wireless communications, especially at high frequency band. However, it suffers from hardware impairments (HWIs) in practical designs, which inevitably degrades its performance and thus limits its full potential. To address this practical… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  8. arXiv:2304.03708  [pdf, other

    eess.IV cs.CV

    Efficient automatic segmentation for multi-level pulmonary arteries: The PARSE challenge

    Authors: Gongning Luo, Kuanquan Wang, Jun Liu, Shuo Li, Xinjie Liang, Xiangyu Li, Shaowei Gan, Wei Wang, Suyu Dong, Wenyi Wang, Pengxin Yu, Enyou Liu, Hongrong Wei, Na Wang, Jia Guo, Huiqi Li, Zhao Zhang, Ziwei Zhao, Na Gao, Nan An, Ashkan Pakzad, Bojidar Rangelov, Jiaqi Dou, Song Tian, Zeyu Liu , et al. (5 additional authors not shown)

    Abstract: Efficient automatic segmentation of multi-level (i.e. main and branch) pulmonary arteries (PA) in CTPA images plays a significant role in clinical applications. However, most existing methods concentrate only on main PA or branch PA segmentation separately and ignore segmentation efficiency. Besides, there is no public large-scale dataset focused on PA segmentation, which makes it highly challengi… ▽ More

    Submitted 9 August, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

  9. arXiv:2212.05024  [pdf, other

    cs.LG

    Decomposable Sparse Tensor on Tensor Regression

    Authors: Haiyi Mao, Jason Xiaotian Dou

    Abstract: Most regularized tensor regression research focuses on tensors predictors with scalars responses or vectors predictors to tensors responses. We consider the sparse low rank tensor on tensor regression where predictors $\mathcal{X}$ and responses $\mathcal{Y}$ are both high-dimensional tensors. By demonstrating that the general inner product or the contracted product on a unit rank tensor can be de… ▽ More

    Submitted 14 December, 2022; v1 submitted 9 December, 2022; originally announced December 2022.

  10. arXiv:2210.02284  [pdf, other

    cs.CL

    Unsupervised Sentence Textual Similarity with Compositional Phrase Semantics

    Authors: Zihao Wang, Jiaheng Dou, Yong Zhang

    Abstract: Measuring Sentence Textual Similarity (STS) is a classic task that can be applied to many downstream NLP applications such as text generation and retrieval. In this paper, we focus on unsupervised STS that works on various domains but only requires minimal data and computational resources. Theoretically, we propose a light-weighted Expectation-Correction (EC) formulation for STS computation. EC fo… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: COLING 2022; Github repository https://github.com/zihao-wang/rots ; Partially overlapped with arXiv:2002.00745 ; 20 pages, 5 figures, 17 tables

  11. Sampling Through the Lens of Sequential Decision Making

    Authors: Jason Xiaotian Dou, Alvin Qingkai Pan, Runxue Bao, Haiyi Harry Mao, Lei Luo, Zhi-Hong Mao

    Abstract: Sampling is ubiquitous in machine learning methodologies. Due to the growth of large datasets and model complexity, we want to learn and adapt the sampling process while training a representation. Towards achieving this grand goal, a variety of sampling techniques have been proposed. However, most of them either use a fixed sampling scheme or adjust the sampling scheme based on simple heuristics.… ▽ More

    Submitted 13 December, 2022; v1 submitted 17 August, 2022; originally announced August 2022.

  12. arXiv:2207.07734  [pdf, other

    q-bio.GN cs.AI cs.GL

    COEM: Cross-Modal Embedding for MetaCell Identification

    Authors: Haiyi Mao, Minxue Jia, Jason Xiaotian Dou, Haotian Zhang, Panayiotis V. Benos

    Abstract: Metacells are disjoint and homogeneous groups of single-cell profiles, representing discrete and highly granular cell states. Existing metacell algorithms tend to use only one modality to infer metacells, even though single-cell multi-omics datasets profile multiple molecular modalities within the same cell. Here, we present \textbf{C}ross-M\textbf{O}dal \textbf{E}mbedding for \textbf{M}etaCell Id… ▽ More

    Submitted 24 July, 2022; v1 submitted 15 July, 2022; originally announced July 2022.

    Comments: 5 pages, 2 figures, ICML workshop on computational biology

  13. arXiv:2204.00298  [pdf, other

    cs.CV

    Unitail: Detecting, Reading, and Matching in Retail Scene

    Authors: Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides

    Abstract: To make full use of computer vision technology in stores, it is required to consider the actual needs that fit the characteristics of the retail scene. Pursuing this goal, we introduce the United Retail Datasets (Unitail), a large-scale benchmark of basic visual tasks on products that challenges algorithms for detecting, reading, and matching. With 1.8M quadrilateral-shaped instances annotated, th… ▽ More

    Submitted 20 July, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: ECCV 2022

  14. arXiv:2112.11730  [pdf, other

    cs.HC

    GUX-Analyzer: A Deep Multi-modal Analyzer Via Motivational Flow For Game User Experience

    Authors: Zhitao Liu, Ning Xie, Guobiao Yang, Jiale Dou, Lanxiao Huang, Guang Yang, Lin Yuan

    Abstract: Quantitative analysis of Game User eXperience (GUX) is important to the game industry. Different from the typical questionnaire analysis, this paper focuses on the computational analysis of GUX. We aim to analyze the relationship between game and players using the multi-modal data including physiological data and game process data. We theoretically extend the Flow model from the classic skill-and-… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

  15. arXiv:2106.10493  [pdf, other

    cs.CV

    CenterAtt: Fast 2-stage Center Attention Network

    Authors: Jianyun Xu, Xin Tang, Jian Dou, Xu Shu, Yushi Zhu

    Abstract: In this technical report, we introduce the methods of HIKVISION_LiDAR_Det in the challenge of waymo open dataset real-time 3D detection. Our solution for the competition are built upon Centerpoint 3D detection framework. Several variants of CenterPoint are explored, including center attention head and feature pyramid network neck. In order to achieve real time detection, methods like batchnorm mer… ▽ More

    Submitted 19 June, 2021; originally announced June 2021.

  16. arXiv:2103.12978  [pdf, other

    cs.CV

    RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation

    Authors: Jianyun Xu, Ruixiang Zhang, Jian Dou, Yushi Zhu, Jie Sun, Shiliang Pu

    Abstract: Point clouds can be represented in many forms (views), typically, point-based sets, voxel-based cells or range-based images(i.e., panoramic view). The point-based view is geometrically accurate, but it is disordered, which makes it difficult to find local neighbors efficiently. The voxel-based view is regular, but sparse, and computation grows cubically when voxel resolution increases. The range-b… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

  17. arXiv:1903.06405  [pdf, other

    cs.CV cs.RO

    BLVD: Building A Large-scale 5D Semantics Benchmark for Autonomous Driving

    Authors: Jianru Xue, Jianwu Fang, Tao Li, Bohua Zhang, Pu Zhang, Zhen Ye, Jian Dou

    Abstract: In autonomous driving community, numerous benchmarks have been established to assist the tasks of 3D/2D object detection, stereo vision, semantic/instance segmentation. However, the more meaningful dynamic evolution of the surrounding objects of ego-vehicle is rarely exploited, and lacks a large-scale dataset platform. To address this, we introduce BLVD, a large-scale 5D semantics benchmark which… ▽ More

    Submitted 15 March, 2019; originally announced March 2019.

    Comments: To appear in ICRA2019

  18. arXiv:1711.04618  [pdf

    cs.CY

    Impartial redistricting: a Markov chain approach to the "Gerrymandering problem"

    Authors: Jason Dou

    Abstract: After every U.S. national census, a state legislature is required to redraw the boundaries of congressional districts in order to account for changes in population. At the moment this is done in a highly partisan way, with districting done in order to maximize the benefits to the party in power. This is a threat to U.S's democracy. There have been proposals to take the re-districting out of the ha… ▽ More

    Submitted 30 October, 2017; originally announced November 2017.

    Comments: Bachelor's thesis, Beijing Univ (2014)

  19. arXiv:1710.00273  [pdf

    cs.CL

    What Words Do We Use to Lie?: Word Choice in Deceptive Messages

    Authors: Jason Xiaotian Dou, Michelle Liu, Haaris Muneer, Adam Schlussel

    Abstract: Text messaging is the most widely used form of computer-mediated communication (CMC). Previous findings have shown that linguistic factors can reliably indicate messages as deceptive. For example, users take longer and use more words to craft deceptive messages than they do truthful messages. Existing research has also examined how factors, such as student status and gender, affect rates of decept… ▽ More

    Submitted 1 August, 2022; v1 submitted 30 September, 2017; originally announced October 2017.

  20. arXiv:1602.01428  [pdf

    cs.CL cs.IR

    "Draw My Topics": Find Desired Topics fast from large scale of Corpus

    Authors: Jason Dou, Ni Sun, Xiaojun Zou

    Abstract: We develop the "Draw My Topics" toolkit, which provides a fast way to incorporate social scientists' interest into standard topic modelling. Instead of using raw corpus with primitive processing as input, an algorithm based on Vector Space Model and Conditional Entropy are used to connect social scientists' willingness and unsupervised topic models' output. Space for users' adjustment on specific… ▽ More

    Submitted 3 February, 2016; originally announced February 2016.

  21. arXiv:1510.03247   

    cs.CY

    Impartial Redistricting: A Markov Chain Approach

    Authors: Lucy Chenyun Wu, Jason Xiaotian Dou, Danny Sleator, Alan Frieze, David Miller

    Abstract: The gerrymandering problem is a worldwide problem which sets great threat to democracy and justice in district based elections. Thanks to partisan redistricting commissions, district boundaries are often manipulated to benefit incumbents. Since an independent commission is hard to come by, the possibility of impartially generating districts with a computer is explored in this thesis. We have devel… ▽ More

    Submitted 13 October, 2015; v1 submitted 12 October, 2015; originally announced October 2015.

    Comments: about authorship naming problem, will fix soon