Skip to main content

Showing 1–50 of 74 results for author: Bai, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11481  [pdf, other

    cs.LG cs.AI

    Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms

    Authors: Vaneet Aggarwal, Washim Uddin Mondal, Qinbo Bai

    Abstract: Reinforcement Learning (RL) serves as a versatile framework for sequential decision-making, finding applications across diverse domains such as robotics, autonomous driving, recommendation systems, supply chain optimization, biology, mechanics, and finance. The primary objective in these applications is to maximize the average reward. Real-world scenarios often necessitate adherence to specific co… ▽ More

    Submitted 21 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.02042; text overlap with arXiv:2202.00150 by other authors

  2. arXiv:2406.10367  [pdf, other

    cs.LG

    Disentangled Hyperbolic Representation Learning for Heterogeneous Graphs

    Authors: Qijie Bai, Changli Nie, Haiwei Zhang, Zhicheng Dou, Xiaojie Yuan

    Abstract: Heterogeneous graphs have attracted a lot of research interests recently due to the success for representing complex real-world systems. However, existing methods have two pain points in embedding them into low-dimensional spaces: the mixing of structural and semantic information, and the distributional mismatch between data and embedding spaces. These two challenges require representation methods… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:2406.05551  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    Autoregressive Diffusion Transformer for Text-to-Speech Synthesis

    Authors: Zhijun Liu, Shuai Wang, Sho Inoue, Qibing Bai, Haizhou Li

    Abstract: Audio language models have recently emerged as a promising approach for various audio generation tasks, relying on audio tokenizers to encode waveforms into sequences of discrete symbols. Audio tokenization often poses a necessary compromise between code bitrate and reconstruction accuracy. When dealing with low-bitrate audio codes, language models are constrained to process only a subset of the i… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  4. arXiv:2406.04679  [pdf, other

    eess.IV cs.CV

    XctDiff: Reconstruction of CT Images with Consistent Anatomical Structures from a Single Radiographic Projection Image

    Authors: Qingze Bai, Tiange Liu, Zhi Liu, Yubing Tong, Drew Torigian, Jayaram Udupa

    Abstract: In this paper, we present XctDiff, an algorithm framework for reconstructing CT from a single radiograph, which decomposes the reconstruction process into two easily controllable tasks: feature extraction and CT reconstruction. Specifically, we first design a progressive feature extraction strategy that is able to extract robust 3D priors from radiographs. Then, we use the extracted prior informat… ▽ More

    Submitted 13 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  5. arXiv:2404.11869  [pdf, other

    cs.LG cs.SI

    Node-like as a Whole: Structure-aware Searching and Coarsening for Graph Classification

    Authors: Xiaorui Qi, Qijie Bai, Yanlong Wen, Haiwei Zhang, Xiaojie Yuan

    Abstract: Graph Transformers (GTs) have made remarkable achievements in graph-level tasks. However, most existing works regard graph structures as a form of guidance or bias for enhancing node representations, which focuses on node-central perspectives and lacks explicit representations of edges and structures. One natural question is, can we treat graph structures node-like as a whole to learn high-level f… ▽ More

    Submitted 24 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: 22 pages

  6. arXiv:2404.04906  [pdf, other

    cs.HC cs.IR

    Balancing Information Perception with Yin-Yang: Agent-Based Information Neutrality Model for Recommendation Systems

    Authors: Mengyan Wang, Yuxuan Hu, Shiqing Wu, Weihua Li, Quan Bai, Verica Rupar

    Abstract: While preference-based recommendation algorithms effectively enhance user engagement by recommending personalized content, they often result in the creation of ``filter bubbles''. These bubbles restrict the range of information users interact with, inadvertently reinforcing their existing viewpoints. Previous research has focused on modifying these underlying algorithms to tackle this issue. Yet,… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  7. arXiv:2402.15525  [pdf, other

    cs.CL cs.CY

    Detecting misinformation through Framing Theory: the Frame Element-based Model

    Authors: Guan Wang, Rebecca Frederick, Jinglong Duan, William Wong, Verica Rupar, Weihua Li, Quan Bai

    Abstract: In this paper, we delve into the rapidly evolving challenge of misinformation detection, with a specific focus on the nuanced manipulation of narrative frames - an under-explored area within the AI community. The potential for Generative AI models to generate misleading narratives underscores the urgency of this problem. Drawing from communication and framing theories, we posit that the presentati… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 17 pages, 9 figures, 7 tables

  8. arXiv:2402.15289  [pdf, other

    cs.CL cs.LG

    Let's Rectify Step by Step: Improving Aspect-based Sentiment Analysis with Diffusion Models

    Authors: Shunyu Liu, Jie Zhou, Qunxi Zhu, Qin Chen, Qingchun Bai, Jun Xiao, Liang He

    Abstract: Aspect-Based Sentiment Analysis (ABSA) stands as a crucial task in predicting the sentiment polarity associated with identified aspects within text. However, a notable challenge in ABSA lies in precisely determining the aspects' boundaries (start and end indices), especially for long ones, due to users' colloquial expressions. We propose DiffusionABSA, a novel diffusion model tailored for ABSA, wh… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted to LREC-COLING 2024, submission version

  9. arXiv:2402.14000  [pdf, other

    cs.CV

    Real-time 3D-aware Portrait Editing from a Single Image

    Authors: Qingyan Bai, Zifan Shi, Yinghao Xu, Hao Ouyang, Qiuyu Wang, Ceyuan Yang, Xuan Wang, Gordon Wetzstein, Yujun Shen, Qifeng Chen

    Abstract: This work presents 3DPE, a practical method that can efficiently edit a face image following given prompts, like reference images or text descriptions, in a 3D-aware manner. To this end, a lightweight module is distilled from a 3D portrait generator and a text-to-image model, which provide prior knowledge of face geometry and superior editing capability, respectively. Such a design brings two comp… ▽ More

    Submitted 2 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  10. arXiv:2402.02042  [pdf, ps, other

    cs.LG cs.AI

    Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm

    Authors: Qinbo Bai, Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: This paper explores the realm of infinite horizon average reward Constrained Markov Decision Processes (CMDP). To the best of our knowledge, this work is the first to delve into the regret and constraint violation analysis of average reward CMDPs with a general policy parametrization. To address this challenge, we propose a primal dual based policy gradient algorithm that adeptly manages the const… ▽ More

    Submitted 3 March, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: We fixed Lemma 6 in v2 which changed the final result

  11. arXiv:2310.04342  [pdf, other

    cs.DB cs.NI

    Minerva: Decentralized Collaborative Query Processing over InterPlanetary File System

    Authors: Zhiyi Yao, Bowen Ding, Qianlan Bai, Yuedong Xu

    Abstract: Data silos create barriers in accessing and utilizing data dispersed over networks. Directly sharing data easily suffers from the long downloading time, the single point failure and the untraceable data usage. In this paper, we present Minerva, a peer-to-peer cross-cluster data query system based on InterPlanetary File System (IPFS). Minerva makes use of the distributed Hash table (DHT) lookup to… ▽ More

    Submitted 8 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  12. arXiv:2309.11730  [pdf, other

    eess.AS cs.SD

    Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

    Authors: Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li

    Abstract: Current speaker recognition systems primarily rely on supervised approaches, constrained by the scale of labeled datasets. To boost the system performance, researchers leverage large pretrained models such as WavLM to transfer learned high-level features to the downstream speaker recognition task. However, this approach introduces extra parameters as the pretrained model remains in the inference s… ▽ More

    Submitted 26 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: submitted to ICASSP 2024

  13. arXiv:2309.01922  [pdf, ps, other

    cs.LG cs.AI

    Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes

    Authors: Qinbo Bai, Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: In this paper, we consider an infinite horizon average reward Markov Decision Process (MDP). Distinguishing itself from existing works within this context, our approach harnesses the power of the general policy gradient-based algorithm, liberating it from the constraints of assuming a linear MDP structure. We propose a policy gradient-based algorithm and show its global convergence property. We th… ▽ More

    Submitted 2 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Journal ref: AAAI 2024

  14. arXiv:2308.07926  [pdf, other

    cs.CV

    CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

    Authors: Hao Ouyang, Qiuyu Wang, Yuxi Xiao, Qingyan Bai, Juntao Zhang, Kecheng Zheng, Xiaowei Zhou, Qifeng Chen, Yujun Shen

    Abstract: We present the content deformation field CoDeF as a new type of video representation, which consists of a canonical content field aggregating the static contents in the entire video and a temporal deformation field recording the transformations from the canonical image (i.e., rendered from the canonical content field) to each individual frame along the time axis.Given a target video, these two fie… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: Project Webpage: https://qiuyu96.github.io/CoDeF/, Code: https://github.com/qiuyu96/CoDeF

  15. arXiv:2307.02797  [pdf, other

    cs.IR cs.AI

    BHEISR: Nudging from Bias to Balance -- Promoting Belief Harmony by Eliminating Ideological Segregation in Knowledge-based Recommendations

    Authors: Mengyan Wang, Yuxuan Hu, Zihan Yuan, Chenting Jiang, Weihua Li, Shiqing Wu, Quan Bai

    Abstract: In the realm of personalized recommendation systems, the increasing concern is the amplification of belief imbalance and user biases, a phenomenon primarily attributed to the filter bubble. Addressing this critical issue, we introduce an innovative intermediate agency (BHEISR) between users and existing recommendation systems to attenuate the negative repercussions of the filter bubble effect in e… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: 26 pages

    MSC Class: 68T07 ACM Class: I.2.6; I.2.7

  16. arXiv:2306.05537  [pdf, other

    cs.CL

    AaKOS: Aspect-adaptive Knowledge-based Opinion Summarization

    Authors: Guan Wang, Weihua Li, Edmund M-K. Lai, Quan Bai

    Abstract: The rapid growth of information on the Internet has led to an overwhelming amount of opinions and comments on various activities, products, and services. This makes it difficult and time-consuming for users to process all the available information when making decisions. Text summarization, a Natural Language Processing (NLP) task, has been widely explored to help users quickly retrieve relevant in… ▽ More

    Submitted 25 May, 2023; originally announced June 2023.

    Comments: 21 pages, 4 figures, 7 tables

  17. arXiv:2305.08272  [pdf, other

    cs.DB

    QueryBooster: Improving SQL Performance Using Middleware Services for Human-Centered Query Rewriting

    Authors: Qiushi Bai, Sadeem Alsudais, Chen Li

    Abstract: SQL query performance is critical in database applications, and query rewriting is a technique that transforms an original query into an equivalent query with a better performance. In a wide range of database-supported systems, there is a unique problem where both the application and database layer are black boxes, and the developers need to use their knowledge about the data and domain to rewrite… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

  18. HGWaveNet: A Hyperbolic Graph Neural Network for Temporal Link Prediction

    Authors: Qijie Bai, Changli Nie, Haiwei Zhang, Dongming Zhao, Xiaojie Yuan

    Abstract: Temporal link prediction, aiming to predict future edges between paired nodes in a dynamic graph, is of vital importance in diverse applications. However, existing methods are mainly built upon uniform Euclidean space, which has been found to be conflict with the power-law distributions of real-world graphs and unable to represent the hierarchical connections between nodes effectively. With respec… ▽ More

    Submitted 3 May, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

    Comments: Accepted by Web Conference (WWW) 2023

    Journal ref: WWW '23: Proceedings of the ACM Web Conference 2023 (523-532)

  19. $\text{H}^2\text{TNE}$: Temporal Heterogeneous Information Network Embedding in Hyperbolic Spaces

    Authors: Qijie Bai, Jiawen Guo, Haiwei Zhang, Changli Nie, Lin Zhang, Xiaojie Yuan

    Abstract: Temporal heterogeneous information network (temporal HIN) embedding, aiming to represent various types of nodes of different timestamps into low dimensional spaces while preserving structural and semantic information, is of vital importance in diverse real-life tasks. Researchers have made great efforts on temporal HIN embedding in Euclidean spaces and got some considerable achievements. However,… ▽ More

    Submitted 14 June, 2024; v1 submitted 14 April, 2023; originally announced April 2023.

    Journal ref: The Semantic Web-ISWC 2022: 21st International Semantic Web Conference, Virtual Event, October 23-27, 2022, Proceedings (pp. 179-195)

  20. arXiv:2304.01999  [pdf, other

    cs.CV

    Revisiting the Evaluation of Image Synthesis with GANs

    Authors: Mengping Yang, Ceyuan Yang, Yichi Zhang, Qingyan Bai, Yujun Shen, Bo Dai

    Abstract: A good metric, which promises a reliable comparison between solutions, is essential for any well-defined task. Unlike most vision tasks that have per-sample ground-truth, image synthesis tasks target generating unseen data and hence are usually evaluated through a distributional distance between one set of real samples and another set of generated samples. This study presents an empirical investig… ▽ More

    Submitted 23 October, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023 datasets and benchmarks track

  21. arXiv:2303.00815  [pdf, other

    cs.CL cs.AI

    Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis

    Authors: Jingli Shi, Weihua Li, Quan Bai, Yi Yang, Jianhua Jiang

    Abstract: Aspect term extraction is a fundamental task in fine-grained sentiment analysis, which aims at detecting customer's opinion targets from reviews on product or service. The traditional supervised models can achieve promising results with annotated datasets, however, the performance dramatically decreases when they are applied to the task of cross-domain aspect term extraction. Existing cross-domain… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: 22 pages

  22. arXiv:2302.08505  [pdf, other

    cs.CV cs.AI

    Rapid-Motion-Track: Markerless Tracking of Fast Human Motion with Deeper Learning

    Authors: Renjie Li, Chun Yu Lao, Rebecca St. George, Katherine Lawler, Saurabh Garg, Son N. Tran, Quan Bai, Jane Alty

    Abstract: Objective The coordination of human movement directly reflects function of the central nervous system. Small deficits in movement are often the first sign of an underlying neurological problem. The objective of this research is to develop a new end-to-end, deep learning-based system, Rapid-Motion-Track (RMT) that can track the fastest human movement accurately when webcams or laptop cameras are us… ▽ More

    Submitted 18 January, 2023; originally announced February 2023.

  23. arXiv:2302.01443  [pdf, other

    cs.AI

    DOR: A Novel Dual-Observation-Based Approach for News Recommendation Systems

    Authors: Mengyan Wang, Weihua Li, Jingli Shi, Shiqing Wu, Quan Bai

    Abstract: Online social media platforms offer access to a vast amount of information, but sifting through the abundance of news can be overwhelming and tiring for readers. personalised recommendation algorithms can help users find information that interests them. However, most existing models rely solely on observations of user behaviour, such as viewing history, ignoring the connections between the news an… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    MSC Class: 68T07

  24. arXiv:2212.03752  [pdf, other

    cs.CV eess.IV

    GLeaD: Improving GANs with A Generator-Leading Task

    Authors: Qingyan Bai, Ceyuan Yang, Yinghao Xu, Xihui Liu, Yujiu Yang, Yujun Shen

    Abstract: Generative adversarial network (GAN) is formulated as a two-player game between a generator (G) and a discriminator (D), where D is asked to differentiate whether an image comes from real data or is produced by G. Under such a formulation, D plays as the rule maker and hence tends to dominate the competition. Towards a fairer game in GANs, we propose a new paradigm for adversarial training, which… ▽ More

    Submitted 6 June, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: CVPR2023. Project page: https://ezioby.github.io/glead/ Code: https://github.com/EzioBy/glead/

  25. arXiv:2212.00007  [pdf, other

    cs.HC cs.AI cs.LG

    A Light-weight, Effective and Efficient Model for Label Aggregation in Crowdsourcing

    Authors: Yi Yang, Zhong-Qiu Zhao, Quan Bai, Qing Liu, Weihua Li

    Abstract: Due to the noises in crowdsourced labels, label aggregation (LA) has emerged as a standard procedure to post-process crowdsourced labels. LA methods estimate true labels from crowdsourced labels by modeling worker qualities. Most existing LA methods are iterative in nature. They need to traverse all the crowdsourced labels multiple times in order to jointly and iteratively update true labels and w… ▽ More

    Submitted 19 November, 2022; originally announced December 2022.

  26. arXiv:2211.15956  [pdf, other

    cs.LG cs.AI

    Offline Reinforcement Learning with Closed-Form Policy Improvement Operators

    Authors: Jiachen Li, Edwin Zhang, Ming Yin, Qinxun Bai, Yu-Xiang Wang, William Yang Wang

    Abstract: Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a policy is trained to maximize a learned value function while constrained by the behavior policy to avoid a significant distributional shift. In this paper, we propose our closed-form policy improvement operators. We make a n… ▽ More

    Submitted 22 July, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: Accepted at ICML 2023

  27. arXiv:2211.00185  [pdf, other

    cs.LG cs.AI cs.CV

    Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models

    Authors: Wenli Yang, Guan Huang, Renjie Li, Jiahao Yu, Yanyu Chen, Quan Bai, Beyong Kang

    Abstract: Convolutional neural network (CNN) models have seen advanced improvements in performance in various domains, but lack of interpretability is a major barrier to assurance and regulation during operation for acceptance and deployment of AI-assisted applications. There have been many works on input interpretability focusing on analyzing the input-output relations, but the internal logic of models has… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

  28. arXiv:2210.09549  [pdf, other

    cs.CV cs.LG

    Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation

    Authors: Ruijun Li, Weihua Li, Yi Yang, Hanyu Wei, Jianhua Jiang, Quan Bai

    Abstract: Recently, diffusion models have been proven to perform remarkably well in text-to-image synthesis tasks in a number of studies, immediately presenting new study opportunities for image generation. Google's Imagen follows this research trend and outperforms DALLE2 as the best model for text-to-image generation. However, Imagen merely uses a T5 language model for text processing, which cannot ensure… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    MSC Class: 94A08 ACM Class: I.4.0

  29. arXiv:2208.02189  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis

    Authors: Qibing Bai, Tom Ko, Yu Zhang

    Abstract: In human speech, the attitude of a speaker cannot be fully expressed only by the textual content. It has to come along with the intonation. Declarative questions are commonly used in daily Cantonese conversations, and they are usually uttered with rising intonation. Vanilla neural text-to-speech (TTS) systems are not capable of synthesizing rising intonation for these sentences due to the loss of… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

    Comments: Accepted by INTERSPEECH 2022

  30. arXiv:2207.02376  [pdf, other

    cs.CV cs.AI

    A Comprehensive Review on Deep Supervision: Theories and Applications

    Authors: Renjie Li, Xinyi Wang, Guan Huang, Wenli Yang, Kaining Zhang, Xiaotong Gu, Son N. Tran, Saurabh Garg, Jane Alty, Quan Bai

    Abstract: Deep supervision, or known as 'intermediate supervision' or 'auxiliary supervision', is to add supervision at hidden layers of a neural network. This technique has been increasingly applied in deep neural network learning systems for various computer vision applications recently. There is a consensus that deep supervision helps improve neural network performance by alleviating the gradient vanishi… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

  31. arXiv:2206.05850  [pdf, other

    cs.LG cs.AI eess.SY

    Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm

    Authors: Qinbo Bai, Amrit Singh Bedi, Vaneet Aggarwal

    Abstract: We consider the problem of constrained Markov decision process (CMDP) in continuous state-actions spaces where the goal is to maximize the expected cumulative reward subject to some constraints. We propose a novel Conservative Natural Policy Gradient Primal-Dual Algorithm (C-NPG-PD) to achieve zero constraint violation while achieving state of the art convergence results for the objective value fu… ▽ More

    Submitted 16 May, 2024; v1 submitted 12 June, 2022; originally announced June 2022.

    Comments: The latest version fixed the error in the proof of Lemma 4 in AAAI2023

  32. arXiv:2205.15514  [pdf, other

    cs.CL

    A Knowledge-Enhanced Adversarial Model for Cross-lingual Structured Sentiment Analysis

    Authors: Qi Zhang, Jie Zhou, Qin Chen, Qingchun Bai, Jun Xiao, Liang He

    Abstract: Structured sentiment analysis, which aims to extract the complex semantic structures such as holders, expressions, targets, and polarities, has obtained widespread attention from both industry and academia. Unfortunately, the existing structured sentiment analysis datasets refer to a few languages and are relatively small, limiting neural network models' performance. In this paper, we focus on the… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  33. Enhancing Event-Level Sentiment Analysis with Structured Arguments

    Authors: Qi Zhang, Jie Zhou, Qin Chen, Qinchun Bai, Liang He

    Abstract: Previous studies about event-level sentiment analysis (SA) usually model the event as a topic, a category or target terms, while the structured arguments (e.g., subject, object, time and location) that have potential effects on the sentiment are not well studied. In this paper, we redefine the task as structured event-level SA and propose an End-to-End Event-level Sentiment Analysis (… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  34. arXiv:2205.09048  [pdf, other

    eess.IV cs.CV

    Global Contrast Masked Autoencoders Are Powerful Pathological Representation Learners

    Authors: Hao Quan, Xingyu Li, Weixing Chen, Qun Bai, Mingchen Zou, Ruijie Yang, Tingting Zheng, Ruiqun Qi, Xinghua Gao, Xiaoyu Cui

    Abstract: Based on digital pathology slice scanning technology, artificial intelligence algorithms represented by deep learning have achieved remarkable results in the field of computational pathology. Compared to other medical images, pathology images are more difficult to annotate, and thus, there is an extreme lack of available datasets for conducting supervised learning to train robust deep learning mod… ▽ More

    Submitted 15 November, 2023; v1 submitted 18 May, 2022; originally announced May 2022.

  35. arXiv:2205.08993  [pdf, other

    cs.CL eess.AS

    Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation

    Authors: Qianqian Dong, Fengpeng Yue, Tom Ko, Mingxuan Wang, Qibing Bai, Yu Zhang

    Abstract: Direct Speech-to-speech translation (S2ST) has drawn more and more attention recently. The task is very challenging due to data scarcity and complex speech-to-speech mapping. In this paper, we report our recent achievements in S2ST. Firstly, we build a S2ST Transformer baseline which outperforms the original Translatotron. Secondly, we utilize the external data by pseudo-labeling and obtain a new… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: Submitted to INTERSPEECH 2022

  36. arXiv:2205.02850  [pdf

    eess.IV cs.AI cs.CV

    A Deep Reinforcement Learning Framework for Rapid Diagnosis of Whole Slide Pathological Images

    Authors: Tingting Zheng, Weixing chen, Shuqin Li, Hao Quan, Qun Bai, Tianhang Nan, Song Zheng, Xinghua Gao, Yue Zhao, Xiaoyu Cui

    Abstract: The deep neural network is a research hotspot for histopathological image analysis, which can improve the efficiency and accuracy of diagnosis for pathologists or be used for disease screening. The whole slide pathological image can reach one gigapixel and contains abundant tissue feature information, which needs to be divided into a lot of patches in the training and inference stages. This will l… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

  37. arXiv:2203.15610  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

    Authors: Rui Wang, Qibing Bai, Junyi Ao, Long Zhou, Zhixiang Xiong, Zhihua Wei, Yu Zhang, Tom Ko, Haizhou Li

    Abstract: Self-supervised speech representation learning has shown promising results in various speech processing tasks. However, the pre-trained models, e.g., HuBERT, are storage-intensive Transformers, limiting their scope of applications under low-resource settings. To this end, we propose LightHuBERT, a once-for-all Transformer compression framework, to find the desired architectures automatically by pr… ▽ More

    Submitted 18 June, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: 5 pages, 2 figures, accepted to Insterspeech 2022

  38. arXiv:2203.11105  [pdf, other

    cs.CV

    High-fidelity GAN Inversion with Padding Space

    Authors: Qingyan Bai, Yinghao Xu, Jiapeng Zhu, Weihao Xia, Yujiu Yang, Yujun Shen

    Abstract: Inverting a Generative Adversarial Network (GAN) facilitates a wide range of image editing tasks using pre-trained generators. Existing methods typically employ the latent space of GANs as the inversion space yet observe the insufficient recovery of spatial details. In this work, we propose to involve the padding space of the generator to complement the latent space with spatial information. Concr… ▽ More

    Submitted 27 July, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: ECCV 2022 camera-ready; Project page: https://ezioby.github.io/padinv/; Code: https://github.com/EzioBy/padinv

  39. GAC: A Deep Reinforcement Learning Model Toward User Incentivization in Unknown Social Networks

    Authors: Shiqing Wu, Weihua Li, Quan Bai

    Abstract: In recent years, many applications have deployed incentive mechanisms to promote users' attention and engagement. Most incentive mechanisms determine specific incentive values based on users' attributes (e.g., preferences), while such information is unavailable in many real-world applications. Meanwhile, due to budget restrictions, realizing successful incentivization for all users can be challeng… ▽ More

    Submitted 13 November, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted by Knowledge-Based Systems

  40. arXiv:2203.04036  [pdf, other

    cs.CV

    StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN

    Authors: Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang

    Abstract: One-shot talking face generation aims at synthesizing a high-quality talking face video from an arbitrary portrait image, driven by a video or an audio segment. One challenging quality factor is the resolution of the output video: higher resolution conveys more details. In this work, we investigate the latent feature space of a pre-trained StyleGAN and discover some excellent spatial transformatio… ▽ More

    Submitted 16 March, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: Project Page is at http://feiiyin.github.io/StyleHEAT/

  41. arXiv:2202.06232  [pdf, other

    cs.LG

    A Geometric Understanding of Natural Gradient

    Authors: Qinxun Bai, Steven Rosenberg, Wei Xu

    Abstract: While natural gradients have been widely studied from both theoretical and empirical perspectives, we argue that some fundamental theoretical issues regarding the existence of gradients in infinite dimensional function spaces remain underexplored. We address these issues by providing a geometric perspective and mathematical framework for studying natural gradient that is more complete and rigorous… ▽ More

    Submitted 23 May, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

  42. arXiv:2112.10454  [pdf, other

    cs.CR cs.GT

    Blockchain Mining with Multiple Selfish Miners

    Authors: Qianlan Bai, Yuedong Xu, Nianyi Liu, Xin Wang

    Abstract: This paper studies a fundamental problem regarding the security of blockchain PoW consensus on how the existence of multiple misbehaving miners influences the profitability of selfish mining. Each selfish miner (or attacker interchangeably) maintains a private chain and makes it public opportunistically for acquiring more rewards incommensurate to his Hash power. We first establish a general Marko… ▽ More

    Submitted 31 March, 2023; v1 submitted 20 December, 2021; originally announced December 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:1811.08263

  43. arXiv:2112.10275  [pdf, other

    cs.CV cs.AI

    Parallel Multi-Scale Networks with Deep Supervision for Hand Keypoint Detection

    Authors: Renjie Li, Son Tran, Saurabh Garg, Katherine Lawler, Jane Alty, Quan Bai

    Abstract: Keypoint detection plays an important role in a wide range of applications. However, predicting keypoints of small objects such as human hands is a challenging problem. Recent works fuse feature maps of deep Convolutional Neural Networks (CNNs), either via multi-level feature integration or multi-resolution aggregation. Despite achieving some success, the feature fusion approaches increase the com… ▽ More

    Submitted 19 December, 2021; originally announced December 2021.

  44. arXiv:2112.00182  [pdf, other

    cs.DB

    Maliva: Using Machine Learning to Rewrite Visualization Queries Under Time Constraints

    Authors: Qiushi Bai, Sadeem Alsudais, Chen Li, Shuang Zhao

    Abstract: We consider data-visualization systems where a middleware layer translates a frontend request to a SQL query to a backend database to compute visual results. We study the problem of answering a visualization request within a limited time constraint due to the responsiveness requirement. We explore the optimization options of rewriting an original query by adding hints and/or doing approximations s… ▽ More

    Submitted 14 February, 2022; v1 submitted 30 November, 2021; originally announced December 2021.

  45. Hand gesture detection in tests performed by older adults

    Authors: Guan Huang, Son N. Tran, Quan Bai, Jane Alty

    Abstract: Our team are developing a new online test that analyses hand movement features associated with ageing that can be completed remotely from the research centre. To obtain hand movement features, participants will be asked to perform a variety of hand gestures using their own computer cameras. However, it is challenging to collect high quality hand movement video data, especially for older participan… ▽ More

    Submitted 28 October, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

    Journal ref: Neural Comput & Applic (2022)

  46. arXiv:2110.12081  [pdf, other

    cs.LG cs.RO

    Off-policy Reinforcement Learning with Optimistic Exploration and Distribution Correction

    Authors: Jiachen Li, Shuo Cheng, Zhenyu Liao, Huayan Wang, William Yang Wang, Qinxun Bai

    Abstract: Improving the sample efficiency of reinforcement learning algorithms requires effective exploration. Following the principle of $\textit{optimism in the face of uncertainty}$ (OFU), we train a separate exploration policy to maximize the approximate upper confidence bound of the critics in an off-policy actor-critic framework. However, this introduces extra differences between the replay buffer and… ▽ More

    Submitted 22 November, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

    Comments: Deep RL Workshop, NeurIPS 2022

  47. arXiv:2110.04854  [pdf, other

    cs.CV

    Identity-guided Face Generation with Multi-modal Contour Conditions

    Authors: Qingyan Bai, Weihao Xia, Fei Yin, Yujiu Yang

    Abstract: Recent face generation methods have tried to synthesize faces based on the given contour condition, like a low-resolution image or sketch. However, the problem of identity ambiguity remains unsolved, which usually occurs when the contour is too vague to provide reliable identity information (e.g., when its resolution is extremely low). Thus feasible solutions of image restoration could be infinite… ▽ More

    Submitted 2 August, 2022; v1 submitted 10 October, 2021; originally announced October 2021.

    Comments: Accepted to ICIP 2022

  48. arXiv:2109.06332  [pdf, other

    cs.LG

    Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach

    Authors: Qinbo Bai, Amrit Singh Bedi, Mridul Agarwal, Alec Koppel, Vaneet Aggarwal

    Abstract: Reinforcement learning is widely used in applications where one needs to perform sequential decisions while interacting with the environment. The problem becomes more challenging when the decision requirement includes satisfying some safety constraints. The problem is mathematically formulated as constrained Markov decision process (CMDP). In the literature, various algorithms are available to sol… ▽ More

    Submitted 13 July, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: This paper is the arXiv version with Appendices of the published AAAI paper: "Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach," in Proc. AAAI, Feb 2022. The paper has been further extended with concave utilities and constraints in v2

    Journal ref: AAAI 2022

  49. arXiv:2109.05439  [pdf, other

    cs.LG cs.AI

    Concave Utility Reinforcement Learning with Zero-Constraint Violations

    Authors: Mridul Agarwal, Qinbo Bai, Vaneet Aggarwal

    Abstract: We consider the problem of tabular infinite horizon concave utility reinforcement learning (CURL) with convex constraints. For this, we propose a model-based learning algorithm that also achieves zero constraint violations. Assuming that the concave objective and the convex constraints have a solution interior to the set of feasible occupation measures, we solve a tighter optimization problem to e… ▽ More

    Submitted 16 November, 2023; v1 submitted 12 September, 2021; originally announced September 2021.

    Comments: Transactions on Machine Learning Research, Dec 2022

    Journal ref: Transactions on Machine Learning Research, Dec 2022

  50. arXiv:2107.05992  [pdf, other

    cs.SI cs.AI

    Identifying Influential Users in Unknown Social Networks for Adaptive Incentive Allocation Under Budget Restriction

    Authors: Shiqing Wu, Weihua Li, Hao Shen, Quan Bai

    Abstract: In recent years, recommendation systems have been widely applied in many domains. These systems are impotent in affecting users to choose the behavior that the system expects. Meanwhile, providing incentives has been proven to be a more proactive way to affect users' behaviors. Due to the budget limitation, the number of users who can be incentivized is restricted. In this light, we intend to util… ▽ More

    Submitted 14 July, 2021; v1 submitted 13 July, 2021; originally announced July 2021.