Search | arXiv e-print repository

$β$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search

Authors: Peng Ye, Tong He, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang

Abstract: Neural Architecture Search has attracted increasing attention in recent years. Among them, differential NAS approaches such as DARTS, have gained popularity for the search efficiency. However, they still suffer from three main issues, that are, the weak stability due to the performance collapse, the poor generalization ability of the searched architectures, and the inferior robustness to different… ▽ More Neural Architecture Search has attracted increasing attention in recent years. Among them, differential NAS approaches such as DARTS, have gained popularity for the search efficiency. However, they still suffer from three main issues, that are, the weak stability due to the performance collapse, the poor generalization ability of the searched architectures, and the inferior robustness to different kinds of proxies. To solve the stability and generalization problems, a simple-but-effective regularization method, termed as Beta-Decay, is proposed to regularize the DARTS-based NAS searching process (i.e., $β$-DARTS). Specifically, Beta-Decay regularization can impose constraints to keep the value and variance of activated architecture parameters from being too large, thereby ensuring fair competition among architecture parameters and making the supernet less sensitive to the impact of input on the operation set. In-depth theoretical analyses on how it works and why it works are provided. Comprehensive experiments validate that Beta-Decay regularization can help to stabilize the searching process and makes the searched network more transferable across different datasets. To address the robustness problem, we first benchmark different NAS methods under a wide range of proxy data, proxy channels, proxy layers and proxy epochs, since the robustness of NAS under different kinds of proxies has not been explored before. We then conclude some interesting findings and find that $β$-DARTS always achieves the best result among all compared NAS methods under almost all proxies. We further introduce the novel flooding regularization to the weight optimization of $β$-DARTS (i.e., Bi-level regularization), and experimentally and theoretically verify its effectiveness for improving the proxy robustness of differentiable NAS. △ Less

Submitted 16 January, 2023; originally announced January 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2203.01665

arXiv:2301.03377 [pdf, other]

Machine Learning for Large-Scale Optimization in 6G Wireless Networks

Authors: Yandong Shi, Lixiang Lian, Yuanming Shi, Zixin Wang, Yong Zhou, Liqun Fu, Lin Bai, Jun Zhang, Wei Zhang

Abstract: The sixth generation (6G) wireless systems are envisioned to enable the paradigm shift from "connected things" to "connected intelligence", featured by ultra high density, large-scale, dynamic heterogeneity, diversified functional requirements and machine learning capabilities, which leads to a growing need for highly efficient intelligent algorithms. The classic optimization-based algorithms usua… ▽ More The sixth generation (6G) wireless systems are envisioned to enable the paradigm shift from "connected things" to "connected intelligence", featured by ultra high density, large-scale, dynamic heterogeneity, diversified functional requirements and machine learning capabilities, which leads to a growing need for highly efficient intelligent algorithms. The classic optimization-based algorithms usually require highly precise mathematical model of data links and suffer from poor performance with high computational cost in realistic 6G applications. Based on domain knowledge (e.g., optimization models and theoretical tools), machine learning (ML) stands out as a promising and viable methodology for many complex large-scale optimization problems in 6G, due to its superior performance, generalizability, computational efficiency and robustness. In this paper, we systematically review the most representative "learning to optimize" techniques in diverse domains of 6G wireless networks by identifying the inherent feature of the underlying optimization problem and investigating the specifically designed ML frameworks from the perspective of optimization. In particular, we will cover algorithm unrolling, learning to branch-and-bound, graph neural network for structured optimization, deep reinforcement learning for stochastic optimization, end-to-end learning for semantic optimization, as well as federated learning for distributed optimization, for solving challenging large-scale optimization problems arising from various important wireless applications. Through the in-depth discussion, we shed light on the excellent performance of ML-based optimization algorithms with respect to the classical methods, and provide insightful guidance to develop advanced ML techniques in 6G networks. △ Less

Submitted 3 January, 2023; originally announced January 2023.

arXiv:2212.08287 [pdf, other]

Rich Event Modeling for Script Event Prediction

Authors: Long Bai, Saiping Guan, Zixuan Li, Jiafeng Guo, Xiaolong Jin, Xueqi Cheng

Abstract: Script is a kind of structured knowledge extracted from texts, which contains a sequence of events. Based on such knowledge, script event prediction aims to predict the subsequent event. To do so, two aspects should be considered for events, namely, event description (i.e., what the events should contain) and event encoding (i.e., how they should be encoded). Most existing methods describe an even… ▽ More Script is a kind of structured knowledge extracted from texts, which contains a sequence of events. Based on such knowledge, script event prediction aims to predict the subsequent event. To do so, two aspects should be considered for events, namely, event description (i.e., what the events should contain) and event encoding (i.e., how they should be encoded). Most existing methods describe an event by a verb together with only a few core arguments (i.e., subject, object, and indirect object), which are not precise. In addition, existing event encoders are limited to a fixed number of arguments, which are not flexible to deal with extra information. Thus, in this paper, we propose the Rich Event Prediction (REP) framework for script event prediction. Fundamentally, it is based on the proposed rich event description, which enriches the existing ones with three kinds of important information, namely, the senses of verbs, extra semantic roles, and types of participants. REP contains an event extractor to extract such information from texts. Based on the extracted rich information, a predictor then selects the most probable subsequent event. The core component of the predictor is a transformer-based event encoder to flexibly deal with an arbitrary number of arguments. Experimental results on the widely used Gigaword Corpus show the effectiveness of the proposed framework. △ Less

Submitted 16 December, 2022; originally announced December 2022.

Comments: AAAI 2023 (main conference)

arXiv:2212.07651 [pdf, other]

Two-stage Contextual Transformer-based Convolutional Neural Network for Airway Extraction from CT Images

Authors: Yanan Wu, Shuiqing Zhao, Shouliang Qi, Jie Feng, Haowen Pang, Runsheng Chang, Long Bai, Mengqi Li, Shuyue Xia, Wei Qian, Hongliang Ren

Abstract: Accurate airway extraction from computed tomography (CT) images is a critical step for planning navigation bronchoscopy and quantitative assessment of airway-related chronic obstructive pulmonary disease (COPD). The existing methods are challenging to sufficiently segment the airway, especially the high-generation airway, with the constraint of the limited label and cannot meet the clinical use in… ▽ More Accurate airway extraction from computed tomography (CT) images is a critical step for planning navigation bronchoscopy and quantitative assessment of airway-related chronic obstructive pulmonary disease (COPD). The existing methods are challenging to sufficiently segment the airway, especially the high-generation airway, with the constraint of the limited label and cannot meet the clinical use in COPD. We propose a novel two-stage 3D contextual transformer-based U-Net for airway segmentation using CT images. The method consists of two stages, performing initial and refined airway segmentation. The two-stage model shares the same subnetwork with different airway masks as input. Contextual transformer block is performed both in the encoder and decoder path of the subnetwork to finish high-quality airway segmentation effectively. In the first stage, the total airway mask and CT images are provided to the subnetwork, and the intrapulmonary airway mask and corresponding CT scans to the subnetwork in the second stage. Then the predictions of the two-stage method are merged as the final prediction. Extensive experiments were performed on in-house and multiple public datasets. Quantitative and qualitative analysis demonstrate that our proposed method extracted much more branches and lengths of the tree while accomplishing state-of-the-art airway segmentation performance. The code is available at https://github.com/zhaozsq/airway_segmentation. △ Less

Submitted 15 December, 2022; originally announced December 2022.

arXiv:2212.06435 [pdf, ps, other]

Boundary behaviors for a continuous-state nonlinear Neveu's branching process

Authors: Linyu Bai, Xu Yang

Abstract: After generalizing the criteria introduced by Chen, we establish the necessary and sufficient conditions for the extinction, explosion and coming down from infinity of a continuous-state nonlinear Neveu's branching process. After generalizing the criteria introduced by Chen, we establish the necessary and sufficient conditions for the extinction, explosion and coming down from infinity of a continuous-state nonlinear Neveu's branching process. △ Less

Submitted 13 December, 2022; originally announced December 2022.

arXiv:2212.05228 [pdf, other]

QESK: Quantum-based Entropic Subtree Kernels for Graph Classification

Authors: Lu Bai, Lixin Cui, Edwin R. Hancock

Abstract: In this paper, we propose a novel graph kernel, namely the Quantum-based Entropic Subtree Kernel (QESK), for Graph Classification. To this end, we commence by computing the Average Mixing Matrix (AMM) of the Continuous-time Quantum Walk (CTQW) evolved on each graph structure. Moreover, we show how this AMM matrix can be employed to compute a series of entropic subtree representations associated wi… ▽ More In this paper, we propose a novel graph kernel, namely the Quantum-based Entropic Subtree Kernel (QESK), for Graph Classification. To this end, we commence by computing the Average Mixing Matrix (AMM) of the Continuous-time Quantum Walk (CTQW) evolved on each graph structure. Moreover, we show how this AMM matrix can be employed to compute a series of entropic subtree representations associated with the classical Weisfeiler-Lehman (WL) algorithm. For a pair of graphs, the QESK kernel is defined by computing the exponentiation of the negative Euclidean distance between their entropic subtree representations, theoretically resulting in a positive definite graph kernel. We show that the proposed QESK kernel not only encapsulates complicated intrinsic quantum-based structural characteristics of graph structures through the CTQW, but also theoretically addresses the shortcoming of ignoring the effects of unshared substructures arising in state-of-the-art R-convolution graph kernels. Moreover, unlike the classical R-convolution kernels, the proposed QESK can discriminate the distinctions of isomorphic subtrees in terms of the global graph structures, theoretically explaining the effectiveness. Experiments indicate that the proposed QESK kernel can significantly outperform state-of-the-art graph kernels and graph deep learning methods for graph classification problems. △ Less

Submitted 10 December, 2022; originally announced December 2022.

arXiv:2211.02904 [pdf, other]

HAQJSK: Hierarchical-Aligned Quantum Jensen-Shannon Kernels for Graph Classification

Authors: Lu Bai, Lixin Cui, Yue Wang, Ming Li, Edwin R. Hancock

Abstract: In this work, we propose a family of novel quantum kernels, namely the Hierarchical Aligned Quantum Jensen-Shannon Kernels (HAQJSK), for un-attributed graphs. Different from most existing classical graph kernels, the proposed HAQJSK kernels can incorporate hierarchical aligned structure information between graphs and transform graphs of random sizes into fixed-sized aligned graph structures, i.e.,… ▽ More In this work, we propose a family of novel quantum kernels, namely the Hierarchical Aligned Quantum Jensen-Shannon Kernels (HAQJSK), for un-attributed graphs. Different from most existing classical graph kernels, the proposed HAQJSK kernels can incorporate hierarchical aligned structure information between graphs and transform graphs of random sizes into fixed-sized aligned graph structures, i.e., the Hierarchical Transitive Aligned Adjacency Matrix of vertices and the Hierarchical Transitive Aligned Density Matrix of the Continuous-Time Quantum Walk (CTQW). For a pair of graphs to hand, the resulting HAQJSK kernels are defined by measuring the Quantum Jensen-Shannon Divergence (QJSD) between their transitive aligned graph structures. We show that the proposed HAQJSK kernels not only reflect richer intrinsic global graph characteristics in terms of the CTQW, but also address the drawback of neglecting structural correspondence information arising in most existing R-convolution kernels. Furthermore, unlike the previous Quantum Jensen-Shannon Kernels associated with the QJSD and the CTQW, the proposed HAQJSK kernels can simultaneously guarantee the properties of permutation invariant and positive definiteness, explaining the theoretical advantages of the HAQJSK kernels. Experiments indicate the effectiveness of the proposed kernels. △ Less

Submitted 10 December, 2022; v1 submitted 5 November, 2022; originally announced November 2022.

arXiv:2211.00309 [pdf, other]

doi 10.1103/PhysRevD.107.095008

Probe the Mixing Parameter $|V_{τN}|^2$ for Heavy Neutrinos

Authors: Lingxiao Bai, Ying-nan Mao, Kechen Wang

Abstract: Because of the difficulty in detecting final state taus, the mixing parameter $|V_{τN}|^2$ for heavy neutrino $N$ is not well studied at current experiments, compared with other mixing parameters $|V_{e N}|^2$ and $|V_{μN}|^2$. In this paper, we focus on a challenging scenario where $N$ mixes with active neutrino of tau flavour only, i.e. $ |V_{τN}|^2 \neq 0 $ and $|V_{e N}|^2 = |V_{μN}|^2 = 0$. W… ▽ More Because of the difficulty in detecting final state taus, the mixing parameter $|V_{τN}|^2$ for heavy neutrino $N$ is not well studied at current experiments, compared with other mixing parameters $|V_{e N}|^2$ and $|V_{μN}|^2$. In this paper, we focus on a challenging scenario where $N$ mixes with active neutrino of tau flavour only, i.e. $ |V_{τN}|^2 \neq 0 $ and $|V_{e N}|^2 = |V_{μN}|^2 = 0$. We derive current constraints on $|V_{τN}|^2$ from the rare $Z$-boson decay and electroweak precision data (EWPD). To forecast the future limits, we also investigate the signal $p p \to τ^{\pm} τ^{\pm} j j $ via a Majorana heavy neutrino at future proton-proton colliders. To suppress the background, both taus are required to decay leptonically into muons, leading to the final state containing two same sign muons, at least two jets plus moderate missing energy. The signal and relevant background processes are simulated at the HL-LHC and SppC/FCC-hh with center-of-mass energy of 14 TeV and 100 TeV. The preselection and multivariate analyses based on machine-learning are performed to reduce background. Limits on $|V_{τN}|^2$ are shown for heavy neutrino mass in the range 10-1000 GeV based on measurements from the rare $Z$-boson decay and EWPD, and searches at the HL-LHC and SppC/FCC-hh with integrated luminosities of 3 and 20 ab$^{-1}$. △ Less

Submitted 25 October, 2023; v1 submitted 1 November, 2022; originally announced November 2022.

Comments: 9 figures, 4 tables. arXiv admin note: text overlap with arXiv:2210.17050

Journal ref: Physical Review D 107 (2023) no. 9, 095008

arXiv:2210.14524 [pdf, other]

A Bibliometric Analysis and Review on Reinforcement Learning for Transportation Applications

Authors: Can Li, Lei Bai, Lina Yao, S. Travis Waller, Wei Liu

Abstract: Transportation is the backbone of the economy and urban development. Improving the efficiency, sustainability, resilience, and intelligence of transportation systems is critical and also challenging. The constantly changing traffic conditions, the uncertain influence of external factors (e.g., weather, accidents), and the interactions among multiple travel modes and multi-type flows result in the… ▽ More Transportation is the backbone of the economy and urban development. Improving the efficiency, sustainability, resilience, and intelligence of transportation systems is critical and also challenging. The constantly changing traffic conditions, the uncertain influence of external factors (e.g., weather, accidents), and the interactions among multiple travel modes and multi-type flows result in the dynamic and stochastic natures of transportation systems. The planning, operation, and control of transportation systems require flexible and adaptable strategies in order to deal with uncertainty, non-linearity, variability, and high complexity. In this context, Reinforcement Learning (RL) that enables autonomous decision-makers to interact with the complex environment, learn from the experiences, and select optimal actions has been rapidly emerging as one of the most useful approaches for smart transportation. This paper conducts a bibliometric analysis to identify the development of RL-based methods for transportation applications, typical journals/conferences, and leading topics in the field of intelligent transportation in recent ten years. Then, this paper presents a comprehensive literature review on applications of RL in transportation by categorizing different methods with respect to the specific application domains. The potential future research directions of RL applications and developments are also discussed. △ Less

Submitted 26 October, 2022; originally announced October 2022.

arXiv:2210.12920 [pdf, ps, other]

Efficient User Scheduling for Uplink Hybrid Satellite-Terrestrial Communication

Authors: Lina Zhu, Lin Bai, Lin Zhou, Jinho Choi

Abstract: Due to increasing demands of seamless connection and massive information exchange across the world, the integrated satellite-terrestrial communication systems develop rapidly. To shed lights on the design of this system, we consider an uplink communication model consisting of a single satellite, a single terrestrial station and multiple ground users. The terrestrial station uses decode-and-forward… ▽ More Due to increasing demands of seamless connection and massive information exchange across the world, the integrated satellite-terrestrial communication systems develop rapidly. To shed lights on the design of this system, we consider an uplink communication model consisting of a single satellite, a single terrestrial station and multiple ground users. The terrestrial station uses decode-and-forward (DF) to facilitate the communication between ground users and the satellite. The channel between the satellite and the terrestrial station is assumed to be a quasi-static shadowed Rician fading channel, while the channels between the terrestrial station and ground users are assumed to experience independent quasi-static Rayleigh fading. We consider two cases of channel state information (CSI) availability. When instantaneous CSI is available, we derive the instantaneous achievable sum rate of all ground users and formulate an optimization problem to maximize the sum rate. When only channel distribution information (CDI) is available, we derive a closed-form expression for the outage probability and formulate another optimization problem to minimize the outage probability. Both optimization problems correspond to scheduling algorithms for ground users. For both cases, we propose low-complexity user scheduling algorithms and demonstrate the efficiency of our scheduling algorithms via numerical simulations. △ Less

Submitted 23 October, 2022; originally announced October 2022.

Comments: To appear in IEEE Transactions on Wireless Communications

arXiv:2210.12736 [pdf, other]

Achievable Error Exponents for Two-Phase Multiple Classification

Authors: Lin Zhou, Jun Diao, Lin Bai

Abstract: We revisit $M$-ary classification of Gutman (TIT 1989), where one is tasked to determine whether a testing sequence is generated with the same distribution as one of the $M$ training sequences or not. Our main result is a two-phase test, its theoretical analysis and its optimality guarantee. Specifically, our two-phase test is a special case of a sequential test with only two decision time points:… ▽ More We revisit $M$-ary classification of Gutman (TIT 1989), where one is tasked to determine whether a testing sequence is generated with the same distribution as one of the $M$ training sequences or not. Our main result is a two-phase test, its theoretical analysis and its optimality guarantee. Specifically, our two-phase test is a special case of a sequential test with only two decision time points: the first phase of our test is a fixed-length test with a reject option, the second-phase of our test proceeds only if a reject option is decided in the first phase and the second phase of our test does \emph{not} allow a reject option. To provide theoretical guarantee for our test, we derive achievable error exponents using the method of types and derive a converse result for the optimal sequential test using the techniques recently proposed by Hsu, Li and Wang (ITW, 2022) for binary classification. Analytically and numerically, we show that our two phase test achieves the performance of an optimal sequential test with proper choice of test parameters. In particular, similarly as the optimal sequential test, our test does not need a final reject option to achieve the optimal error exponent region while an optimal fixed-length test needs a reject option to achieve the same region. Finally, we specialize our results to binary classification when $M=2$ and to $M$-ary hypothesis testing when the ratio of the lengths of training sequences and testing sequences tends to infinity so that generating distributions can be estimated perfectly. △ Less

Submitted 26 May, 2023; v1 submitted 23 October, 2022; originally announced October 2022.

Comments: submitted to IEEE Trans. Inf. Theory

arXiv:2210.09708 [pdf, other]

HiSMatch: Historical Structure Matching based Temporal Knowledge Graph Reasoning

Authors: Zixuan Li, Zhongni Hou, Saiping Guan, Xiaolong Jin, Weihua Peng, Long Bai, Yajuan Lyu, Wei Li, Jiafeng Guo, Xueqi Cheng

Abstract: A Temporal Knowledge Graph (TKG) is a sequence of KGs with respective timestamps, which adopts quadruples in the form of (\emph{subject}, \emph{relation}, \emph{object}, \emph{timestamp}) to describe dynamic facts. TKG reasoning has facilitated many real-world applications via answering such queries as (\emph{query entity}, \emph{query relation}, \emph{?}, \emph{future timestamp}) about future. Th… ▽ More A Temporal Knowledge Graph (TKG) is a sequence of KGs with respective timestamps, which adopts quadruples in the form of (\emph{subject}, \emph{relation}, \emph{object}, \emph{timestamp}) to describe dynamic facts. TKG reasoning has facilitated many real-world applications via answering such queries as (\emph{query entity}, \emph{query relation}, \emph{?}, \emph{future timestamp}) about future. This is actually a matching task between a query and candidate entities based on their historical structures, which reflect behavioral trends of the entities at different timestamps. In addition, recent KGs provide background knowledge of all the entities, which is also helpful for the matching. Thus, in this paper, we propose the \textbf{Hi}storical \textbf{S}tructure \textbf{Match}ing (\textbf{HiSMatch}) model. It applies two structure encoders to capture the semantic information contained in the historical structures of the query and candidate entities. Besides, it adopts another encoder to integrate the background knowledge into the model. TKG reasoning experiments on six benchmark datasets demonstrate the significant improvement of the proposed HiSMatch model, with up to 5.6\% performance improvement in MRR, compared to the state-of-the-art baselines. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Comments: Full paper of EMNLP 2022 Findings

arXiv:2210.06747 [pdf, other]

DCANet: Differential Convolution Attention Network for RGB-D Semantic Segmentation

Authors: Lizhi Bai, Jun Yang, Chunqi Tian, Yaoru Sun, Maoyu Mao, Yanjun Xu, Weirong Xu

Abstract: Combining RGB images and the corresponding depth maps in semantic segmentation proves the effectiveness in the past few years. Existing RGB-D modal fusion methods either lack the non-linear feature fusion ability or treat both modal images equally, regardless of the intrinsic distribution gap or information loss. Here we find that depth maps are suitable to provide intrinsic fine-grained patterns… ▽ More Combining RGB images and the corresponding depth maps in semantic segmentation proves the effectiveness in the past few years. Existing RGB-D modal fusion methods either lack the non-linear feature fusion ability or treat both modal images equally, regardless of the intrinsic distribution gap or information loss. Here we find that depth maps are suitable to provide intrinsic fine-grained patterns of objects due to their local depth continuity, while RGB images effectively provide a global view. Based on this, we propose a pixel differential convolution attention (DCA) module to consider geometric information and local-range correlations for depth data. Furthermore, we extend DCA to ensemble differential convolution attention (EDCA) which propagates long-range contextual dependencies and seamlessly incorporates spatial distribution for RGB data. DCA and EDCA dynamically adjust convolutional weights by pixel difference to enable self-adaptive in local and long range, respectively. A two-branch network built with DCA and EDCA, called Differential Convolutional Network (DCANet), is proposed to fuse local and global information of two-modal data. Consequently, the individual advantage of RGB and depth data are emphasized. Our DCANet is shown to set a new state-of-the-art performance for RGB-D semantic segmentation on two challenging benchmark datasets, i.e., NYUDv2 and SUN-RGBD. △ Less

Submitted 13 October, 2022; originally announced October 2022.

arXiv:2210.00770 [pdf, other]

Accelerate Reinforcement Learning with PID Controllers in the Pendulum Simulations

Authors: Liping Bai

Abstract: We propose a Proportional Integral Derivative (PID) controller-based coaching scheme to expedite reinforcement learning (RL). We propose a Proportional Integral Derivative (PID) controller-based coaching scheme to expedite reinforcement learning (RL). △ Less

Submitted 3 October, 2022; originally announced October 2022.

arXiv:2209.11404 [pdf, other]

doi 10.1007/s11263-023-01943-2

Towards Frame Rate Agnostic Multi-Object Tracking

Authors: Weitao Feng, Lei Bai, Yongqiang Yao, Fengwei Yu, Wanli Ouyang

Abstract: Multi-Object Tracking (MOT) is one of the most fundamental computer vision tasks that contributes to various video analysis applications. Despite the recent promising progress, current MOT research is still limited to a fixed sampling frame rate of the input stream. In fact, we empirically found that the accuracy of all recent state-of-the-art trackers drops dramatically when the input frame rate… ▽ More Multi-Object Tracking (MOT) is one of the most fundamental computer vision tasks that contributes to various video analysis applications. Despite the recent promising progress, current MOT research is still limited to a fixed sampling frame rate of the input stream. In fact, we empirically found that the accuracy of all recent state-of-the-art trackers drops dramatically when the input frame rate changes. For a more intelligent tracking solution, we shift the attention of our research work to the problem of Frame Rate Agnostic MOT (FraMOT), which takes frame rate insensitivity into consideration. In this paper, we propose a Frame Rate Agnostic MOT framework with a Periodic training Scheme (FAPS) to tackle the FraMOT problem for the first time. Specifically, we propose a Frame Rate Agnostic Association Module (FAAM) that infers and encodes the frame rate information to aid identity matching across multi-frame-rate inputs, improving the capability of the learned model in handling complex motion-appearance relations in FraMOT. Moreover, the association gap between training and inference is enlarged in FraMOT because those post-processing steps not included in training make a larger difference in lower frame rate scenarios. To address it, we propose Periodic Training Scheme (PTS) to reflect all post-processing steps in training via tracking pattern matching and fusion. Along with the proposed approaches, we make the first attempt to establish an evaluation method for this new task of FraMOT in two different modes, i.e., known frame rate and unknown frame rate, aiming to handle a more complex situation. The quantitative experiments on the challenging MOT17/20 dataset (FraMOT version) have clearly demonstrated that the proposed approaches can handle different frame rates better and thus improve the robustness against complicated scenarios. △ Less

Submitted 17 April, 2023; v1 submitted 23 September, 2022; originally announced September 2022.

Comments: 24 pages; Author version

arXiv:2209.06389 [pdf]

doi 10.1145/3511808.3557370

Jointly Contrastive Representation Learning on Road Network and Trajectory

Authors: Zhenyu Mao, Ziyue Li, Dedong Li, Lei Bai, Rui Zhao

Abstract: Road network and trajectory representation learning are essential for traffic systems since the learned representation can be directly used in various downstream tasks (e.g., traffic speed inference, and travel time estimation). However, most existing methods only contrast within the same scale, i.e., treating road network and trajectory separately, which ignores valuable inter-relations. In this… ▽ More Road network and trajectory representation learning are essential for traffic systems since the learned representation can be directly used in various downstream tasks (e.g., traffic speed inference, and travel time estimation). However, most existing methods only contrast within the same scale, i.e., treating road network and trajectory separately, which ignores valuable inter-relations. In this paper, we aim to propose a unified framework that jointly learns the road network and trajectory representations end-to-end. We design domain-specific augmentations for road-road contrast and trajectory-trajectory contrast separately, i.e., road segment with its contextual neighbors and trajectory with its detour replaced and dropped alternatives, respectively. On top of that, we further introduce the road-trajectory cross-scale contrast to bridge the two scales by maximizing the total mutual information. Unlike the existing cross-scale contrastive learning methods on graphs that only contrast a graph and its belonging nodes, the contrast between road segment and trajectory is elaborately tailored via novel positive sampling and adaptive weighting strategies. We conduct prudent experiments based on two real-world datasets with four downstream tasks, demonstrating improved performance and effectiveness. The code is available at https://github.com/mzy94/JCLRNT. △ Less

Submitted 13 February, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

Comments: Accepted in CIKM 2022. Latest updates at 13 Feb 2023

arXiv:2209.02556 [pdf, other]

The Outcome of the 2022 Landslide4Sense Competition: Advanced Landslide Detection from Multi-Source Satellite Imagery

Authors: Omid Ghorbanzadeh, Yonghao Xu, Hengwei Zhao, Junjue Wang, Yanfei Zhong, Dong Zhao, Qi Zang, Shuang Wang, Fahong Zhang, Yilei Shi, Xiao Xiang Zhu, Lin Bai, Weile Li, Weihang Peng, Pedram Ghamisi

Abstract: The scientific outcomes of the 2022 Landslide4Sense (L4S) competition organized by the Institute of Advanced Research in Artificial Intelligence (IARAI) are presented here. The objective of the competition is to automatically detect landslides based on large-scale multiple sources of satellite imagery collected globally. The 2022 L4S aims to foster interdisciplinary research on recent developments… ▽ More The scientific outcomes of the 2022 Landslide4Sense (L4S) competition organized by the Institute of Advanced Research in Artificial Intelligence (IARAI) are presented here. The objective of the competition is to automatically detect landslides based on large-scale multiple sources of satellite imagery collected globally. The 2022 L4S aims to foster interdisciplinary research on recent developments in deep learning (DL) models for the semantic segmentation task using satellite imagery. In the past few years, DL-based models have achieved performance that meets expectations on image interpretation, due to the development of convolutional neural networks (CNNs). The main objective of this article is to present the details and the best-performing algorithms featured in this competition. The winning solutions are elaborated with state-of-the-art models like the Swin Transformer, SegFormer, and U-Net. Advanced machine learning techniques and strategies such as hard example mining, self-training, and mix-up data augmentation are also considered. Moreover, we describe the L4S benchmark data set in order to facilitate further comparisons, and report the results of the accuracy assessment online. The data is accessible on \textit{Future Development Leaderboard} for future evaluation at \url{https://www.iarai.ac.at/landslide4sense/challenge/}, and researchers are invited to submit more prediction results, evaluate the accuracy of their methods, compare them with those of other users, and, ideally, improve the landslide detection results reported in this article. △ Less

Submitted 12 September, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

arXiv:2208.07137 [pdf, other]

An Empirical Study of Pseudo-Labeling for Image-based 3D Object Detection

Authors: Xinzhu Ma, Yuan Meng, Yinmin Zhang, Lei Bai, Jun Hou, Shuai Yi, Wanli Ouyang

Abstract: Image-based 3D detection is an indispensable component of the perception system for autonomous driving. However, it still suffers from the unsatisfying performance, one of the main reasons for which is the limited training data. Unfortunately, annotating the objects in the 3D space is extremely time/resource-consuming, which makes it hard to extend the training set arbitrarily. In this work, we fo… ▽ More Image-based 3D detection is an indispensable component of the perception system for autonomous driving. However, it still suffers from the unsatisfying performance, one of the main reasons for which is the limited training data. Unfortunately, annotating the objects in the 3D space is extremely time/resource-consuming, which makes it hard to extend the training set arbitrarily. In this work, we focus on the semi-supervised manner and explore the feasibility of a cheaper alternative, i.e. pseudo-labeling, to leverage the unlabeled data. For this purpose, we conduct extensive experiments to investigate whether the pseudo-labels can provide effective supervision for the baseline models under varying settings. The experimental results not only demonstrate the effectiveness of the pseudo-labeling mechanism for image-based 3D detection (e.g. under monocular setting, we achieve 20.23 AP for moderate level on the KITTI-3D testing set without bells and whistles, improving the baseline model by 6.03 AP), but also show several interesting and noteworthy findings (e.g. the models trained with pseudo-labels perform better than that trained with ground-truth annotations based on the same training data). We hope this work can provide insights for the image-based 3D detection community under a semi-supervised setting. The codes, pseudo-labels, and pre-trained models will be publicly available. △ Less

Submitted 15 August, 2022; originally announced August 2022.

Comments: tech report

arXiv:2208.03926 [pdf, other]

Achievable Refined Asymptotics for Successive Refinement Using Gaussian Codebooks

Authors: Lin Bai, Zhuangfei Wu, Lin Zhou

Abstract: We study the mismatched successive refinement problem where one uses Gaussian codebooks to compress an arbitrary memoryless source with successive minimum Euclidean distance encoding under the quadratic distortion measure. Specifically, we derive achievable refined asymptotics under both the joint excess-distortion probability (JEP) and the separate excess-distortion probabilities (SEP) criteria.… ▽ More We study the mismatched successive refinement problem where one uses Gaussian codebooks to compress an arbitrary memoryless source with successive minimum Euclidean distance encoding under the quadratic distortion measure. Specifically, we derive achievable refined asymptotics under both the joint excess-distortion probability (JEP) and the separate excess-distortion probabilities (SEP) criteria. For both second-order and moderate deviations asymptotics, we consider two types of codebooks: the spherical codebook where each codeword is drawn independently and uniformly from the surface of a sphere and the i.i.d. Gaussian codebook where each component of each codeword is drawn independently from a Gaussian distribution. We establish the achievable second-order rate-region under JEP and we show that under SEP any memoryless source satisfying mild moment conditions is strongly successively refinable. When specialized to a Gaussian memoryless source (GMS), our results provide an alternative achievability proof with specific code design. We show that under JEP and SEP, the same moderate deviations constant is achievable. For large deviations asymptotics, we only consider the i.i.d. Gaussian codebook since the i.i.d. Gaussian codebook has better performance than the spherical codebook in this regime for the one layer mismatched rate-distortion problem (Zhou, Tan, Motani, TIT, 2019). We derive achievable exponents of both JEP and SEP and specialize our results to a GMS, which appears to be a novel result of independent interest. △ Less

Submitted 9 February, 2023; v1 submitted 8 August, 2022; originally announced August 2022.

arXiv:2207.12601 [pdf]

doi 10.1088/1674-1137/ac9371

Flux Variations of Cosmic Ray Air Showers Detected by LHAASO-KM2A During a Thunderstorm on 10 June 2021

Authors: LHAASO Collaboration, F. Aharonian, Q. An, Axikegu, L. X. Bai, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Zhe Cao, Zhen Cao, J. Chang, J. F. Chang, E. S. Chen, Liang Chen, Liang Chen, Long Chen, M. J. Chen, M. L. Chen, S. H. Chen, S. Z. Chen, T. L. Chen, X. J. Chen , et al. (248 additional authors not shown)

Abstract: The Large High Altitude Air Shower Observatory (LHAASO) has three sub-arrays, KM2A, WCDA and WFCTA. The flux variations of cosmic ray air showers were studied by analyzing the KM2A data during the thunderstorm on 10 June 2021. The number of shower events that meet the trigger conditions increases significantly in atmospheric electric fields, with maximum fractional increase of 20%. The variations… ▽ More The Large High Altitude Air Shower Observatory (LHAASO) has three sub-arrays, KM2A, WCDA and WFCTA. The flux variations of cosmic ray air showers were studied by analyzing the KM2A data during the thunderstorm on 10 June 2021. The number of shower events that meet the trigger conditions increases significantly in atmospheric electric fields, with maximum fractional increase of 20%. The variations of trigger rates (increases or decreases) are found to be strongly dependent on the primary zenith angle. The flux of secondary particles increases significantly, following a similar trend with that of the shower events. To better understand the observed behavior, Monte Carlo simulations are performed with CORSIKA and G4KM2A (a code based on GEANT4). We find that the experimental data (in saturated negative fields) are in good agreement with simulations, assuming the presence of a uniform upward electric field of 700 V/cm with a thickness of 1500 m in the atmosphere above the observation level. Due to the acceleration/deceleration and deflection by the atmospheric electric field, the number of secondary particles with energy above the detector threshold is modified, resulting in the changes in shower detection rate. △ Less

Submitted 6 December, 2022; v1 submitted 25 July, 2022; originally announced July 2022.

Comments: 18 pages, 11 figures

Journal ref: Chinese Phys. C 47 015001 (2023)

arXiv:2207.08220 [pdf, other]

Fast-MoCo: Boost Momentum-based Contrastive Learning with Combinatorial Patches

Authors: Yuanzheng Ci, Chen Lin, Lei Bai, Wanli Ouyang

Abstract: Contrastive-based self-supervised learning methods achieved great success in recent years. However, self-supervision requires extremely long training epochs (e.g., 800 epochs for MoCo v3) to achieve promising results, which is unacceptable for the general academic community and hinders the development of this topic. This work revisits the momentum-based contrastive learning frameworks and identifi… ▽ More Contrastive-based self-supervised learning methods achieved great success in recent years. However, self-supervision requires extremely long training epochs (e.g., 800 epochs for MoCo v3) to achieve promising results, which is unacceptable for the general academic community and hinders the development of this topic. This work revisits the momentum-based contrastive learning frameworks and identifies the inefficiency in which two augmented views generate only one positive pair. We propose Fast-MoCo - a novel framework that utilizes combinatorial patches to construct multiple positive pairs from two augmented views, which provides abundant supervision signals that bring significant acceleration with neglectable extra computational cost. Fast-MoCo trained with 100 epochs achieves 73.5% linear evaluation accuracy, similar to MoCo v3 (ResNet-50 backbone) trained with 800 epochs. Extra training (200 epochs) further improves the result to 75.1%, which is on par with state-of-the-art methods. Experiments on several downstream tasks also confirm the effectiveness of Fast-MoCo. △ Less

Submitted 19 July, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

Comments: Accepted for publication at the 2022 European Conference on Computer Vision (ECCV 2022)

arXiv:2207.07265 [pdf]

Acceleration of 60 MeV proton beams in the commissioning experiment of SULF-10 PW laser

Authors: A. X. Li, C. Y. Qin, H. Zhang, S. Li, L. L. Fan, Q. S. Wang, T. J. Xu, N. W. Wang, L. H. Yu, Y. Xu, Y. Q. Liu, C. Wang, X. L. Wang, Z. X. Zhang, X. Y. Liu, P. L. Bai, Z. B. Gan, X. B. Zhang, X. B. Wang, C. Fan, Y. J. Sun, Y. H. Tang, B. Yao, X. Y. Liang, Y. X. Leng , et al. (3 additional authors not shown)

Abstract: We report the experimental results of the commissioning phase in the 10 PW laser beamline of Shanghai Superintense Ultrafast Laser Facility (SULF). The peak power reaches 2.4 PW on target without the last amplifying during the experiment. The laser energy of 72\pm 9 J is directed to a focal spot of ~6 μm diameter (FWHM) in 30 fs pulse duration, yielding a focused peak intensity around 2.0 \times 1… ▽ More We report the experimental results of the commissioning phase in the 10 PW laser beamline of Shanghai Superintense Ultrafast Laser Facility (SULF). The peak power reaches 2.4 PW on target without the last amplifying during the experiment. The laser energy of 72\pm 9 J is directed to a focal spot of ~6 μm diameter (FWHM) in 30 fs pulse duration, yielding a focused peak intensity around 2.0 \times 10^{21} W/cm^2. First laser-proton acceleration experiment is performed using plain copper and plastic targets. High-energy proton beams with maximum cut-off energy up to 62.5 MeV are achieved using copper foils at the optimum target thickness of 4 μm via target normal sheath acceleration (TNSA). For plastic targets of tens of nanometers thick, the proton cut-off energy is approximately 20 MeV, showing ring-like or filamented density distributions. These experimental results reflect the capabilities of the SULF-10 PW beamline, e.g., both ultrahigh intensity and relatively good beam contrast. Further optimization for these key parameters is underway, where peak laser intensities of 10^{22}-10^{23} W/cm^2 are anticipated to support various experiments on extreme field physics. △ Less

Submitted 14 July, 2022; originally announced July 2022.

Comments: 20 pages, 8 figures, regular article, This article has been submitted to "High Power Laser Science and Engineering"

arXiv:2207.05888 [pdf, other]

A Near Sensor Edge Computing System for Point Cloud Semantic Segmentation

Authors: Lin Bai, Yiming Zhao, Xinming Huang

Abstract: Point cloud semantic segmentation has attracted attentions due to its robustness to light condition. This makes it an ideal semantic solution for autonomous driving. However, considering the large computation burden and bandwidth demanding of neural networks, putting all the computing into vehicle Electronic Control Unit (ECU) is not efficient or practical. In this paper, we proposed a light weigh… ▽ More Point cloud semantic segmentation has attracted attentions due to its robustness to light condition. This makes it an ideal semantic solution for autonomous driving. However, considering the large computation burden and bandwidth demanding of neural networks, putting all the computing into vehicle Electronic Control Unit (ECU) is not efficient or practical. In this paper, we proposed a light weighted point cloud semantic segmentation network based on range view. Due to its simple pre-processing and standard convolution, it is efficient when running on deep learning accelerator like DPU. Furthermore, a near sensor computing system is built for autonomous vehicles. In this system, a FPGA-based deep learning accelerator core (DPU) is placed next to the LiDAR sensor, to perform point cloud pre-processing and segmentation neural network. By leaving only the post-processing step to ECU, this solution heavily alleviate the computation burden of ECU and consequently shortens the decision making and vehicles reaction latency. Our semantic segmentation network achieved 10 frame per second (fps) on Xilinx DPU with computation efficiency 42.5 GOP/W. △ Less

Submitted 12 July, 2022; originally announced July 2022.

Comments: accepted by ISCAS 2022

arXiv:2206.10255 [pdf, other]

GNN-PMB: A Simple but Effective Online 3D Multi-Object Tracker without Bells and Whistles

Authors: Jianan Liu, Liping Bai, Yuxuan Xia, Tao Huang, Bing Zhu, Qing-Long Han

Abstract: Multi-object tracking (MOT) is among crucial applications in modern advanced driver assistance systems (ADAS) and autonomous driving (AD) systems. The global nearest neighbor (GNN) filter, as the earliest random vector-based Bayesian tracking framework, has been adopted in most of state-of-the-arts trackers in the automotive industry. The development of random finite set (RFS) theory facilitates a… ▽ More Multi-object tracking (MOT) is among crucial applications in modern advanced driver assistance systems (ADAS) and autonomous driving (AD) systems. The global nearest neighbor (GNN) filter, as the earliest random vector-based Bayesian tracking framework, has been adopted in most of state-of-the-arts trackers in the automotive industry. The development of random finite set (RFS) theory facilitates a mathematically rigorous treatment of the MOT problem, and different variants of RFS-based Bayesian filters have then been proposed. However, their effectiveness in the real ADAS and AD application is still an open problem. In this paper, it is demonstrated that the latest RFS-based Bayesian tracking framework could be superior to typical random vector-based Bayesian tracking framework via a systematic comparative study of both traditional random vector-based Bayesian filters with rule-based heuristic track maintenance and RFS-based Bayesian filters on the nuScenes validation dataset. An RFS-based tracker, namely Poisson multi-Bernoulli filter using the global nearest neighbor (GNN-PMB), is proposed to LiDAR-based MOT tasks. This GNN-PMB tracker is simple to use, and it achieves competitive results on the nuScenes dataset. Specifically, the proposed GNN-PMB tracker outperforms most state-of-the-art LiDAR-only trackers and LiDAR and camera fusion-based trackers, ranking the $3^{rd}$ among all LiDAR-only trackers on nuScenes 3D tracking challenge leader board at the time of submission. △ Less

Submitted 8 February, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

Comments: accepted by IEEE Transactions on Intelligent Vehicles

arXiv:2206.07472 [pdf, other]

Collaborative Knowledge Graph Fusion by Exploiting the Open Corpus

Authors: Yue Wang, Yao Wan, Lu Bai, Lixin Cui, Zhuo Xu, Ming Li, Philip S. Yu, Edwin R Hancock

Abstract: To alleviate the challenges of building Knowledge Graphs (KG) from scratch, a more general task is to enrich a KG using triples from an open corpus, where the obtained triples contain noisy entities and relations. It is challenging to enrich a KG with newly harvested triples while maintaining the quality of the knowledge representation. This paper proposes a system to refine a KG using information… ▽ More To alleviate the challenges of building Knowledge Graphs (KG) from scratch, a more general task is to enrich a KG using triples from an open corpus, where the obtained triples contain noisy entities and relations. It is challenging to enrich a KG with newly harvested triples while maintaining the quality of the knowledge representation. This paper proposes a system to refine a KG using information harvested from an additional corpus. To this end, we formulate our task as two coupled sub-tasks, namely join event extraction (JEE) and knowledge graph fusion (KGF). We then propose a Collaborative Knowledge Graph Fusion Framework to allow our sub-tasks to mutually assist one another in an alternating manner. More concretely, the explorer carries out the JEE supervised by both the ground-truth annotation and an existing KG provided by the supervisor. The supervisor then evaluates the triples extracted by the explorer and enriches the KG with those that are highly ranked. To implement this evaluation, we further propose a Translated Relation Alignment Scoring Mechanism to align and translate the extracted triples to the prior KG. Experiments verify that this collaboration can both improve the performance of the JEE and the KGF. △ Less

Submitted 15 June, 2022; originally announced June 2022.

Comments: Under review by IEEE Transactions on Knowledge and Data Engineering (TKDE)

arXiv:2206.04053 [pdf, other]

Unsupervised Knowledge Adaptation for Passenger Demand Forecasting

Authors: Can Li, Lei Bai, Wei Liu, Lina Yao, S Travis Waller

Abstract: Considering the multimodal nature of transport systems and potential cross-modal correlations, there is a growing trend of enhancing demand forecasting accuracy by learning from multimodal data. These multimodal forecasting models can improve accuracy but be less practical when different parts of multimodal datasets are owned by different institutions who cannot directly share data among them. Whi… ▽ More Considering the multimodal nature of transport systems and potential cross-modal correlations, there is a growing trend of enhancing demand forecasting accuracy by learning from multimodal data. These multimodal forecasting models can improve accuracy but be less practical when different parts of multimodal datasets are owned by different institutions who cannot directly share data among them. While various institutions may can not share their data with each other directly, they may share forecasting models trained by their data, where such models cannot be used to identify the exact information from their datasets. This study proposes an Unsupervised Knowledge Adaptation Demand Forecasting framework to forecast the demand of the target mode by utilizing a pre-trained model based on data of another mode, which does not require direct data sharing of the source mode. The proposed framework utilizes the potential shared patterns among multiple transport modes to improve forecasting performance while avoiding the direct sharing of data among different institutions. Specifically, a pre-trained forecasting model is first learned based on the data of a source mode, which can capture and memorize the source travel patterns. Then, the demand data of the target dataset is encoded into an individual knowledge part and a sharing knowledge part which will extract travel patterns by individual extraction network and sharing extraction network, respectively. The unsupervised knowledge adaptation strategy is utilized to form the sharing features for further forecasting by making the pre-trained network and the sharing extraction network analogous. Our findings illustrate that unsupervised knowledge adaptation by sharing the pre-trained model to the target mode can improve the forecasting performance without the dependence on direct data sharing. △ Less

Submitted 8 June, 2022; originally announced June 2022.

arXiv:2205.13974 [pdf]

doi 10.1109/JPROC.2022.3177230

DLMP of Competitive Markets in Active Distribution Networks: Models, Solutions, Applications, and Visions

Authors: Xiaofei Wang, Fangxing Li, Linquan Bai, Xin Fang

Abstract: Traditionally, the electric distribution system operates with uniform energy prices across all system nodes. However, as the adoption of distributed energy resources (DERs) propels a shift from passive to active distribution network (ADN) operation, a distribution-level electricity market has been proposed to manage new complexities efficiently. In addition, distribution locational marginal price… ▽ More Traditionally, the electric distribution system operates with uniform energy prices across all system nodes. However, as the adoption of distributed energy resources (DERs) propels a shift from passive to active distribution network (ADN) operation, a distribution-level electricity market has been proposed to manage new complexities efficiently. In addition, distribution locational marginal price (DLMP) has been established in the literature as the primary pricing mechanism. The DLMP inherits the LMP concept in the transmission-level wholesale market, but incorporates characteristics of the distribution system, such as high R/X ratios and power losses, system imbalance, and voltage regulation needs. The DLMP provides a solution that can be essential for competitive market operation in future distribution systems. This paper first provides an overview of the current distribution-level market architectures and their early implementations. Next, the general clearing model, model relaxations, and DLMP formulation are comprehensively reviewed. The state-of-the-art solution methods for distribution market clearing are summarized and categorized into centralized, distributed, and decentralized methods. Then, DLMP applications for the operation and planning of DERs and distribution system operators (DSOs) are discussed in detail. Finally, visions of future research directions and possible barriers and challenges are presented. △ Less

Submitted 27 May, 2022; originally announced May 2022.

Journal ref: Proceedings of the IEEE, vol. 111, no. 7, pp. 725-743, July 2023

arXiv:2205.12095 [pdf, other]

DNNAbacus: Toward Accurate Computational Cost Prediction for Deep Neural Networks

Authors: Lu Bai, Weixing Ji, Qinyuan Li, Xilai Yao, Wei Xin, Wanyi Zhu

Abstract: Deep learning is attracting interest across a variety of domains, including natural language processing, speech recognition, and computer vision. However, model training is time-consuming and requires huge computational resources. Existing works on the performance prediction of deep neural networks, which mostly focus on the training time prediction of a few models, rely on analytical models and r… ▽ More Deep learning is attracting interest across a variety of domains, including natural language processing, speech recognition, and computer vision. However, model training is time-consuming and requires huge computational resources. Existing works on the performance prediction of deep neural networks, which mostly focus on the training time prediction of a few models, rely on analytical models and result in high relative errors. %Optimizing task scheduling and reducing job failures in data centers are essential to improve resource utilization and reduce carbon emissions. This paper investigates the computational resource demands of 29 classical deep neural networks and builds accurate models for predicting computational costs. We first analyze the profiling results of typical networks and demonstrate that the computational resource demands of models with different inputs and hyperparameters are not obvious and intuitive. We then propose a lightweight prediction approach DNNAbacus with a novel network structural matrix for network representation. DNNAbacus can accurately predict both memory and time cost for PyTorch and TensorFlow models, which is also generalized to different hardware architectures and can have zero-shot capability for unseen networks. Our experimental results show that the mean relative error (MRE) is 0.9% with respect to time and 2.8% with respect to memory for 29 classic models, which is much lower than the state-of-the-art works. △ Less

Submitted 24 May, 2022; originally announced May 2022.

arXiv:2205.10839 [pdf, other]

Deep Learning for Visual Speech Analysis: A Survey

Authors: Changchong Sheng, Gangyao Kuang, Liang Bai, Chenping Hou, Yulan Guo, Xin Xu, Matti Pietikäinen, Li Liu

Abstract: Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment. As a powerful AI strategy, deep learning techniques have extensively promoted the development of visual speech learning. Over the past five years, numerous deep learning based methods have bee… ▽ More Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment. As a powerful AI strategy, deep learning techniques have extensively promoted the development of visual speech learning. Over the past five years, numerous deep learning based methods have been proposed to address various problems in this area, especially automatic visual speech recognition and generation. To push forward future research on visual speech, this paper aims to present a comprehensive review of recent progress in deep learning methods on visual speech analysis. We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance. Besides, we also identify gaps in current research and discuss inspiring future research directions. △ Less

Submitted 14 March, 2024; v1 submitted 22 May, 2022; originally announced May 2022.

Comments: 20 pages, 8 figures. Accepted by IEEE TPAMI

arXiv:2205.04771 [pdf, other]

Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains

Authors: Haiyang Yang, Meilin Chen, Yizhou Wang, Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Wanli Ouyang

Abstract: Generalizing learned representations across significantly different visual domains is a fundamental yet crucial ability of the human visual system. While recent self-supervised learning methods have achieved good performances with evaluation set on the same domain as the training set, they will have an undesirable performance decrease when tested on a different domain. Therefore, the self-supervis… ▽ More Generalizing learned representations across significantly different visual domains is a fundamental yet crucial ability of the human visual system. While recent self-supervised learning methods have achieved good performances with evaluation set on the same domain as the training set, they will have an undesirable performance decrease when tested on a different domain. Therefore, the self-supervised learning from multiple domains task is proposed to learn domain-invariant features that are not only suitable for evaluation on the same domain as the training set but also can be generalized to unseen domains. In this paper, we propose a Domain-invariant Masked AutoEncoder (DiMAE) for self-supervised learning from multi-domains, which designs a new pretext task, \emph{i.e.,} the cross-domain reconstruction task, to learn domain-invariant features. The core idea is to augment the input image with style noise from different domains and then reconstruct the image from the embedding of the augmented image, regularizing the encoder to learn domain-invariant features. To accomplish the idea, DiMAE contains two critical designs, 1) content-preserved style mix, which adds style information from other domains to input while persevering the content in a parameter-free manner, and 2) multiple domain-specific decoders, which recovers the corresponding domain style of input to the encoded domain-invariant features for reconstruction. Experiments on PACS and DomainNet illustrate that DiMAE achieves considerable gains compared with recent state-of-the-art methods. △ Less

Submitted 6 June, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

arXiv:2205.01509 [pdf, other]

MS Lesion Segmentation: Revisiting Weighting Mechanisms for Federated Learning

Authors: Dongnan Liu, Mariano Cabezas, Dongang Wang, Zihao Tang, Lei Bai, Geng Zhan, Yuling Luo, Kain Kyle, Linda Ly, James Yu, Chun-Chien Shieh, Aria Nguyen, Ettikan Kandasamy Karuppiah, Ryan Sullivan, Fernando Calamante, Michael Barnett, Wanli Ouyang, Weidong Cai, Chenyu Wang

Abstract: Federated learning (FL) has been widely employed for medical image analysis to facilitate multi-client collaborative learning without sharing raw data. Despite great success, FL's performance is limited for multiple sclerosis (MS) lesion segmentation tasks, due to variance in lesion characteristics imparted by different scanners and acquisition parameters. In this work, we propose the first FL MS… ▽ More Federated learning (FL) has been widely employed for medical image analysis to facilitate multi-client collaborative learning without sharing raw data. Despite great success, FL's performance is limited for multiple sclerosis (MS) lesion segmentation tasks, due to variance in lesion characteristics imparted by different scanners and acquisition parameters. In this work, we propose the first FL MS lesion segmentation framework via two effective re-weighting mechanisms. Specifically, a learnable weight is assigned to each local node during the aggregation process, based on its segmentation performance. In addition, the segmentation loss function in each client is also re-weighted according to the lesion volume for the data during training. Comparison experiments on two FL MS segmentation scenarios using public and clinical datasets have demonstrated the effectiveness of the proposed method by outperforming other FL methods significantly. Furthermore, the segmentation performance of FL incorporating our proposed aggregation mechanism can exceed centralised training with all the raw data. The extensive evaluation also indicated the superiority of our method when estimating brain volume differences estimation after lesion inpainting. △ Less

Submitted 3 May, 2022; originally announced May 2022.

Comments: 10 pages, 3 figures, and 7 tables

arXiv:2204.11160 [pdf, ps, other]

doi 10.1088/1674-4527/ac60d2

Discovery of extended structure around open cluster COIN-Gaia 13 based on Gaia EDR3

Authors: Leya Bai, Jing Zhong, Li Chen, Jing Li, Jinliang Hou

Abstract: COIN-Gaia 13 is a newly discovered open cluster revealed by Gaia DR2 data. It is a nearby open cluster with a distance of about 513 pc. Combined with the five-dimensional astrometric data of Gaia EDR3 with higher accuracy, we use the membership assignment algorithm (pyUPMASK) to determine the membership of COIN-Gaia 13 in a large extended spatial region. The cluster has found 478 candidate members… ▽ More COIN-Gaia 13 is a newly discovered open cluster revealed by Gaia DR2 data. It is a nearby open cluster with a distance of about 513 pc. Combined with the five-dimensional astrometric data of Gaia EDR3 with higher accuracy, we use the membership assignment algorithm (pyUPMASK) to determine the membership of COIN-Gaia 13 in a large extended spatial region. The cluster has found 478 candidate members. After obtaining reliable cluster members, we further study its basic properties and spatial distribution. Our results show that there is an obvious extended structure of the cluster in the X-Y plane. This elongated structure is distributed along the spiral arm, and the whole length is about 270 pc. The cluster age is 250 Myr, the total mass is about 439 M$_\odot$, and the tidal radius of the cluster is about 11 pc. Since more than half of the member stars (352 stars) are located outside twice the tidal radius, it is suspected that this cluster is undergoing the dynamic dissolution process. Furthermore, the spatial distribution and kinematic analysis indicate that the extended structure in COIN-Gaia 13 is more likely to be caused by the differential rotation of the Galaxy. △ Less

Submitted 23 April, 2022; originally announced April 2022.

Comments: 12 pages, 9 figures, accepted for publication in Research in Astronomy and Astrophysics

arXiv:2203.12335 [pdf, other]

DR.VIC: Decomposition and Reasoning for Video Individual Counting

Authors: Tao Han, Lei Bai, Junyu Gao, Qi Wang, Wanli Ouyang

Abstract: Pedestrian counting is a fundamental tool for understanding pedestrian patterns and crowd flow analysis. Existing works (e.g., image-level pedestrian counting, crossline crowd counting et al.) either only focus on the image-level counting or are constrained to the manual annotation of lines. In this work, we propose to conduct the pedestrian counting from a new perspective - Video Individual Count… ▽ More Pedestrian counting is a fundamental tool for understanding pedestrian patterns and crowd flow analysis. Existing works (e.g., image-level pedestrian counting, crossline crowd counting et al.) either only focus on the image-level counting or are constrained to the manual annotation of lines. In this work, we propose to conduct the pedestrian counting from a new perspective - Video Individual Counting (VIC), which counts the total number of individual pedestrians in the given video (a person is only counted once). Instead of relying on the Multiple Object Tracking (MOT) techniques, we propose to solve the problem by decomposing all pedestrians into the initial pedestrians who existed in the first frame and the new pedestrians with separate identities in each following frame. Then, an end-to-end Decomposition and Reasoning Network (DRNet) is designed to predict the initial pedestrian count with the density estimation method and reason the new pedestrian's count of each frame with the differentiable optimal transport. Extensive experiments are conducted on two datasets with congested pedestrians and diverse scenes, demonstrating the effectiveness of our method over baselines with great superiority in counting the individual pedestrians. Code: https://github.com/taohan10200/DRNet. △ Less

Submitted 28 March, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

Comments: Accepted by CVPR 2022. [camera ready with supplement]

arXiv:2203.07782 [pdf, other]

Complex Evolutional Pattern Learning for Temporal Knowledge Graph Reasoning

Authors: Zixuan Li, Saiping Guan, Xiaolong Jin, Weihua Peng, Yajuan Lyu, Yong Zhu, Long Bai, Wei Li, Jiafeng Guo, Xueqi Cheng

Abstract: A Temporal Knowledge Graph (TKG) is a sequence of KGs corresponding to different timestamps. TKG reasoning aims to predict potential facts in the future given the historical KG sequences. One key of this task is to mine and understand evolutional patterns of facts from these sequences. The evolutional patterns are complex in two aspects, length-diversity and time-variability. Existing models for T… ▽ More A Temporal Knowledge Graph (TKG) is a sequence of KGs corresponding to different timestamps. TKG reasoning aims to predict potential facts in the future given the historical KG sequences. One key of this task is to mine and understand evolutional patterns of facts from these sequences. The evolutional patterns are complex in two aspects, length-diversity and time-variability. Existing models for TKG reasoning focus on modeling fact sequences of a fixed length, which cannot discover complex evolutional patterns that vary in length. Furthermore, these models are all trained offline, which cannot well adapt to the changes of evolutional patterns from then on. Thus, we propose a new model, called Complex Evolutional Network (CEN), which uses a length-aware Convolutional Neural Network (CNN) to handle evolutional patterns of different lengths via an easy-to-difficult curriculum learning strategy. Besides, we propose to learn the model under the online setting so that it can adapt to the changes of evolutional patterns over time. Extensive experiments demonstrate that CEN obtains substantial performance improvement under both the traditional offline and the proposed online settings. △ Less

Submitted 20 March, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: ACL 2022 main conference

arXiv:2203.06928 [pdf, ps, other]

Generalized quantum cluster algebras: the Laurent phenomenon and upper bounds

Authors: Liqian Bai, Xueqing Chen, Ming Ding, Fan Xu

Abstract: Generalized quantum cluster algebras introduced in [1] are quantum deformation of generalized cluster algebras of geometric types. In this paper, we prove that the Laurent phenomenon holds in these generalized quantum cluster algebras. We also show that upper bounds coincide with the corresponding generalized quantum upper cluster algebras under the "coprimality" condition. Generalized quantum cluster algebras introduced in [1] are quantum deformation of generalized cluster algebras of geometric types. In this paper, we prove that the Laurent phenomenon holds in these generalized quantum cluster algebras. We also show that upper bounds coincide with the corresponding generalized quantum upper cluster algebras under the "coprimality" condition. △ Less

Submitted 14 March, 2022; originally announced March 2022.

Comments: 22 pages. Comments are welcome

arXiv:2203.05328 [pdf, other]

Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking

Authors: Boyu Chen, Peixia Li, Lei Bai, Lei Qiao, Qiuhong Shen, Bo Li, Weihao Gan, Wei Wu, Wanli Ouyang

Abstract: Exploiting a general-purpose neural architecture to replace hand-wired designs or inductive biases has recently drawn extensive interest. However, existing tracking approaches rely on customized sub-modules and need prior knowledge for architecture selection, hindering the tracking development in a more general system. This paper presents a Simplified Tracking architecture (SimTrack) by leveraging… ▽ More Exploiting a general-purpose neural architecture to replace hand-wired designs or inductive biases has recently drawn extensive interest. However, existing tracking approaches rely on customized sub-modules and need prior knowledge for architecture selection, hindering the tracking development in a more general system. This paper presents a Simplified Tracking architecture (SimTrack) by leveraging a transformer backbone for joint feature extraction and interaction. Unlike existing Siamese trackers, we serialize the input images and concatenate them directly before the one-branch backbone. Feature interaction in the backbone helps to remove well-designed interaction modules and produce a more efficient and effective framework. To reduce the information loss from down-sampling in vision transformers, we further propose a foveal window strategy, providing more diverse input patches with acceptable computational costs. Our SimTrack improves the baseline with 2.5%/2.6% AUC gains on LaSOT/TNL2K and gets results competitive with other specialized tracking algorithms without bells and whistles. △ Less

Submitted 15 July, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

Comments: Accepted by ECCV 2022

arXiv:2203.05177 [pdf, other]

doi 10.3847/1538-4365/ac5cbb

New Open Cluster candidates Found in Galactic Disk Using Gaia DR2/EDR3 Data

Authors: Zhihong He, Chunyan Li, Jing Zhong, Guimei Liu, Leya Bai, Songmei Qin, Yueyue Jiang, Xi Zhang, Li Chen

Abstract: We report 541 new open cluster candidates in Gaia EDR3 through revisiting the cluster results from an earlier analysis of the Gaia DR2, which revealed nearly a thousand open cluster candidates in the solar neighborhood (mostly d < 3 kpc) resideing at Galactic latitudes |b| < 20 degrees. A subsequent comparison with lists of known clusters shows a large increases of the cluster samples within 2 kpc… ▽ More We report 541 new open cluster candidates in Gaia EDR3 through revisiting the cluster results from an earlier analysis of the Gaia DR2, which revealed nearly a thousand open cluster candidates in the solar neighborhood (mostly d < 3 kpc) resideing at Galactic latitudes |b| < 20 degrees. A subsequent comparison with lists of known clusters shows a large increases of the cluster samples within 2 kpc from the Sun. We assign membership probabilities to the stars through the open source pyUPMASK algorithm, and also estimate the physical parameters through isochrone fitting for each candidate. Most of the new candidates show small total proper motion dispersions and clear features in the color-magnitude diagrams. Besides, the metallicity gradient of the new candidates is consistent with those found in the literature. The cluster parameters and member stars are available at CDS via anonymous ftp to cdsarc.u-strasbg.fr(130.79.128.5) or via https://cdsarc.unistra.fr/viz-bin/cat/J/ApJS. The discovery of these new objects shows that the open cluster samples in Gaia data is still not complete, and more discoveries are expected in the future researches. △ Less

Submitted 10 March, 2022; originally announced March 2022.

Comments: 13 pages, 8 figures, 3 tables, accepted for publication in ApJS; online data submitted to CDS (see link int the paper)

arXiv:2203.02866 [pdf, other]

doi 10.3847/1538-3881/ac5b6a

WASP-35 and HAT-P-30/WASP-51: re-analysis using TESS and ground-based transit photometry

Authors: Lu Bai, Shenghong Gu, Xiaobin Wang, Leilei Sun, Chi-Tai Kwok, Ho-Keung Hui

Abstract: High-precision transit observations provide excellent opportunities for characterizing the physical properties of exoplanetary systems. These physical properties supply many pieces of information for unvealing the internal structure, external atmosphere, and dynamical history of the planets. We present revised properties of transiting systems WASP-35 and HAT-P-30/WASP-51 through analyzing newly av… ▽ More High-precision transit observations provide excellent opportunities for characterizing the physical properties of exoplanetary systems. These physical properties supply many pieces of information for unvealing the internal structure, external atmosphere, and dynamical history of the planets. We present revised properties of transiting systems WASP-35 and HAT-P-30/WASP-51 through analyzing newly available TESS photometry and ground-based observations obtained at 1m telescope of Yunnan Observatories as well as from the literature. The improved system parameters are consistent with the previous results. Furthermore, we find that HAT-P-30b/WASP-51b's transits show significant timing variation which cannot be explained by decaying orbit due to tidal dissipation and the Rømer effect, while both apsidal precession and an additional perturbing body could reproduce this signal through our comprehensive dynamical simulations. Because both of them are valuable targets which are suitable for transmission spectroscopy, we make some predictions for atmospheric properties of WASP-35b and HAT-P-30b/WASP-51b based on newly derived system parameters. △ Less

Submitted 5 March, 2022; originally announced March 2022.

Comments: 22 pages, 11 figures,accepted by AJ on 2022 Mar 5

arXiv:2203.02384 [pdf, other]

AutoMO-Mixer: An automated multi-objective Mixer model for balanced, safe and robust prediction in medicine

Authors: Xi Chen, Jiahuan Lv, Dehua Feng, Xuanqin Mou, Ling Bai, Shu Zhang, Zhiguo Zhou

Abstract: Accurately identifying patient's status through medical images plays an important role in diagnosis and treatment. Artificial intelligence (AI), especially the deep learning, has achieved great success in many fields. However, more reliable AI model is needed in image guided diagnosis and therapy. To achieve this goal, developing a balanced, safe and robust model with a unified framework is desira… ▽ More Accurately identifying patient's status through medical images plays an important role in diagnosis and treatment. Artificial intelligence (AI), especially the deep learning, has achieved great success in many fields. However, more reliable AI model is needed in image guided diagnosis and therapy. To achieve this goal, developing a balanced, safe and robust model with a unified framework is desirable. In this study, a new unified model termed as automated multi-objective Mixer (AutoMO-Mixer) model was developed, which utilized a recent developed multiple layer perceptron Mixer (MLP-Mixer) as base. To build a balanced model, sensitivity and specificity were considered as the objective functions simultaneously in training stage. Meanwhile, a new evidential reasoning based on entropy was developed to achieve a safe and robust model in testing stage. The experiment on an optical coherence tomography dataset demonstrated that AutoMO-Mixer can obtain safer, more balanced, and robust results compared with MLP-Mixer and other available models. △ Less

Submitted 4 March, 2022; originally announced March 2022.

arXiv:2202.05530 [pdf, ps, other]

An Improved EPA based Receiver Design for Uplink LDPC Coded SCMA System

Authors: Lingyun Chai, Zilong Liu, Pei Xiao, Amine Maaref, Lin Bai

Abstract: Sparse code multiple access (SCMA) is an emerging paradigm for efficient enabling of massive connectivity in future machine-type communications (MTC). In this letter, we conceive the uplink transmissions of the low-density parity check (LDPC) coded SCMA system. Traditional receiver design of LDPC-SCMA system, which is based on message passing algorithm (MPA) for multiuser detection followed by ind… ▽ More Sparse code multiple access (SCMA) is an emerging paradigm for efficient enabling of massive connectivity in future machine-type communications (MTC). In this letter, we conceive the uplink transmissions of the low-density parity check (LDPC) coded SCMA system. Traditional receiver design of LDPC-SCMA system, which is based on message passing algorithm (MPA) for multiuser detection followed by individual LDPC decoding, may suffer from the drawback of the high complexity and large decoding latency, especially when the system has large codebook size and/or high overloading factor. To address this problem, we introduce a novel receiver design by applying the expectation propagation algorithm (EPA) to the joint detection and decoding (JDD) involving an aggregated factor graph of LDPC code and sparse codebooks. Our numerical results demonstrate the superiority of the proposed EPA based JDD receiver over the conventional Turbo receiver in terms of both significantly lower complexity and faster convergence rate without noticeable error rate performance degradation. △ Less

Submitted 11 February, 2022; originally announced February 2022.

arXiv:2202.01478 [pdf, other]

Trajectory Forecasting from Detection with Uncertainty-Aware Motion Encoding

Authors: Pu Zhang, Lei Bai, Jianru Xue, Jianwu Fang, Nanning Zheng, Wanli Ouyang

Abstract: Trajectory forecasting is critical for autonomous platforms to make safe planning and actions. Currently, most trajectory forecasting methods assume that object trajectories have been extracted and directly develop trajectory predictors based on the ground truth trajectories. However, this assumption does not hold in practical situations. Trajectories obtained from object detection and tracking ar… ▽ More Trajectory forecasting is critical for autonomous platforms to make safe planning and actions. Currently, most trajectory forecasting methods assume that object trajectories have been extracted and directly develop trajectory predictors based on the ground truth trajectories. However, this assumption does not hold in practical situations. Trajectories obtained from object detection and tracking are inevitably noisy, which could cause serious forecasting errors to predictors built on ground truth trajectories. In this paper, we propose a trajectory predictor directly based on detection results without relying on explicitly formed trajectories. Different from the traditional methods which encode the motion cue of an agent based on its clearly defined trajectory, we extract the motion information only based on the affinity cues among detection results, in which an affinity-aware state update mechanism is designed to take the uncertainty of association into account. In addition, considering that there could be multiple plausible matching candidates, we aggregate the states of them. This design relaxes the undesirable effect of noisy trajectory obtained from data association. Extensive ablation experiments validate the effectiveness of our method and its generalization ability on different detectors. Cross-comparison to other forecasting schemes further proves the superiority of our method. Code will be released upon acceptance. △ Less

Submitted 10 February, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

Comments: 11 pages, 4 figures

arXiv:2201.03188 [pdf, other]

Writing Style Aware Document-level Event Extraction

Authors: Zhuo Xu, Yue Wang, Lu Bai, Lixin Cui

Abstract: Event extraction, the technology that aims to automatically get the structural information from documents, has attracted more and more attention in many fields. Most existing works discuss this issue with the token-level multi-label classification framework by distinguishing the tokens as different roles while ignoring the writing styles of documents. The writing style is a special way of content… ▽ More Event extraction, the technology that aims to automatically get the structural information from documents, has attracted more and more attention in many fields. Most existing works discuss this issue with the token-level multi-label classification framework by distinguishing the tokens as different roles while ignoring the writing styles of documents. The writing style is a special way of content organizing for documents and it is relative fixed in documents with a special field (e.g. financial, medical documents, etc.). We argue that the writing style contains important clues for judging the roles for tokens and the ignorance of such patterns might lead to the performance degradation for the existing works. To this end, we model the writing style in documents as a distribution of argument roles, i.e., Role-Rank Distribution, and propose an event extraction model with the Role-Rank Distribution based Supervision Mechanism to capture this pattern through the supervised training process of an event extraction task. We compare our model with state-of-the-art methods on several real-world datasets. The empirical results show that our approach outperforms other alternatives with the captured patterns. This verifies the writing style contains valuable information that could improve the performance of the event extraction task. △ Less

Submitted 10 January, 2022; originally announced January 2022.

Comments: This paper has been submitted to Pattern Recognition Letters

arXiv:2112.15280 [pdf, other]

What is Event Knowledge Graph: A Survey

Authors: Saiping Guan, Xueqi Cheng, Long Bai, Fujun Zhang, Zixuan Li, Yutao Zeng, Xiaolong Jin, Jiafeng Guo

Abstract: Besides entity-centric knowledge, usually organized as Knowledge Graph (KG), events are also an essential kind of knowledge in the world, which trigger the spring up of event-centric knowledge representation form like Event KG (EKG). It plays an increasingly important role in many downstream applications, such as search, question-answering, recommendation, financial quantitative investments, and t… ▽ More Besides entity-centric knowledge, usually organized as Knowledge Graph (KG), events are also an essential kind of knowledge in the world, which trigger the spring up of event-centric knowledge representation form like Event KG (EKG). It plays an increasingly important role in many downstream applications, such as search, question-answering, recommendation, financial quantitative investments, and text generation. This paper provides a comprehensive survey of EKG from history, ontology, instance, and application views. Specifically, to characterize EKG thoroughly, we focus on its history, definition, schema induction, acquisition, related representative graphs/systems, and applications. The development processes and trends are studied therein. We further summarize prospective directions to facilitate future research on EKG. △ Less

Submitted 13 June, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

Comments: Accepted to TKDE 2022

arXiv:2112.14405 [pdf, ps, other]

Single-state or low-lying-states dominance mechanism of $2νββ$-decay nuclear matrix elements

Authors: W. L. Lv, Y. F. Niu, D. L. Fang, C. L. Bai

Abstract: The $2νββ$-decay nuclear matrix elements (NMEs) for 11 nuclei are studied with the self-consistent quasiparticle random phase approximation (QRPA) based on Skyrme Hartree-Fock-Bogoliubov (Skyrme HFB) model. As a common feature pointed out in https://journals.aps.org/prc/abstract/10.1103/PhysRevC.98.064325 Phys. Rev. C 98, 064325 (2018), negative contributions in the running sums of NMEs are found,… ▽ More The $2νββ$-decay nuclear matrix elements (NMEs) for 11 nuclei are studied with the self-consistent quasiparticle random phase approximation (QRPA) based on Skyrme Hartree-Fock-Bogoliubov (Skyrme HFB) model. As a common feature pointed out in https://journals.aps.org/prc/abstract/10.1103/PhysRevC.98.064325 Phys. Rev. C 98, 064325 (2018), negative contributions in the running sums of NMEs are found, and play important roles in the fulfillment of the single-state dominance or low-lying-states dominance hypothesis. By comparing the results of QRPA model and quasiparticle Tamm-Dancoff approximation (QTDA) model, we find that the negative contributions are due to the enhanced ground-state correlations, which are brought by the backward amplitude in QRPA model and tuned by strong isoscalar pairing interaction. The enhancement of ground-state correlations will change the signs of GT$^{+}$ transition amplitudes of higher-lying states and leads to the negative contributions in the running sum. △ Less

Submitted 25 April, 2022; v1 submitted 29 December, 2021; originally announced December 2021.

Comments: 9 pages, 8 figures

arXiv:2112.06823 [pdf, other]

Multi-Asset Spot and Option Market Simulation

Authors: Magnus Wiese, Ben Wood, Alexandre Pachoud, Ralf Korn, Hans Buehler, Phillip Murray, Lianjun Bai

Abstract: We construct realistic spot and equity option market simulators for a single underlying on the basis of normalizing flows. We address the high-dimensionality of market observed call prices through an arbitrage-free autoencoder that approximates efficient low-dimensional representations of the prices while maintaining no static arbitrage in the reconstructed surface. Given a multi-asset universe, w… ▽ More We construct realistic spot and equity option market simulators for a single underlying on the basis of normalizing flows. We address the high-dimensionality of market observed call prices through an arbitrage-free autoencoder that approximates efficient low-dimensional representations of the prices while maintaining no static arbitrage in the reconstructed surface. Given a multi-asset universe, we leverage the conditional invertibility property of normalizing flows and introduce a scalable method to calibrate the joint distribution of a set of independent simulators while preserving the dynamics of each simulator. Empirical results highlight the goodness of the calibrated simulators and their fidelity. △ Less

Submitted 13 December, 2021; originally announced December 2021.

arXiv:2112.02544 [pdf]

Variation between Antiferromagnetism and Ferrimagnetism in NiPS3 by Electron Doping

Authors: Mengjuan Mi, Xingwen Zheng, Shilei Wang, Yang Zhou, Lixuan Yu, Han Xiao, Houning Song, Bing Shen, Fangsen Li, Lihui Bai, Yanxue Chen, Shanpeng Wang, Xiaohui Liu, Yilin Wang

Abstract: How to electrically control magnetic properties of a magnetic material is promising towards spintronic applications, where the investigation of carrier doping effects on antiferromagnetic (AFM) materials remains challenging due to their zero net magnetization. In this work, we found electron doping dependent variation of magnetic orders of a two-dimensional (2D) AFM insulator NiPS3, where doping c… ▽ More How to electrically control magnetic properties of a magnetic material is promising towards spintronic applications, where the investigation of carrier doping effects on antiferromagnetic (AFM) materials remains challenging due to their zero net magnetization. In this work, we found electron doping dependent variation of magnetic orders of a two-dimensional (2D) AFM insulator NiPS3, where doping concentration is tuned by intercalating various organic cations into the van der Waals gaps of NiPS3 without introduction of defects and impurity phases. The doped NiPS3 shows an AFM-ferrimagnetic (FIM) transition at doping level of 0.2-0.5 electrons/cell and a FIM-AFM transition at doping level of >= 0.6 electrons/cell. We propose that the found phenomenon is due to competition between Stoner exchange dominated inter-chain ferromagnetic order and super-exchange dominated inter-chain AFM order at different doping level. Our studies provide a viable way to exploit correlation between electronic structures and magnetic properties of 2D magnetic materials for realization of magnetoelectric effect. △ Less

Submitted 3 March, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

Comments: 18 pages, 7 figures

arXiv:2112.00496 [pdf, other]

Revisiting the Transferability of Supervised Pretraining: an MLP Perspective

Authors: Yizhou Wang, Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Donglian Qi, Wanli Ouyang

Abstract: The pretrain-finetune paradigm is a classical pipeline in visual learning. Recent progress on unsupervised pretraining methods shows superior transfer performance to their supervised counterparts. This paper revisits this phenomenon and sheds new light on understanding the transferability gap between unsupervised and supervised pretraining from a multilayer perceptron (MLP) perspective. While prev… ▽ More The pretrain-finetune paradigm is a classical pipeline in visual learning. Recent progress on unsupervised pretraining methods shows superior transfer performance to their supervised counterparts. This paper revisits this phenomenon and sheds new light on understanding the transferability gap between unsupervised and supervised pretraining from a multilayer perceptron (MLP) perspective. While previous works focus on the effectiveness of MLP on unsupervised image classification where pretraining and evaluation are conducted on the same dataset, we reveal that the MLP projector is also the key factor to better transferability of unsupervised pretraining methods than supervised pretraining methods. Based on this observation, we attempt to close the transferability gap between supervised and unsupervised pretraining by adding an MLP projector before the classifier in supervised pretraining. Our analysis indicates that the MLP projector can help retain intra-class variation of visual features, decrease the feature distribution distance between pretraining and evaluation datasets, and reduce feature redundancy. Extensive experiments on public benchmarks demonstrate that the added MLP projector significantly boosts the transferability of supervised pretraining, e.g. +7.2% top-1 accuracy on the concept generalization task, +5.8% top-1 accuracy for linear evaluation on 12-domain classification tasks, and +0.8% AP on COCO object detection task, making supervised pretraining comparable or even better than unsupervised pretraining. △ Less

Submitted 28 March, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

Comments: Accepted by CVPR 2022. [camera ready with supplement]

arXiv:2112.00186 [pdf, other]

doi 10.1088/2040-8986/ac1b7c

Quantum-enhanced rubidium atomic magnetometer based on Faraday rotation via 795-nm Stokes operator squeezed light

Authors: Lele Bai, Xin Wen, Yulin Yang, Lulu Zhang, Jun He, Yanhua Wang, Junmin Wang

Abstract: With the help of Stokes operator S2 squeezed state (also called polarization squeezed state (PSS)) of 795-nm light, rubidium-87 (87Rb) atomic magnetometer based on Faraday rotation has been implemented and characterized.The PSS of Stokes operator S2 of 795-nm light has been prepared by means of coherently combining the polarization coherent state (PCS) of a linearly p-polarized bright 795-nm light… ▽ More With the help of Stokes operator S2 squeezed state (also called polarization squeezed state (PSS)) of 795-nm light, rubidium-87 (87Rb) atomic magnetometer based on Faraday rotation has been implemented and characterized.The PSS of Stokes operator S2 of 795-nm light has been prepared by means of coherently combining the polarization coherent state (PCS) of a linearly p-polarized bright 795-nm light beam and a linearly s-polarized squeezed vacuum state (SVS) generated by a 397.5-nm ultraviolet laser pumped sub-threshold optical parametric oscillator (OPO) with a PPKTP bulk crystal inside the OPO cavity.PSS with a squeezing level of -3.7 has been achieved around the analysis frequency of 10 kHz. At different transitions of D1 line, various frequency detuning, and reasonable atomic vapor cells temperature, Faraday rotation has been measured and compared.To decrease absorption (scattering) losses and the back-action from atomic spin noise to the probe beams polarization noise for maintaining the quantum properties of PSS of Stokes operator S2 of 795-nm light, we had to run our magnetometer with 87Rb vapor cells temperature below 60, at which the PSS was almost destroyed.The sensitivities of magnetic field measurement were characterized via measuring signal-to-noise ratio of the alternating current (AC) calibrated magnetic field signal with a balanced polarimeter. Under the conditions of the atomic number density of 5.8*1010 /cm3 and the probe beam with a detuning of - 400 MHz relative to the 5S1/2 (Fg=2) - 5P1/2 (Fe=1) transition of 87Rb D1 line, a typical sensitivity of 19.5 pT/Hz1/2 has been achieved employing PSS of Stokes operator S2 as the probe, compared with a sensitivity of 28.3 pT/Hz1/2 using PCS as the probe.We preliminarily demonstrated that the quantum-enhanced sensitivity in a Faraday-rotation-based 87Rb atomic magnetometer with the help of PSS of 795-nm light. △ Less

Submitted 5 December, 2021; v1 submitted 30 November, 2021; originally announced December 2021.

Comments: 8 pages, 5 figures J. Opt. 23 (2021) 085202

arXiv:2111.09572 [pdf, ps, other]

doi 10.1364/OE.448084

Enhancement of spin noise spectroscopy of rubidium atomic ensemble by using of the polarization squeezed light

Authors: Lele Bai, Lulu Zhang, Yongbiao Yang, Rui Chang, Yao Qin, Jun He, Xin Wen, Junmin Wang

Abstract: We measured the spin noise spectroscopy (SNS) of rubidium atomic ensemble with two different atomic vapor cells (filled with the buffer gases or coated with paraffin film on the inner wall), and demonstrated the enhancement of signal to noise ratio (SNR) by using of the polarization squeezed state (PSS) of 795 nm light field with Stokes operator S2 squeezed. PSS is prepared by locking the relative… ▽ More We measured the spin noise spectroscopy (SNS) of rubidium atomic ensemble with two different atomic vapor cells (filled with the buffer gases or coated with paraffin film on the inner wall), and demonstrated the enhancement of signal to noise ratio (SNR) by using of the polarization squeezed state (PSS) of 795 nm light field with Stokes operator S2 squeezed. PSS is prepared by locking the relative phase between the squeezed vacuum state of light obtained by a sub-threshold optical parametric oscillator and the orthogonal polarized local oscillator beam by means of the quantum noise lock. Under the same conditions, PSS can be employed not only to improve SNR, but also to keep the full width at half maximum (FWHM) of SNS unchanged, compared with the case of using polarization coherent state (PCS), and the enhancement of SNR is positively correlated with the squeezing level of PSS. With the increase of probe laser power and atomic number density, the SNR and FWHM of SNS will increase correspondingly. With the help of PSS of Stokes operator S2, quantum enhancement of both SNR and FWHM of SNS signal has been demonstrated by controlling optical power of the S2 polarization squeezed light beam or atomic number density in our experiments. △ Less

Submitted 18 November, 2021; originally announced November 2021.

arXiv:2111.06545 [pdf, ps, other]

doi 10.1126/science.abg5137

Peta-electron volt gamma-ray emission from the Crab Nebula

Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, Axikegu, L. X. Bai, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, H. Cai, J. T. Cai, Zhe Cao, J. Chang, J. F. Chang, B. M. Chen, E. S. Chen, J. Chen, Liang Chen, Liang Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen , et al. (250 additional authors not shown)

Abstract: The Crab pulsar and the surrounding nebula powered by the pulsar's rotational energy through the formation and termination of a relativistic electron-positron wind is a bright source of gamma-rays carrying crucial information about this complex conglomerate. We report the detection of $γ$-rays with a spectrum showing gradual steepening over three energy decades, from $5\times 10^{-4}$ to $1.1$ pet… ▽ More The Crab pulsar and the surrounding nebula powered by the pulsar's rotational energy through the formation and termination of a relativistic electron-positron wind is a bright source of gamma-rays carrying crucial information about this complex conglomerate. We report the detection of $γ$-rays with a spectrum showing gradual steepening over three energy decades, from $5\times 10^{-4}$ to $1.1$ petaelectronvolt (PeV). The ultra-high-energy photons exhibit the presence of a PeV electron accelerator (a pevatron) with an acceleration rate exceeding 15% of the absolute theoretical limit. Assuming that unpulsed $γ$-rays are produced at the termination of the pulsar's wind, we constrain the pevatron's size, between $0.025$ and $0.1$ pc, and the magnetic field $\approx 110 μ$G. The production rate of PeV electrons, $2.5 \times 10^{36}$ erg $\rm s^{-1}$, constitutes 0.5% of the pulsar's spin-down luminosity, although we do not exclude a non-negligible contribution of PeV protons to the production of the highest energy $γ$-rays. △ Less

Submitted 11 November, 2021; originally announced November 2021.

Comments: 43 pages, 13 figures, 2 tables; Published in Science

Journal ref: Science, 2021, Vol 373, Issue 6553, pp. 425-430

Showing 151–200 of 370 results for author: Bai, L