Search | arXiv e-print repository

arXiv:2210.02026 [pdf, other]

doi 10.1103/PhysRevB.107.125112

Truncated atomic plane wave method for the subband structure calculations of Moiré systems

Authors: Wangqian Miao, Chu Li, Xu Han, Ding Pan, Xi Dai

Abstract: We propose a highly efficient and accurate numerical scheme named Truncated Atomic Plane Wave (TAPW) method to determine the subband structure of Twisted Bilayer Graphene (TBG) inspired by BM model. Our method utilizes real space information of carbon atoms in the moiré unit cell and projects the full tight binding Hamiltonian into a much smaller subspace using atomic plane waves. We present accur… ▽ More We propose a highly efficient and accurate numerical scheme named Truncated Atomic Plane Wave (TAPW) method to determine the subband structure of Twisted Bilayer Graphene (TBG) inspired by BM model. Our method utilizes real space information of carbon atoms in the moiré unit cell and projects the full tight binding Hamiltonian into a much smaller subspace using atomic plane waves. We present accurate electronic band structures of TBG in a wide range of twist angles together with detailed moiré potential and screened Coulomb interaction at the first magic angle using our new method. Furthermore, we generalize our formalism to solve the problem of low frequency moiré phonons in TBG. △ Less

Submitted 15 February, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

arXiv:2210.01030 [pdf, ps, other]

The Answer to Baggett's Problem is Affirmative

Authors: Xingde Dai

Abstract: Let $ψ$ be a Parceval wavelet in $L^2 (\R)$ with the space of negative dilates $V(ψ)$. The intersection of the dilates $V(ψ)$ is the zero space. In other words, we have \begin{align*} \bigcap_{n\in\Z} D^n \overline{\textrm{span}}\{D^{\textrm{-}m} T^\ell ψ\mid m\geq 0, m,\ell\in\Z\}=\{0\}. \end{align*} Let $ψ$ be a Parceval wavelet in $L^2 (\R)$ with the space of negative dilates $V(ψ)$. The intersection of the dilates $V(ψ)$ is the zero space. In other words, we have \begin{align*} \bigcap_{n\in\Z} D^n \overline{\textrm{span}}\{D^{\textrm{-}m} T^\ell ψ\mid m\geq 0, m,\ell\in\Z\}=\{0\}. \end{align*} △ Less

Submitted 3 October, 2022; originally announced October 2022.

arXiv:2209.15019 [pdf, other]

Revealing AGNs Through TESS Variability

Authors: Helena P. Treiber, Jason T. Hinkle, Michael M. Fausnaugh, Benjamin J. Shappee, Christopher S. Kochanek, Patrick J. Vallely, Katie Auchettl, Thomas W. S. Holoien, Anna V. Payne, Xinyu Dai

Abstract: We used Transiting Exoplanet Survey Satellite (TESS) data to identify 29 candidate active galactic nuclei (AGNs) through their optical variability. The high-cadence, high-precision TESS light curves present a unique opportunity for the identification of AGNs, including those not selected through other methods. Of the candidates, we found that 18 have either previously been identified as AGNs in th… ▽ More We used Transiting Exoplanet Survey Satellite (TESS) data to identify 29 candidate active galactic nuclei (AGNs) through their optical variability. The high-cadence, high-precision TESS light curves present a unique opportunity for the identification of AGNs, including those not selected through other methods. Of the candidates, we found that 18 have either previously been identified as AGNs in the literature or could have been selected based on emission-line diagnostics, mid-IR colors, or X-ray luminosity. AGNs in low-mass galaxies offer a window into supermassive black hole (SMBH) and galaxy co-evolution and 8 of the 29 candidates have estimated black hole masses $\mathrm{\lesssim 10^{6} M_{\odot}}$. The low-mass galaxies NGC 4395 and NGC 4449 are two of our five "high-confidence" candidates. By applying our methodology to the entire TESS main and extended mission datasets, we expect to identify $\sim$45 more AGN candidates, of which $\sim$26 will be new and $\sim$8 will be in low-mass galaxies. △ Less

Submitted 29 September, 2022; originally announced September 2022.

Comments: 21 pages, 17 figures, 6 tables. Will be submitted to AAS journals. Comments welcome

arXiv:2209.13865 [pdf, other]

Zero-Shot 3D Drug Design by Sketching and Generating

Authors: Siyu Long, Yi Zhou, Xinyu Dai, Hao Zhou

Abstract: Drug design is a crucial step in the drug discovery cycle. Recently, various deep learning-based methods design drugs by generating novel molecules from scratch, avoiding traversing large-scale drug libraries. However, they depend on scarce experimental data or time-consuming docking simulation, leading to overfitting issues with limited training data and slow generation speed. In this study, we p… ▽ More Drug design is a crucial step in the drug discovery cycle. Recently, various deep learning-based methods design drugs by generating novel molecules from scratch, avoiding traversing large-scale drug libraries. However, they depend on scarce experimental data or time-consuming docking simulation, leading to overfitting issues with limited training data and slow generation speed. In this study, we propose the zero-shot drug design method DESERT (Drug dEsign by SkEtching and geneRaTing). Specifically, DESERT splits the design process into two stages: sketching and generating, and bridges them with the molecular shape. The two-stage fashion enables our method to utilize the large-scale molecular database to reduce the need for experimental data and docking simulation. Experiments show that DESERT achieves a new state-of-the-art at a fast speed. △ Less

Submitted 4 October, 2022; v1 submitted 28 September, 2022; originally announced September 2022.

Comments: NeurIPS 2022 camera-ready

arXiv:2209.09515 [pdf, other]

doi 10.1103/PhysRevB.106.245129

Heavy fermion representation for twisted bilayer graphene systems

Authors: Hao Shi, Xi Dai

Abstract: We construct a heavy fermion representation for twisted bilayer graphene (TBG) systems. Two local orbitals (per spin/valley) are analytically found, which are exactly the maximally localized zero modes of the continuum Hamiltonian near the AA-stacking center. They have similar properties to the Wannier functions in [arXiv:2111.05865v2], but also have a clear interpretation as the zeroth pseudo Lan… ▽ More We construct a heavy fermion representation for twisted bilayer graphene (TBG) systems. Two local orbitals (per spin/valley) are analytically found, which are exactly the maximally localized zero modes of the continuum Hamiltonian near the AA-stacking center. They have similar properties to the Wannier functions in [arXiv:2111.05865v2], but also have a clear interpretation as the zeroth pseudo Landau levels (ZLL) of Dirac fermions under the uniform strain field created by twisting [arXiv:1810.03103v3]. The electronic states of TBG can be viewed as the hybridization between these ZLL orbitals and other itinerant states which can be obtained following the standard procedure of orthogonalized plane wave method. The "heavy fermion" model for TBG separates the strongly correlated components from the itinerant components and provides a solid base for the comprehensive understanding of the exotic physics in TBG. △ Less

Submitted 10 November, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

Journal ref: Physical Review B 106, 245129 (2022)

arXiv:2209.07672 [pdf, other]

Nonparametric Estimation via Partial Derivatives

Authors: Xiaowu Dai

Abstract: Traditional nonparametric estimation methods often lead to a slow convergence rate in large dimensions and require unrealistically enormous sizes of datasets for reliable conclusions. We develop an approach based on partial derivatives, either observed or estimated, to effectively estimate the function at near-parametric convergence rates. The novel approach and computational algorithm could lead… ▽ More Traditional nonparametric estimation methods often lead to a slow convergence rate in large dimensions and require unrealistically enormous sizes of datasets for reliable conclusions. We develop an approach based on partial derivatives, either observed or estimated, to effectively estimate the function at near-parametric convergence rates. The novel approach and computational algorithm could lead to methods useful to practitioners in many areas of science and engineering. Our theoretical results reveal a behavior universal to this class of nonparametric estimation problems. We explore a general setting involving tensor product spaces and build upon the smoothing spline analysis of variance (SS-ANOVA) framework. For $d$-dimensional models under full interaction, the optimal rates with gradient information on $p$ covariates are identical to those for the $(d-p)$-interaction models without gradients and, therefore, the models are immune to the "curse of interaction." For additive models, the optimal rates using gradient information are root-$n$, thus achieving the "parametric rate." We demonstrate aspects of the theoretical results through synthetic and real data applications. △ Less

Submitted 18 August, 2024; v1 submitted 15 September, 2022; originally announced September 2022.

Comments: To appear in JRSSB

arXiv:2209.07484 [pdf, other]

Hydra Attention: Efficient Attention with Many Heads

Authors: Daniel Bolya, Cheng-Yang Fu, Xiaoliang Dai, Peizhao Zhang, Judy Hoffman

Abstract: While transformers have begun to dominate many tasks in vision, applying them to large images is still computationally difficult. A large reason for this is that self-attention scales quadratically with the number of tokens, which in turn, scales quadratically with the image size. On larger images (e.g., 1080p), over 60% of the total computation in the network is spent solely on creating and apply… ▽ More While transformers have begun to dominate many tasks in vision, applying them to large images is still computationally difficult. A large reason for this is that self-attention scales quadratically with the number of tokens, which in turn, scales quadratically with the image size. On larger images (e.g., 1080p), over 60% of the total computation in the network is spent solely on creating and applying attention matrices. We take a step toward solving this issue by introducing Hydra Attention, an extremely efficient attention operation for Vision Transformers (ViTs). Paradoxically, this efficiency comes from taking multi-head attention to its extreme: by using as many attention heads as there are features, Hydra Attention is computationally linear in both tokens and features with no hidden constants, making it significantly faster than standard self-attention in an off-the-shelf ViT-B/16 by a factor of the token count. Moreover, Hydra Attention retains high accuracy on ImageNet and, in some cases, actually improves it. △ Less

Submitted 15 September, 2022; originally announced September 2022.

Comments: Accepted CADL 2022 (ECCV Workshop)

arXiv:2209.06952 [pdf]

doi 10.1088/1361-6501/acb5b3

Landmark Tracking in Liver US images Using Cascade Convolutional Neural Networks with Long Short-Term Memory

Authors: Yupei Zhang, Xianjin Dai, Zhen Tian, Yang Lei, Jacob F. Wynne, Pretesh Patel, Yue Chen, Tian Liu, Xiaofeng Yang

Abstract: This study proposed a deep learning-based tracking method for ultrasound (US) image-guided radiation therapy. The proposed cascade deep learning model is composed of an attention network, a mask region-based convolutional neural network (mask R-CNN), and a long short-term memory (LSTM) network. The attention network learns a mapping from a US image to a suspected area of landmark motion in order t… ▽ More This study proposed a deep learning-based tracking method for ultrasound (US) image-guided radiation therapy. The proposed cascade deep learning model is composed of an attention network, a mask region-based convolutional neural network (mask R-CNN), and a long short-term memory (LSTM) network. The attention network learns a mapping from a US image to a suspected area of landmark motion in order to reduce the search region. The mask R-CNN then produces multiple region-of-interest (ROI) proposals in the reduced region and identifies the proposed landmark via three network heads: bounding box regression, proposal classification, and landmark segmentation. The LSTM network models the temporal relationship among the successive image frames for bounding box regression and proposal classification. To consolidate the final proposal, a selection method is designed according to the similarities between sequential frames. The proposed method was tested on the liver US tracking datasets used in the Medical Image Computing and Computer Assisted Interventions (MICCAI) 2015 challenges, where the landmarks were annotated by three experienced observers to obtain their mean positions. Five-fold cross-validation on the 24 given US sequences with ground truths shows that the mean tracking error for all landmarks is 0.65+/-0.56 mm, and the errors of all landmarks are within 2 mm. We further tested the proposed model on 69 landmarks from the testing dataset that has a similar image pattern to the training pattern, resulting in a mean tracking error of 0.94+/-0.83 mm. Our experimental results have demonstrated the feasibility and accuracy of our proposed method in tracking liver anatomic landmarks using US images, providing a potential solution for real-time liver tracking for active motion management during radiation therapy. △ Less

Submitted 14 September, 2022; originally announced September 2022.

arXiv:2208.13962 [pdf, ps, other]

Singular Weyl's law with Ricci curvature bounded below

Authors: Xianzhe Dai, Shouhei Honda, Jiayin Pan, Guofang Wei

Abstract: We establish two surprising types of Weyl's laws for some compact $\mathrm{RCD}(K, N)$/Ricci limit spaces. The first type could have power growth of any order (bigger than one). The other one has an order corrected by logarithm similar to some fractals even though the space is 2-dimensional. Moreover the limits in both types can be written in terms of the singular sets of null capacities, instead… ▽ More We establish two surprising types of Weyl's laws for some compact $\mathrm{RCD}(K, N)$/Ricci limit spaces. The first type could have power growth of any order (bigger than one). The other one has an order corrected by logarithm similar to some fractals even though the space is 2-dimensional. Moreover the limits in both types can be written in terms of the singular sets of null capacities, instead of the regular sets. These are the first examples with such features for $\mathrm{RCD}(K,N)$ spaces. Our results depends crucially on analyzing and developing important properties of the examples constructed by the last two authors, showing them isometric to the $α$-Grushin halfplanes. Of independent interest, this also allows us to provide counterexamples to conjectures by Cheeger-Colding and by Kapovitch-Kell-Ketterer. △ Less

Submitted 6 May, 2023; v1 submitted 29 August, 2022; originally announced August 2022.

Comments: Final version. To appear in Trans. AMS Series B. 41 pages

arXiv:2208.13722 [pdf, other]

Open-Set Semi-Supervised Object Detection

Authors: Yen-Cheng Liu, Chih-Yao Ma, Xiaoliang Dai, Junjiao Tian, Peter Vajda, Zijian He, Zsolt Kira

Abstract: Recent developments for Semi-Supervised Object Detection (SSOD) have shown the promise of leveraging unlabeled data to improve an object detector. However, thus far these methods have assumed that the unlabeled data does not contain out-of-distribution (OOD) classes, which is unrealistic with larger-scale unlabeled datasets. In this paper, we consider a more practical yet challenging problem, Open… ▽ More Recent developments for Semi-Supervised Object Detection (SSOD) have shown the promise of leveraging unlabeled data to improve an object detector. However, thus far these methods have assumed that the unlabeled data does not contain out-of-distribution (OOD) classes, which is unrealistic with larger-scale unlabeled datasets. In this paper, we consider a more practical yet challenging problem, Open-Set Semi-Supervised Object Detection (OSSOD). We first find the existing SSOD method obtains a lower performance gain in open-set conditions, and this is caused by the semantic expansion, where the distracting OOD objects are mispredicted as in-distribution pseudo-labels for the semi-supervised training. To address this problem, we consider online and offline OOD detection modules, which are integrated with SSOD methods. With the extensive studies, we found that leveraging an offline OOD detector based on a self-supervised vision transformer performs favorably against online OOD detectors due to its robustness to the interference of pseudo-labeling. In the experiment, our proposed framework effectively addresses the semantic expansion issue and shows consistent improvements on many OSSOD benchmarks, including large-scale COCO-OpenImages. We also verify the effectiveness of our framework under different OSSOD conditions, including varying numbers of in-distribution classes, different degrees of supervision, and different combinations of unlabeled sets. △ Less

Submitted 29 August, 2022; originally announced August 2022.

Comments: Project Page is at https://ycliu93.github.io/projects/ossod.html

arXiv:2208.12257 [pdf, other]

Video Mobile-Former: Video Recognition with Efficient Global Spatial-temporal Modeling

Authors: Rui Wang, Zuxuan Wu, Dongdong Chen, Yinpeng Chen, Xiyang Dai, Mengchen Liu, Luowei Zhou, Lu Yuan, Yu-Gang Jiang

Abstract: Transformer-based models have achieved top performance on major video recognition benchmarks. Benefiting from the self-attention mechanism, these models show stronger ability of modeling long-range dependencies compared to CNN-based models. However, significant computation overheads, resulted from the quadratic complexity of self-attention on top of a tremendous number of tokens, limit the use of… ▽ More Transformer-based models have achieved top performance on major video recognition benchmarks. Benefiting from the self-attention mechanism, these models show stronger ability of modeling long-range dependencies compared to CNN-based models. However, significant computation overheads, resulted from the quadratic complexity of self-attention on top of a tremendous number of tokens, limit the use of existing video transformers in applications with limited resources like mobile devices. In this paper, we extend Mobile-Former to Video Mobile-Former, which decouples the video architecture into a lightweight 3D-CNNs for local context modeling and a Transformer modules for global interaction modeling in a parallel fashion. To avoid significant computational cost incurred by computing self-attention between the large number of local patches in videos, we propose to use very few global tokens (e.g., 6) for a whole video in Transformers to exchange information with 3D-CNNs with a cross-attention mechanism. Through efficient global spatial-temporal modeling, Video Mobile-Former significantly improves the video recognition performance of alternative lightweight baselines, and outperforms other efficient CNN-based models at the low FLOP regime from 500M to 6G total FLOPs on various video recognition tasks. It is worth noting that Video Mobile-Former is the first Transformer-based video model which constrains the computational budget within 1G FLOPs. △ Less

Submitted 25 August, 2022; originally announced August 2022.

arXiv:2208.08776 [pdf, ps, other]

Perelman's functionals on manifolds with non-isolated conical singularities

Authors: Xianzhe Dai, Changliang Wang

Abstract: In this article, we define Perelman's functionals on manifolds with non-isolated conical singularities by starting from a spectral point of view for the Perelman's $λ$-functional. (Our definition of non-isolated conical singularities includes isolated conical singularities.) We prove that the spectrum of Schrödinger operator $-4Δ+ R$ on manifolds with non-isolated conical singularities consists of… ▽ More In this article, we define Perelman's functionals on manifolds with non-isolated conical singularities by starting from a spectral point of view for the Perelman's $λ$-functional. (Our definition of non-isolated conical singularities includes isolated conical singularities.) We prove that the spectrum of Schrödinger operator $-4Δ+ R$ on manifolds with non-isolated conical singularities consists of discrete eigenvalues with finite multiplicities, provided that scalar curvatures of cross sections of cones have a certain lower bound. This enables us to define the $λ$-functional on these singular manifolds, and further, to prove that the infimum of $W$-functional is finite, with the help of some weighted Sobolev inequalities. Furthermore, we obtain some asymptotic behavior of eigenfunctions and the minimizer of the $W$-functional near the singularity, and a more refined optimal partial asymptotic expansion for eigenfunctions near isolated conical singularities. We also study the spectrum of $-4Δ+ R$ and Perelman's functionals on manifolds with more general singularities, i.e. the $r^α$-horn singularities which serve as prototypes of algebraic singularities. △ Less

Submitted 12 November, 2023; v1 submitted 18 August, 2022; originally announced August 2022.

Comments: Sections are re-ordered, some proofs are simplified, some typos and errors are corrected, revised version, 58 pages

arXiv:2208.08660 [pdf, other]

doi 10.1007/JHEP07(2023)198

Study of $B_c^+$ meson decays to charmonia plus multihadron final states

Authors: LHCb Collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, B. Adeva, M. Adinolfi, P. Adlarson, H. Afsharnia, C. Agapopoulou, C. A. Aidala, S. Aiola, Z. Ajaltouni, S. Akar, K. Akiba, J. Albrecht, F. Alessio, M. Alexander, A. Alfonso Albero, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1050 additional authors not shown)

Abstract: Four decay modes of the $B_c^+$ meson into a $J/ψ$ meson and multiple charged kaons or pions are studied using proton-proton collision data, collected with the~LHCb detector at centre-of-mass energies of 7, 8, and 13~TeV and corresponding to an integrated luminosity of $9$~fb$^{-1}$. The decay $B_c^+\to J/ψK^+ K^- π^+ π^+ π^-$ is observed for the first time, and evidence for the… ▽ More Four decay modes of the $B_c^+$ meson into a $J/ψ$ meson and multiple charged kaons or pions are studied using proton-proton collision data, collected with the~LHCb detector at centre-of-mass energies of 7, 8, and 13~TeV and corresponding to an integrated luminosity of $9$~fb$^{-1}$. The decay $B_c^+\to J/ψK^+ K^- π^+ π^+ π^-$ is observed for the first time, and evidence for the $B_c^+\to J/ψ4π^+ 3π^-$ decay is found. The decay $B_c^+\to J/ψ3π^+ 2π^-$ is observedand and the previous observation of the $B_c^+\toψ(2S) π^+ π^+ π^-$ decay is confirmed using the $ψ(2S) \to J/ψπ^+ π^-$ decay mode. Ratios of the branching fractions of these four $B_c^+$ decay channels are measured. △ Less

Submitted 13 December, 2023; v1 submitted 18 August, 2022; originally announced August 2022.

Comments: 18 pages, 6 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-025.html (LHCb public pages)

Report number: CERN-EP-2022-162, LHCb-PAPER-2022-025

Journal ref: JHEP 07 (2023) 198

arXiv:2208.05571 [pdf, ps, other]

doi 10.1103/PhysRevResearch.5.033155

Tunable Coupler for Mediating Interactions between a Two-Level System and a Waveguide from a Decoupled State to the Ultra-Strong Coupling Regime

Authors: N. Janzen, X. Dai, S. Ren, J. Shi, A. Lupascu

Abstract: Two-level systems (TLS) coupled to waveguides are a fundamental paradigm for light-matter interactions and quantum networks. We introduce and experimentally demonstrate a method to tune the interaction between a TLS, implemented as a flux qubit, and a transmission line waveguide from a decoupled state to a coupling strength that is a significant fraction of the TLS transition frequency, near the u… ▽ More Two-level systems (TLS) coupled to waveguides are a fundamental paradigm for light-matter interactions and quantum networks. We introduce and experimentally demonstrate a method to tune the interaction between a TLS, implemented as a flux qubit, and a transmission line waveguide from a decoupled state to a coupling strength that is a significant fraction of the TLS transition frequency, near the ultra-strong coupling regime. The coupling, controlled via magnetic flux, is described by a normalized coupling strength $α$ that is measured to range between $6.2\times10^{-5}$ and $2.19\times10^{-2}$, with larger attainable maximum values predicted by a circuit model of the device. This system enables future investigations in the dynamics of the spin-boson model, microwave photonics, and relativistic quantum information. △ Less

Submitted 1 September, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

Comments: Main: 14 pages, 10 figures

Journal ref: Phys. Rev. Res. 2023, 5 (3), 033155

arXiv:2208.00942 [pdf, ps, other]

doi 10.1103/PhysRevB.106.235103

Interaction Driven Topological Phase Transition in Monolayer CrCl$_2$(pyrazine)$_2$

Authors: Xuecong Ji, Jiacheng Gao, Changming Yue, Zhijun Wang, Hua Wu, Xi Dai, Hongming Weng

Abstract: The quadratic band crossing points (QBCPs) at Fermi level in two-dimension have been proposed to be unstable under electron-electron interaction. The possible interaction driven states include quantum anomalous Hall (QAH) state and various nematic ordered states. In this work, motivated by the discovery of ferromagnetic van der Waals layered metal-organic framework CrCl$_2$(pyrazine)$_2$, we theor… ▽ More The quadratic band crossing points (QBCPs) at Fermi level in two-dimension have been proposed to be unstable under electron-electron interaction. The possible interaction driven states include quantum anomalous Hall (QAH) state and various nematic ordered states. In this work, motivated by the discovery of ferromagnetic van der Waals layered metal-organic framework CrCl$_2$(pyrazine)$_2$, we theoretically propose that the single layer of CrCl$_2$(pyrazine)$_2$ might realize one or some of these interaction driven states based on the QBCP protected by $C_4$ symmetry. By introducing the short-range density-density type repulsion interactions into this system, we have found the phase diagram depending on different interaction range and strength. The exotic phases include the staggered chiral flux state manifesting QAH effect, the site-nematic insulator and the site-nematic Dirac semimetal state. The QAH state is robust against perturbations breaking the QBCP but it is weakened by increasing temperature. The metal-organic framework is tunable by changing the transition-metal elements, which might improve the gap size and stability of this interaction induced QAH state. △ Less

Submitted 1 August, 2022; originally announced August 2022.

arXiv:2207.12760 [pdf]

The Quadruplon in a Monolayer Semiconductor

Authors: Jiacheng Tang, Hao Sun, Qiyao Zhang, Xingcan Dai, Zhen Wang, Cun-Zheng Ning

Abstract: Understanding the structure of matter or materials and interaction or correlations among the constituent elementary particles are the central tasks of all branches of science, from physics, chemistry, to biology. In physics, this ultimate goal has spurred a constant search for high-order correlated entities or composite particles for nearly all states and forms of matter, from elementary particles… ▽ More Understanding the structure of matter or materials and interaction or correlations among the constituent elementary particles are the central tasks of all branches of science, from physics, chemistry, to biology. In physics, this ultimate goal has spurred a constant search for high-order correlated entities or composite particles for nearly all states and forms of matter, from elementary particles, nuclei, cold atoms, to condensed matter. So far, such composite particles involving two or three constituent particles have been experimentally identified, such as the Cooper pairs, excitons, and trions in condensed matter physics, or diquarks and mesons in quantum chromodynamics. Although the four-body irreducible entities have long been predicted theoretically in a variety of materials systems alternatively as quadruplons, quadrons, or quartets, the closely related experimental observation so far seems to be restricted to the field of elementary particles (e.g. the recent tetraquark at CERN) only. In this article, we present the first experimental evidence for the existence of a four-body irreducible entity, the quadruplon, involving two electrons and two holes in a monolayer of Molybdenum Ditelluride. Using the optical pump-probe technique, we discovered a series of new spectral features that are distinct from those of trions and bi-excitons. By solving the four-body Bethe-Salpeter equation in conjunction with the cluster expansion approach, we are able to explain these spectral features in terms of the four-body irreducible cluster or the quadruplons. In contrast to a bi-exciton which consists of two weakly bound excitons, a quadruplon consists of two electrons and two holes without the presence of an exciton. △ Less

Submitted 4 May, 2024; v1 submitted 26 July, 2022; originally announced July 2022.

arXiv:2207.11852 [pdf, ps, other]

On recurrence in zero-dimsnional locally compact flow with compactly generated phase group

Authors: Xiongping Dai

Abstract: We define recurrence for a compactly generated para-topological group $G$ acting continuously on a locally compact Hausdorff space $X$ with $\dim X=0$, and then, show that if $\overline{Gx}$ is compact for all $x\in X$, the conditions (i) this dynamics is pointwise recurrent, (ii) $X$ is a union of $G$-minimal sets, (iii) the $G$-orbit closure relation is closed in $X\times X$, and (iv)… ▽ More We define recurrence for a compactly generated para-topological group $G$ acting continuously on a locally compact Hausdorff space $X$ with $\dim X=0$, and then, show that if $\overline{Gx}$ is compact for all $x\in X$, the conditions (i) this dynamics is pointwise recurrent, (ii) $X$ is a union of $G$-minimal sets, (iii) the $G$-orbit closure relation is closed in $X\times X$, and (iv) $X\ni x\mapsto \overline{Gx}\in 2^X$ is continuous, are pairwise equivalent. Consequently, if this dynamics is pointwise product recurrent, then it is pointwise regularly almost periodic and equicontinuous; moreover, a distal, compact, and non-connected $G$-flow has a non-trivial equicontinuous pointwise regularly almost periodic factor. △ Less

Submitted 24 July, 2022; originally announced July 2022.

Comments: 23 pages; an extension of 2203.08466; to appear in Illinois J. Math. arXiv admin note: text overlap with arXiv:2203.08466

arXiv:2207.10777 [pdf, other]

An advanced combination of semi-supervised Normalizing Flow & Yolo (YoloNF) to detect and recognize vehicle license plates

Authors: Khalid Oublal, Xinyi Dai

Abstract: Fully Automatic License Plate Recognition (ALPR) has been a frequent research topic due to several practical applications. However, many of the current solutions are still not robust enough in real situations, commonly depending on many constraints. This paper presents a robust and efficient ALPR system based on the state-of-the-art YOLO object detector and Normalizing flows. The model uses two ne… ▽ More Fully Automatic License Plate Recognition (ALPR) has been a frequent research topic due to several practical applications. However, many of the current solutions are still not robust enough in real situations, commonly depending on many constraints. This paper presents a robust and efficient ALPR system based on the state-of-the-art YOLO object detector and Normalizing flows. The model uses two new strategies. Firstly, a two-stage network using YOLO and a normalization flow-based model for normalization to detect Licenses Plates (LP) and recognize the LP with numbers and Arabic characters. Secondly, Multi-scale image transformations are implemented to provide a solution to the problem of the YOLO cropped LP detection including significant background noise. Furthermore, extensive experiments are led on a new dataset with realistic scenarios, we introduce a larger public annotated dataset collected from Moroccan plates. We demonstrate that our proposed model can learn on a small number of samples free of single or multiple characters. The dataset will also be made publicly available to encourage further studies and research on plate detection and recognition. △ Less

Submitted 21 July, 2022; originally announced July 2022.

Comments: arXiv admin note: text overlap with arXiv:1802.09567 by other authors; text overlap with arXiv:2012.06737 by other authors without attribution

arXiv:2207.07187 [pdf, other]

doi 10.1145/3543507.3583446

NASRec: Weight Sharing Neural Architecture Search for Recommender Systems

Authors: Tunhou Zhang, Dehua Cheng, Yuchen He, Zhengxing Chen, Xiaoliang Dai, Liang Xiong, Feng Yan, Hai Li, Yiran Chen, Wei Wen

Abstract: The rise of deep neural networks offers new opportunities in optimizing recommender systems. However, optimizing recommender systems using deep neural networks requires delicate architecture fabrication. We propose NASRec, a paradigm that trains a single supernet and efficiently produces abundant models/sub-architectures by weight sharing. To overcome the data multi-modality and architecture heter… ▽ More The rise of deep neural networks offers new opportunities in optimizing recommender systems. However, optimizing recommender systems using deep neural networks requires delicate architecture fabrication. We propose NASRec, a paradigm that trains a single supernet and efficiently produces abundant models/sub-architectures by weight sharing. To overcome the data multi-modality and architecture heterogeneity challenges in the recommendation domain, NASRec establishes a large supernet (i.e., search space) to search the full architectures. The supernet incorporates versatile choice of operators and dense connectivity to minimize human efforts for finding priors. The scale and heterogeneity in NASRec impose several challenges, such as training inefficiency, operator-imbalance, and degraded rank correlation. We tackle these challenges by proposing single-operator any-connection sampling, operator-balancing interaction modules, and post-training fine-tuning. Our crafted models, NASRecNet, show promising results on three Click-Through Rates (CTR) prediction benchmarks, indicating that NASRec outperforms both manually designed models and existing NAS methods with state-of-the-art performance. Our work is publicly available at https://github.com/facebookresearch/NasRec. △ Less

Submitted 12 February, 2023; v1 submitted 14 July, 2022; originally announced July 2022.

Comments: Proceedings of the ACM Web Conference 2023 (WWW'23)

Journal ref: Proceedings of the ACM Web Conference 2023 (WWW'23)

arXiv:2207.04811 [pdf, ps, other]

doi 10.1007/s00209-023-03229-2

Asymptotic Spectral Flow

Authors: Xianzhe Dai, Yihan Li

Abstract: In this paper we study the asymptotic behavior of the spectral flow of a one-parameter family $\{D_s\}$ of Dirac operators acting on the spinor bunldle $S$ twisted by a vector bundle $E$ of rank $k$, with the parameter $s\in [0,r]$ when $r$ gets sufficiently large. Our method uses the variation of eta invariant and local index theory technique. The key is a uniform estimate of the eta invariant… ▽ More In this paper we study the asymptotic behavior of the spectral flow of a one-parameter family $\{D_s\}$ of Dirac operators acting on the spinor bunldle $S$ twisted by a vector bundle $E$ of rank $k$, with the parameter $s\in [0,r]$ when $r$ gets sufficiently large. Our method uses the variation of eta invariant and local index theory technique. The key is a uniform estimate of the eta invariant $\barη(D_r)$ which is established via local index theory technique and heat kernel estimate. △ Less

Submitted 11 July, 2022; originally announced July 2022.

arXiv:2207.04196 [pdf, other]

doi 10.1109/LRA.2022.3195189

Robotic Depowdering for Additive Manufacturing Via Pose Tracking

Authors: Zhenwei Liu, Junyi Geng, Xikai Dai, Tomasz Swierzewski, Kenji Shimada

Abstract: With the rapid development of powder-based additive manufacturing, depowdering, a process of removing unfused powder that covers 3D-printed parts, has become a major bottleneck to further improve its productiveness. Traditional manual depowdering is extremely time-consuming and costly, and some prior automated systems either require pre-depowdering or lack adaptability to different 3D-printed part… ▽ More With the rapid development of powder-based additive manufacturing, depowdering, a process of removing unfused powder that covers 3D-printed parts, has become a major bottleneck to further improve its productiveness. Traditional manual depowdering is extremely time-consuming and costly, and some prior automated systems either require pre-depowdering or lack adaptability to different 3D-printed parts. To solve these problems, we introduce a robotic system that automatically removes unfused powder from the surface of 3D-printed parts. The key component is a visual perception system, which consists of a pose-tracking module that tracks the 6D pose of powder-occluded parts in real-time, and a progress estimation module that estimates the depowdering completion percentage. The tracking module can be run efficiently on a laptop CPU at up to 60 FPS. Experiments show that our depowdering system can remove unfused powder from the surface of various 3D-printed parts without causing any damage. To the best of our knowledge, this is one of the first vision-based robotic depowdering systems that adapt to parts with various shapes without the need for pre-depowdering. △ Less

Submitted 4 September, 2022; v1 submitted 9 July, 2022; originally announced July 2022.

Comments: Video link: https://www.youtube.com/watch?v=AUIkyULAhqM

Journal ref: 2022 IEEE Robotics and Automation Letters

arXiv:2207.03520 [pdf, other]

Should All Proposals be Treated Equally in Object Detection?

Authors: Yunsheng Li, Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Pei Yu, Jing Yin, Lu Yuan, Zicheng Liu, Nuno Vasconcelos

Abstract: The complexity-precision trade-off of an object detector is a critical problem for resource constrained vision tasks. Previous works have emphasized detectors implemented with efficient backbones. The impact on this trade-off of proposal processing by the detection head is investigated in this work. It is hypothesized that improved detection efficiency requires a paradigm shift, towards the unequa… ▽ More The complexity-precision trade-off of an object detector is a critical problem for resource constrained vision tasks. Previous works have emphasized detectors implemented with efficient backbones. The impact on this trade-off of proposal processing by the detection head is investigated in this work. It is hypothesized that improved detection efficiency requires a paradigm shift, towards the unequal processing of proposals, assigning more computation to good proposals than poor ones. This results in better utilization of available computational budget, enabling higher accuracy for the same FLOPS. We formulate this as a learning problem where the goal is to assign operators to proposals, in the detection head, so that the total computational cost is constrained and the precision is maximized. The key finding is that such matching can be learned as a function that maps each proposal embedding into a one-hot code over operators. While this function induces a complex dynamic network routing mechanism, it can be implemented by a simple MLP and learned end-to-end with off-the-shelf object detectors. This 'dynamic proposal processing' (DPP) is shown to outperform state-of-the-art end-to-end object detectors (DETR, Sparse R-CNN) by a clear margin for a given computational complexity. △ Less

Submitted 7 July, 2022; originally announced July 2022.

Comments: Accepted by ECCV 2022

arXiv:2207.02017 [pdf, ps, other]

Dissipative Landau-Zener tunneling: crossover from weak to strong environment coupling

Authors: X. Dai, R. Trappen, H. Chen, D. Melanson, M. A. Yurtalan, D. M. Tennant, A. J. Martinez, Y. Tang, E. Mozgunov, J. Gibson, J. A. Grover, S. M. Disseler, J. I. Basham, S. Novikov, R. Das, A. J. Melville, B. M. Niedzielski, C. F. Hirjibehedin, K. Serniak, S. J. Weber, J. L. Yoder, W. D. Oliver, K. M. Zick, D. A. Lidar, A. Lupascu

Abstract: Landau-Zener (LZ) tunneling, describing transitions in a two-level system during a sweep through an anti-crossing, is a model applicable to a wide range of physical phenomena, such as atomic collisions, chemical reactions, and molecular magnets, and has been extensively studied theoretically and experimentally. Dissipation due to coupling between the system and environment is an important factor i… ▽ More Landau-Zener (LZ) tunneling, describing transitions in a two-level system during a sweep through an anti-crossing, is a model applicable to a wide range of physical phenomena, such as atomic collisions, chemical reactions, and molecular magnets, and has been extensively studied theoretically and experimentally. Dissipation due to coupling between the system and environment is an important factor in determining the transition rates. Here we report experimental results on the dissipative LZ transition. Using a tunable superconducting flux qubit, we observe for the first time the crossover from weak to strong coupling to the environment. The weak coupling limit corresponds to small system-environment coupling and leads to environment-induced thermalization. In the strong coupling limit, environmental excitations dress the system and transitions occur between the dressed states. Our results confirm previous theoretical studies of dissipative LZ tunneling in the weak and strong coupling limits. Our results for the intermediate regime are novel and could stimulate further theoretical development of open system dynamics. This work provides insight into the role of open system effects on quantum annealing, which employs quantum tunneling to search for low-energy solutions to hard computational problems. △ Less

Submitted 5 July, 2022; originally announced July 2022.

Comments: 18 pages, 12 figures

arXiv:2206.08621 [pdf, other]

A Graph-Enhanced Click Model for Web Search

Authors: Jianghao Lin, Weiwen Liu, Xinyi Dai, Weinan Zhang, Shuai Li, Ruiming Tang, Xiuqiang He, Jianye Hao, Yong Yu

Abstract: To better exploit search logs and model users' behavior patterns, numerous click models are proposed to extract users' implicit interaction feedback. Most traditional click models are based on the probabilistic graphical model (PGM) framework, which requires manually designed dependencies and may oversimplify user behaviors. Recently, methods based on neural networks are proposed to improve the pr… ▽ More To better exploit search logs and model users' behavior patterns, numerous click models are proposed to extract users' implicit interaction feedback. Most traditional click models are based on the probabilistic graphical model (PGM) framework, which requires manually designed dependencies and may oversimplify user behaviors. Recently, methods based on neural networks are proposed to improve the prediction accuracy of user behaviors by enhancing the expressive ability and allowing flexible dependencies. However, they still suffer from the data sparsity and cold-start problems. In this paper, we propose a novel graph-enhanced click model (GraphCM) for web search. Firstly, we regard each query or document as a vertex, and propose novel homogeneous graph construction methods for queries and documents respectively, to fully exploit both intra-session and inter-session information for the sparsity and cold-start problems. Secondly, following the examination hypothesis, we separately model the attractiveness estimator and examination predictor to output the attractiveness scores and examination probabilities, where graph neural networks and neighbor interaction techniques are applied to extract the auxiliary information encoded in the pre-constructed homogeneous graphs. Finally, we apply combination functions to integrate examination probabilities and attractiveness scores into click predictions. Extensive experiments conducted on three real-world session datasets show that GraphCM not only outperforms the state-of-art models, but also achieves superior performance in addressing the data sparsity and cold-start problems. △ Less

Submitted 22 August, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

Comments: 10 pages; Accepted by SIGIR 2021

arXiv:2206.08506 [pdf, other]

A Numerical Reasoning Question Answering System with Fine-grained Retriever and the Ensemble of Multiple Generators for FinQA

Authors: Bin Wang, Jiangzhou Ju, Yunlin Mao, Xin-Yu Dai, Shujian Huang, Jiajun Chen

Abstract: The numerical reasoning in the financial domain -- performing quantitative analysis and summarizing the information from financial reports -- can greatly increase business efficiency and reduce costs of billions of dollars. Here, we propose a numerical reasoning question answering system to answer numerical reasoning questions among financial text and table data sources, consisting of a retriever… ▽ More The numerical reasoning in the financial domain -- performing quantitative analysis and summarizing the information from financial reports -- can greatly increase business efficiency and reduce costs of billions of dollars. Here, we propose a numerical reasoning question answering system to answer numerical reasoning questions among financial text and table data sources, consisting of a retriever module, a generator module, and an ensemble module. Specifically, in the retriever module, in addition to retrieving the whole row data, we innovatively design a cell retriever that retrieves the gold cells to avoid bringing unrelated and similar cells in the same row to the inputs of the generator module. In the generator module, we utilize multiple generators to produce programs, which are operation steps to answer the question. Finally, in the ensemble module, we integrate multiple programs to choose the best program as the output of our system. In the final private test set in FinQA Competition, our system obtains 69.79 execution accuracy. △ Less

Submitted 16 June, 2022; originally announced June 2022.

arXiv:2206.05994 [pdf, other]

doi 10.23919/ACC53348.2022.9867252

Discretization and Stabilization of Energy-Based Controller for Period Switching Control and Flexible Scheduling

Authors: Seyed Amir Tafrishi, Xiaotian Dai, Yasuhisa Hirata, Alan Burns

Abstract: Emerging advanced control applications, with increased complexity in software but limited computing resources, suggest that real-time controllers should have adaptable designs. These control strategies also should be designed with consideration of the run-time behavior of the system. One of such research attempts is to design the controller along with the task scheduler, known as control-schedulin… ▽ More Emerging advanced control applications, with increased complexity in software but limited computing resources, suggest that real-time controllers should have adaptable designs. These control strategies also should be designed with consideration of the run-time behavior of the system. One of such research attempts is to design the controller along with the task scheduler, known as control-scheduling co-design, for more predictable timing behavior as well as surviving system overloads. Unlike traditional controller designs, which have equal-distance sampling periods, the co-design approach increases the system flexibility and resilience by explicitly considering timing properties, for example using an event-based controller or with multiple sampling times (non-uniform sampling and control). Within this context, we introduce the first work on the discretization of an energy-based controller that can switch arbitrarily between multiple periods and adjust the control parameters accordingly without destabilizing the system. A digital controller design based on this paradigm for a DC motor with an elastic load as an example is introduced and the stability condition is given based on the proposed Lyapunov function. The method is evaluated with various computer-based simulations which demonstrate its effectiveness. △ Less

Submitted 13 June, 2022; originally announced June 2022.

Comments: Accepted to 2022 American Control Conference (ACC), 6 pages, 8 figures

arXiv:2206.05836 [pdf, other]

GLIPv2: Unifying Localization and Vision-Language Understanding

Authors: Haotian Zhang, Pengchuan Zhang, Xiaowei Hu, Yen-Chun Chen, Liunian Harold Li, Xiyang Dai, Lijuan Wang, Lu Yuan, Jenq-Neng Hwang, Jianfeng Gao

Abstract: We present GLIPv2, a grounded VL understanding model, that serves both localization tasks (e.g., object detection, instance segmentation) and Vision-Language (VL) understanding tasks (e.g., VQA, image captioning). GLIPv2 elegantly unifies localization pre-training and Vision-Language Pre-training (VLP) with three pre-training tasks: phrase grounding as a VL reformulation of the detection task, reg… ▽ More We present GLIPv2, a grounded VL understanding model, that serves both localization tasks (e.g., object detection, instance segmentation) and Vision-Language (VL) understanding tasks (e.g., VQA, image captioning). GLIPv2 elegantly unifies localization pre-training and Vision-Language Pre-training (VLP) with three pre-training tasks: phrase grounding as a VL reformulation of the detection task, region-word contrastive learning as a novel region-word level contrastive learning task, and the masked language modeling. This unification not only simplifies the previous multi-stage VLP procedure but also achieves mutual benefits between localization and understanding tasks. Experimental results show that a single GLIPv2 model (all model weights are shared) achieves near SoTA performance on various localization and understanding tasks. The model also shows (1) strong zero-shot and few-shot adaption performance on open-vocabulary object detection tasks and (2) superior grounding capability on VL understanding tasks. Code will be released at https://github.com/microsoft/GLIP. △ Less

Submitted 11 October, 2022; v1 submitted 12 June, 2022; originally announced June 2022.

Comments: NeurIPS 2022; updated with reviewers' comments addressed; Code is released at https://github.com/microsoft/GLIP

arXiv:2206.05476 [pdf, other]

doi 10.1145/3534678.3539390

Sampling-based Estimation of the Number of Distinct Values in Distributed Environment

Authors: Jiajun Li, Zhewei Wei, Bolin Ding, Xiening Dai, Lu Lu, Jingren Zhou

Abstract: In data mining, estimating the number of distinct values (NDV) is a fundamental problem with various applications. Existing methods for estimating NDV can be broadly classified into two categories: i) scanning-based methods, which scan the entire data and maintain a sketch to approximate NDV; and ii) sampling-based methods, which estimate NDV using sampling data rather than accessing the entire da… ▽ More In data mining, estimating the number of distinct values (NDV) is a fundamental problem with various applications. Existing methods for estimating NDV can be broadly classified into two categories: i) scanning-based methods, which scan the entire data and maintain a sketch to approximate NDV; and ii) sampling-based methods, which estimate NDV using sampling data rather than accessing the entire data warehouse. Scanning-based methods achieve a lower approximation error at the cost of higher I/O and more time. Sampling-based estimation is preferable in applications with a large data volume and a permissible error restriction due to its higher scalability. However, while the sampling-based method is more effective on a single machine, it is less practical in a distributed environment with massive data volumes. For obtaining the final NDV estimators, the entire sample must be transferred throughout the distributed system, incurring a prohibitive communication cost when the sample rate is significant. This paper proposes a novel sketch-based distributed method that achieves sub-linear communication costs for distributed sampling-based NDV estimation under mild assumptions. Our method leverages a sketch-based algorithm to estimate the sample's {\em frequency of frequency} in the {\em distributed streaming model}, which is compatible with most classical sampling-based NDV estimators. Additionally, we provide theoretical evidence for our method's ability to minimize communication costs in the worst-case scenario. Extensive experiments show that our method saves orders of magnitude in communication costs compared to existing sampling- and sketch-based methods. △ Less

Submitted 11 June, 2022; originally announced June 2022.

Comments: 11 pages

arXiv:2206.03484 [pdf, other]

Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding

Authors: Lingchen Meng, Xiyang Dai, Yinpeng Chen, Pengchuan Zhang, Dongdong Chen, Mengchen Liu, Jianfeng Wang, Zuxuan Wu, Lu Yuan, Yu-Gang Jiang

Abstract: Combining multiple datasets enables performance boost on many computer vision tasks. But similar trend has not been witnessed in object detection when combining multiple datasets due to two inconsistencies among detection datasets: taxonomy difference and domain gap. In this paper, we address these challenges by a new design (named Detection Hub) that is dataset-aware and category-aligned. It not… ▽ More Combining multiple datasets enables performance boost on many computer vision tasks. But similar trend has not been witnessed in object detection when combining multiple datasets due to two inconsistencies among detection datasets: taxonomy difference and domain gap. In this paper, we address these challenges by a new design (named Detection Hub) that is dataset-aware and category-aligned. It not only mitigates the dataset inconsistency but also provides coherent guidance for the detector to learn across multiple datasets. In particular, the dataset-aware design is achieved by learning a dataset embedding that is used to adapt object queries as well as convolutional kernels in detection heads. The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding and leveraging the semantic coherence of language embedding. Detection Hub fulfills the benefits of large data on object detection. Experiments demonstrate that joint training on multiple datasets achieves significant performance gains over training on each dataset alone. Detection Hub further achieves SoTA performance on UODB benchmark with wide variety of datasets. △ Less

Submitted 29 March, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

Comments: CVPR camera ready

arXiv:2206.01843 [pdf, other]

Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning

Authors: Yujia Xie, Luowei Zhou, Xiyang Dai, Lu Yuan, Nguyen Bach, Ce Liu, Michael Zeng

Abstract: People say, "A picture is worth a thousand words". Then how can we get the rich information out of the image? We argue that by using visual clues to bridge large pretrained vision foundation models and language models, we can do so without any extra cross-modal training. Thanks to the strong zero-shot capability of foundation models, we start by constructing a rich semantic representation of the i… ▽ More People say, "A picture is worth a thousand words". Then how can we get the rich information out of the image? We argue that by using visual clues to bridge large pretrained vision foundation models and language models, we can do so without any extra cross-modal training. Thanks to the strong zero-shot capability of foundation models, we start by constructing a rich semantic representation of the image (e.g., image tags, object attributes / locations, captions) as a structured textual prompt, called visual clues, using a vision foundation model. Based on visual clues, we use large language model to produce a series of comprehensive descriptions for the visual content, which is then verified by the vision model again to select the candidate that aligns best with the image. We evaluate the quality of generated descriptions by quantitative and qualitative measurement. The results demonstrate the effectiveness of such a structured semantic representation. △ Less

Submitted 14 September, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

arXiv:2206.00320 [pdf, ps, other]

doi 10.1007/s11118-023-10090-9

Mittag--Leffler Euler integrator and large deviations for stochastic space-time fractional diffusion equations

Authors: Xinjie Dai, Jialin Hong, Derui Sheng

Abstract: Stochastic space-time fractional diffusion equations often appear in the modeling of the heat propagation in non-homogeneous medium. In this paper, we firstly investigate the Mittag--Leffler Euler integrator of a class of stochastic space-time fractional diffusion equations, whose super-convergence order is obtained by developing a helpful decomposition way for the time-fractional integral. Here,… ▽ More Stochastic space-time fractional diffusion equations often appear in the modeling of the heat propagation in non-homogeneous medium. In this paper, we firstly investigate the Mittag--Leffler Euler integrator of a class of stochastic space-time fractional diffusion equations, whose super-convergence order is obtained by developing a helpful decomposition way for the time-fractional integral. Here, the developed decomposition way is the key to dealing with the singularity of the solution operator. Moreover, we study the Freidlin--Wentzell type large deviation principles of the underlying equation and its Mittag--Leffler Euler integrator based on the weak convergence approach. In particular, we prove that the large deviation rate function of the Mittag--Leffler Euler integrator $Γ$-converges to that of the underlying equation. △ Less

Submitted 13 August, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

MSC Class: 60H15; 60H35; 65M12

arXiv:2205.13412 [pdf, other]

Physical-World Optical Adversarial Attacks on 3D Face Recognition

Authors: Yanjie Li, Yiquan Li, Xuelong Dai, Songtao Guo, Bin Xiao

Abstract: 2D face recognition has been proven insecure for physical adversarial attacks. However, few studies have investigated the possibility of attacking real-world 3D face recognition systems. 3D-printed attacks recently proposed cannot generate adversarial points in the air. In this paper, we attack 3D face recognition systems through elaborate optical noises. We took structured light 3D scanners as ou… ▽ More 2D face recognition has been proven insecure for physical adversarial attacks. However, few studies have investigated the possibility of attacking real-world 3D face recognition systems. 3D-printed attacks recently proposed cannot generate adversarial points in the air. In this paper, we attack 3D face recognition systems through elaborate optical noises. We took structured light 3D scanners as our attack target. End-to-end attack algorithms are designed to generate adversarial illumination for 3D faces through the inherent or an additional projector to produce adversarial points at arbitrary positions. Nevertheless, face reflectance is a complex procedure because the skin is translucent. To involve this projection-and-capture procedure in optimization loops, we model it by Lambertian rendering model and use SfSNet to estimate the albedo. Moreover, to improve the resistance to distance and angle changes while maintaining the perturbation unnoticeable, a 3D transform invariant loss and two kinds of sensitivity maps are introduced. Experiments are conducted in both simulated and physical worlds. We successfully attacked point-cloud-based and depth-image-based 3D face recognition algorithms while needing fewer perturbations than previous state-of-the-art physical-world 3D adversarial attacks. △ Less

Submitted 13 November, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

Comments: Submitted to CVPR 2023

arXiv:2205.07444 [pdf, other]

A Deep Reinforcement Learning Blind AI in DareFightingICE

Authors: Thai Van Nguyen, Xincheng Dai, Ibrahim Khan, Ruck Thawonmas, Hai V. Pham

Abstract: This paper presents a deep reinforcement learning agent (AI) that uses sound as the input on the DareFightingICE platform at the DareFightingICE Competition in IEEE CoG 2022. In this work, an AI that only uses sound as the input is called blind AI. While state-of-the-art AIs rely mostly on visual or structured observations provided by their environments, learning to play games from only sound is s… ▽ More This paper presents a deep reinforcement learning agent (AI) that uses sound as the input on the DareFightingICE platform at the DareFightingICE Competition in IEEE CoG 2022. In this work, an AI that only uses sound as the input is called blind AI. While state-of-the-art AIs rely mostly on visual or structured observations provided by their environments, learning to play games from only sound is still new and thus challenging. We propose different approaches to process audio data and use the Proximal Policy Optimization algorithm for our blind AI. We also propose to use our blind AI in evaluation of sound designs submitted to the competition and define two metrics for this task. The experimental results show the effectiveness of not only our blind AI but also the proposed two metrics. △ Less

Submitted 30 June, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

Comments: 2022 IEEE Conference on Games (CoG 2022)

ACM Class: I.2; H.5.2; H.5.5

arXiv:2205.07281 [pdf, other]

doi 10.1103/PhysRevB.106.035146

Topological States in Chevrel Phase Materials from First-principle Calculations

Authors: Shuai Zhang, Shiyu Peng, Xi Dai, Hongming Weng

Abstract: Chevrel phase materials form a family of ternary molybdenum chalcogenides with a general chemical formula $A_x{\rm Mo}_6X_8$ ($A$ = metal elements, $X$ = chalcogen). The variety of $A$ atoms makes a large number of family members and leads to many tunable physical properties, such as the superconductivity, thermoelectricity and the ionic conductivity. In this work, we have further found various no… ▽ More Chevrel phase materials form a family of ternary molybdenum chalcogenides with a general chemical formula $A_x{\rm Mo}_6X_8$ ($A$ = metal elements, $X$ = chalcogen). The variety of $A$ atoms makes a large number of family members and leads to many tunable physical properties, such as the superconductivity, thermoelectricity and the ionic conductivity. In this work, we have further found various nontrivial band topological states in these materials by using first-principle calculations. The compounds having time-reversal symmetry, such as ${\rm BaMo}_6{\rm S}_8$, ${\rm SrMo}_6{\rm S}_8$, and ${\rm Mo}_6{\rm S}_8$, are topological insulators in both of the $R\bar{3}$ and $P\bar{1}$ phases, whereas ${\rm EuMo}_6{\rm S}_8$ within ferromagnetic state, it is an axion insulator in the $R\bar{3}$ phase and a trivial one in the $P\bar{1}$ phase. This indicates that the change of $A$ ions can modify the chemical potential, lattice distortion, and magnetic orders, which offers a unique way to influence the topological states and other properties. We hope this work can stimulate further studies of Chevrel phase materials to find more intriguing phenomena, such as topological superconducting states and Majorana modes. △ Less

Submitted 30 July, 2022; v1 submitted 15 May, 2022; originally announced May 2022.

arXiv:2205.06479 [pdf, ps, other]

Frame set for Gabor systems with Haar window

Authors: Xin-Rong Dai, Meng Zhu

Abstract: We show the full structure of the frame set for the Gabor system $\mathcal{G}(g;α,β):=\{e^{-2πi mβ\cdot}g(\cdot-nα):m,n\in\Bbb Z\}$ with the window being the Haar function $g=-χ_{[-1/2,0)}+χ_{[0,1/2)}$. The strategy of this paper is to introduce the piecewise linear transformation $\mathcal{M}$ on the unit circle, and to provide a complete characterization of structures for its (symmetric) maximal… ▽ More We show the full structure of the frame set for the Gabor system $\mathcal{G}(g;α,β):=\{e^{-2πi mβ\cdot}g(\cdot-nα):m,n\in\Bbb Z\}$ with the window being the Haar function $g=-χ_{[-1/2,0)}+χ_{[0,1/2)}$. The strategy of this paper is to introduce the piecewise linear transformation $\mathcal{M}$ on the unit circle, and to provide a complete characterization of structures for its (symmetric) maximal invariant sets. This transformation is related to the famous three gap theorem of Steinhaus which may be of independent interest. Furthermore, a classical criterion on Gabor frames is improved, which allows us to establish {a} necessary and sufficient condition for the Gabor system $\mathcal{G}(g;α,β)$ to be a frame, i.e., the symmetric invariant set of the transformation $\mathcal{M}$ is empty. Compared with the previous studies, the present paper provides a self-contained environment to study Gabor frames by a new perspective, which includes that the techniques developed here are new and all the proofs could be understood thoroughly by the readers without reference to the known results in the previous literature. △ Less

Submitted 13 May, 2022; originally announced May 2022.

MSC Class: Primary 42C15; 42C40; Secondary 28D05; 37A05; 94A20

arXiv:2205.05076 [pdf, other]

Reduce Information Loss in Transformers for Pluralistic Image Inpainting

Authors: Qiankun Liu, Zhentao Tan, Dongdong Chen, Qi Chu, Xiyang Dai, Yinpeng Chen, Mengchen Liu, Lu Yuan, Nenghai Yu

Abstract: Transformers have achieved great success in pluralistic image inpainting recently. However, we find existing transformer based solutions regard each pixel as a token, thus suffer from information loss issue from two aspects: 1) They downsample the input image into much lower resolutions for efficiency consideration, incurring information loss and extra misalignment for the boundaries of masked reg… ▽ More Transformers have achieved great success in pluralistic image inpainting recently. However, we find existing transformer based solutions regard each pixel as a token, thus suffer from information loss issue from two aspects: 1) They downsample the input image into much lower resolutions for efficiency consideration, incurring information loss and extra misalignment for the boundaries of masked regions. 2) They quantize $256^3$ RGB pixels to a small number (such as 512) of quantized pixels. The indices of quantized pixels are used as tokens for the inputs and prediction targets of transformer. Although an extra CNN network is used to upsample and refine the low-resolution results, it is difficult to retrieve the lost information back.To keep input information as much as possible, we propose a new transformer based framework "PUT". Specifically, to avoid input downsampling while maintaining the computation efficiency, we design a patch-based auto-encoder P-VQVAE, where the encoder converts the masked image into non-overlapped patch tokens and the decoder recovers the masked regions from inpainted tokens while keeping the unmasked regions unchanged. To eliminate the information loss caused by quantization, an Un-Quantized Transformer (UQ-Transformer) is applied, which directly takes the features from P-VQVAE encoder as input without quantization and regards the quantized tokens only as prediction targets. Extensive experiments show that PUT greatly outperforms state-of-the-art methods on image fidelity, especially for large masked regions and complex large-scale datasets. Code is available at https://github.com/liuqk3/PUT △ Less

Submitted 15 May, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

Comments: CVPR 2022, code is available at https://github.com/liuqk3/PUT

arXiv:2204.13026 [pdf]

doi 10.1364/OE.25.010070

Highly fabrication tolerant InP based polarization beam splitter based on p-i-n structure

Authors: Nicolás Abadía, Xiangyang Dai, Qiaoyin Lu, Wei-Hua Guo, David Patel, David V. Plant, John F. Donegan

Abstract: In this work, a novel highly fabrication tolerant polarization beam splitter (PBS) is presented on an InP platform. To achieve the splitting, we combine the Pockels effect and the plasma dispersion effect in a symmetric 1x2 Mach-Zehnder interferometer (MZI). One p-i-n phase shifter of the MZI is driven in forward bias to exploit the plasma dispersion effect and modify the phase of both the TE and… ▽ More In this work, a novel highly fabrication tolerant polarization beam splitter (PBS) is presented on an InP platform. To achieve the splitting, we combine the Pockels effect and the plasma dispersion effect in a symmetric 1x2 Mach-Zehnder interferometer (MZI). One p-i-n phase shifter of the MZI is driven in forward bias to exploit the plasma dispersion effect and modify the phase of both the TE and TM mode. The other arm of the MZI is driven in reverse bias to exploit the Pockels effect which affects only the TE mode. By adjusting the voltages of the two phase shifters, a different interference condition can be set for the TE and the TM modes thereby splitting them at the output of the MZI. By adjusting the voltages, the very tight fabrication tolerances known for fully passive PBS are eased. The experimental results show that an extinction ratio better than 15 dB and an on-chip loss of 3.5 dB over the full C-band (1530-1565nm) are achieved. △ Less

Submitted 19 July, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

Comments: 8 pages, 7 figures, and 2 tables. Published version: see https://doi.org/10.1364/OE.25.010070. Related works: see https://doi.org/10.1364/OE.22.011236, https://doi.org/10.1364/oe.26.030292, and https://doi.org/10.1063/5.0044490. Keywords: Photonic Integrated Circuits; Integrated Optics Devices; Beam Splitters; Polarization-Selective Devices

Journal ref: Opt. Express 25, 10070-10077 (2017)

arXiv:2204.11397 [pdf, other]

Tensorial tomographic differential phase-contrast microscopy

Authors: Shiqi Xu, Xiang Dai, Xi Yang, Kevin C. Zhou, Kanghyun Kim, Vinayak Pathak, Carolyn Glass, Roarke Horstmeyer

Abstract: We report Tensorial Tomographic Differential Phase-Contrast microscopy (T2DPC), a quantitative label-free tomographic imaging method for simultaneous measurement of phase and anisotropy. T2DPC extends differential phase-contrast microscopy, a quantitative phase imaging technique, to highlight the vectorial nature of light. The method solves for permittivity tensor of anisotropic samples from inten… ▽ More We report Tensorial Tomographic Differential Phase-Contrast microscopy (T2DPC), a quantitative label-free tomographic imaging method for simultaneous measurement of phase and anisotropy. T2DPC extends differential phase-contrast microscopy, a quantitative phase imaging technique, to highlight the vectorial nature of light. The method solves for permittivity tensor of anisotropic samples from intensity measurements acquired with a standard microscope equipped with an LED matrix, a circular polarizer, and a polarization-sensitive camera. We demonstrate accurate volumetric reconstructions of refractive index, birefringence, and orientation for various validation samples, and show that the reconstructed polarization structures of a biological specimen are predictive of pathology. △ Less

Submitted 6 September, 2022; v1 submitted 24 April, 2022; originally announced April 2022.

arXiv:2204.11219 [pdf, other]

doi 10.1016/j.renene.2022.03.160

A three-dimensional dynamic mode decomposition analysis of wind farm flow aerodynamics

Authors: Xuan Dai, Da Xu, Mengqi Zhang, Richard J. A. M. Stevens

Abstract: High-fidelity large-eddy simulations are suitable to obtain insight into the complex flow dynamics in extended wind farms. In order to better understand these flow dynamics, we use dynamic mode decomposition (DMD) to analyze and reconstruct the flow field in large-scale numerically simulated wind farms by large-eddy simulations (LES). Different wind farm layouts are considered, and we find that a… ▽ More High-fidelity large-eddy simulations are suitable to obtain insight into the complex flow dynamics in extended wind farms. In order to better understand these flow dynamics, we use dynamic mode decomposition (DMD) to analyze and reconstruct the flow field in large-scale numerically simulated wind farms by large-eddy simulations (LES). Different wind farm layouts are considered, and we find that a combination of horizontal and vertical staggering leads to improved wind farm performance compared to traditional horizontal staggering. We analyze the wind farm flows using the amplitude selection (AP) and sparsity-promoting (SP method) DMD approach. We find that the AP method tends to select modes with a small length scale and a high frequency, while the SP method selects large coherent structures with low frequency. The latter are somewhat reminiscent of modes obtained using proper orthogonal decomposition (POD). We find that a relatively limited number of SP-DMD modes is sufficient to accurately reconstruct the flow field in the entire wind farm, whereas the AP-DMD method requires more modes to achieve an accurate reconstruction. Thus, the SP-DMD method has a smaller performance loss compared to the AP-DMD method in terms of the reconstruction of the flow field. △ Less

Submitted 24 April, 2022; originally announced April 2022.

arXiv:2204.10496 [pdf, other]

Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks

Authors: Zhecan Wang, Noel Codella, Yen-Chun Chen, Luowei Zhou, Xiyang Dai, Bin Xiao, Jianwei Yang, Haoxuan You, Kai-Wei Chang, Shih-fu Chang, Lu Yuan

Abstract: Cross-modal encoders for vision-language (VL) tasks are often pretrained with carefully curated vision-language datasets. While these datasets reach an order of 10 million samples, the labor cost is prohibitive to scale further. Conversely, unimodal encoders are pretrained with simpler annotations that are less cost-prohibitive, achieving scales of hundreds of millions to billions. As a result, un… ▽ More Cross-modal encoders for vision-language (VL) tasks are often pretrained with carefully curated vision-language datasets. While these datasets reach an order of 10 million samples, the labor cost is prohibitive to scale further. Conversely, unimodal encoders are pretrained with simpler annotations that are less cost-prohibitive, achieving scales of hundreds of millions to billions. As a result, unimodal encoders have achieved state-of-art (SOTA) on many downstream tasks. However, challenges remain when applying to VL tasks. The pretraining data is not optimal for cross-modal architectures and requires heavy computational resources. In addition, unimodal architectures lack cross-modal interactions that have demonstrated significant benefits for VL tasks. Therefore, how to best leverage pretrained unimodal encoders for VL tasks is still an area of active research. In this work, we propose a method to leverage unimodal vision and text encoders for VL tasks that augment existing VL approaches while conserving computational complexity. Specifically, we propose Multimodal Adaptive Distillation (MAD), which adaptively distills useful knowledge from pretrained encoders to cross-modal VL encoders. Second, to better capture nuanced impacts on VL task performance, we introduce an evaluation protocol that includes Visual Commonsense Reasoning (VCR), Visual Entailment (SNLI-VE), and Visual Question Answering (VQA), across a variety of data constraints and conditions of domain shift. Experiments demonstrate that MAD leads to consistent gains in the low-shot, domain-shifted, and fully-supervised conditions on VCR, SNLI-VE, and VQA, achieving SOTA performance on VCR compared to other single models pretrained with image-text data. Finally, MAD outperforms concurrent works utilizing pretrained vision encoder from CLIP. Code will be made available. △ Less

Submitted 28 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2201.05729

arXiv:2204.09636 [pdf, other]

Residual Mixture of Experts

Authors: Lemeng Wu, Mengchen Liu, Yinpeng Chen, Dongdong Chen, Xiyang Dai, Lu Yuan

Abstract: Mixture of Experts (MoE) is able to scale up vision transformers effectively. However, it requires prohibiting computation resources to train a large MoE transformer. In this paper, we propose Residual Mixture of Experts (RMoE), an efficient training pipeline for MoE vision transformers on downstream tasks, such as segmentation and detection. RMoE achieves comparable results with the upper-bound M… ▽ More Mixture of Experts (MoE) is able to scale up vision transformers effectively. However, it requires prohibiting computation resources to train a large MoE transformer. In this paper, we propose Residual Mixture of Experts (RMoE), an efficient training pipeline for MoE vision transformers on downstream tasks, such as segmentation and detection. RMoE achieves comparable results with the upper-bound MoE training, while only introducing minor additional training cost than the lower-bound non-MoE training pipelines. The efficiency is supported by our key observation: the weights of an MoE transformer can be factored into an input-independent core and an input-dependent residual. Compared with the weight core, the weight residual can be efficiently trained with much less computation resource, e.g., finetuning on the downstream data. We show that, compared with the current MoE training pipeline, we get comparable results while saving over 30% training cost. When compared with state-of-the-art non- MoE transformers, such as Swin-T / CvT-13 / Swin-L, we get +1.1 / 0.9 / 1.0 mIoU gain on ADE20K segmentation and +1.4 / 1.6 / 0.6 AP gain on MS-COCO object detection task with less than 3% additional training cost. △ Less

Submitted 4 October, 2022; v1 submitted 20 April, 2022; originally announced April 2022.

arXiv:2204.09370 [pdf, other]

Multi-Level Interaction Reranking with User Behavior History

Authors: Yunjia Xi, Weiwen Liu, Jieming Zhu, Xilong Zhao, Xinyi Dai, Ruiming Tang, Weinan Zhang, Rui Zhang, Yong Yu

Abstract: As the final stage of the multi-stage recommender system (MRS), reranking directly affects users' experience and satisfaction, thus playing a critical role in MRS. Despite the improvement achieved in the existing work, three issues are yet to be solved. First, users' historical behaviors contain rich preference information, such as users' long and short-term interests, but are not fully exploited… ▽ More As the final stage of the multi-stage recommender system (MRS), reranking directly affects users' experience and satisfaction, thus playing a critical role in MRS. Despite the improvement achieved in the existing work, three issues are yet to be solved. First, users' historical behaviors contain rich preference information, such as users' long and short-term interests, but are not fully exploited in reranking. Previous work typically treats items in history equally important, neglecting the dynamic interaction between the history and candidate items. Second, existing reranking models focus on learning interactions at the item level while ignoring the fine-grained feature-level interactions. Lastly, estimating the reranking score on the ordered initial list before reranking may lead to the early scoring problem, thereby yielding suboptimal reranking performance. To address the above issues, we propose a framework named Multi-level Interaction Reranking (MIR). MIR combines low-level cross-item interaction and high-level set-to-list interaction, where we view the candidate items to be reranked as a set and the users' behavior history in chronological order as a list. We design a novel SLAttention structure for modeling the set-to-list interactions with personalized long-short term interests. Moreover, feature-level interactions are incorporated to capture the fine-grained influence among items. We design MIR in such a way that any permutation of the input items would not change the output ranking, and we theoretically prove it. Extensive experiments on three public and proprietary datasets show that MIR significantly outperforms the state-of-the-art models using various ranking and utility metrics. △ Less

Submitted 20 April, 2022; originally announced April 2022.

arXiv:2204.09366 [pdf, other]

Analyzing the Intensity of Complaints on Social Media

Authors: Ming Fang, Shi Zong, Jing Li, Xinyu Dai, Shujian Huang, Jiajun Chen

Abstract: Complaining is a speech act that expresses a negative inconsistency between reality and human expectations. While prior studies mostly focus on identifying the existence or the type of complaints, in this work, we present the first study in computational linguistics of measuring the intensity of complaints from text. Analyzing complaints from such perspective is particularly useful, as complaints… ▽ More Complaining is a speech act that expresses a negative inconsistency between reality and human expectations. While prior studies mostly focus on identifying the existence or the type of complaints, in this work, we present the first study in computational linguistics of measuring the intensity of complaints from text. Analyzing complaints from such perspective is particularly useful, as complaints of certain degrees may cause severe consequences for companies or organizations. We create the first Chinese dataset containing 3,103 posts about complaints from Weibo, a popular Chinese social media platform. These posts are then annotated with complaints intensity scores using Best-Worst Scaling (BWS) method. We show that complaints intensity can be accurately estimated by computational models with the best mean square error achieving 0.11. Furthermore, we conduct a comprehensive linguistic analysis around complaints, including the connections between complaints and sentiment, and a cross-lingual comparison for complaints expressions used by Chinese and English speakers. We finally show that our complaints intensity scores can be incorporated for better estimating the popularity of posts on social media. △ Less

Submitted 20 April, 2022; originally announced April 2022.

Comments: NAACL 2022 (Findings)

arXiv:2204.06683 [pdf, other]

Revisiting Transformer-based Models for Long Document Classification

Authors: Xiang Dai, Ilias Chalkidis, Sune Darkner, Desmond Elliott

Abstract: The recent literature in text classification is biased towards short text sequences (e.g., sentences or paragraphs). In real-world applications, multi-page multi-paragraph documents are common and they cannot be efficiently encoded by vanilla Transformer-based models. We compare different Transformer-based Long Document Classification (TrLDC) approaches that aim to mitigate the computational overh… ▽ More The recent literature in text classification is biased towards short text sequences (e.g., sentences or paragraphs). In real-world applications, multi-page multi-paragraph documents are common and they cannot be efficiently encoded by vanilla Transformer-based models. We compare different Transformer-based Long Document Classification (TrLDC) approaches that aim to mitigate the computational overhead of vanilla transformers to encode much longer text, namely sparse attention and hierarchical encoding methods. We examine several aspects of sparse attention (e.g., size of local attention window, use of global attention) and hierarchical (e.g., document splitting strategy) transformers on four document classification datasets covering different domains. We observe a clear benefit from being able to process longer text, and, based on our results, we derive practical advice of applying Transformer-based models on long document classification tasks. △ Less

Submitted 25 October, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

Comments: Findings of EMNLP 2022

arXiv:2204.04714 [pdf, other]

doi 10.1103/PhysRevX.13.011027

Transverse Peierls Transition

Authors: Kaifa Luo, Xi Dai

Abstract: In the present paper, we have discussed a new type of spontaneous symmetry breaking phases caused by the softening of the transverse acoustic phonon modes through the electron phonon coupling. These new phases include the shear density wave and self-twisting wave, which are caused by the softening of linearly and circularly polarized acoustic phonon modes, respectively. We propose that two of the… ▽ More In the present paper, we have discussed a new type of spontaneous symmetry breaking phases caused by the softening of the transverse acoustic phonon modes through the electron phonon coupling. These new phases include the shear density wave and self-twisting wave, which are caused by the softening of linearly and circularly polarized acoustic phonon modes, respectively. We propose that two of the topological semimetal systems in the quantum limit, where the electrons only occupy the lowest Landau bands under external magnetic field, will be the perfect systems to realize these new phases. Exotic physical effects will be induced in these new phases, including the 3D quantum Hall effect, chiral standing acoustic wave, magneto-acoustic effects and chiral phonon correction to Einstein-de Hass effect. △ Less

Submitted 10 April, 2022; originally announced April 2022.

arXiv:2204.02030 [pdf, other]

$\textit{latent}$-GLAT: Glancing at Latent Variables for Parallel Text Generation

Authors: Yu Bao, Hao Zhou, Shujian Huang, Dongqi Wang, Lihua Qian, Xinyu Dai, Jiajun Chen, Lei Li

Abstract: Recently, parallel text generation has received widespread attention due to its success in generation efficiency. Although many advanced techniques are proposed to improve its generation quality, they still need the help of an autoregressive model for training to overcome the one-to-many multi-modal phenomenon in the dataset, limiting their applications. In this paper, we propose… ▽ More Recently, parallel text generation has received widespread attention due to its success in generation efficiency. Although many advanced techniques are proposed to improve its generation quality, they still need the help of an autoregressive model for training to overcome the one-to-many multi-modal phenomenon in the dataset, limiting their applications. In this paper, we propose $\textit{latent}$-GLAT, which employs the discrete latent variables to capture word categorical information and invoke an advanced curriculum learning technique, alleviating the multi-modality problem. Experiment results show that our method outperforms strong baselines without the help of an autoregressive model, which further broadens the application scenarios of the parallel decoding paradigm. △ Less

Submitted 5 April, 2022; originally announced April 2022.

Comments: 12 pages, 5 figures, 6 tables. Accepted as a long paper in the main conference of ACL-2022

arXiv:2203.16335 [pdf, other]

doi 10.1016/j.ifacol.2022.07.250

Rapid Scalable Distributed Power Flow with Open-Source Implementation

Authors: Xinliang Dai, Yichen Cai, Yuning Jiang, Veit Hagenmeyer

Abstract: This paper introduces a new method for solving the distributed AC power flow (PF) problem by further exploiting the problem formulation. We propose a new variant of the ALADIN algorithm devised specifically for this type of problem. This new variant is characterized by using a reduced modelling method of the distributed AC PF problem, which is reformulated as a zero-residual least-squares problem… ▽ More This paper introduces a new method for solving the distributed AC power flow (PF) problem by further exploiting the problem formulation. We propose a new variant of the ALADIN algorithm devised specifically for this type of problem. This new variant is characterized by using a reduced modelling method of the distributed AC PF problem, which is reformulated as a zero-residual least-squares problem with consensus constraints. This PF is then solved by a Gauss-Newton based inexact ALADIN algorithm presented in the paper. An open-source implementation of this algorithm, called rapidPF+, is provided. Simulation results, for which the power system's dimension varies from 53 to 10224 buses, show great potential of this combination in the aspects of both the computing. △ Less

Submitted 30 March, 2022; originally announced March 2022.

arXiv:2203.15542 [pdf, ps, other]

doi 10.1145/3488560.3498478

Modeling Users' Contextualized Page-wise Feedback for Click-Through Rate Prediction in E-commerce Search

Authors: Zhifang Fan, Dan Ou, Yulong Gu, Bairan Fu, Xiang Li, Wentian Bao, Xin-Yu Dai, Xiaoyi Zeng, Tao Zhuang, Qingwen Liu

Abstract: Modeling user's historical feedback is essential for Click-Through Rate Prediction in personalized search and recommendation. Existing methods usually only model users' positive feedback information such as click sequences which neglects the context information of the feedback. In this paper, we propose a new perspective for context-aware users' behavior modeling by including the whole page-wisely… ▽ More Modeling user's historical feedback is essential for Click-Through Rate Prediction in personalized search and recommendation. Existing methods usually only model users' positive feedback information such as click sequences which neglects the context information of the feedback. In this paper, we propose a new perspective for context-aware users' behavior modeling by including the whole page-wisely exposed products and the corresponding feedback as contextualized page-wise feedback sequence. The intra-page context information and inter-page interest evolution can be captured to learn more specific user preference. We design a novel neural ranking model RACP(i.e., Recurrent Attention over Contextualized Page sequence), which utilizes page-context aware attention to model the intra-page context. A recurrent attention process is used to model the cross-page interest convergence evolution as denoising the interest in the previous pages. Experiments on public and real-world industrial datasets verify our model's effectiveness. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2203.14301 [pdf, other]

doi 10.1103/PhysRevB.106.L241105

Quantized and half-quantized Anomalous Hall effect induced by in-plane magnetic field

Authors: Song Sun, Hongming Weng, Xi Dai

Abstract: In this paper we propose that, quantized and nearly half-quantized intrinsic anomalous Hall effect can be induced by in-plane external magnetic field through the Zeeman coupling in non-magnetic 2D systems with sizeable spin-orbital coupling but without two-fold rotational symmetry. An analytical result is derived for 2D electron gas model with $C_{3v}$ symmetry. Based on the… ▽ More In this paper we propose that, quantized and nearly half-quantized intrinsic anomalous Hall effect can be induced by in-plane external magnetic field through the Zeeman coupling in non-magnetic 2D systems with sizeable spin-orbital coupling but without two-fold rotational symmetry. An analytical result is derived for 2D electron gas model with $C_{3v}$ symmetry. Based on the $\boldsymbol{k\cdot p}$ Hamiltonian derived from first principle calculations, we find that quantized and nearly half-quantized conductance can be observed in $\mathrm{Sb_2Te_3}$ thin film in the clean limit with strong in-plane magnetic field $B>20\ \mathrm{T}$ and low temperature $T<100\ \mathrm{mK}$. △ Less

Submitted 27 March, 2022; originally announced March 2022.

Comments: 11 pages, 4 figures

arXiv:2203.12799 [pdf, other]

Energy-Efficient UAV-Mounted RIS Assisted Mobile Edge Computing

Authors: Zhiyuan Zhai, Xinhong Dai, Bin Duo, Xin Wang, Xiaojun Yuan

Abstract: Unmanned aerial vehicle (UAV) and reconfigurable intelligent surface (RIS) have been recently applied in the field of mobile edge computing (MEC) to improve the data exchange environment by proactively changing the wireless channels through maneuverable location deployment and intelligent signals reflection, respectively. Nevertheless, they may suffer from inherent limitations in practical scenari… ▽ More Unmanned aerial vehicle (UAV) and reconfigurable intelligent surface (RIS) have been recently applied in the field of mobile edge computing (MEC) to improve the data exchange environment by proactively changing the wireless channels through maneuverable location deployment and intelligent signals reflection, respectively. Nevertheless, they may suffer from inherent limitations in practical scenarios. UAV-mounted RIS (U-RIS), as a promising integrated approach, can combine the advantages of UAV and RIS to break the limit. Inspired by this, we consider a novel U-RIS assisted MEC system, where a U-RIS is deployed to assist the communication between the ground users and an MEC server. The joint UAV trajectory, RIS passive beamforming and MEC resource allocation design is developed to maximize the energy efficiency (EE) of the system. To tackle the intractable non-convex problem, we divide it into two subproblems and solve them iteratively based on successive convex approximation (SCA) and the Dinkelbach method. Finally we obtain a high-performance suboptimal solution. Simulation results show that the proposed algorithm significantly improves the energy efficiency of the MEC system. △ Less

Submitted 23 March, 2022; originally announced March 2022.

Showing 351–400 of 1,233 results for author: Dai, X