-
Periodic Coronal Rain Driven by Self-consistent Heating Process in a Radiative Magnetohydrodynamic Simulation
Authors:
Zekun Lu,
Feng Chen,
J. H. Guo,
M. D. Ding,
Can Wang,
Haocheng Yu,
Y. W. Ni,
Chun Xia
Abstract:
The periodic coronal rain and in-phase radiative intensity pulsations have been observed in multiple wavelengths in recent years. However, due to the lack of three-dimensional coronal magnetic fields and thermodynamic data in observations, it remains challenging to quantify the coronal heating rate that drives the mass cycles. In this work, based on the MURaM code, we conduct a three-dimensional r…
▽ More
The periodic coronal rain and in-phase radiative intensity pulsations have been observed in multiple wavelengths in recent years. However, due to the lack of three-dimensional coronal magnetic fields and thermodynamic data in observations, it remains challenging to quantify the coronal heating rate that drives the mass cycles. In this work, based on the MURaM code, we conduct a three-dimensional radiative magnetohydrodynamic simulation spanning from the convective zone to the corona, where the solar atmosphere is heated self-consistently through dissipation resulting from magneto-convection. For the first time, we model the periodic coronal rain in an active region. With a high spatial resolution, the simulation well resembles the observational features across different extreme ultraviolet wavelengths. These include the realistic interweaving coronal loops, periodic coronal rain and periodic intensity pulsations, with two periods of 3.0~h and 3.7~h identified within one loop system. Moreover, the simulation allows for a detailed three-dimensional depiction of coronal rain on small scales, revealing adjacent shower-like rain clumps $\sim500$~km in width and showcasing their multi-thermal internal structures. We further reveal that these periodic variations essentially reflect the cyclic energy evolution of the coronal loop under thermal non-equilibrium state. Importantly, as the driver of the mass circulation, the self-consistent coronal heating rate is considerably complex in time and space, with hour-level variations in one order of magnitude, minute-level bursts, and varying asymmetry reaching ten times between footpoints. This provides an instructive template for the ad hoc heating function, and further enhances our understanding of the coronal heating process.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Constrained Diffusion Models via Dual Training
Authors:
Shervin Khalafi,
Dongsheng Ding,
Alejandro Ribeiro
Abstract:
Diffusion models have attained prominence for their ability to synthesize a probability distribution for a given dataset via a diffusion process, enabling the generation of new data points with high fidelity. However, diffusion processes are prone to generating biased data based on the training dataset. To address this issue, we develop constrained diffusion models by imposing diffusion constraint…
▽ More
Diffusion models have attained prominence for their ability to synthesize a probability distribution for a given dataset via a diffusion process, enabling the generation of new data points with high fidelity. However, diffusion processes are prone to generating biased data based on the training dataset. To address this issue, we develop constrained diffusion models by imposing diffusion constraints based on desired distributions that are informed by requirements. Specifically, we cast the training of diffusion models under requirements as a constrained distribution optimization problem that aims to reduce the distribution difference between original and generated data while obeying constraints on the distribution of generated data. We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints. To train constrained diffusion models, we develop a dual training algorithm and characterize the optimality of the trained constrained diffusion model. We empirically demonstrate the effectiveness of our constrained models in two constrained generation tasks: (i) we consider a dataset with one or more underrepresented classes where we train the model with constraints to ensure fairly sampling from all classes during inference; (ii) we fine-tune a pre-trained diffusion model to sample from a new dataset while avoiding overfitting.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
VCEMO: Multi-Modal Emotion Recognition for Chinese Voiceprints
Authors:
Jinghua Tang,
Liyun Zhang,
Yu Lu,
Dian Ding,
Lanqing Yang,
YiChao Chen,
Minjie Bian,
Xiaoshan Li,
Guangtao Xue
Abstract:
Emotion recognition can enhance humanized machine responses to user commands, while voiceprint-based perception systems can be easily integrated into commonly used devices like smartphones and stereos. Despite having the largest number of speakers, there is a noticeable absence of high-quality corpus datasets for emotion recognition using Chinese voiceprints. Hence, this paper introduces the VCEMO…
▽ More
Emotion recognition can enhance humanized machine responses to user commands, while voiceprint-based perception systems can be easily integrated into commonly used devices like smartphones and stereos. Despite having the largest number of speakers, there is a noticeable absence of high-quality corpus datasets for emotion recognition using Chinese voiceprints. Hence, this paper introduces the VCEMO dataset to address this deficiency. The proposed dataset is constructed from everyday conversations and comprises over 100 users and 7,747 textual samples. Furthermore, this paper proposes a multimodal-based model as a benchmark, which effectively fuses speech, text, and external knowledge using a co-attention structure. The system employs contrastive learning-based regulation for the uneven distribution of the dataset and the diversity of emotional expressions. The experiments demonstrate the significant improvement of the proposed model over SOTA on the VCEMO and IEMOCAP datasets. Code and dataset will be released for research.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
Microwave-driven multistability in a strongly interacting Rydberg atoms
Authors:
Yu Ma,
Bang Liu,
Li-Hua Zhang,
Ya-Jun Wang,
Zheng-Yuan Zhang,
Shi-Yao Shao,
Qing Li,
Han-Chao Chen,
Jun Zhang,
Tian-Yu Han,
Qi-Feng Wang,
Jia-Dou Nan,
Yi-Ming Ying,
Dong-Yang Zhu,
Bao-Sen Shi,
Dong-Sheng Ding
Abstract:
The interactions between Rydberg atoms and microwave fields provide a valuable framework for studying the complex dynamics out of equilibrium, exotic phases, and critical phenomena in many-body physics. This unique interplay allows us to explore various regimes of nonlinearity and phase transitions. Here, we observe a phase transition from the state in the regime of bistability to that in multista…
▽ More
The interactions between Rydberg atoms and microwave fields provide a valuable framework for studying the complex dynamics out of equilibrium, exotic phases, and critical phenomena in many-body physics. This unique interplay allows us to explore various regimes of nonlinearity and phase transitions. Here, we observe a phase transition from the state in the regime of bistability to that in multistability in a strongly interacting Rydberg atoms by varying the microwave field intensity, accompanying with the breaking of Z3-symmetry. During the phase transition, the system experiences a hidden critical point, in which the multistable states are difficult to be identified. Through changing the initial state of system, we can identify a hidden multistable state and reveal a hidden trajectory of phase transition, allowing us to track to a hidden critical point. In addition, we observe multiple phase transitions in spectra, suggesting higher-order symmetry breaking. The reported results shed light on manipulating multistability in dissipative Rydberg atoms systems and hold promise in the applications of non-equilibrium many-body physics.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs
Authors:
Sergio Rozada,
Dongsheng Ding,
Antonio G. Marques,
Alejandro Ribeiro
Abstract:
We study the problem of computing deterministic optimal policies for constrained Markov decision processes (MDPs) with continuous state and action spaces, which are widely encountered in constrained dynamical systems. Designing deterministic policy gradient methods in continuous state and action spaces is particularly challenging due to the lack of enumerable state-action pairs and the adoption of…
▽ More
We study the problem of computing deterministic optimal policies for constrained Markov decision processes (MDPs) with continuous state and action spaces, which are widely encountered in constrained dynamical systems. Designing deterministic policy gradient methods in continuous state and action spaces is particularly challenging due to the lack of enumerable state-action pairs and the adoption of deterministic policies, hindering the application of existing policy gradient methods for constrained MDPs. To this end, we develop a deterministic policy gradient primal-dual method to find an optimal deterministic policy with non-asymptotic convergence. Specifically, we leverage regularization of the Lagrangian of the constrained MDP to propose a deterministic policy gradient primal-dual (D-PGPD) algorithm that updates the deterministic policy via a quadratic-regularized gradient ascent step and the dual variable via a quadratic-regularized gradient descent step. We prove that the primal-dual iterates of D-PGPD converge at a sub-linear rate to an optimal regularized primal-dual pair. We instantiate D-PGPD with function approximation and prove that the primal-dual iterates of D-PGPD converge at a sub-linear rate to an optimal regularized primal-dual pair, up to a function approximation error. Furthermore, we demonstrate the effectiveness of our method in two continuous control problems: robot navigation and fluid control. To the best of our knowledge, this appears to be the first work that proposes a deterministic policy search method for continuous-space constrained MDPs.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
A more generalized two-qubit symmetric quantum joint measurement
Authors:
Ying-Qiu He,
Dong Ding,
Ting Gao,
Zan-Jia Li,
Feng-Li Yan
Abstract:
A standard two-qubit joint measurement is the well-known Bell state measurement (BSM), in which each reduced state (traced out one qubit) is the completely mixed state. Recently, a novel quantum joint measurement named elegant joint measurement (EJM) has been proposed, where the reduced states of the EJM basis have tetrahedral symmetry. In this work, we first suggest a five-parameter entangled sta…
▽ More
A standard two-qubit joint measurement is the well-known Bell state measurement (BSM), in which each reduced state (traced out one qubit) is the completely mixed state. Recently, a novel quantum joint measurement named elegant joint measurement (EJM) has been proposed, where the reduced states of the EJM basis have tetrahedral symmetry. In this work, we first suggest a five-parameter entangled state and reveal its inherent symmetry. Based on this, we define a more generalized EJM parameterized by $z$, $\varphi$ and $θ$, and provide the quantum circuits for preparing and detecting these basis states. There are three main results: (i) the previous single-parameter EJM can be directly obtained by specifying the parameters $z$ and $\varphi$; (ii) the initial unit vectors related to the four vertices of the regular tetrahedron are not limited to the original choice and not all the unit vectors in cylindrical coordinates are suitable for forming the EJM basis; and (iii) the reduced states of the present EJM basis can always form two mirrorimage tetrahedrons, robustly preserving its elegant properties. We focus on figuring out what kind of states the EJM basis belongs to and providing a method for constructing the more generalized three-parameter EJM, which may contribute to the multi-setting measurement and the potential applications for quantum information processing.
△ Less
Submitted 12 August, 2024;
originally announced August 2024.
-
Various Features of the X-class White-light Flares in Super Active Region NOAA 13664
Authors:
Ying Li,
Xiaofeng Liu,
Zhichen Jing,
Wei Chen,
Qiao Li,
Yang Su,
De-Chao Song,
M. D. Ding,
Li Feng,
Hui Li,
Weiqun Gan
Abstract:
Super active region NOAA 13664 produced 12 X-class flares (including the largest one, an occulted X8.7 flare, in solar cycle 25 so far) during 2024 May 8-15 and 11 of them are identified as white-light flares. Here we present various features of these X-class white-light flares observed by the White-light Solar Telescope (WST) on board the Advanced Space-based Solar Observatory and the Helioseismi…
▽ More
Super active region NOAA 13664 produced 12 X-class flares (including the largest one, an occulted X8.7 flare, in solar cycle 25 so far) during 2024 May 8-15 and 11 of them are identified as white-light flares. Here we present various features of these X-class white-light flares observed by the White-light Solar Telescope (WST) on board the Advanced Space-based Solar Observatory and the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory. It is found that both the white-light emissions at WST 3600 Å (Balmer continuum) and HMI 6173 Å (Paschen continuum) show up in different regions of the sunspot group in these flares, including outside the sunspots and within the penumbra and umbra of the sunspots. They exhibit a point-, ribbon-, loop-, or ejecta-like shape, which can come from flare ribbons (or footpoints), flare loops, and plasma ejecta depending on the perspective view. The white-light duration and relative enhancement are measured and both parameters for 3600 Å emission have greater values than those for 6173 Å emission. It is also found that these white-light emissions are cospatial well with the hard X-ray (HXR) sources in the on-disk flares but have some offsets with the HXR emissions in the off-limb flares. In addition, it is interesting that the 3600 and 6173 Å emissions show different correlations with the peak HXR fluxes, with the former one more sensitive to the HXR emission. All these greatly help us understand the white-light flares of a large magnitude from a super active region on the Sun and also provide important insights into superflares on Sun-like stars.
△ Less
Submitted 11 August, 2024;
originally announced August 2024.
-
Exceptional point and hysteresis trajectories in cold Rydberg atomic gases
Authors:
Jun Zhang,
En-Ze Li,
Ya-Jun Wang,
Bang Liu,
Li-Hua Zhang,
Zheng-Yuan Zhang,
Shi-Yao Shao,
Qing Li,
Han-Chao Chen,
Yu Ma,
Tian-Yu Han,
Qi-Feng Wang,
Jia-Dou Nan,
Yi-Ming Ying,
Dong-Yang Zhu,
Bao-Sen Shi,
Dong-Sheng Ding
Abstract:
The interplay between strong long-range interactions and the coherent driving contribute to the formation of complex patterns, symmetry, and novel phases of matter in many-body systems. However, long-range interactions may induce an additional dissipation channel, resulting in non-Hermitian many-body dynamics and the emergence of exceptional points in spectrum. Here, we report experimental observa…
▽ More
The interplay between strong long-range interactions and the coherent driving contribute to the formation of complex patterns, symmetry, and novel phases of matter in many-body systems. However, long-range interactions may induce an additional dissipation channel, resulting in non-Hermitian many-body dynamics and the emergence of exceptional points in spectrum. Here, we report experimental observation of interaction-induced exceptional points in cold Rydberg atomic gases, revealing the breaking of charge-conjugation parity symmetry. By measuring the transmission spectrum under increasing and decreasing probe intensity, the interaction-induced hysteresis trajectories are observed, which give rise to non-Hermitian dynamics. We record the area enclosed by hysteresis loops and investigate the dynamics of hysteresis loops. The reported exceptional points and hysteresis trajectories in cold Rydberg atomic gases provide valuable insights into the underlying non-Hermitian physics in many-body systems, allowing us to study the interplay between long-range interactions and non-Hermiticity.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
The most distant HI galaxies discovered by the 500 m dish FAST
Authors:
Hongwei Xi,
Bo Peng,
Lister Staveley-Smith,
Bi-Qing For,
Bin Liu,
Ru-Rong Chen,
Lei Yu,
Dejian Ding,
Wei-Jian Guo,
Hu Zou,
Suijian Xue,
Jing Wang,
Thomas G. Brink,
WeiKang Zheng,
Alexei V. Filippenko,
Yi Yang,
Jianyan Wei,
Y. Sophia Dai,
Zi-Jian Li,
Zizhao He,
Chengzi Jiang,
Alexei Moiseev,
Sergey Kotov
Abstract:
Neutral hydrogen (HI) is the primary component of the cool interstellar medium (ISM) and is the reservoir of fuel for star formation. Owing to the sensitivity of existing radio telescopes, our understanding of the evolution of the ISM in galaxies remains limited, as it is based on only a few hundred galaxies detected in HI beyond the local Universe. With the high sensitivity of the Five-hundred-me…
▽ More
Neutral hydrogen (HI) is the primary component of the cool interstellar medium (ISM) and is the reservoir of fuel for star formation. Owing to the sensitivity of existing radio telescopes, our understanding of the evolution of the ISM in galaxies remains limited, as it is based on only a few hundred galaxies detected in HI beyond the local Universe. With the high sensitivity of the Five-hundred-meter Aperture Spherical radio Telescope (FAST), we carried out a blind HI search, the FAST Ultra-Deep Survey (FUDS), which extends to redshifts up to 0.42 and a sensitivity of 50 $\rm μJy \cdot beam^{-1}$. Here, we report the first discovery of six galaxies in HI at $z>0.38$. For these galaxies, the FAST angular resolution of $\sim\,4'$ corresponds to a mean linear size of $\sim1.3\,h_{70}^{-1}\,$Mpc. These galaxies are among the most distant HI emission detections known, with one having the most massive HI content ($10^{10.93 \pm 0.04}~h_{70}^{-2}\, \rm M_\odot$). Using recent data from the DESI survey, and new observations with the Hale, BTA, and Keck telescopes, optical counterparts are detected for all galaxies within the 3-$σ$ positional uncertainty ($0.5\,h_{70}^{-1}\,$Mpc) and $\rm 200\,km \cdot s^{-1}$ in recession velocity. Assuming that the dominant source of HI is the identified optical counterpart, we find an evidence of evolution in the HI content of galaxies over the last 4.2 Gyr. Our new high-redshift HI galaxy sample provides the opportunity to better investigate the evolution of cool gas in galaxies. A larger sample size in the future will allow us to refine our knowledge of the formation and evolution of galaxies.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
Coordinating Decisions via Quantum Telepathy
Authors:
Dawei Ding,
Liang Jiang
Abstract:
Quantum telepathy, or pseudotelepathy, is the phenomenon where two non-communicating parties can exhibit correlated behaviors that are impossible to achieve using classical mechanics. This is also known as Bell inequality violation and is made possible by quantum entanglement. In this work, we present a conceptual framework for applying quantum telepathy to real-world problems. In general, the pro…
▽ More
Quantum telepathy, or pseudotelepathy, is the phenomenon where two non-communicating parties can exhibit correlated behaviors that are impossible to achieve using classical mechanics. This is also known as Bell inequality violation and is made possible by quantum entanglement. In this work, we present a conceptual framework for applying quantum telepathy to real-world problems. In general, the problems involve coordinating decisions given a set of observations without being able to communicate. We argue this inability is actually quite prevalent in the modern era where the decision-making timescales of computer processors are so short that speed of light delay is actually quite appreciable in comparison. We highlight the example of high-frequency trading (HFT), where trades are made at microsecond timescales, but the speed of light delay between different exchanges can range from the order of 10 microseconds to 10 milliseconds. Due to the maturity of Bell inequality violation experiments, experimental realization of quantum telepathy schemes that can attain a quantum advantage for real-world problems $\textit{is already almost immediately possible}$. We demonstrate this by conducting a case study for a concrete HFT scenario that gives rise to a generalization of the CHSH game and evaluate different possible physical implementations for achieving a quantum advantage. It is well known that Bell inequality violation is a rigorous mathematical proof of a quantum advantage over any classical strategy and does not need any complexity-theoretic assumptions such as $\text{BQP}\neq\text{BPP}$. Moreover, fault tolerance is not necessary to realize a quantum advantage: for example, violating the CHSH inequality only requires single-qubit gates applied on two entangled qubits.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
CoCA: Cooperative Component Analysis
Authors:
Daisy Yi Ding,
Alden Green,
Min Woo Sun,
Robert Tibshirani
Abstract:
We propose Cooperative Component Analysis (CoCA), a new method for unsupervised multi-view analysis: it identifies the component that simultaneously captures significant within-view variance and exhibits strong cross-view correlation. The challenge of integrating multi-view data is particularly important in biology and medicine, where various types of "-omic" data, ranging from genomics to proteom…
▽ More
We propose Cooperative Component Analysis (CoCA), a new method for unsupervised multi-view analysis: it identifies the component that simultaneously captures significant within-view variance and exhibits strong cross-view correlation. The challenge of integrating multi-view data is particularly important in biology and medicine, where various types of "-omic" data, ranging from genomics to proteomics, are measured on the same set of samples. The goal is to uncover important, shared signals that represent underlying biological mechanisms. CoCA combines an approximation error loss to preserve information within data views and an "agreement penalty" to encourage alignment across data views. By balancing the trade-off between these two key components in the objective, CoCA has the property of interpolating between the commonly-used principal component analysis (PCA) and canonical correlation analysis (CCA) as special cases at the two ends of the solution path. CoCA chooses the degree of agreement in a data-adaptive manner, using a validation set or cross-validation to estimate test error. Furthermore, we propose a sparse variant of CoCA that incorporates the Lasso penalty to yield feature sparsity, facilitating the identification of key features driving the observed patterns. We demonstrate the effectiveness of CoCA on simulated data and two real multiomics studies of COVID-19 and ductal carcinoma in situ of breast. In both real data applications, CoCA successfully integrates multiomics data, extracting components that are not only consistently present across different data views but also more informative and predictive of disease progression. CoCA offers a powerful framework for discovering important shared signals in multi-view data, with the potential to uncover novel insights in an increasingly multi-view data world.
△ Less
Submitted 23 July, 2024;
originally announced July 2024.
-
Meso-inflationary Peccei-Quinn symmetry breaking with non-minimal coupling
Authors:
Yermek Aldabergenov,
Ding Ding,
Wei Lin,
Yidun Wan
Abstract:
We study a realization of the inflationary scenario where the Peccei-Quinn (PQ) symmetry is spontaneously broken during inflation, facilitated by its non-minimal coupling to gravity. This results in effectively two-field inflation: the early stage is driven by an inflaton field with the PQ symmetry intact, and the later stage is driven by the PQ scalar after its effective mass becomes tachyonic, c…
▽ More
We study a realization of the inflationary scenario where the Peccei-Quinn (PQ) symmetry is spontaneously broken during inflation, facilitated by its non-minimal coupling to gravity. This results in effectively two-field inflation: the early stage is driven by an inflaton field with the PQ symmetry intact, and the later stage is driven by the PQ scalar after its effective mass becomes tachyonic, causing destabilization from the origin. The non-minimal coupling serves the dual purpose of restoring the PQ symmetry during early inflation and flattening the PQ potential post-tachyonic shift, allowing for continued slow roll. We analyze the inflationary background solutions and scalar perturbations, which are amplified at small scales via significant isocurvature perturbations generated near the symmetry-breaking epoch. These perturbations lead to second-order gravitational waves, detectable by next-generation space-based experiments.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
All-optically tunable enantio-selectivity and chirality transfer
Authors:
En-Ze Li,
Ming-Xin Dong,
Dong-Sheng Ding,
Bao-Sen Shi,
Guang-Can Guo,
Franco Nori
Abstract:
Detecting and controlling the chirality of materials play an essential role in exploring nature, providing new avenues for material creation, discrimination, and manipulation. In such tasks, chiral reagents are essential in defining or enhancing the chiral dichroism response. However, ignoring their influences on the symmetry of the medium hamper the ability to control and induce asymmetric synthe…
▽ More
Detecting and controlling the chirality of materials play an essential role in exploring nature, providing new avenues for material creation, discrimination, and manipulation. In such tasks, chiral reagents are essential in defining or enhancing the chiral dichroism response. However, ignoring their influences on the symmetry of the medium hamper the ability to control and induce asymmetric synthesis. Here, we propose a simple but versatile chirality transfer method for synthesizing and manipulating the chirality of medium. The proposed method induces the dispersion of light in a neutral atomic system, allowing to deterministically and tunably control the chirality transfer using a helical field. First, we theoretically analyze the mechanism for this optically induced chirality transfer. Afterwards, we experimentally study the enantio-sensitive feature of the medium exposed to the auxiliary chiral field. This result can be suppressed or enhanced in a deterministic enantio-selection, opening up an efficient way to manipulate asymmetric synthesis.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
OCCAM: Towards Cost-Efficient and Accuracy-Aware Image Classification Inference
Authors:
Dujian Ding,
Bicheng Xu,
Laks V. S. Lakshmanan
Abstract:
Image classification is a fundamental building block for a majority of computer vision applications. With the growing popularity and capacity of machine learning models, people can easily access trained image classifiers as a service online or offline. However, model use comes with a cost and classifiers of higher capacity usually incur higher inference costs. To harness the respective strengths o…
▽ More
Image classification is a fundamental building block for a majority of computer vision applications. With the growing popularity and capacity of machine learning models, people can easily access trained image classifiers as a service online or offline. However, model use comes with a cost and classifiers of higher capacity usually incur higher inference costs. To harness the respective strengths of different classifiers, we propose a principled approach, OCCAM, to compute the best classifier assignment strategy over image classification queries (termed as the optimal model portfolio) so that the aggregated accuracy is maximized, under user-specified cost budgets. Our approach uses an unbiased and low-variance accuracy estimator and effectively computes the optimal solution by solving an integer linear programming problem. On a variety of real-world datasets, OCCAM achieves 40% cost reduction with little to no accuracy drop.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
One-Shot Safety Alignment for Large Language Models via Optimal Dualization
Authors:
Xinmeng Huang,
Shuo Li,
Edgar Dobriban,
Osbert Bastani,
Hamed Hassani,
Dongsheng Ding
Abstract:
The growing safety concerns surrounding Large Language Models (LLMs) raise an urgent need to align them with diverse human preferences to simultaneously enhance their helpfulness and safety. A promising approach is to enforce safety constraints through Reinforcement Learning from Human Feedback (RLHF). For such constrained RLHF, common Lagrangian-based primal-dual policy optimization methods are c…
▽ More
The growing safety concerns surrounding Large Language Models (LLMs) raise an urgent need to align them with diverse human preferences to simultaneously enhance their helpfulness and safety. A promising approach is to enforce safety constraints through Reinforcement Learning from Human Feedback (RLHF). For such constrained RLHF, common Lagrangian-based primal-dual policy optimization methods are computationally expensive and often unstable. This paper presents a dualization perspective that reduces constrained alignment to an equivalent unconstrained alignment problem. We do so by pre-optimizing a smooth and convex dual function that has a closed form. This shortcut eliminates the need for cumbersome primal-dual policy iterations, thus greatly reducing the computational burden and improving training stability. Our strategy leads to two practical algorithms in model-based and preference-based scenarios (MoCAN and PeCAN, respectively). A broad range of experiments demonstrate the effectiveness of our methods.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Survival of the Fittest Representation: A Case Study with Modular Addition
Authors:
Xiaoman Delores Ding,
Zifan Carl Guo,
Eric J. Michaud,
Ziming Liu,
Max Tegmark
Abstract:
When a neural network can learn multiple distinct algorithms to solve a task, how does it "choose" between them during training? To approach this question, we take inspiration from ecology: when multiple species coexist, they eventually reach an equilibrium where some survive while others die out. Analogously, we suggest that a neural network at initialization contains many solutions (representati…
▽ More
When a neural network can learn multiple distinct algorithms to solve a task, how does it "choose" between them during training? To approach this question, we take inspiration from ecology: when multiple species coexist, they eventually reach an equilibrium where some survive while others die out. Analogously, we suggest that a neural network at initialization contains many solutions (representations and algorithms), which compete with each other under pressure from resource constraints, with the "fittest" ultimately prevailing. To investigate this Survival of the Fittest hypothesis, we conduct a case study on neural networks performing modular addition, and find that these networks' multiple circular representations at different Fourier frequencies undergo such competitive dynamics, with only a few circles surviving at the end. We find that the frequencies with high initial signals and gradients, the "fittest," are more likely to survive. By increasing the embedding dimension, we also observe more surviving frequencies. Inspired by the Lotka-Volterra equations describing the dynamics between species, we find that the dynamics of the circles can be nicely characterized by a set of linear differential equations. Our results with modular addition show that it is possible to decompose complicated representations into simpler components, along with their basic interactions, to offer insight on the training dynamics of representations.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition
Authors:
Lingdong Kong,
Shaoyuan Xie,
Hanjiang Hu,
Yaru Niu,
Wei Tsang Ooi,
Benoit R. Cottereau,
Lai Xing Ng,
Yuexin Ma,
Wenwei Zhang,
Liang Pan,
Kai Chen,
Ziwei Liu,
Weichao Qiu,
Wei Zhang,
Xu Cao,
Hao Lu,
Ying-Cong Chen,
Caixin Kang,
Xinning Zhou,
Chengyang Ying,
Wentao Shang,
Xingxing Wei,
Yinpeng Dong,
Bo Yang,
Shengyin Jiang
, et al. (66 additional authors not shown)
Abstract:
In the realm of autonomous driving, robust perception under out-of-distribution conditions is paramount for the safe deployment of vehicles. Challenges such as adverse weather, sensor malfunctions, and environmental unpredictability can severely impact the performance of autonomous systems. The 2024 RoboDrive Challenge was crafted to propel the development of driving perception technologies that c…
▽ More
In the realm of autonomous driving, robust perception under out-of-distribution conditions is paramount for the safe deployment of vehicles. Challenges such as adverse weather, sensor malfunctions, and environmental unpredictability can severely impact the performance of autonomous systems. The 2024 RoboDrive Challenge was crafted to propel the development of driving perception technologies that can withstand and adapt to these real-world variabilities. Focusing on four pivotal tasks -- BEV detection, map segmentation, semantic occupancy prediction, and multi-view depth estimation -- the competition laid down a gauntlet to innovate and enhance system resilience against typical and atypical disturbances. This year's challenge consisted of five distinct tracks and attracted 140 registered teams from 93 institutes across 11 countries, resulting in nearly one thousand submissions evaluated through our servers. The competition culminated in 15 top-performing solutions, which introduced a range of innovative approaches including advanced data augmentation, multi-sensor fusion, self-supervised learning for error correction, and new algorithmic strategies to enhance sensor robustness. These contributions significantly advanced the state of the art, particularly in handling sensor inconsistencies and environmental variability. Participants, through collaborative efforts, pushed the boundaries of current technologies, showcasing their potential in real-world scenarios. Extensive evaluations and analyses provided insights into the effectiveness of these solutions, highlighting key trends and successful strategies for improving the resilience of driving perception systems. This challenge has set a new benchmark in the field, providing a rich repository of techniques expected to guide future research in this field.
△ Less
Submitted 29 May, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
Power-Domain Interference Graph Estimation for Full-Duplex Millimeter-Wave Backhauling
Authors:
Haorui Li,
Daqian Ding,
Yibo Pi,
Xudong Wang
Abstract:
Traditional wisdom for network resource management allocates separate frequency-time resources for measurement and data transmission tasks. As a result, the two types of tasks have to compete for resources, and a heavy measurement task inevitably reduces available resources for data transmission. This prevents interference graph estimation (IGE), a heavy yet important measurement task, from being…
▽ More
Traditional wisdom for network resource management allocates separate frequency-time resources for measurement and data transmission tasks. As a result, the two types of tasks have to compete for resources, and a heavy measurement task inevitably reduces available resources for data transmission. This prevents interference graph estimation (IGE), a heavy yet important measurement task, from being widely used in practice. To resolve this issue, we propose to use power as a new dimension for interference measurement in full-duplex millimeter-wave backhaul networks, such that data transmission and measurement can be done simultaneously using the same frequency-time resources. Our core insight is to consider the mmWave network as a linear system, where the received power of a node is a linear combination of the channel gains. By controlling the powers of transmitters, we can find unique solutions for the channel gains of interference links and use them to estimate the interference. To accomplish resource allocation and IGE simultaneously, we jointly optimize resource allocation and IGE with power control. Extensive simulations show that significant links in the interference graph can be accurately estimated with minimal extra power consumption, independent of the time and carrier frequency offsets between nodes.
△ Less
Submitted 9 May, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing
Authors:
Dujian Ding,
Ankur Mallick,
Chi Wang,
Robert Sim,
Subhabrata Mukherjee,
Victor Ruhle,
Laks V. S. Lakshmanan,
Ahmed Hassan Awadallah
Abstract:
Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e.g., edge) devices, tend to lag behind in terms of response quality. Therefore in this work we propose a hybrid inference approach which combines their respective strengths to save cost and maintain quality. Our ap…
▽ More
Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e.g., edge) devices, tend to lag behind in terms of response quality. Therefore in this work we propose a hybrid inference approach which combines their respective strengths to save cost and maintain quality. Our approach uses a router that assigns queries to the small or large model based on the predicted query difficulty and the desired quality level. The desired quality level can be tuned dynamically at test time to seamlessly trade quality for cost as per the scenario requirements. In experiments our approach allows us to make up to 40% fewer calls to the large model, with no drop in response quality.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Pointsoup: High-Performance and Extremely Low-Decoding-Latency Learned Geometry Codec for Large-Scale Point Cloud Scenes
Authors:
Kang You,
Kai Liu,
Li Yu,
Pan Gao,
Dandan Ding
Abstract:
Despite considerable progress being achieved in point cloud geometry compression, there still remains a challenge in effectively compressing large-scale scenes with sparse surfaces. Another key challenge lies in reducing decoding latency, a crucial requirement in real-world application. In this paper, we propose Pointsoup, an efficient learning-based geometry codec that attains high-performance an…
▽ More
Despite considerable progress being achieved in point cloud geometry compression, there still remains a challenge in effectively compressing large-scale scenes with sparse surfaces. Another key challenge lies in reducing decoding latency, a crucial requirement in real-world application. In this paper, we propose Pointsoup, an efficient learning-based geometry codec that attains high-performance and extremely low-decoding-latency simultaneously. Inspired by conventional Trisoup codec, a point model-based strategy is devised to characterize local surfaces. Specifically, skin features are embedded from local windows via an attention-based encoder, and dilated windows are introduced as cross-scale priors to infer the distribution of quantized features in parallel. During decoding, features undergo fast refinement, followed by a folding-based point generator that reconstructs point coordinates with fairly fast speed. Experiments show that Pointsoup achieves state-of-the-art performance on multiple benchmarks with significantly lower decoding complexity, i.e., up to 90$\sim$160$\times$ faster than the G-PCCv23 Trisoup decoder on a comparatively low-end platform (e.g., one RTX 2080Ti). Furthermore, it offers variable-rate control with a single neural model (2.9MB), which is attractive for industrial practitioners.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Microwave seeding time crystal in Floquet driven Rydberg atoms
Authors:
Bang Liu,
Li-Hua Zhang,
Yu Ma,
Tian-Yu Han,
Qi-Feng Wang,
Jun Zhang,
Zheng-Yuan Zhang,
Shi-Yao Shao,
Qing Li,
Han-Chao Chen,
Ya-Jun Wang,
Jia-Dou Nan,
Yi-Ming Yin,
Dong-Sheng Ding,
Bao-Sen Shi
Abstract:
Crystal seeding enables a deeper understanding of phase behavior, leading to the development of methods for controlling and manipulating phase transitions in various applications such as materials synthesis, crystallization processes, and phase transformation engineering. How to seed a crystalline in time domain is an open question, which is of great significant and may provide an avenue to unders…
▽ More
Crystal seeding enables a deeper understanding of phase behavior, leading to the development of methods for controlling and manipulating phase transitions in various applications such as materials synthesis, crystallization processes, and phase transformation engineering. How to seed a crystalline in time domain is an open question, which is of great significant and may provide an avenue to understand and control time-dependent quantum many-body physics. Here, we utilize a microwave pulse as a seed to induce the formation of a discrete time crystal in Floquet driven Rydberg atoms. In the experiment, the periodic driving on Rydberg states acts as a seeded crystalline order in subspace, which triggers the time-translation symmetry breaking across the entire ensemble. The behavior of the emergent time crystal is elaborately linked to alterations in the seed, such as the relative phase shift and the frequency difference, which result in phase dependent seeding and corresponding shift in periodicity of the time crystal, leading to embryonic synchronization. This result opens up new possibilities for studying and harnessing time-dependent quantum many-body phenomena, offering insights into the behavior of complex many-body systems under seeding.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Ultra-Wide Dual-band Rydberg Atomic Receiver Based on Space Division Multiplexing RF-Chip Modules
Authors:
Li-Hua Zhang,
Bang Liu,
Zong-Kai Liu,
Zheng-Yuan Zhang,
Shi-Yao Shao,
Qi-Feng Wang,
Ma YuTian-Yu Han,
Guang-Can Guo,
Dong-Sheng Ding,
Bao-Sen Shi
Abstract:
Detecting microwave signals over a wide frequency range has numerous advantages as it enables simultaneous transmission of a large amount of information and access to more spectrum resources. This capability is crucial for applications such as microwave communication, remote sensing, and radar. However, conventional microwave receiving systems are limited by amplifiers and band-pass filters that c…
▽ More
Detecting microwave signals over a wide frequency range has numerous advantages as it enables simultaneous transmission of a large amount of information and access to more spectrum resources. This capability is crucial for applications such as microwave communication, remote sensing, and radar. However, conventional microwave receiving systems are limited by amplifiers and band-pass filters that can only operate efficiently in a specific frequency range. Typically, these systems can only process signals within a three-fold frequency range, which limits the data transfer bandwidth of the microwave communication systems. Developing novel atom-integrated microwave sensors, for example, radio frequency (RF)-chip coupled Rydberg atomic receiver, provides opportunities for a large working bandwidth of microwave sensing at the atomic level. Here, an ultra-wide dual-band RF sensing scheme is demonstrated by space-division multiplexing two RF-chip-integrated atomic receiver modules. The system can simultaneously receive dual-band microwave signals that span a frequency range exceeding 6 octaves (300 MHz and 24 GHz). This work paves the way for multi-band microwave reception applications within an ultra-wide range by RF-chip-integrated Rydberg atomic sensor.
△ Less
Submitted 16 April, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Early warning signals of the tipping point in strongly interacting Rydberg atoms
Authors:
Jun Zhang,
Zong-Kai Liu,
Li-Hua Zhang,
Bang Liu,
Zheng-Yuan Zhang,
Shi-Yao Shao,
Qing Li,
Han-Chao Chen,
Yu Ma,
Tian-Yu Han,
Qi-Feng Wang,
C. Stuart Adams,
Dong-Sheng Ding,
Bao-Sen Shi
Abstract:
The identification of tipping points is essential for prediction of collapses or other sudden changes in complex systems. Applications include studies of ecology, thermodynamics, climatology, and epidemiology. However, detecting early signs of proximity to a tipping is made challenging by complexity and non-linearity. Strongly interacting Rydberg atom gases offer a model systems that offer both co…
▽ More
The identification of tipping points is essential for prediction of collapses or other sudden changes in complex systems. Applications include studies of ecology, thermodynamics, climatology, and epidemiology. However, detecting early signs of proximity to a tipping is made challenging by complexity and non-linearity. Strongly interacting Rydberg atom gases offer a model systems that offer both complexity and non-linearity, including phase transition and critical slowing down. Here, via an external probe we observe prior warning of the proximity of a phase transition of Rydberg thermal gases. This warning signal is manifested as a cessation of the variance growth with increasing probe intensity. We also observed the dynamics of the critical slowing down behavior versus different time scales, driving intensities, and atomic densities, thus providing insights into the study of a Rydberg atom system's critical behavior. Our experiment suggests that the full critical slowing down dynamics of strongly-interacting Rydberg atoms can be probed systematically, thus providing a benchmark with which to identify critical phenomena in quantum many-body systems.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Floquet engineering Rydberg sub-THz frequency comb spectroscopy
Authors:
Li-Hua Zhang,
Zong-Kai Liu,
Bang Liu,
Qi-Feng Wang,
Yu Ma,
Tian-Yu Han,
Zheng-Yuan Zhang,
Han-Chao Chen,
Shi-Yao Shao,
Qing Lim,
Jun Zhang,
Dong-Sheng Ding,
Bao-Sen Shi
Abstract:
Engineering a Terahertz (THz) frequency comb spectroscopy at atomic level advances the precisely measurement in spectroscopy and sensing. Current progresses on THz frequency comb rely on difference-frequency generation, optical parametric oscillation, and other methods. Generating a THz frequency comb poses challenges in source stability and achieving a narrow bandwidth, which traditional THz devi…
▽ More
Engineering a Terahertz (THz) frequency comb spectroscopy at atomic level advances the precisely measurement in spectroscopy and sensing. Current progresses on THz frequency comb rely on difference-frequency generation, optical parametric oscillation, and other methods. Generating a THz frequency comb poses challenges in source stability and achieving a narrow bandwidth, which traditional THz devices are difficult to achieve. Furthermore, accurately measuring the generated THz frequency comb necessitates a high-performance THz detector. Rydberg atoms are well-suited for electric field sensing due to their ultra-wide radio frequency transition energy levels, making them especially sensitive to external electric fields in the DC to THz bandwidth. However, there have been no reports about generating THz frequency comb spectroscopy at the atomic level until now. This work presents a THz frequency comb spectroscopy with Rydberg atoms, in which a Floquet comb-like transition is engineered through a time-periodic drive field. Our approach simplifies the setup required for THz frequency comb spectroscopy while extending the working bandwidth for Rydberg atomic sensors. The THz frequency comb spectroscopy at the atomic level reported in this article shows great potential for various applications in astronomy, remote sensing, spectral detection of biological samples, and other related fields.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Cavity-enhanced Rydberg atom microwave receiver
Authors:
Bang Liu,
Li-Hua Zhang,
Zong-Kai Liu,
Qi-Feng Wang,
Yu Ma,
Tian-Yu Han,
Zheng-Yuan Zhang,
Shi-Yao Shao,
Jun Zhang,
Qing Li,
Han-Chao Chen,
Dong-Sheng Ding,
Bao-Sen Shi
Abstract:
Developing microwave electric field sensing based on Rydberg atom has received significant attention due to its unique advantages. However, achieving effective coupling between Rydberg atom and the microwave electric field in the sensing process is a challenging problem that greatly impacts the sensitivity. To address this, we propose the use of a microwave resonant cavity to enhance the effective…
▽ More
Developing microwave electric field sensing based on Rydberg atom has received significant attention due to its unique advantages. However, achieving effective coupling between Rydberg atom and the microwave electric field in the sensing process is a challenging problem that greatly impacts the sensitivity. To address this, we propose the use of a microwave resonant cavity to enhance the effective coupling between the Rydberg atoms and the microwave electric field. In our experiment, we use a three-photon excitation scheme to prepare Rydberg atoms, make measurements of electric fields without and with a microwave cavity in which the vapor cell is put inside. Through experimental testing, we achieve an 18 dB enhancement of power sensitivity. The experiment shows an effective enhancement in electric field pulse signal detection. This result provides a promising direction for enhancing the sensitivity of Rydberg atomic electric field sensors and paves the way for their application in precision electric field measurements.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
A model for heating the super-hot corona in solar active regions
Authors:
Zekun Lu,
Feng Chen,
M. D. Ding,
Can Wang,
Yu Dai,
Xin Cheng
Abstract:
What physical mechanisms heat the outer solar or stellar atmosphere to million-Kelvin temperatures is a fundamental but long-standing open question. In particular, the solar corona in active region cores contains an even hotter component reaching ten million Kelvin, manifesting as persistent coronal loops in extreme ultraviolet and soft X-ray images, which imposes a more stringent energy budget. H…
▽ More
What physical mechanisms heat the outer solar or stellar atmosphere to million-Kelvin temperatures is a fundamental but long-standing open question. In particular, the solar corona in active region cores contains an even hotter component reaching ten million Kelvin, manifesting as persistent coronal loops in extreme ultraviolet and soft X-ray images, which imposes a more stringent energy budget. Here, we present a self-consistent coronal heating model using a state-of-the-art three-dimensional radiative magnetohydrodynamics simulation. We find that the continuous magnetic flux emergence in active regions keeps driving magnetic reconnections that release energy impulsively but, on time average, persistently. As a result, numerous sub-structures are heated to ten million Kelvin and then evolve independently, which collectively form long-lived and stable coronal loops as in observations. This provides a heating model explaining the origin of the super-hot coronal plasma and the persistence of hot coronal loops in emerging active regions.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
$(n,m,p)$-type quantum network configuration and its nonlocality
Authors:
Zan-Jia Li,
Ying-Qiu He,
Dong Ding,
Ming-Xing Yu,
Ting Gao,
Feng-Li Yan
Abstract:
A quantum network shared entangled sources among distant nodes enables us to distribute entanglement along the network by suitable measurements. Network nonlocality means that it does not admit a network model involving local variables emitted from independent sources. In this work, we construct an $(n,m,p)$-type quantum network configuration and then derive the corresponding $n$-local correlation…
▽ More
A quantum network shared entangled sources among distant nodes enables us to distribute entanglement along the network by suitable measurements. Network nonlocality means that it does not admit a network model involving local variables emitted from independent sources. In this work, we construct an $(n,m,p)$-type quantum network configuration and then derive the corresponding $n$-local correlation inequalities based on the assumption of independent sources. As a universal acyclic network configuration, it can cover most of the existing network models, such as the typical chain-network and star-network, and admit both centerless and asymmetric configurations. Then we demonstrate the non-$n$-locality of the present network by calculating the violation of the $n$-local inequality with bipartite entangled sources and Pauli measurements.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Non-Hermitian unidirectional routing of photonic qubits
Authors:
En-Ze Li,
Yi-Yang Liu,
Ming-Xin Dong,
Dong-Sheng Ding,
Bao-Sen Shi
Abstract:
Efficient and tunable qubit unidirectional routers and spin-wave diodes play an important role in both classical and quantum information processing domains. Here, we reveal that multi-level neutral cold atoms can mediate both dissipative and coherent couplings. Interestingly, we investigate and practically implement this paradigm in experiments, successfully synthesizing a system with dual functio…
▽ More
Efficient and tunable qubit unidirectional routers and spin-wave diodes play an important role in both classical and quantum information processing domains. Here, we reveal that multi-level neutral cold atoms can mediate both dissipative and coherent couplings. Interestingly, we investigate and practically implement this paradigm in experiments, successfully synthesizing a system with dual functionality as both a photonic qubit unidirectional router and a spin-wave diode. By manipulating the helicity of the field, we can effectively balance the coherence coupling and dissipative channel, thereby ensuring the unidirectional transfer of photonic qubits. The qubit fidelity exceeds 97.49%, and the isolation ratio achieves $16.8\pm0.11$ dB while the insertion loss is lower than 0.36 dB. Furthermore, we show that the spin-wave diode can effectively achieve unidirectional information transfer by appropriately setting the coherent coupling parameters. Our work not only provides new ideas for the design of extensive components in quantum networks, but also opens up new possibilities for non-Hermitian quantum physics, complex quantum networks, and unidirectional quantum information transfer.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
EDDA: A Encoder-Decoder Data Augmentation Framework for Zero-Shot Stance Detection
Authors:
Daijun Ding,
Li Dong,
Zhichao Huang,
Guangning Xu,
Xu Huang,
Bo Liu,
Liwen Jing,
Bowen Zhang
Abstract:
Stance detection aims to determine the attitude expressed in text towards a given target. Zero-shot stance detection (ZSSD) has emerged to classify stances towards unseen targets during inference. Recent data augmentation techniques for ZSSD increase transferable knowledge between targets through text or target augmentation. However, these methods exhibit limitations. Target augmentation lacks log…
▽ More
Stance detection aims to determine the attitude expressed in text towards a given target. Zero-shot stance detection (ZSSD) has emerged to classify stances towards unseen targets during inference. Recent data augmentation techniques for ZSSD increase transferable knowledge between targets through text or target augmentation. However, these methods exhibit limitations. Target augmentation lacks logical connections between generated targets and source text, while text augmentation relies solely on training data, resulting in insufficient generalization. To address these issues, we propose an encoder-decoder data augmentation (EDDA) framework. The encoder leverages large language models and chain-of-thought prompting to summarize texts into target-specific if-then rationales, establishing logical relationships. The decoder generates new samples based on these expressions using a semantic correlation word replacement strategy to increase syntactic diversity. We also analyze the generated expressions to develop a rationale-enhanced network that fully utilizes the augmented data. Experiments on benchmark datasets demonstrate our approach substantially improves over state-of-the-art ZSSD techniques. The proposed EDDA framework increases semantic relevance and syntactic variety in augmented texts while enabling interpretable rationale-based learning.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
Entity Alignment with Unlabeled Dangling Cases
Authors:
Hang Yin,
Dong Ding,
Liyao Xiang,
Yuheng He,
Yihan Wu,
Xinbing Wang,
Chenghu Zhou
Abstract:
We investigate the entity alignment problem with unlabeled dangling cases, meaning that there are entities in the source or target graph having no counterparts in the other, and those entities remain unlabeled. The problem arises when the source and target graphs are of different scales, and it is much cheaper to label the matchable pairs than the dangling entities. To solve the issue, we propose…
▽ More
We investigate the entity alignment problem with unlabeled dangling cases, meaning that there are entities in the source or target graph having no counterparts in the other, and those entities remain unlabeled. The problem arises when the source and target graphs are of different scales, and it is much cheaper to label the matchable pairs than the dangling entities. To solve the issue, we propose a novel GNN-based dangling detection and entity alignment framework. While the two tasks share the same GNN and are trained together, the detected dangling entities are removed in the alignment. Our framework is featured by a designed entity and relation attention mechanism for selective neighborhood aggregation in representation learning, as well as a positive-unlabeled learning loss for an unbiased estimation of dangling entities. Experimental results have shown that each component of our design contributes to the overall alignment performance which is comparable or superior to baselines, even if the baselines additionally have 30\% of the dangling entities labeled as training data.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Sun-as-a-star Study of an X-class Solar Flare with Spectroscopic Observations of CHASE
Authors:
Y. L. Ma,
Q. H. Lao,
X. Cheng,
B. T. Wang,
Z. H. Zhao,
S. H. Rao,
C. Li,
M. D. Ding
Abstract:
Sun-as-a-star spectroscopic characteristics of solar flares can be used as a benchmark for the detection and analyses of stellar flares. Here, we study the Sun-as-a-star properties of an X1.0 solar flare using high-resolution spectroscopic data obtained by the Chinese $\mathrm{H} α$ Solar Explorer (CHASE). A noise reduction algorithm based on discrete Fourier transformation is first employed to en…
▽ More
Sun-as-a-star spectroscopic characteristics of solar flares can be used as a benchmark for the detection and analyses of stellar flares. Here, we study the Sun-as-a-star properties of an X1.0 solar flare using high-resolution spectroscopic data obtained by the Chinese $\mathrm{H} α$ Solar Explorer (CHASE). A noise reduction algorithm based on discrete Fourier transformation is first employed to enhance the signal-to-noise ratio of the space-integral $\mathrm{H} α$ spectrum with a focus on its typical characteristics. For the flare of interest, we find that the average $\mathrm{H} α$ profile displays a strong emission at the line center and an obvious line broadening. It also presents a clear red asymmetry, corresponding to a redshift velocity of around $50 \ \mathrm{km \ s^{-1}}$ that slightly decreases with time, consistent with previous results. Furthermore, we study how the size of the space-integral region affects the characteristics of the flare Sun-as-a-star $\mathrm{H} α$ profile. It is found that although the redshift velocity calculated from the $\mathrm{H} α$ profile remains unchanged, the detectability of the characteristics weakens as the space-integral region becomes large. An upper limit for the size of the target region where the red asymmetry is detectable is estimated. It is also found that the intensity in $\mathrm{H} α$ profiles, measured by the equivalent widths of the spectra, are significantly underestimated if the $\mathrm{H} α$ spectra are further averaged in the time domain.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Prototyping and Experimental Results for Environment-Aware Millimeter Wave Beam Alignment via Channel Knowledge Map
Authors:
Zhuoyin Dai,
Di Wu,
Zhenjun Dong,
Kun Li,
Dingyang Ding,
Sihan Wang,
Yong Zeng
Abstract:
Channel knowledge map (CKM), which aims to directly reflect the intrinsic channel properties of the local wireless environment, is a novel technique for achieving environmentaware communication. In this paper, to alleviate the large training overhead in millimeter wave (mmWave) beam alignment, an environment-aware and training-free beam alignment prototype is established based on a typical CKM, te…
▽ More
Channel knowledge map (CKM), which aims to directly reflect the intrinsic channel properties of the local wireless environment, is a novel technique for achieving environmentaware communication. In this paper, to alleviate the large training overhead in millimeter wave (mmWave) beam alignment, an environment-aware and training-free beam alignment prototype is established based on a typical CKM, termed beam index map (BIM). To this end, a general CKM construction method is first presented, and an indoor BIM is constructed offline to learn the candidate transmit and receive beam index pairs for each grid in the experimental area. Furthermore, based on the location information of the receiver (or the dynamic obstacles) from the ultra-wide band (UWB) positioning system, the established BIM is used to achieve training-free beam alignment by directly providing the beam indexes for the transmitter and receiver. Three typical scenarios are considered in the experiment, including quasi-static environment with line-of-sight (LoS) link, quasistatic environment without LoS link and dynamic environment. Besides, the receiver orientation measured from the gyroscope is also used to help CKM predict more accurate beam indexes. The experiment results show that compared with the benchmark location-based beam alignment strategy, the CKM-based beam alignment strategy can achieve much higher received power, which is close to that achieved by exhaustive beam search, but with significantly reduced training overhead.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Uplift Modeling for Target User Attacks on Recommender Systems
Authors:
Wenjie Wang,
Changsheng Wang,
Fuli Feng,
Wentao Shi,
Daizong Ding,
Tat-Seng Chua
Abstract:
Recommender systems are vulnerable to injective attacks, which inject limited fake users into the platforms to manipulate the exposure of target items to all users. In this work, we identify that conventional injective attackers overlook the fact that each item has its unique potential audience, and meanwhile, the attack difficulty across different users varies. Blindly attacking all users will re…
▽ More
Recommender systems are vulnerable to injective attacks, which inject limited fake users into the platforms to manipulate the exposure of target items to all users. In this work, we identify that conventional injective attackers overlook the fact that each item has its unique potential audience, and meanwhile, the attack difficulty across different users varies. Blindly attacking all users will result in a waste of fake user budgets and inferior attack performance. To address these issues, we focus on an under-explored attack task called target user attacks, aiming at promoting target items to a particular user group. In addition, we formulate the varying attack difficulty as heterogeneous treatment effects through a causal lens and propose an Uplift-guided Budget Allocation (UBA) framework. UBA estimates the treatment effect on each target user and optimizes the allocation of fake user budgets to maximize the attack performance. Theoretical and empirical analysis demonstrates the rationality of treatment effect estimation methods of UBA. By instantiating UBA on multiple attackers, we conduct extensive experiments on three datasets under various settings with different target items, target users, fake user budgets, victim models, and defense models, validating the effectiveness and robustness of UBA.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Generalized Coronal Loop Scaling Laws and Their Implication for Turbulence in Solar Active Region Loops
Authors:
Y. Dai,
J. J. Xiang,
M. D. Ding
Abstract:
Recent coronal loop modeling has emphasized the importance of combining both Coulomb collisions and turbulent scattering to characterize field-aligned thermal conduction, which invokes a hybrid loop model. In this work we generalize the hybrid model by incorporating nonuniform heating and cross section that are both formulated by a power-law function of temperature. Based on the hybrid model solut…
▽ More
Recent coronal loop modeling has emphasized the importance of combining both Coulomb collisions and turbulent scattering to characterize field-aligned thermal conduction, which invokes a hybrid loop model. In this work we generalize the hybrid model by incorporating nonuniform heating and cross section that are both formulated by a power-law function of temperature. Based on the hybrid model solutions, we construct scaling laws that relate loop-top temperature ($T_a$) and heating rate ($H_a$) to other loop parameters. It is found that the loop-top properties for turbulent loops are additionally power-law functions of turbulent mean free path ($λ_T$), with the functional forms varying from situation to situation that depends on the specification of the heating and/or areal parameters. More importantly, both a sufficiently footpoint-concentrated heating and a cross-sectional expansion with height can effectively weaken (strengthen) the negative (positive) power-law dependence of $T_a$ ($H_a$) on $λ_T$. The reason lies in a notable reduction of heat flux by footpoint heating and/or cross-sectional expansion in the turbulence-dominated coronal part, where turbulent scattering introduces a much weaker dependence of the conduction coefficient on temperature. In this region, therefore, the reduction of the heat flux predominately relies on a backward flattening of the temperature gradient. Through numerical modeling that incorporates more realistic conditions, this scenario is further consolidated. Our results have important implication for solar active region (AR) loops. With the factors of nonuniform heating and cross section taken into account, AR loops can bear relatively stronger turbulence while still keeping a physically reasonable temperature for nonflaring loops.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Developing an Automated Detection, Tracking and Analysis Method for Solar Filaments Observed by CHASE via Machine Learning
Authors:
Z. Zheng,
Q. Hao,
Y. Qiu,
J. Hong,
C. Li,
M. D. Ding
Abstract:
Studies on the dynamics of solar filaments have significant implications for understanding their formation, evolution, and eruption, which are of great importance for space weather warning and forecasting. The H$α$ Imaging Spectrograph (HIS) onboard the recently launched Chinese H$α$ Solar Explorer (CHASE) can provide full-disk solar H$α$ spectroscopic observations, which bring us an opportunity t…
▽ More
Studies on the dynamics of solar filaments have significant implications for understanding their formation, evolution, and eruption, which are of great importance for space weather warning and forecasting. The H$α$ Imaging Spectrograph (HIS) onboard the recently launched Chinese H$α$ Solar Explorer (CHASE) can provide full-disk solar H$α$ spectroscopic observations, which bring us an opportunity to systematically explore and analyze the plasma dynamics of filaments. The dramatically increased observation data require automate processing and analysis which are impossible if dealt with manually. In this paper, we utilize the U-Net model to identify filaments and implement the Channel and Spatial Reliability Tracking (CSRT) algorithm for automated filament tracking. In addition, we use the cloud model to invert the line-of-sight velocity of filaments and employ the graph theory algorithm to extract the filament spine, which can advance our understanding of the dynamics of filaments. The favorable test performance confirms the validity of our method, which will be implemented in the following statistical analyses of filament features and dynamics of CHASE/HIS observations.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Higher-order and fractional discrete time crystals in Floquet-driven Rydberg atoms
Authors:
Bang Liu,
Li-Hua Zhang,
Zong-Kai Liu,
Jun Zhang,
Zheng-Yuan Zhang,
Shi-Yao Shao,
Qing Li,
Han-Chao Chen,
Yu Ma,
Tian-Yu Han,
Qi-Feng Wang,
Dong-Sheng Ding,
Bao-Sen Shi
Abstract:
Higher-order and fractional discrete time crystals (DTCs) are exotic phases of matter where the discrete time translation symmetry is broken into higher-order and non-integer category. Generation of these unique DTCs has been widely studied theoretically in different systems. However, no current experimental methods can probe these higher-order and fractional DTCs in any quantum many-body systems.…
▽ More
Higher-order and fractional discrete time crystals (DTCs) are exotic phases of matter where the discrete time translation symmetry is broken into higher-order and non-integer category. Generation of these unique DTCs has been widely studied theoretically in different systems. However, no current experimental methods can probe these higher-order and fractional DTCs in any quantum many-body systems. We demonstrate an experimental approach to observe higher-order and fractional DTCs in Floquet-driven Rydberg atomic gases. We have discovered multiple $n$-DTCs with integer values of $n$ = 2, 3, and 4, and others ranging up to 14, along with fractional $n$-DTCs with $n$ values beyond the integers. The system response can transition between adjacent integer DTCs, during which the fractional DTCs are investigated. Study of higher-order and fractional DTCs expands fundamental knowledge of non-equilibrium dynamics and is promising for discovery of more complex temporal symmetries beyond the single discrete time translation symmetry.
△ Less
Submitted 27 February, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Bifurcation of time crystals in driven and dissipative Rydberg atomic gas
Authors:
Bang Liu,
Li-Hua Zhang,
Zong-Kai Liu,
Jun Zhang,
Zheng-Yuan Zhang,
Shi-Yao Shao,
Qing Li,
Han-Chao Chen,
Yu Ma,
Tian-Yu Han,
Qi-Feng Wang,
Dong-Sheng Ding,
Bao-Sen Shi
Abstract:
A time crystal is an exotic phase of matter where time-translational symmetry is broken; this phase differs from the spatial symmetry breaking induced in crystals in space. Lots of experiments report the transition from a thermal equilibrium phase to time crystal phase. However, there is no experimental method to probe the bifurcation effect of distinct time crystals in quantum many-body systems.…
▽ More
A time crystal is an exotic phase of matter where time-translational symmetry is broken; this phase differs from the spatial symmetry breaking induced in crystals in space. Lots of experiments report the transition from a thermal equilibrium phase to time crystal phase. However, there is no experimental method to probe the bifurcation effect of distinct time crystals in quantum many-body systems. Here, in a driven and dissipative many-body Rydberg atom system, we observe multiple continuous dissipative time crystals and emergence of more complex temporal symmetries beyond the single time crystal phase. Bifurcation of time crystals in strongly interacting Rydberg atoms is observed; the process manifests as a transition from a time crystal state of long temporal order to one of short temporal order, or vice versa. By manipulating the driving field parameters, we observe the time crystal's bistability and a hysteresis loop. These investigations indicate new possibilities for control and manipulation of the temporal symmetries of non-equilibrium systems.
△ Less
Submitted 27 February, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
The White-light Emissions in Two X-class Flares Observed by ASO-S and CHASE
Authors:
Ying Li,
Zhichen Jing,
De-Chao Song,
Qiao Li,
Jun Tian,
Xiaofeng Liu,
Ya Wang,
M. D. Ding,
Andrea Francesco Battaglia,
Li Feng,
Hui Li,
Weiqun Gan
Abstract:
The white-light continuum emissions in solar flares (i.e., white-light flares) are usually observed on the solar disk but, in a few cases, off the limb. Here we present on-disk as well as off-limb continuum emissions at 3600 Å (in the Balmer continuum) in an X2.1 flare (SOL2023-03-03T17:52) and an X1.5 flare (SOL2023-08-07T20:46), respectively, observed by the White-light Solar Telescope (WST) on…
▽ More
The white-light continuum emissions in solar flares (i.e., white-light flares) are usually observed on the solar disk but, in a few cases, off the limb. Here we present on-disk as well as off-limb continuum emissions at 3600 Å (in the Balmer continuum) in an X2.1 flare (SOL2023-03-03T17:52) and an X1.5 flare (SOL2023-08-07T20:46), respectively, observed by the White-light Solar Telescope (WST) on the Advanced Space-based Solar Observatory (ASO-S). These continuum emissions are seen at the ribbons for the X2.1 flare and on loops during the X1.5 event, in which the latter also appears in the decay phase. These emissions also show up in the pseudo-continuum images at Fe I λ6173 from the Helioseismic and Magnetic Imager (HMI) on the Solar Dynamics Observatory (SDO). In addition, the ribbon sources in the X2.1 flare exhibit significant enhancements in the Fe I line at 6569.2 Å and the nearby continuum observed by the Chinese Hα Solar Explorer (CHASE). It is found that the on-disk continuum emissions in the X2.1 flare are related to a nonthermal electron-beam heating either directly or indirectly, while the off-limb emissions in the X1.5 flare are associated with thermal plasma cooling or due to Thomson scattering. These comprehensive continuum observations can provide good constraints on flare energy deposition models, which helps well understand the physical mechanism of white-light flares.
△ Less
Submitted 11 February, 2024;
originally announced February 2024.
-
Microwave control of collective quantum jump statistics of a dissipative Rydberg gas
Authors:
Zong-Kai Liu,
Kong-Hao Sun,
Albert Cabot,
Federico Carollo,
Jun Zhang,
Zheng-Yuan Zhang,
Li-Hua Zhang,
Bang Liu,
Tian-Yu Han,
Qing Li,
Yu Ma,
Han-Chao Chen,
Igor Lesanovsky,
Dong-Sheng Ding,
Bao-Sen Shi
Abstract:
Quantum many-body systems near phase transitions respond collectively to externally applied perturbations. We explore this phenomenon in a laser-driven dissipative Rydberg gas that is tuned to a bistable regime. Here two metastable phases coexist, which feature a low and high density of Rydberg atoms, respectively. The ensuing collective dynamics, which we monitor in situ, is characterized by stoc…
▽ More
Quantum many-body systems near phase transitions respond collectively to externally applied perturbations. We explore this phenomenon in a laser-driven dissipative Rydberg gas that is tuned to a bistable regime. Here two metastable phases coexist, which feature a low and high density of Rydberg atoms, respectively. The ensuing collective dynamics, which we monitor in situ, is characterized by stochastic collective jumps between these two macroscopically distinct many-body phases. We show that the statistics of these jumps can be controlled using a dual-tone microwave field. In particular, we find that the distribution of jump times develops peaks corresponding to subharmonics of the relative microwave detuning. Our study demonstrates the control of collective statistical properties of dissipative quantum many-body systems without the necessity of fine-tuning or of ultra cold temperatures. Such robust phenomena may find technological applications in quantum sensing and metrology.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Quantum teleportation based on the elegant joint measurement
Authors:
Dong Ding,
Ming-Xing Yu,
Ying-Qiu He,
Hao-Sen Ji,
Ting Gao,
Feng-Li Yan
Abstract:
As a generalization of the well-known Bell state measurement (BSM), the elegant joint measurement (EJM) is a kind of novel two-qubit joint measurement, parameterized by a subtle phase factor $θ\in [0,π/2]$. We explore quantum teleportation based on the EJM, inspired by Gisin's idea that quantum entanglement not only provides quantum channel and also quantum joint measurement for quantum teleportat…
▽ More
As a generalization of the well-known Bell state measurement (BSM), the elegant joint measurement (EJM) is a kind of novel two-qubit joint measurement, parameterized by a subtle phase factor $θ\in [0,π/2]$. We explore quantum teleportation based on the EJM, inspired by Gisin's idea that quantum entanglement not only provides quantum channel and also quantum joint measurement for quantum teleportation. It is a probabilistic teleportation caused by undesired nonunitary quantum evolution. There are two interesting features in the present scenario. First, it goes beyond the conventional teleportation scenario, which can be included in the present scenario. Second, different from the BSM being single input and four outcomes, it can provide an adjustable input setting or even multiple measurement settings for the sender (or the controller). Moreover, we show in detail the feasible quantum circuits to realize the present scenario, where a few unitary operations and a nonunitary quantum gate are being utilized.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
Another Way to the Top: Exploit Contextual Clustering in Learned Image Coding
Authors:
Yichi Zhang,
Zhihao Duan,
Ming Lu,
Dandan Ding,
Fengqing Zhu,
Zhan Ma
Abstract:
While convolution and self-attention are extensively used in learned image compression (LIC) for transform coding, this paper proposes an alternative called Contextual Clustering based LIC (CLIC) which primarily relies on clustering operations and local attention for correlation characterization and compact representation of an image. As seen, CLIC expands the receptive field into the entire image…
▽ More
While convolution and self-attention are extensively used in learned image compression (LIC) for transform coding, this paper proposes an alternative called Contextual Clustering based LIC (CLIC) which primarily relies on clustering operations and local attention for correlation characterization and compact representation of an image. As seen, CLIC expands the receptive field into the entire image for intra-cluster feature aggregation. Afterward, features are reordered to their original spatial positions to pass through the local attention units for inter-cluster embedding. Additionally, we introduce the Guided Post-Quantization Filtering (GuidedPQF) into CLIC, effectively mitigating the propagation and accumulation of quantization errors at the initial decoding stage. Extensive experiments demonstrate the superior performance of CLIC over state-of-the-art works: when optimized using MSE, it outperforms VVC by about 10% BD-Rate in three widely-used benchmark datasets; when optimized using MS-SSIM, it saves more than 50% BD-Rate over VVC. Our CLIC offers a new way to generate compact representations for image compression, which also provides a novel direction along the line of LIC development.
△ Less
Submitted 21 January, 2024;
originally announced January 2024.
-
Diffusion Model Conditioning on Gaussian Mixture Model and Negative Gaussian Mixture Gradient
Authors:
Weiguo Lu,
Xuan Wu,
Deng Ding,
Jinqiao Duan,
Jirong Zhuang,
Gangnan Yuan
Abstract:
Diffusion models (DMs) are a type of generative model that has a huge impact on image synthesis and beyond. They achieve state-of-the-art generation results in various generative tasks. A great diversity of conditioning inputs, such as text or bounding boxes, are accessible to control the generation. In this work, we propose a conditioning mechanism utilizing Gaussian mixture models (GMMs) as feat…
▽ More
Diffusion models (DMs) are a type of generative model that has a huge impact on image synthesis and beyond. They achieve state-of-the-art generation results in various generative tasks. A great diversity of conditioning inputs, such as text or bounding boxes, are accessible to control the generation. In this work, we propose a conditioning mechanism utilizing Gaussian mixture models (GMMs) as feature conditioning to guide the denoising process. Based on set theory, we provide a comprehensive theoretical analysis that shows that conditional latent distribution based on features and classes is significantly different, so that conditional latent distribution on features produces fewer defect generations than conditioning on classes. Two diffusion models conditioned on the Gaussian mixture model are trained separately for comparison. Experiments support our findings. A novel gradient function called the negative Gaussian mixture gradient (NGMG) is proposed and applied in diffusion model training with an additional classifier. Training stability has improved. We also theoretically prove that NGMG shares the same benefit as the Earth Mover distance (Wasserstein) as a more sensible cost function when learning distributions supported by low-dimensional manifolds.
△ Less
Submitted 1 February, 2024; v1 submitted 20 January, 2024;
originally announced January 2024.
-
Transferable Learned Image Compression-Resistant Adversarial Perturbations
Authors:
Yang Sui,
Zhuohang Li,
Ding Ding,
Xiang Pan,
Xiaozhong Xu,
Shan Liu,
Zhenzhong Chen
Abstract:
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks. While existing adversarial perturbations are primarily applied to uncompressed images or compressed images by the traditional image compression method, i.e., JPEG, limited studies have investigated the robustness of models for image classification in the context of D…
▽ More
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks. While existing adversarial perturbations are primarily applied to uncompressed images or compressed images by the traditional image compression method, i.e., JPEG, limited studies have investigated the robustness of models for image classification in the context of DNN-based image compression. With the rapid evolution of advanced image compression, DNN-based learned image compression has emerged as the promising approach for transmitting images in many security-critical applications, such as cloud-based face recognition and autonomous driving, due to its superior performance over traditional compression. Therefore, there is a pressing need to fully investigate the robustness of a classification system post-processed by learned image compression. To bridge this research gap, we explore the adversarial attack on a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules. Furthermore, to enhance the transferability of perturbations across various quality levels and architectures of learned image compression models, we introduce a saliency score-based sampling method to enable the fast generation of transferable perturbation. Extensive experiments with popular attack methods demonstrate the enhanced transferability of our proposed method when attacking images that have been post-processed with different learned image compression models.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
Cross-target Stance Detection by Exploiting Target Analytical Perspectives
Authors:
Daijun Ding,
Rong Chen,
Liwen Jing,
Bowen Zhang,
Xu Huang,
Li Dong,
Xiaowen Zhao,
Ge Song
Abstract:
Cross-target stance detection (CTSD) is an important task, which infers the attitude of the destination target by utilizing annotated data derived from the source target. One important approach in CTSD is to extract domain-invariant features to bridge the knowledge gap between multiple targets. However, the analysis of informal and short text structure, and implicit expressions, complicate the ext…
▽ More
Cross-target stance detection (CTSD) is an important task, which infers the attitude of the destination target by utilizing annotated data derived from the source target. One important approach in CTSD is to extract domain-invariant features to bridge the knowledge gap between multiple targets. However, the analysis of informal and short text structure, and implicit expressions, complicate the extraction of domain-invariant knowledge. In this paper, we propose a Multi-Perspective Prompt-Tuning (MPPT) model for CTSD that uses the analysis perspective as a bridge to transfer knowledge. First, we develop a two-stage instruct-based chain-of-thought method (TsCoT) to elicit target analysis perspectives and provide natural language explanations (NLEs) from multiple viewpoints by formulating instructions based on large language model (LLM). Second, we propose a multi-perspective prompt-tuning framework (MultiPLN) to fuse the NLEs into the stance predictor. Extensive experiments results demonstrate the superiority of MPPT against the state-of-the-art baseline methods.
△ Less
Submitted 3 January, 2024; v1 submitted 3 January, 2024;
originally announced January 2024.
-
Resilient Constrained Reinforcement Learning
Authors:
Dongsheng Ding,
Zhengyan Huan,
Alejandro Ribeiro
Abstract:
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before training. It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward maximization objective and the constraint satisfaction, which is ubiquitous in constrained decision-making. To tackle this issue, we…
▽ More
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before training. It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward maximization objective and the constraint satisfaction, which is ubiquitous in constrained decision-making. To tackle this issue, we propose a new constrained RL approach that searches for policy and constraint specifications together. This method features the adaptation of relaxing the constraint according to a relaxation cost introduced in the learning objective. Since this feature mimics how ecological systems adapt to disruptions by altering operation, our approach is termed as resilient constrained RL. Specifically, we provide a set of sufficient conditions that balance the constraint satisfaction and the reward maximization in notion of resilient equilibrium, propose a tractable formulation of resilient constrained policy optimization that takes this equilibrium as an optimal solution, and advocate two resilient constrained policy search algorithms with non-asymptotic convergence guarantees on the optimality gap and constraint satisfaction. Furthermore, we demonstrate the merits and the effectiveness of our approach in computational experiments.
△ Less
Submitted 29 December, 2023; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Effective dynamics of quantum fluctuations in field theory: with applications to cosmology
Authors:
Ding Ding,
Zhao Yu,
Yidun Wan
Abstract:
We develop a novel framework for describing quantum fluctuations in field theory, with a focus on cosmological applications. Our method uniquely circumvents the use of operator/Hilbert-space formalism, instead relying on a systematic treatment of classical variables, quantum fluctuations, and an effective Hamiltonian. Our framework not only aligns with standard formalisms in flat and de Sitter spa…
▽ More
We develop a novel framework for describing quantum fluctuations in field theory, with a focus on cosmological applications. Our method uniquely circumvents the use of operator/Hilbert-space formalism, instead relying on a systematic treatment of classical variables, quantum fluctuations, and an effective Hamiltonian. Our framework not only aligns with standard formalisms in flat and de Sitter spacetimes, which assumes no backreaction, demonstrated through the $\varphi^3$-model, but also adeptly handles time-dependent backreaction in more general cases. The uncertainty principle and spatial symmetry emerge as critical tools for selecting initial conditions and understanding effective potentials. We discover that modes inside the Hubble horizon \emph{do not} necessarily feel an initial Minkowski vacuum, as is commonly assumed. Our findings offer fresh insights into the early universe's quantum fluctuations and potential explanations to large-scale CMB anomalies.
△ Less
Submitted 22 April, 2024; v1 submitted 26 December, 2023;
originally announced December 2023.
-
A Logically Consistent Chain-of-Thought Approach for Stance Detection
Authors:
Bowen Zhang,
Daijun Ding,
Liwen Jing,
Hu Huang
Abstract:
Zero-shot stance detection (ZSSD) aims to detect stances toward unseen targets. Incorporating background knowledge to enhance transferability between seen and unseen targets constitutes the primary approach of ZSSD. However, these methods often struggle with a knowledge-task disconnect and lack logical consistency in their predictions. To address these issues, we introduce a novel approach named L…
▽ More
Zero-shot stance detection (ZSSD) aims to detect stances toward unseen targets. Incorporating background knowledge to enhance transferability between seen and unseen targets constitutes the primary approach of ZSSD. However, these methods often struggle with a knowledge-task disconnect and lack logical consistency in their predictions. To address these issues, we introduce a novel approach named Logically Consistent Chain-of-Thought (LC-CoT) for ZSSD, which improves stance detection by ensuring relevant and logically sound knowledge extraction. LC-CoT employs a three-step process. Initially, it assesses whether supplementary external knowledge is necessary. Subsequently, it uses API calls to retrieve this knowledge, which can be processed by a separate LLM. Finally, a manual exemplar guides the LLM to infer stance categories, using an if-then logical structure to maintain relevance and logical coherence. This structured approach to eliciting background knowledge enhances the model's capability, outperforming traditional supervised methods without relying on labeled data.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
Venn: Resource Management Across Federated Learning Jobs
Authors:
Jiachen Liu,
Fan Lai,
Ding Ding,
Yiwen Zhang,
Mosharaf Chowdhury
Abstract:
In recent years, federated learning (FL) has emerged as a promising approach for machine learning (ML) and data science across distributed edge devices. With the increasing popularity of FL, resource contention between multiple FL jobs training on the same device population is increasing as well. Scheduling edge resources among multiple FL jobs is different from GPU scheduling for cloud ML because…
▽ More
In recent years, federated learning (FL) has emerged as a promising approach for machine learning (ML) and data science across distributed edge devices. With the increasing popularity of FL, resource contention between multiple FL jobs training on the same device population is increasing as well. Scheduling edge resources among multiple FL jobs is different from GPU scheduling for cloud ML because of the ephemeral nature and planetary scale of participating devices as well as the overlapping resource requirements of diverse FL jobs. Existing resource managers for FL jobs opt for random assignment of devices to FL jobs for simplicity and scalability, which leads to poor performance. In this paper, we present Venn, an FL resource manager, that efficiently schedules ephemeral, heterogeneous devices among many FL jobs, with the goal of reducing their average job completion time (JCT). Venn formulates the Intersection Resource Scheduling (IRS) problem to identify complex resource contention among multiple FL jobs. Then, Venn proposes a contention-aware scheduling heuristic to minimize the average scheduling delay. Furthermore, it proposes a resource-aware device-to-job matching heuristic that focuses on optimizing response collection time by mitigating stragglers. Our evaluation shows that, compared to the state-of-the-art FL resource managers, Venn improves the average JCT by up to 1.88X.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
One Gate Scheme to Rule Them All: Introducing a Complex Yet Reduced Instruction Set for Quantum Computing
Authors:
Jianxin Chen,
Dawei Ding,
Weiyuan Gong,
Cupjin Huang,
Qi Ye
Abstract:
The design and architecture of a quantum instruction set are paramount to the performance of a quantum computer. This work introduces a gate scheme for qubits with $XX+YY$ coupling that directly and efficiently realizes any two-qubit gate up to single-qubit gates. First, this scheme enables high-fidelity execution of quantum operations and achieves minimum possible gate times. Second, since the sc…
▽ More
The design and architecture of a quantum instruction set are paramount to the performance of a quantum computer. This work introduces a gate scheme for qubits with $XX+YY$ coupling that directly and efficiently realizes any two-qubit gate up to single-qubit gates. First, this scheme enables high-fidelity execution of quantum operations and achieves minimum possible gate times. Second, since the scheme spans the entire $\textbf{SU}(4)$ group of two-qubit gates, we can use it to attain the optimal two-qubit gate count for algorithm implementation. These two advantages in synergy give rise to a quantum Complex yet Reduced Instruction Set Computer (CRISC). Though the gate scheme is compact, it supports a comprehensive array of quantum operations. This may seem paradoxical but is realizable due to the fundamental differences between quantum and classical computer architectures.
Using our gate scheme, we observe marked improvements across various applications, including generic $n$-qubit gate synthesis, quantum volume, and qubit routing. Furthermore, the proposed scheme also realizes a gate locally equivalent to the commonly used CNOT gate with a gate time of $\fracπ{2g}$, where $g$ is the two-qubit coupling. The AshN scheme is also completely impervious to $ZZ$ error, the main coherent error in transversely coupled systems, as the control parameters implementing the gates can be easily adjusted to take the $ZZ$ term into account.
△ Less
Submitted 13 May, 2024; v1 submitted 9 December, 2023;
originally announced December 2023.
-
Corner-to-Center Long-range Context Model for Efficient Learned Image Compression
Authors:
Yang Sui,
Ding Ding,
Xiang Pan,
Xiaozhong Xu,
Shan Liu,
Bo Yuan,
Zhenzhong Chen
Abstract:
In the framework of learned image compression, the context model plays a pivotal role in capturing the dependencies among latent representations. To reduce the decoding time resulting from the serial autoregressive context model, the parallel context model has been proposed as an alternative that necessitates only two passes during the decoding phase, thus facilitating efficient image compression…
▽ More
In the framework of learned image compression, the context model plays a pivotal role in capturing the dependencies among latent representations. To reduce the decoding time resulting from the serial autoregressive context model, the parallel context model has been proposed as an alternative that necessitates only two passes during the decoding phase, thus facilitating efficient image compression in real-world scenarios. However, performance degradation occurs due to its incomplete casual context. To tackle this issue, we conduct an in-depth analysis of the performance degradation observed in existing parallel context models, focusing on two aspects: the Quantity and Quality of information utilized for context prediction and decoding. Based on such analysis, we propose the \textbf{Corner-to-Center transformer-based Context Model (C$^3$M)} designed to enhance context and latent predictions and improve rate-distortion performance. Specifically, we leverage the logarithmic-based prediction order to predict more context features from corner to center progressively. In addition, to enlarge the receptive field in the analysis and synthesis transformation, we use the Long-range Crossing Attention Module (LCAM) in the encoder/decoder to capture the long-range semantic information by assigning the different window shapes in different channels. Extensive experimental evaluations show that the proposed method is effective and outperforms the state-of-the-art parallel methods. Finally, according to the subjective analysis, we suggest that improving the detailed representation in transformer-based image compression is a promising direction to be explored.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.