-
Duality via Sequential Quantum Circuit in the Topological Holography Formalism
Authors:
Robijn Vanhove,
Vibhu Ravindran,
David T. Stephen,
Xiao-Gang Wen,
Xie Chen
Abstract:
Two quantum theories which look different but are secretly describing the same low-energy physics are said to be dual to each other. When realized in the Topological Holography formalism, duality corresponds to changing the gapped boundary condition on the top boundary of a topological field theory, which determines the symmetry of the system, while not affecting the bottom boundary where all the…
▽ More
Two quantum theories which look different but are secretly describing the same low-energy physics are said to be dual to each other. When realized in the Topological Holography formalism, duality corresponds to changing the gapped boundary condition on the top boundary of a topological field theory, which determines the symmetry of the system, while not affecting the bottom boundary where all the dynamics take place. In this paper, we show that duality in the Topological Holography formalism can be realized with a Sequential Quantum Circuit applied to the top boundary. As a consequence, the Hamiltonians before and after the duality mapping have exactly the same spectrum in the corresponding symmetry sectors, and the entanglement in the corresponding low-energy eigenstates differs by at most an area law term.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
MCDGLN: Masked Connection-based Dynamic Graph Learning Network for Autism Spectrum Disorder
Authors:
Peng Wang,
Xin Wen,
Ruochen Cao,
Chengxin Gao,
Yanrong Hao,
Rui Cao
Abstract:
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by complex physiological processes. Previous research has predominantly focused on static cerebral interactions, often neglecting the brain's dynamic nature and the challenges posed by network noise. To address these gaps, we introduce the Masked Connection-based Dynamic Graph Learning Network (MCDGLN). Our approach firs…
▽ More
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by complex physiological processes. Previous research has predominantly focused on static cerebral interactions, often neglecting the brain's dynamic nature and the challenges posed by network noise. To address these gaps, we introduce the Masked Connection-based Dynamic Graph Learning Network (MCDGLN). Our approach first segments BOLD signals using sliding temporal windows to capture dynamic brain characteristics. We then employ a specialized weighted edge aggregation (WEA) module, which uses the cross convolution with channel-wise element-wise convolutional kernel, to integrate dynamic functional connectivity and to isolating task-relevant connections. This is followed by topological feature extraction via a hierarchical graph convolutional network (HGCN), with key attributes highlighted by a self-attention module. Crucially, we refine static functional connections using a customized task-specific mask, reducing noise and pruning irrelevant links. The attention-based connection encoder (ACE) then enhances critical connections and compresses static features. The combined features are subsequently used for classification. Applied to the Autism Brain Imaging Data Exchange I (ABIDE I) dataset, our framework achieves a 73.3\% classification accuracy between ASD and Typical Control (TC) groups among 1,035 subjects. The pivotal roles of WEA and ACE in refining connectivity and enhancing classification accuracy underscore their importance in capturing ASD-specific features, offering new insights into the disorder.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Can OOD Object Detectors Learn from Foundation Models?
Authors:
Jiahui Liu,
Xin Wen,
Shizhen Zhao,
Yingxian Chen,
Xiaojuan Qi
Abstract:
Out-of-distribution (OOD) object detection is a challenging task due to the absence of open-set OOD data. Inspired by recent advancements in text-to-image generative models, such as Stable Diffusion, we study the potential of generative models trained on large-scale open-set data to synthesize OOD samples, thereby enhancing OOD object detection. We introduce SyncOOD, a simple data curation method…
▽ More
Out-of-distribution (OOD) object detection is a challenging task due to the absence of open-set OOD data. Inspired by recent advancements in text-to-image generative models, such as Stable Diffusion, we study the potential of generative models trained on large-scale open-set data to synthesize OOD samples, thereby enhancing OOD object detection. We introduce SyncOOD, a simple data curation method that capitalizes on the capabilities of large foundation models to automatically extract meaningful OOD data from text-to-image generative models. This offers the model access to open-world knowledge encapsulated within off-the-shelf foundation models. The synthetic OOD samples are then employed to augment the training of a lightweight, plug-and-play OOD detector, thus effectively optimizing the in-distribution (ID)/OOD decision boundaries. Extensive experiments across multiple benchmarks demonstrate that SyncOOD significantly outperforms existing methods, establishing new state-of-the-art performance with minimal synthetic data usage.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Authors:
Xin Zheng,
Jie Lou,
Boxi Cao,
Xueru Wen,
Yuqiu Ji,
Hongyu Lin,
Yaojie Lu,
Xianpei Han,
Debing Zhang,
Le Sun
Abstract:
Self-critic has become an important mechanism for enhancing the reasoning performance of LLMs. However, current approaches mainly involve basic prompts without further training, which tend to be over-simplified, leading to limited accuracy.Moreover, there is a lack of in-depth investigation of the relationship between LLM's ability to criticism and its task-solving performance.To address these iss…
▽ More
Self-critic has become an important mechanism for enhancing the reasoning performance of LLMs. However, current approaches mainly involve basic prompts without further training, which tend to be over-simplified, leading to limited accuracy.Moreover, there is a lack of in-depth investigation of the relationship between LLM's ability to criticism and its task-solving performance.To address these issues, we propose Critic-CoT, a novel framework that pushes LLMs toward System-2-like critic capability, via step-wise CoT reasoning format and distant-supervision data construction, without the need for human annotation. Experiments on GSM8K and MATH show that via filtering out invalid solutions or iterative refinement, our enhanced model boosts task-solving performance, which demonstrates the effectiveness of our method. Further, we find that training on critique and refinement alone improves the generation. We hope our work could shed light on future research on improving the reasoning and critic ability of LLMs.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
TVG: A Training-free Transition Video Generation Method with Diffusion Models
Authors:
Rui Zhang,
Yaosen Chen,
Yuegen Liu,
Wei Wang,
Xuming Wen,
Hongxia Wang
Abstract:
Transition videos play a crucial role in media production, enhancing the flow and coherence of visual narratives. Traditional methods like morphing often lack artistic appeal and require specialized skills, limiting their effectiveness. Recent advances in diffusion model-based video generation offer new possibilities for creating transitions but face challenges such as poor inter-frame relationshi…
▽ More
Transition videos play a crucial role in media production, enhancing the flow and coherence of visual narratives. Traditional methods like morphing often lack artistic appeal and require specialized skills, limiting their effectiveness. Recent advances in diffusion model-based video generation offer new possibilities for creating transitions but face challenges such as poor inter-frame relationship modeling and abrupt content changes. We propose a novel training-free Transition Video Generation (TVG) approach using video-level diffusion models that addresses these limitations without additional training. Our method leverages Gaussian Process Regression ($\mathcal{GPR}$) to model latent representations, ensuring smooth and dynamic transitions between frames. Additionally, we introduce interpolation-based conditional controls and a Frequency-aware Bidirectional Fusion (FBiF) architecture to enhance temporal control and transition reliability. Evaluations of benchmark datasets and custom image pairs demonstrate the effectiveness of our approach in generating high-quality smooth transition videos. The code are provided in https://sobeymil.github.io/tvg.com.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama
Authors:
Jing Tang,
Quanlu Jia,
Yuqiang Xie,
Zeyu Gong,
Xiang Wen,
Jiayi Zhang,
Yalong Guo,
Guibin Chen,
Jiangping Yang
Abstract:
Generating high-quality shooting scripts containing information such as scene and shot language is essential for short drama script generation. We collect 6,660 popular short drama episodes from the Internet, each with an average of 100 short episodes, and the total number of short episodes is about 80,000, with a total duration of about 2,000 hours and totaling 10 terabytes (TB). We perform keyfr…
▽ More
Generating high-quality shooting scripts containing information such as scene and shot language is essential for short drama script generation. We collect 6,660 popular short drama episodes from the Internet, each with an average of 100 short episodes, and the total number of short episodes is about 80,000, with a total duration of about 2,000 hours and totaling 10 terabytes (TB). We perform keyframe extraction and annotation on each episode to obtain about 10,000,000 shooting scripts. We perform 100 script restorations on the extracted shooting scripts based on our self-developed large short drama generation model SkyReels. This leads to a dataset containing 1,000,000,000 pairs of scripts and shooting scripts for short dramas, called SkyScript-100M. We compare SkyScript-100M with the existing dataset in detail and demonstrate some deeper insights that can be achieved based on SkyScript-100M. Based on SkyScript-100M, researchers can achieve several deeper and more far-reaching script optimization goals, which may drive a paradigm shift in the entire field of text-to-video and significantly advance the field of short drama video generation. The data and code are available at https://github.com/vaew/SkyScript-100M.
△ Less
Submitted 28 August, 2024; v1 submitted 17 August, 2024;
originally announced August 2024.
-
A Survey of Trojan Attacks and Defenses to Deep Neural Networks
Authors:
Lingxin Jin,
Xianyu Wen,
Wei Jiang,
Jinyu Zhan
Abstract:
Deep Neural Networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to Neural Network Trojans (NN Trojans) maliciously injected by adversaries. This vulnerability arises due to the intricate architecture and opacity of DNNs, resulting i…
▽ More
Deep Neural Networks (DNNs) have found extensive applications in safety-critical artificial intelligence systems, such as autonomous driving and facial recognition systems. However, recent research has revealed their susceptibility to Neural Network Trojans (NN Trojans) maliciously injected by adversaries. This vulnerability arises due to the intricate architecture and opacity of DNNs, resulting in numerous redundant neurons embedded within the models. Adversaries exploit these vulnerabilities to conceal malicious Trojans within DNNs, thereby causing erroneous outputs and posing substantial threats to the efficacy of DNN-based applications. This article presents a comprehensive survey of Trojan attacks against DNNs and the countermeasure methods employed to mitigate them. Initially, we trace the evolution of the concept from traditional Trojans to NN Trojans, highlighting the feasibility and practicality of generating NN Trojans. Subsequently, we provide an overview of notable works encompassing various attack and defense strategies, facilitating a comparative analysis of their approaches. Through these discussions, we offer constructive insights aimed at refining these techniques. In recognition of the gravity and immediacy of this subject matter, we also assess the feasibility of deploying such attacks in real-world scenarios as opposed to controlled ideal datasets. The potential real-world implications underscore the urgency of addressing this issue effectively.
△ Less
Submitted 15 August, 2024;
originally announced August 2024.
-
Dihadron azimuthal asymmetry and light-quark dipole moments at the Electron-Ion Collider
Authors:
Xin-Kai Wen,
Bin Yan,
Zhite Yu,
C. -P. Yuan
Abstract:
We propose a novel method to probe light-quark dipole moments by examining the azimuthal asymmetries between a collinear pair of hadrons in semi-inclusive deep inelastic lepton scattering off an unpolarized proton target at the Electron-Ion Collider. These asymmetries provide a means to observe transversely polarized quarks, which arise exclusively from the interference between the dipole and the…
▽ More
We propose a novel method to probe light-quark dipole moments by examining the azimuthal asymmetries between a collinear pair of hadrons in semi-inclusive deep inelastic lepton scattering off an unpolarized proton target at the Electron-Ion Collider. These asymmetries provide a means to observe transversely polarized quarks, which arise exclusively from the interference between the dipole and the Standard Model interactions, thereby depending linearly on the dipole couplings. We demonstrate that this novel approach can enhance current constraints on light-quark dipole operators by an order of magnitude, free from contamination of other new physics effects. Furthermore, it allows for a simultaneous determination of both the real and imaginary parts of the dipole couplings, offering a new avenue for investigating potential $CP$-violating effects at high energies.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Recovering R-symbols from modular data
Authors:
Siu-Hung Ng,
Eric C Rowell,
Xiao-Gang Wen
Abstract:
Given a premodular category $\mathcal{C}$, we show that its $R$-symbol can be recovered from its $T$-matrice, fusion coefficients and some 2nd generalized Frobenius-Schur indicators. In particular, if $\mathcal{C}$ is modular, its $R$-symbols for a certain gauge choice are completely determined by its modular data.
Given a premodular category $\mathcal{C}$, we show that its $R$-symbol can be recovered from its $T$-matrice, fusion coefficients and some 2nd generalized Frobenius-Schur indicators. In particular, if $\mathcal{C}$ is modular, its $R$-symbols for a certain gauge choice are completely determined by its modular data.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Supernova Polarization Signals From the Interaction with a Dense Circumstellar Disk
Authors:
Xudong Wen,
He Gao,
Yi Yang,
Liangduan Liu,
Shunke Ai,
Zongkai Peng
Abstract:
There is increasing evidence that massive stars may exhibit an enhanced mass loss shortly before their termination explosion. Some of them also indicate the enhancement of their circumstellar matter (CSM) is not spherically symmetric. Supernova (SN) interacting with aspherical CSM could induce special polarization signals from multiple radiation components that deviate from spherical symmetry. We…
▽ More
There is increasing evidence that massive stars may exhibit an enhanced mass loss shortly before their termination explosion. Some of them also indicate the enhancement of their circumstellar matter (CSM) is not spherically symmetric. Supernova (SN) interacting with aspherical CSM could induce special polarization signals from multiple radiation components that deviate from spherical symmetry. We investigate the time-evolution of the continuum polarization induced by the SN ejecta interacting with a disk/torus-like CSM. Our calculation suggests that the interaction between the SN ejecta and an immediate disk-like CSM with a thin, homogenous density structure would produce a high continuum polarization, which may reach a peak level of $\sim$15\%. The interplay between the evolving geometry of the emitting regions and the time-variant flux ratio between the polar ejecta and the equatorial CSM interaction may produce a double-peaked feature in the polarization time sequence. A similar trend of the time evolution of the polarization is also found for a radially extended CSM disk that exhibits a wind-like density structure, with an overall relatively lower level of continuum polarization ($<2.5\%$) during the interaction process. We also identify a non-uniform temperature distribution along the radial direction of the CSM disk, which yields a strong wavelength dependence of the continuum polarization. These signatures provide a unique geometric diagnostic to explore the interaction process and the associated extreme mass loss of the progenitors of interacting transients.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
The Impact of an XAI-Augmented Approach on Binary Classification with Scarce Data
Authors:
Ximing Wen,
Rosina O. Weber,
Anik Sen,
Darryl Hannan,
Steven C. Nesbit,
Vincent Chan,
Alberto Goffi,
Michael Morris,
John C. Hunninghake,
Nicholas E. Villalobos,
Edward Kim,
Christopher J. MacLellan
Abstract:
Point-of-Care Ultrasound (POCUS) is the practice of clinicians conducting and interpreting ultrasound scans right at the patient's bedside. However, the expertise needed to interpret these images is considerable and may not always be present in emergency situations. This reality makes algorithms such as machine learning classifiers extremely valuable to augment human decisions. POCUS devices are b…
▽ More
Point-of-Care Ultrasound (POCUS) is the practice of clinicians conducting and interpreting ultrasound scans right at the patient's bedside. However, the expertise needed to interpret these images is considerable and may not always be present in emergency situations. This reality makes algorithms such as machine learning classifiers extremely valuable to augment human decisions. POCUS devices are becoming available at a reasonable cost in the size of a mobile phone. The challenge of turning POCUS devices into life-saving tools is that interpretation of ultrasound images requires specialist training and experience. Unfortunately, the difficulty to obtain positive training images represents an important obstacle to building efficient and accurate classifiers. Hence, the problem we try to investigate is how to explore strategies to increase accuracy of classifiers trained with scarce data. We hypothesize that training with a few data instances may not suffice for classifiers to generalize causing them to overfit. Our approach uses an Explainable AI-Augmented approach to help the algorithm learn more from less and potentially help the classifier better generalize.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Lattice Energy Reservoir in Metal Halide Perovskites
Authors:
Xiaoming Wen,
Baohua Jia
Abstract:
Metal halide perovskite-based technologies have been rapidly developed during the last decade. However, to date, the fundamental question, why are halide perovskites superior to conventional semiconductors? has remained elusive. Here, we propose a new theory of lattice energy reservoir (LER) in halide perovskites and elucidate that LER can comprehensively impact charge carrier dynamics and thus en…
▽ More
Metal halide perovskite-based technologies have been rapidly developed during the last decade. However, to date, the fundamental question, why are halide perovskites superior to conventional semiconductors? has remained elusive. Here, we propose a new theory of lattice energy reservoir (LER) in halide perovskites and elucidate that LER can comprehensively impact charge carrier dynamics and thus enhance device performance, from hot carrier cooling, carrier recombination, anomalous upconversion fluorescence, illumination induced fluorescence enhancement (photobrightening), to high efficiency solar cells and light-emitting diodes. An LER is a dynamic nanodomain in halide perovskites with suppressed thermal transport that can accumulate energy from phonon coupling and then feedback to subgap carriers and result in subgap carrier upconversion. The LER directly results in slowed cooling of hot carriers and significantly prolonged carrier recombination, anomalous upconversion fluorescence, as usually termed as defect tolerance, as well as the anomalous ultraslow phenomena including persistent polarization, memory effect, and photobrightening. The LER theory rationalizes the superior optoelectronic properties and device performance and provides a novel physical understanding for anomalous phenomena observed uniquely in halide perovskites.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Exactly solvable non-unitary time evolution in quantum critical systems I: Effect of complex spacetime metrics
Authors:
Xueda Wen
Abstract:
In this series of works, we study exactly solvable non-unitary time evolutions in one-dimensional quantum critical systems ranging from quantum quenches to time-dependent drivings. In this part I, we are motivated by the recent works of Kontsevich and Segal [1] and Witten [2] on allowable complex spacetime metrics in quantum field theories. In general, such complex spacetime metrics will lead to n…
▽ More
In this series of works, we study exactly solvable non-unitary time evolutions in one-dimensional quantum critical systems ranging from quantum quenches to time-dependent drivings. In this part I, we are motivated by the recent works of Kontsevich and Segal [1] and Witten [2] on allowable complex spacetime metrics in quantum field theories. In general, such complex spacetime metrics will lead to non-unitary time evolutions. In this work, we study the universal features of such non-unitary time evolutions based on exactly solvable setups. Various physical quantities including entanglement Hamiltonian and entanglement spectrum, entanglement entropy, and energy density at an arbitrary time can be exactly solved. Due to the damping effect introduced by the complex time, the excitations in the initial state are gradually damped out in time. The non-equilibrium dynamics exhibits universal features that are qualitatively different from the case of real-time evolutions. For instance, for an infinite system after a global quench, the entanglement entropy of the semi-infinite subsystem will grow logarithmically in time, in contrast to the linear growth in a real-time evolution. Moreover, we study numerically the time-dependent driven quantum critical systems with allowable complex spacetime metrics. It is found that the competition between driving and damping leads to a steady state with an interesting entanglement structure.
△ Less
Submitted 29 July, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
Relighting Scenes with Object Insertions in Neural Radiance Fields
Authors:
Xuening Zhu,
Renjiao Yi,
Xin Wen,
Chenyang Zhu,
Kai Xu
Abstract:
The insertion of objects into a scene and relighting are commonly utilized applications in augmented reality (AR). Previous methods focused on inserting virtual objects using CAD models or real objects from single-view images, resulting in highly limited AR application scenarios. We propose a novel NeRF-based pipeline for inserting object NeRFs into scene NeRFs, enabling novel view synthesis and r…
▽ More
The insertion of objects into a scene and relighting are commonly utilized applications in augmented reality (AR). Previous methods focused on inserting virtual objects using CAD models or real objects from single-view images, resulting in highly limited AR application scenarios. We propose a novel NeRF-based pipeline for inserting object NeRFs into scene NeRFs, enabling novel view synthesis and realistic relighting, supporting physical interactions like casting shadows onto each other, from two sets of images depicting the object and scene. The lighting environment is in a hybrid representation of Spherical Harmonics and Spherical Gaussians, representing both high- and low-frequency lighting components very well, and supporting non-Lambertian surfaces. Specifically, we leverage the benefits of volume rendering and introduce an innovative approach for efficient shadow rendering by comparing the depth maps between the camera view and the light source view and generating vivid soft shadows. The proposed method achieves realistic relighting effects in extensive experimental evaluations.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Beam test results of the prototype of the multi wire drift chamber for the CSR external-target experiment
Authors:
Zhi Qin,
Zhoubo He,
Zhe Cao,
Tao Chen,
Zhi Deng,
Limin Duan,
Dong Guo,
Rongjiang Hu,
Jie Kong,
Canwen Liu,
Peng Ma,
Xianglun Wei,
Shihai Wen,
Xiangjie Wen,
Junwei Yan,
Herun Yang,
Zuoqiao Yang,
Yuhong Yu,
Zhigang Xiao
Abstract:
The half-size prototype of the multi wire drift chamber (MWDC) for the cooling storage ring (CSR) external-target experiment (CEE) was assembled and tested in 350 MeV/u Kr+Fe reactions on the heavy ion research facility in Lanzhou (HIRFL). The prototype consists of 6 sense layers, where the sense wires are stretched in three directions X, U and V, meeting $0^\circ$, $30^\circ$ and $-30^\circ$ with…
▽ More
The half-size prototype of the multi wire drift chamber (MWDC) for the cooling storage ring (CSR) external-target experiment (CEE) was assembled and tested in 350 MeV/u Kr+Fe reactions on the heavy ion research facility in Lanzhou (HIRFL). The prototype consists of 6 sense layers, where the sense wires are stretched in three directions X, U and V, meeting $0^\circ$, $30^\circ$ and $-30^\circ$ with respect to the vertical axis, respectively. The sensitive area of the prototype is $76 {\rm cm} \times 76 {\rm cm}$. The amplified and shaped signals from the anode wires are digitized in a serial capacity array. Being operated with 1500 V high voltage on the anode wires, the efficiency for each layer is beyond 95\%. The tracking residual is about $301 \pm 2 \rm μm$. The performance meets the requirements of CEE.
△ Less
Submitted 15 May, 2024;
originally announced June 2024.
-
Large Language Model as a Universal Clinical Multi-task Decoder
Authors:
Yujiang Wu,
Hongjian Song,
Jiawen Zhang,
Xumeng Wen,
Shun Zheng,
Jiang Bian
Abstract:
The development of effective machine learning methodologies for enhancing the efficiency and accuracy of clinical systems is crucial. Despite significant research efforts, managing a plethora of diversified clinical tasks and adapting to emerging new tasks remain significant challenges. This paper presents a novel paradigm that employs a pre-trained large language model as a universal clinical mul…
▽ More
The development of effective machine learning methodologies for enhancing the efficiency and accuracy of clinical systems is crucial. Despite significant research efforts, managing a plethora of diversified clinical tasks and adapting to emerging new tasks remain significant challenges. This paper presents a novel paradigm that employs a pre-trained large language model as a universal clinical multi-task decoder. This approach leverages the flexibility and diversity of language expressions to handle task topic variations and associated arguments. The introduction of a new task simply requires the addition of a new instruction template. We validate this framework across hundreds of tasks, demonstrating its robustness in facilitating multi-task predictions, performing on par with traditional multi-task learning and single-task learning approaches. Moreover, it shows exceptional adaptability to new tasks, with impressive zero-shot performance in some instances and superior data efficiency in few-shot scenarios. This novel approach offers a unified solution to manage a wide array of new and emerging tasks in clinical applications.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
On-Policy Fine-grained Knowledge Feedback for Hallucination Mitigation
Authors:
Xueru Wen,
Xinyu Lu,
Xinyan Guan,
Yaojie Lu,
Hongyu Lin,
Ben He,
Xianpei Han,
Le Sun
Abstract:
Hallucination occurs when large language models (LLMs) exhibit behavior that deviates from the boundaries of their knowledge during the response generation process. Previous learning-based methods focus on detecting knowledge boundaries and finetuning models with instance-level feedback, but they suffer from inaccurate signals due to off-policy data sampling and coarse-grained feedback. In this pa…
▽ More
Hallucination occurs when large language models (LLMs) exhibit behavior that deviates from the boundaries of their knowledge during the response generation process. Previous learning-based methods focus on detecting knowledge boundaries and finetuning models with instance-level feedback, but they suffer from inaccurate signals due to off-policy data sampling and coarse-grained feedback. In this paper, we introduce \textit{\b{R}einforcement \b{L}earning \b{f}or \b{H}allucination} (RLFH), a fine-grained feedback-based online reinforcement learning method for hallucination mitigation. Unlike previous learning-based methods, RLFH enables LLMs to explore the boundaries of their internal knowledge and provide on-policy, fine-grained feedback on these explorations. To construct fine-grained feedback for learning reliable generation behavior, RLFH decomposes the outcomes of large models into atomic facts, provides statement-level evaluation signals, and traces back the signals to the tokens of the original responses. Finally, RLFH adopts the online reinforcement algorithm with these token-level rewards to adjust model behavior for hallucination mitigation. For effective on-policy optimization, RLFH also introduces an LLM-based fact assessment framework to verify the truthfulness and helpfulness of atomic facts without human intervention. Experiments on HotpotQA, SQuADv2, and Biography benchmarks demonstrate that RLFH can balance their usage of internal knowledge during the generation process to eliminate the hallucination behavior of LLMs.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Hierarchy construction for non-abelian fractional quantum Hall states via anyon condensation
Authors:
Carolyn Zhang,
Ashvin Vishwanath,
Xiao-Gang Wen
Abstract:
For a given parent fractional quantum Hall (FQH) state at filling fraction $ν$, the hierarchy construction produces FQH states at nearby filling fractions $\{ν_n\}$ by condensing minimally charged quasiholes or quasiparticles of the parent state into their own FQH states. The hierarchy construction has been useful for relating families of FQH states and for the experimental identification of the t…
▽ More
For a given parent fractional quantum Hall (FQH) state at filling fraction $ν$, the hierarchy construction produces FQH states at nearby filling fractions $\{ν_n\}$ by condensing minimally charged quasiholes or quasiparticles of the parent state into their own FQH states. The hierarchy construction has been useful for relating families of FQH states and for the experimental identification of the topological order of parent states via the presence of daughter states. We reinterpret the hierarchy construction as a two-step procedure: stacking with a second FQH state and condensing a condensable algebra of bosons. This two-step procedure can be applied to both abelian and non-abelian FQH states, and it does not require calculations with a wavefunction. We show this construction reproduces the hierarchies for the Laughlin and Pfaffian states, and can be applied further to propose hierarchies for various non-abelian FQH states.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
ROSfs: A User-Level File System for ROS
Authors:
Zijun Xu,
Xuanjun Wen,
Yanjie Song,
Shu Yin
Abstract:
We present ROSfs, a novel user-level file system for the Robot Operating System (ROS). ROSfs interprets a robot file as a group of sub-files, with each having a distinct label. ROSfs applies a time index structure to enhance the flexible data query while the data file is under modification. It provides multi-robot systems (MRS) with prompt cross-robot data acquisition and collaboration. We impleme…
▽ More
We present ROSfs, a novel user-level file system for the Robot Operating System (ROS). ROSfs interprets a robot file as a group of sub-files, with each having a distinct label. ROSfs applies a time index structure to enhance the flexible data query while the data file is under modification. It provides multi-robot systems (MRS) with prompt cross-robot data acquisition and collaboration. We implemented a ROSfs prototype and integrated it into a mainstream ROS platform. We then applied and evaluated ROSfs on real-world UAVs and data servers. Evaluation results show that compared with traditional ROS storage methods, ROSfs improves the offline query performance by up to 129x and reduces inter-robot online data query latency under a wireless network by up to 7x.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Tellegen responses in metamaterials
Authors:
Qingdong Yang,
Xinhua Wen,
Zhongfu Li,
Oubo You,
Shuang Zhang
Abstract:
Tellegen medium has long been a topic of debate, with its existence being contested over several decades. It was first proposed by Tellegen in 1948 and is characterized by a real-valued cross coupling between electric and magnetic responses, distinguishing it from the well-known chiral medium that has imaginary coupling coefficients. Significantly, Tellegen responses are closely linked to axion dy…
▽ More
Tellegen medium has long been a topic of debate, with its existence being contested over several decades. It was first proposed by Tellegen in 1948 and is characterized by a real-valued cross coupling between electric and magnetic responses, distinguishing it from the well-known chiral medium that has imaginary coupling coefficients. Significantly, Tellegen responses are closely linked to axion dynamics, an extensively studied subject in condensed matter physics. Here, we report the realization of Tellegen metamaterials in the microwave region through a judicious combination of subwavelength metallic resonators, gyromagnetic materials, and permanent magnet discs. We observe the key signature of the Tellegen response, i.e. a Kerr rotation for reflected wave, while the polarization remains the same in the transmission direction. The retrieved effective Tellegen parameter is several orders of magnitude greater than that of natural materials. Our work opens door to a variety of nonreciprocal photonic devices and may provide a platform for studying axion physics.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Covariate balancing with measurement error
Authors:
Xialing Wen,
Ying Yan
Abstract:
In recent years, there is a growing body of causal inference literature focusing on covariate balancing methods. These methods eliminate observed confounding by equalizing covariate moments between the treated and control groups. The validity of covariate balancing relies on an implicit assumption that all covariates are accurately measured, which is frequently violated in observational studies. N…
▽ More
In recent years, there is a growing body of causal inference literature focusing on covariate balancing methods. These methods eliminate observed confounding by equalizing covariate moments between the treated and control groups. The validity of covariate balancing relies on an implicit assumption that all covariates are accurately measured, which is frequently violated in observational studies. Nevertheless, the impact of measurement error on covariate balancing is unclear, and there is no existing work on balancing mismeasured covariates adequately. In this article, we show that naively ignoring measurement error reversely increases the magnitude of covariate imbalance and induces bias to treatment effect estimation. We then propose a class of measurement error correction strategies for the existing covariate balancing methods. Theoretically, we show that these strategies successfully recover balance for all covariates, and eliminate bias of treatment effect estimation. We assess the proposed correction methods in simulation studies and real data analysis.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Personalized Topic Selection Model for Topic-Grounded Dialogue
Authors:
Shixuan Fan,
Wei Wei,
Xiaofei Wen,
Xianling Mao,
Jixiong Chen,
Dangyang Chen
Abstract:
Recently, the topic-grounded dialogue (TGD) system has become increasingly popular as its powerful capability to actively guide users to accomplish specific tasks through topic-guided conversations. Most existing works utilize side information (\eg topics or personas) in isolation to enhance the topic selection ability. However, due to disregarding the noise within these auxiliary information sour…
▽ More
Recently, the topic-grounded dialogue (TGD) system has become increasingly popular as its powerful capability to actively guide users to accomplish specific tasks through topic-guided conversations. Most existing works utilize side information (\eg topics or personas) in isolation to enhance the topic selection ability. However, due to disregarding the noise within these auxiliary information sources and their mutual influence, current models tend to predict user-uninteresting and contextually irrelevant topics. To build user-engaging and coherent dialogue agent, we propose a \textbf{P}ersonalized topic s\textbf{E}lection model for \textbf{T}opic-grounded \textbf{D}ialogue, named \textbf{PETD}, which takes account of the interaction of side information to selectively aggregate such information for more accurately predicting subsequent topics. Specifically, we evaluate the correlation between global topics and personas and selectively incorporate the global topics aligned with user personas. Furthermore, we propose a contrastive learning based persona selector to filter out irrelevant personas under the constraint of lacking pertinent persona annotations. Throughout the selection and generation, diverse relevant side information is considered. Extensive experiments demonstrate that our proposed method can generate engaging and diverse responses, outperforming state-of-the-art baselines across various evaluation metrics.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Symmetry enforced solution of the many-body Schrödinger equation with deep neural network
Authors:
Zhe Li,
Zixiang Lu,
Ruichen Li,
Xuelan Wen,
Xiang Li,
Liwei Wang,
Ji Chen,
Weiluo Ren
Abstract:
The integration of deep neural networks with the Variational Monte Carlo (VMC) method has marked a significant advancement in solving the Schrödinger equation. In this work, we enforce spin symmetry in the neural network-based VMC calculation with modified optimization target. Our method is designed to solve for the ground state and multiple excited states with target spin symmetry at a low comput…
▽ More
The integration of deep neural networks with the Variational Monte Carlo (VMC) method has marked a significant advancement in solving the Schrödinger equation. In this work, we enforce spin symmetry in the neural network-based VMC calculation with modified optimization target. Our method is designed to solve for the ground state and multiple excited states with target spin symmetry at a low computational cost. It predicts accurate energies while maintaining the correct symmetry in strongly correlated systems, even in cases where different spin states are nearly degenerate. Our approach also excels at spin-gap calculations, including the singlet-triplet gap in biradical systems, which is of high interest in photochemistry. Overall, this work establishes a robust framework for efficiently calculating various quantum states with specific spin symmetry in correlated systems, paving the way for novel discoveries in quantum science.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
GLADformer: A Mixed Perspective for Graph-level Anomaly Detection
Authors:
Fan Xu,
Nan Wang,
Hao Wu,
Xuezhi Wen,
Dalin Zhang,
Siyang Lu,
Binyong Li,
Wei Gong,
Hai Wan,
Xibin Zhao
Abstract:
Graph-Level Anomaly Detection (GLAD) aims to distinguish anomalous graphs within a graph dataset. However, current methods are constrained by their receptive fields, struggling to learn global features within the graphs. Moreover, most contemporary methods are based on spatial domain and lack exploration of spectral characteristics. In this paper, we propose a multi-perspective hybrid graph-level…
▽ More
Graph-Level Anomaly Detection (GLAD) aims to distinguish anomalous graphs within a graph dataset. However, current methods are constrained by their receptive fields, struggling to learn global features within the graphs. Moreover, most contemporary methods are based on spatial domain and lack exploration of spectral characteristics. In this paper, we propose a multi-perspective hybrid graph-level anomaly detector namely GLADformer, consisting of two key modules. Specifically, we first design a Graph Transformer module with global spectrum enhancement, which ensures balanced and resilient parameter distributions by fusing global features and spectral distribution characteristics. Furthermore, to uncover local anomalous attributes, we customize a band-pass spectral GNN message passing module that further enhances the model's generalization capability. Through comprehensive experiments on ten real-world datasets from multiple domains, we validate the effectiveness and robustness of GLADformer. This demonstrates that GLADformer outperforms current state-of-the-art models in graph-level anomaly detection, particularly in effectively capturing global anomaly representations and spectral characteristics.
△ Less
Submitted 3 July, 2024; v1 submitted 2 June, 2024;
originally announced June 2024.
-
Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights
Authors:
Xin Wen,
Bingchen Zhao,
Yilun Chen,
Jiangmiao Pang,
Xiaojuan Qi
Abstract:
Severe data imbalance naturally exists among web-scale vision-language datasets. Despite this, we find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning, and demonstrates significant effectiveness in learning generalizable representations. With an aim to investigate the reasons behind this finding, we conduct controlled experiments to stud…
▽ More
Severe data imbalance naturally exists among web-scale vision-language datasets. Despite this, we find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning, and demonstrates significant effectiveness in learning generalizable representations. With an aim to investigate the reasons behind this finding, we conduct controlled experiments to study various underlying factors, and reveal that CLIP's pretext task forms a dynamic classification problem wherein only a subset of classes is present in training. This isolates the bias from dominant classes and implicitly balances the learning signal. Furthermore, the robustness and discriminability of CLIP improve with more descriptive language supervision, larger data scale, and broader open-world concepts, which are inaccessible to supervised learning. Our study not only uncovers the mechanisms behind CLIP's generalizability beyond data imbalance but also provides transferable insights for the research community. The findings are validated in both supervised and self-supervised learning, enabling models trained on imbalanced data to achieve CLIP-level performance on diverse recognition tasks. Code and data are available at: https://github.com/CVMI-Lab/clip-beyond-tail.
△ Less
Submitted 14 June, 2024; v1 submitted 31 May, 2024;
originally announced May 2024.
-
PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram
Authors:
Sifan Zhou,
Zhihang Yuan,
Dawei Yang,
Xubin Wen,
Xing Hu,
Yuguang Shi,
Ziyu Zhao,
Xiaobo Lu
Abstract:
Real-time and high-performance 3D object detection plays a critical role in autonomous driving and robotics. Recent pillar-based 3D object detectors have gained significant attention due to their compact representation and low computational overhead, making them suitable for onboard deployment and quantization. However, existing pillar-based detectors still suffer from information loss along heigh…
▽ More
Real-time and high-performance 3D object detection plays a critical role in autonomous driving and robotics. Recent pillar-based 3D object detectors have gained significant attention due to their compact representation and low computational overhead, making them suitable for onboard deployment and quantization. However, existing pillar-based detectors still suffer from information loss along height dimension and large numerical distribution difference during pillar feature encoding (PFE), which severely limits their performance and quantization potential. To address above issue, we first unveil the importance of different input information during PFE and identify the height dimension as a key factor in enhancing 3D detection performance. Motivated by this observation, we propose a height-aware pillar feature encoder named PillarHist. Specifically, PillarHist statistics the discrete distribution of points at different heights within one pillar. This simple yet effective design greatly preserves the information along the height dimension while significantly reducing the computation overhead of the PFE. Meanwhile, PillarHist also constrains the arithmetic distribution of PFE input to a stable range, making it quantization-friendly. Notably, PillarHist operates exclusively within the PFE stage to enhance performance, enabling seamless integration into existing pillar-based methods without introducing complex operations. Extensive experiments show the effectiveness of PillarHist in terms of both efficiency and performance.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Low-Resource Crop Classification from Multi-Spectral Time Series Using Lossless Compressors
Authors:
Wei Cheng,
Hongrui Ye,
Xiao Wen,
Jiachen Zhang,
Jiping Xu,
Feifan Zhang
Abstract:
Deep learning has significantly improved the accuracy of crop classification using multispectral temporal data. However, these models have complex structures with numerous parameters, requiring large amounts of data and costly training. In low-resource situations with fewer labeled samples, deep learning models perform poorly due to insufficient data. Conversely, compressors are data-type agnostic…
▽ More
Deep learning has significantly improved the accuracy of crop classification using multispectral temporal data. However, these models have complex structures with numerous parameters, requiring large amounts of data and costly training. In low-resource situations with fewer labeled samples, deep learning models perform poorly due to insufficient data. Conversely, compressors are data-type agnostic, and non-parametric methods do not bring underlying assumptions. Inspired by this insight, we propose a non-training alternative to deep learning models, aiming to address these situations. Specifically, the Symbolic Representation Module is proposed to convert the reflectivity into symbolic representations. The symbolic representations are then cross-transformed in both the channel and time dimensions to generate symbolic embeddings. Next, the Multi-scale Normalised Compression Distance (MNCD) is designed to measure the correlation between any two symbolic embeddings. Finally, based on the MNCDs, high quality crop classification can be achieved using only a k-nearest-neighbor classifier kNN. The entire framework is ready-to-use and lightweight. Without any training, it outperformed, on average, 7 advanced deep learning models trained at scale on three benchmark datasets. It also outperforms more than half of these models in the few-shot setting with sparse crop labels. Therefore, the high performance and robustness of our non-training framework makes it truly applicable to real-world crop mapping. Codes are available at: https://github.com/qinfengsama/Compressor-Based-Crop-Mapping.
△ Less
Submitted 5 July, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
DSPO: An End-to-End Framework for Direct Sorted Portfolio Construction
Authors:
Jianyuan Zhong,
Zhijian Xu,
Saizhuo Wang,
Xiangyu Wen,
Jian Guo,
Qiang Xu
Abstract:
In quantitative investment, constructing characteristic-sorted portfolios is a crucial strategy for asset allocation. Traditional methods transform raw stock data of varying frequencies into predictive characteristic factors for asset sorting, often requiring extensive manual design and misalignment between prediction and optimization goals. To address these challenges, we introduce Direct Sorted…
▽ More
In quantitative investment, constructing characteristic-sorted portfolios is a crucial strategy for asset allocation. Traditional methods transform raw stock data of varying frequencies into predictive characteristic factors for asset sorting, often requiring extensive manual design and misalignment between prediction and optimization goals. To address these challenges, we introduce Direct Sorted Portfolio Optimization (DSPO), an innovative end-to-end framework that efficiently processes raw stock data to construct sorted portfolios directly. DSPO's neural network architecture seamlessly transitions stock data from input to output while effectively modeling the intra-dependency of time-steps and inter-dependency among all tradable stocks. Additionally, we incorporate a novel Monotonical Logistic Regression loss, which directly maximizes the likelihood of constructing optimal sorted portfolios. To the best of our knowledge, DSPO is the first method capable of handling market cross-sections with thousands of tradable stocks fully end-to-end from raw multi-frequency data. Empirical results demonstrate DSPO's effectiveness, yielding a RankIC of 10.12\% and an accumulated return of 121.94\% on the New York Stock Exchange in 2023-2024, and a RankIC of 9.11\% with a return of 108.74\% in other markets during 2021-2022.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging Radar
Authors:
Fangqiang Ding,
Xiangyu Wen,
Lawrence Zhu,
Yiming Li,
Chris Xiaoxuan Lu
Abstract:
3D occupancy-based perception pipeline has significantly advanced autonomous driving by capturing detailed scene descriptions and demonstrating strong generalizability across various object categories and shapes. Current methods predominantly rely on LiDAR or camera inputs for 3D occupancy prediction. These methods are susceptible to adverse weather conditions, limiting the all-weather deployment…
▽ More
3D occupancy-based perception pipeline has significantly advanced autonomous driving by capturing detailed scene descriptions and demonstrating strong generalizability across various object categories and shapes. Current methods predominantly rely on LiDAR or camera inputs for 3D occupancy prediction. These methods are susceptible to adverse weather conditions, limiting the all-weather deployment of self-driving cars. To improve perception robustness, we leverage the recent advances in automotive radars and introduce a novel approach that utilizes 4D imaging radar sensors for 3D occupancy prediction. Our method, RadarOcc, circumvents the limitations of sparse radar point clouds by directly processing the 4D radar tensor, thus preserving essential scene details. RadarOcc innovatively addresses the challenges associated with the voluminous and noisy 4D radar data by employing Doppler bins descriptors, sidelobe-aware spatial sparsification, and range-wise self-attention mechanisms. To minimize the interpolation errors associated with direct coordinate transformations, we also devise a spherical-based feature encoding followed by spherical-to-Cartesian feature aggregation. We benchmark various baseline methods based on distinct modalities on the public K-Radar dataset. The results demonstrate RadarOcc's state-of-the-art performance in radar-based 3D occupancy prediction and promising results even when compared with LiDAR- or camera-based methods. Additionally, we present qualitative evidence of the superior performance of 4D radar in adverse weather conditions and explore the impact of key pipeline components through ablation studies.
△ Less
Submitted 13 June, 2024; v1 submitted 22 May, 2024;
originally announced May 2024.
-
Beyond Trend and Periodicity: Guiding Time Series Forecasting with Textual Cues
Authors:
Zhijian Xu,
Yuxuan Bian,
Jianyuan Zhong,
Xiangyu Wen,
Qiang Xu
Abstract:
This work introduces a novel Text-Guided Time Series Forecasting (TGTSF) task. By integrating textual cues, such as channel descriptions and dynamic news, TGTSF addresses the critical limitations of traditional methods that rely purely on historical data. To support this task, we propose TGForecaster, a robust baseline model that fuses textual cues and time series data using cross-attention mechan…
▽ More
This work introduces a novel Text-Guided Time Series Forecasting (TGTSF) task. By integrating textual cues, such as channel descriptions and dynamic news, TGTSF addresses the critical limitations of traditional methods that rely purely on historical data. To support this task, we propose TGForecaster, a robust baseline model that fuses textual cues and time series data using cross-attention mechanisms. We then present four meticulously curated benchmark datasets to validate the proposed framework, ranging from simple periodic data to complex, event-driven fluctuations. Our comprehensive evaluations demonstrate that TGForecaster consistently achieves state-of-the-art performance, highlighting the transformative potential of incorporating textual information into time series forecasting. This work not only pioneers a novel forecasting task but also establishes a new benchmark for future research, driving advancements in multimodal data integration for time series models.
△ Less
Submitted 24 May, 2024; v1 submitted 22 May, 2024;
originally announced May 2024.
-
Bridging the Gap Between Domain-specific Frameworks and Multiple Hardware Devices
Authors:
Xu Wen,
Wanling Gao,
Lei Wang,
Jianfeng Zhan
Abstract:
The rapid development of domain-specific frameworks has presented us with a significant challenge: The current approach of implementing solutions on a case-by-case basis incurs a theoretical complexity of O(M*N), thereby increasing the cost of porting applications to different hardware platforms. To address these challenges, we propose a systematic methodology that effectively bridges the gap betw…
▽ More
The rapid development of domain-specific frameworks has presented us with a significant challenge: The current approach of implementing solutions on a case-by-case basis incurs a theoretical complexity of O(M*N), thereby increasing the cost of porting applications to different hardware platforms. To address these challenges, we propose a systematic methodology that effectively bridges the gap between domain-specific frameworks and multiple hardware devices, reducing porting complexity to O(M+N). The approach utilizes multi-layer abstractions. Different domain-specific abstractions are employed to represent applications from various domains. These abstractions are then transformed into a unified abstraction, which is subsequently translated into combinations of primitive operators. Finally, these operators are mapped to multiple hardware platforms. The implemented unified framework supports deep learning, classical machine learning, and data analysis across X86, ARM, RISC-V, IoT devices, and GPU. It outperforms existing solutions like scikit-learn, hummingbird, Spark, and pandas, achieving impressive speedups: 1.1x to 3.83x on X86 servers, 1.06x to 4.33x on ARM IoT devices, 1.25x to 3.72x on RISC-V IoT devices, and 1.93x on GPU. The source code is available at https://github.com/BenchCouncil/bridger.git.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Red Teaming Language Models for Contradictory Dialogues
Authors:
Xiaofei Wen,
Bangzheng Li,
Tenghao Huang,
Muhao Chen
Abstract:
Most language models currently available are prone to self-contradiction during dialogues. To mitigate this issue, this study explores a novel contradictory dialogue processing task that aims to detect and modify contradictory statements in a conversation. This task is inspired by research on context faithfulness and dialogue comprehension, which have demonstrated that the detection and understand…
▽ More
Most language models currently available are prone to self-contradiction during dialogues. To mitigate this issue, this study explores a novel contradictory dialogue processing task that aims to detect and modify contradictory statements in a conversation. This task is inspired by research on context faithfulness and dialogue comprehension, which have demonstrated that the detection and understanding of contradictions often necessitate detailed explanations. We develop a dataset comprising contradictory dialogues, in which one side of the conversation contradicts itself. Each dialogue is accompanied by an explanatory label that highlights the location and details of the contradiction. With this dataset, we present a Red Teaming framework for contradictory dialogue processing. The framework detects and attempts to explain the dialogue, then modifies the existing contradictory content using the explanation. Our experiments demonstrate that the framework improves the ability to detect contradictory dialogues and provides valid explanations. Additionally, it showcases distinct capabilities for modifying such dialogues. Our study highlights the importance of the logical inconsistency problem in conversational AI.
△ Less
Submitted 16 May, 2024; v1 submitted 16 May, 2024;
originally announced May 2024.
-
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
Authors:
Xiaoyu Wen,
Chenjia Bai,
Kang Xu,
Xudong Yu,
Yang Zhang,
Xuelong Li,
Zhen Wang
Abstract:
Cross-domain offline reinforcement learning leverages source domain data with diverse transition dynamics to alleviate the data requirement for the target domain. However, simply merging the data of two domains leads to performance degradation due to the dynamics mismatch. Existing methods address this problem by measuring the dynamics gap via domain classifiers while relying on the assumptions of…
▽ More
Cross-domain offline reinforcement learning leverages source domain data with diverse transition dynamics to alleviate the data requirement for the target domain. However, simply merging the data of two domains leads to performance degradation due to the dynamics mismatch. Existing methods address this problem by measuring the dynamics gap via domain classifiers while relying on the assumptions of the transferability of paired domains. In this paper, we propose a novel representation-based approach to measure the domain gap, where the representation is learned through a contrastive objective by sampling transitions from different domains. We show that such an objective recovers the mutual-information gap of transition functions in two domains without suffering from the unbounded issue of the dynamics gap in handling significantly different domains. Based on the representations, we introduce a data filtering algorithm that selectively shares transitions from the source domain according to the contrastive score functions. Empirical results on various tasks demonstrate that our method achieves superior performance, using only 10% of the target data to achieve 89.2% of the performance on 100% target dataset with state-of-the-art methods.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Quantum Phases and Transitions in Spin Chains with Non-Invertible Symmetries
Authors:
Arkya Chatterjee,
Ömer M. Aksoy,
Xiao-Gang Wen
Abstract:
Generalized symmetries often appear in the form of emergent symmetries in low energy effective descriptions of quantum many-body systems. Non-invertible symmetries are a particularly exotic class of generalized symmetries, in that they are implemented by transformations that do not form a group. Such symmetries appear generically in gapless states of quantum matter, constraining the low-energy dyn…
▽ More
Generalized symmetries often appear in the form of emergent symmetries in low energy effective descriptions of quantum many-body systems. Non-invertible symmetries are a particularly exotic class of generalized symmetries, in that they are implemented by transformations that do not form a group. Such symmetries appear generically in gapless states of quantum matter, constraining the low-energy dynamics. To provide a UV-complete description of such symmetries, it is useful to construct lattice models that respect these symmetries exactly. In this paper, we discuss two families of one-dimensional lattice Hamiltonians with finite on-site Hilbert spaces: one with (invertible) $S^{\,}_3$ symmetry and the other with non-invertible $\mathsf{Rep}(S^{\,}_3)$ symmetry. Our models are largely analytically tractable and demonstrate all possible spontaneous symmetry breaking patterns of these symmetries. Moreover, we use numerical techniques to study the nature of continuous phase transitions between the different symmetry-breaking gapped phases associated with both symmetries. Both models have self-dual lines, where the models are enriched by so-called intrinsically non-invertible symmetries generated by Kramers-Wannier-like duality transformations. We provide explicit lattice operators that generate these non-invertible self-duality symmetries. We show that the enhanced symmetry at the self-dual lines is described by a 2+1D symmetry-topological-order (SymTO) of type $\mathrm{JK}^{\,}_4\boxtimes \overline{\mathrm{JK}}^{\,}_4$. The condensable algebras of the SymTO determine the allowed gapped and gapless states of the self-dual $S^{\,}_3$-symmetric and $\mathsf{Rep}(S^{\,}_3)$-symmetric models.
△ Less
Submitted 14 July, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Higher Berry Curvature from the Wave function II: Locally Parameterized States Beyond One Dimension
Authors:
Ophelia Evelyn Sommer,
Ashvin Vishwanath,
Xueda Wen
Abstract:
We propose a systematic wave function based approach to construct topological invariants for families of lattice systems that are short-range entangled using local parameter spaces. This construction is particularly suitable when given a family of tensor networks that can be viewed as the ground states of $d$ dimensional lattice systems, for which we construct the closed $(d+2)$-form higher Berry…
▽ More
We propose a systematic wave function based approach to construct topological invariants for families of lattice systems that are short-range entangled using local parameter spaces. This construction is particularly suitable when given a family of tensor networks that can be viewed as the ground states of $d$ dimensional lattice systems, for which we construct the closed $(d+2)$-form higher Berry curvature, which is a generalization of the well known 2-form Berry curvature. Such $(d+2)$-form higher Berry curvature characterizes a flow of $(d+1)$-form higher Berry curvature in the system. Our construction is equally suitable for constructing other higher pumps, such as the (higher) Thouless pump in the presence of a global on-site $U(1)$ symmetry, which corresponds to a closed $d$-form. The cohomology classes of such higher differential forms are topological invariants and are expected to be quantized for short-range entangled states. We illustrate our construction with exactly solvable lattice models that are in nontrivial higher Berry classes in $d=2$.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Higher Berry Curvature from the Wave Function I: Schmidt Decomposition and Matrix Product States
Authors:
Ophelia Evelyn Sommer,
Xueda Wen,
Ashvin Vishwanath
Abstract:
Higher Berry curvature (HBC) is the proposed generalization of Berry curvature to infinitely extended systems. Heuristically HBC captures the flow of local Berry curvature in a system. Here we provide a simple formula for computing the HBC for extended $d = 1$ systems at the level of wave functions using the Schmidt decomposition. We also find a corresponding formula for matrix product states (MPS…
▽ More
Higher Berry curvature (HBC) is the proposed generalization of Berry curvature to infinitely extended systems. Heuristically HBC captures the flow of local Berry curvature in a system. Here we provide a simple formula for computing the HBC for extended $d = 1$ systems at the level of wave functions using the Schmidt decomposition. We also find a corresponding formula for matrix product states (MPS), and show that for translationally invariant MPS this gives rise to a quantized invariant. We demonstrate our approach with an exactly solvable model and numerical calculations for generic models using iDMRG
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
3D Extended Object Tracking by Fusing Roadside Sparse Radar Point Clouds and Pixel Keypoints
Authors:
Jiayin Deng,
Zhiqun Hu,
Yuxuan Xia,
Zhaoming Lu,
Xiangming Wen
Abstract:
Roadside perception is a key component in intelligent transportation systems. In this paper, we present a novel three-dimensional (3D) extended object tracking (EOT) method, which simultaneously estimates the object kinematics and extent state, in roadside perception using both the radar and camera data. Because of the influence of sensor viewing angle and limited angle resolution, radar measureme…
▽ More
Roadside perception is a key component in intelligent transportation systems. In this paper, we present a novel three-dimensional (3D) extended object tracking (EOT) method, which simultaneously estimates the object kinematics and extent state, in roadside perception using both the radar and camera data. Because of the influence of sensor viewing angle and limited angle resolution, radar measurements from objects are sparse and non-uniformly distributed, leading to inaccuracies in object extent and position estimation. To address this problem, we present a novel spherical Gaussian function weighted Gaussian mixture model. This model assumes that radar measurements originate from a series of probabilistic weighted radar reflectors on the vehicle's extent. Additionally, we utilize visual detection of vehicle keypoints to provide additional information on the positions of radar reflectors. Since keypoints may not always correspond to radar reflectors, we propose an elastic skeleton fusion mechanism, which constructs a virtual force to establish the relationship between the radar reflectors on the vehicle and its extent. Furthermore, to better describe the kinematic state of the vehicle and constrain its extent state, we develop a new 3D constant turn rate and velocity motion model, considering the complex 3D motion of the vehicle relative to the roadside sensor. Finally, we apply variational Bayesian approximation to the intractable measurement update step to enable recursive Bayesian estimation of the object's state. Simulation results using the Carla simulator and experimental results on the nuScenes dataset demonstrate the effectiveness and superiority of the proposed method in comparison to several state-of-the-art 3D EOT methods.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
Interpretable Clustering with the Distinguishability Criterion
Authors:
Ali Turfah,
Xiaoquan Wen
Abstract:
Cluster analysis is a popular unsupervised learning tool used in many disciplines to identify heterogeneous sub-populations within a sample. However, validating cluster analysis results and determining the number of clusters in a data set remains an outstanding problem. In this work, we present a global criterion called the Distinguishability criterion to quantify the separability of identified cl…
▽ More
Cluster analysis is a popular unsupervised learning tool used in many disciplines to identify heterogeneous sub-populations within a sample. However, validating cluster analysis results and determining the number of clusters in a data set remains an outstanding problem. In this work, we present a global criterion called the Distinguishability criterion to quantify the separability of identified clusters and validate inferred cluster configurations. Our computational implementation of the Distinguishability criterion corresponds to the Bayes risk of a randomized classifier under the 0-1 loss. We propose a combined loss function-based computational framework that integrates the Distinguishability criterion with many commonly used clustering procedures, such as hierarchical clustering, k-means, and finite mixture models. We present these new algorithms as well as the results from comprehensive data analysis based on simulation studies and real data applications.
△ Less
Submitted 25 April, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
VulEval: Towards Repository-Level Evaluation of Software Vulnerability Detection
Authors:
Xin-Cheng Wen,
Xinchen Wang,
Yujia Chen,
Ruida Hu,
David Lo,
Cuiyun Gao
Abstract:
Deep Learning (DL)-based methods have proven to be effective for software vulnerability detection, with a potential for substantial productivity enhancements for detecting vulnerabilities. Current methods mainly focus on detecting single functions (i.e., intra-procedural vulnerabilities), ignoring the more complex inter-procedural vulnerability detection scenarios in practice. For example, develop…
▽ More
Deep Learning (DL)-based methods have proven to be effective for software vulnerability detection, with a potential for substantial productivity enhancements for detecting vulnerabilities. Current methods mainly focus on detecting single functions (i.e., intra-procedural vulnerabilities), ignoring the more complex inter-procedural vulnerability detection scenarios in practice. For example, developers routinely engage with program analysis to detect vulnerabilities that span multiple functions within repositories. In addition, the widely-used benchmark datasets generally contain only intra-procedural vulnerabilities, leaving the assessment of inter-procedural vulnerability detection capabilities unexplored.
To mitigate the issues, we propose a repository-level evaluation system, named \textbf{VulEval}, aiming at evaluating the detection performance of inter- and intra-procedural vulnerabilities simultaneously. Specifically, VulEval consists of three interconnected evaluation tasks: \textbf{(1) Function-Level Vulnerability Detection}, aiming at detecting intra-procedural vulnerability given a code snippet; \textbf{(2) Vulnerability-Related Dependency Prediction}, aiming at retrieving the most relevant dependencies from call graphs for providing developers with explanations about the vulnerabilities; and \textbf{(3) Repository-Level Vulnerability Detection}, aiming at detecting inter-procedural vulnerabilities by combining with the dependencies identified in the second task. VulEval also consists of a large-scale dataset, with a total of 4,196 CVE entries, 232,239 functions, and corresponding 4,699 repository-level source code in C/C++ programming languages. Our analysis highlights the current progress and future directions for software vulnerability detection.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Out-of-plane orientated self-trapped excitons enabled polarized light guiding in 2D perovskites
Authors:
Junze Li,
Junchao Hu,
Ting Luo,
Dongliang Chen,
Yingying Chen,
Zeyi Liu,
Dingshan Gao,
Xinglin Wen,
Dehui Li
Abstract:
Active optical waveguides combine light source and waveguides together in an individual component, which are essential for the integrated photonic chips. Although 1D luminescent materials based optical waveguides were extensively investigated, 2D waveguides allow photons to flow within a plane and serve as an ideal component for the ultracompact photonic circuits. Nevertheless, light guiding in 2D…
▽ More
Active optical waveguides combine light source and waveguides together in an individual component, which are essential for the integrated photonic chips. Although 1D luminescent materials based optical waveguides were extensively investigated, 2D waveguides allow photons to flow within a plane and serve as an ideal component for the ultracompact photonic circuits. Nevertheless, light guiding in 2D planar structures normally relies on the precise control of molecular orientation, which is complicated and low yield. Here, we report a strategy to guide polarized light in 2D microflakes by making use of the out-of-plane (OP) orientation of self-trapped excitons in as-synthesized 2D perovskite microplates. A space confined crystallization method is developed to synthesize 2D perovskite microflakes with dominated broad self-trapped excitons emission at room temperature, which are highly OP orientated with a percentage of the OP component over 85%. Taking advantages of the negligible absorption coefficient and improved coupling efficiency of OP orientated self-trapped exciton emission to the planar waveguide mode of the as-synthesized perovskite microflakes, we have achieved a broadband polarized light guiding with a full width at half maximum over 120 nm. Our findings provide a promising platform for the development of ultracompact photonic circuits.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
A Generative Deep Learning Approach for Crash Severity Modeling with Imbalanced Data
Authors:
Junlan Chen,
Ziyuan Pu,
Nan Zheng,
Xiao Wen,
Hongliang Ding,
Xiucheng Guo
Abstract:
Crash data is often greatly imbalanced, with the majority of crashes being non-fatal crashes, and only a small number being fatal crashes due to their rarity. Such data imbalance issue poses a challenge for crash severity modeling since it struggles to fit and interpret fatal crash outcomes with very limited samples. Usually, such data imbalance issues are addressed by data resampling methods, suc…
▽ More
Crash data is often greatly imbalanced, with the majority of crashes being non-fatal crashes, and only a small number being fatal crashes due to their rarity. Such data imbalance issue poses a challenge for crash severity modeling since it struggles to fit and interpret fatal crash outcomes with very limited samples. Usually, such data imbalance issues are addressed by data resampling methods, such as under-sampling and over-sampling techniques. However, most traditional and deep learning-based data resampling methods, such as synthetic minority oversampling technique (SMOTE) and generative Adversarial Networks (GAN) are designed dedicated to processing continuous variables. Though some resampling methods have improved to handle both continuous and discrete variables, they may have difficulties in dealing with the collapse issue associated with sparse discrete risk factors. Moreover, there is a lack of comprehensive studies that compare the performance of various resampling methods in crash severity modeling. To address the aforementioned issues, the current study proposes a crash data generation method based on the Conditional Tabular GAN. After data balancing, a crash severity model is employed to estimate the performance of classification and interpretation. A comparative study is conducted to assess classification accuracy and distribution consistency of the proposed generation method using a 4-year imbalanced crash dataset collected in Washington State, U.S. Additionally, Monte Carlo simulation is employed to estimate the performance of parameter and probability estimation in both two- and three-class imbalance scenarios. The results indicate that using synthetic data generated by CTGAN-RU for crash severity modeling outperforms using original data or synthetic data generated by other resampling methods.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
SCALE: Constructing Structured Natural Language Comment Trees for Software Vulnerability Detection
Authors:
Xin-Cheng Wen,
Cuiyun Gao,
Shuzheng Gao,
Yang Xiao,
Michael R. Lyu
Abstract:
Recently, there has been a growing interest in automatic software vulnerability detection. Pre-trained model-based approaches have demonstrated superior performance than other Deep Learning (DL)-based approaches in detecting vulnerabilities. However, the existing pre-trained model-based approaches generally employ code sequences as input during prediction, and may ignore vulnerability-related stru…
▽ More
Recently, there has been a growing interest in automatic software vulnerability detection. Pre-trained model-based approaches have demonstrated superior performance than other Deep Learning (DL)-based approaches in detecting vulnerabilities. However, the existing pre-trained model-based approaches generally employ code sequences as input during prediction, and may ignore vulnerability-related structural information, as reflected in the following two aspects. First, they tend to fail to infer the semantics of the code statements with complex logic such as those containing multiple operators and pointers. Second, they are hard to comprehend various code execution sequences, which is essential for precise vulnerability detection.
To mitigate the challenges, we propose a Structured Natural Language Comment tree-based vulnerAbiLity dEtection framework based on the pre-trained models, named SCALE. The proposed Structured Natural Language Comment Tree (SCT) integrates the semantics of code statements with code execution sequences based on the Abstract Syntax Trees (ASTs). Specifically, SCALE comprises three main modules: (1) Comment Tree Construction, which aims at enhancing the model's ability to infer the semantics of code statements by first incorporating Large Language Models (LLMs) for comment generation and then adding the comment node to ASTs. (2) Structured Natural Language Comment Tree Construction}, which aims at explicitly involving code execution sequence by combining the code syntax templates with the comment tree. (3) SCT-Enhanced Representation, which finally incorporates the constructed SCTs for well capturing vulnerability patterns.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Observation of spectral lines in the exceptional GRB 221009A
Authors:
Yan-Qiu Zhang,
Shao-Lin Xiong,
Ji-Rong Mao,
Shuang-Nan Zhang,
Wang-Chen Xue,
Chao Zheng,
Jia-Cong Liu,
Zhen Zhang,
Xi-Lu Wang,
Ming-Yu Ge,
Shu-Xu Yi,
Li-Ming Song,
Zheng-Hua An,
Ce Cai,
Xin-Qiao Li,
Wen-Xi Peng,
Wen-Jun Tan,
Chen-Wei Wang,
Xiang-Yang Wen,
Yue Wang,
Shuo Xiao,
Fan Zhang,
Peng Zhang,
Shi-Jie Zheng
Abstract:
As the brightest gamma-ray burst ever observed, GRB 221009A provided a precious opportunity to explore spectral line features. In this paper, we performed a comprehensive spectroscopy analysis of GRB 221009A jointly with GECAM-C and Fermi/GBM data to search for emission and absorption lines. For the first time we investigated the line feature throughout this GRB including the most bright part wher…
▽ More
As the brightest gamma-ray burst ever observed, GRB 221009A provided a precious opportunity to explore spectral line features. In this paper, we performed a comprehensive spectroscopy analysis of GRB 221009A jointly with GECAM-C and Fermi/GBM data to search for emission and absorption lines. For the first time we investigated the line feature throughout this GRB including the most bright part where many instruments suffered problems, and identified prominent emission lines in multiple time intervals. The central energy of the Gaussian emission line evolves from about 37 MeV to 6 MeV, with a nearly constant ratio (about 10\%) between the line width and central energy. Particularly, we find that both the central energy and the energy flux of the emission line evolve with time as a power law decay with power law index of -1 and -2 respectively. We suggest that the observed emission lines most likely originate from the blue-shifted electron positron pair annihilation 511 keV line. We find that a standard high latitude emission scenario cannot fully interpret the observation, thus we propose that the emission line comes from some dense clumps with electron positron pairs traveling together with the jet. In this scenario, we can use the emission line to directly, for the first time, measure the bulk Lorentz factor of the jet ($Γ$) and reveal its time evolution (i.e. $Γ\sim t^{-1}$) during the prompt emission. Interestingly, we find that the flux of the annihilation line in the co-moving frame keeps constant. These discoveries of the spectral line features shed new and important lights on the physics of GRB and relativistic jet.
△ Less
Submitted 28 May, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
InTeX: Interactive Text-to-texture Synthesis via Unified Depth-aware Inpainting
Authors:
Jiaxiang Tang,
Ruijie Lu,
Xiaokang Chen,
Xiang Wen,
Gang Zeng,
Ziwei Liu
Abstract:
Text-to-texture synthesis has become a new frontier in 3D content creation thanks to the recent advances in text-to-image models. Existing methods primarily adopt a combination of pretrained depth-aware diffusion and inpainting models, yet they exhibit shortcomings such as 3D inconsistency and limited controllability. To address these challenges, we introduce InteX, a novel framework for interacti…
▽ More
Text-to-texture synthesis has become a new frontier in 3D content creation thanks to the recent advances in text-to-image models. Existing methods primarily adopt a combination of pretrained depth-aware diffusion and inpainting models, yet they exhibit shortcomings such as 3D inconsistency and limited controllability. To address these challenges, we introduce InteX, a novel framework for interactive text-to-texture synthesis. 1) InteX includes a user-friendly interface that facilitates interaction and control throughout the synthesis process, enabling region-specific repainting and precise texture editing. 2) Additionally, we develop a unified depth-aware inpainting model that integrates depth information with inpainting cues, effectively mitigating 3D inconsistencies and improving generation speed. Through extensive experiments, our framework has proven to be both practical and effective in text-to-texture synthesis, paving the way for high-quality 3D content creation.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
ThermoHands: A Benchmark for 3D Hand Pose Estimation from Egocentric Thermal Images
Authors:
Fangqiang Ding,
Lawrence Zhu,
Xiangyu Wen,
Gaowen Liu,
Chris Xiaoxuan Lu
Abstract:
In this work, we present ThermoHands, a new benchmark for thermal image-based egocentric 3D hand pose estimation, aimed at overcoming challenges like varying lighting conditions and obstructions (e.g., handwear). The benchmark includes a multi-view and multi-spectral dataset collected from 28 subjects performing hand-object and hand-virtual interactions under diverse scenarios, accurately annotate…
▽ More
In this work, we present ThermoHands, a new benchmark for thermal image-based egocentric 3D hand pose estimation, aimed at overcoming challenges like varying lighting conditions and obstructions (e.g., handwear). The benchmark includes a multi-view and multi-spectral dataset collected from 28 subjects performing hand-object and hand-virtual interactions under diverse scenarios, accurately annotated with 3D hand poses through an automated process. We introduce a new baseline method, TherFormer, utilizing dual transformer modules for effective egocentric 3D hand pose estimation in thermal imagery. Our experimental results highlight TherFormer's leading performance and affirm thermal imaging's effectiveness in enabling robust 3D hand pose estimation in adverse conditions.
△ Less
Submitted 13 June, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
Springer correspondence and mirror symmetries for parabolic Hitchin systems
Authors:
Bin Wang,
Xueqing Wen,
Yaoxiong Wen
Abstract:
We prove the Strominger--Yau--Zaslow and topological mirror symmetries for parabolic Hitchin systems of types B and C. In contrast to type A, a geometric reinterpretation of Springer duality is necessary. Furthermore, unlike Hitchin's construction in the non-parabolic case, the map between generic fibers in type B and C needs more analysis due to the change of partitions of Springer dual nilpotent…
▽ More
We prove the Strominger--Yau--Zaslow and topological mirror symmetries for parabolic Hitchin systems of types B and C. In contrast to type A, a geometric reinterpretation of Springer duality is necessary. Furthermore, unlike Hitchin's construction in the non-parabolic case, the map between generic fibers in type B and C needs more analysis due to the change of partitions of Springer dual nilpotent orbits, which is the main difficulty in this article. To tackle this challenge, we first construct and study the geometry of the generic Hitchin fibers of moduli spaces of Higgs bundles associated to the nilpotent orbit closures. Then we study their relation with the generic Hitchin fibers of parabolic Hitchin systems. Along this way, we establish intriguing connections between Springer duality, Kazhdan--Lusztig maps, and singularities of spectral curves, and uncover a new geometric interpretation of Lusztig's canonical quotient.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Performance Bounds for Passive Sensing in Asynchronous ISAC Systems -- Appendices
Authors:
Jingbo Zhao,
Zhaoming Lu,
J. Andrew Zhang,
Weicai Li,
Yifeng Xiong,
Zijun Han,
Xiangming Wen,
Tao Gu
Abstract:
This document contains the appendices for our paper titled ``Performance Bounds for Passive Sensing in Asynchronous ISAC Systems." The appendices include rigorous derivations of key formulas, detailed proofs of the theorems and propositions introduced in the paper, and details of the algorithm tested in the numerical simulation for validation. These appendices aim to support and elaborate on the f…
▽ More
This document contains the appendices for our paper titled ``Performance Bounds for Passive Sensing in Asynchronous ISAC Systems." The appendices include rigorous derivations of key formulas, detailed proofs of the theorems and propositions introduced in the paper, and details of the algorithm tested in the numerical simulation for validation. These appendices aim to support and elaborate on the findings and methodologies presented in the main text. All external references to equations, theorems, and so forth, are directed towards the corresponding elements within the main paper.
△ Less
Submitted 29 March, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Interpretable Models for Detecting and Monitoring Elevated Intracranial Pressure
Authors:
Darryl Hannan,
Steven C. Nesbit,
Ximing Wen,
Glen Smith,
Qiao Zhang,
Alberto Goffi,
Vincent Chan,
Michael J. Morris,
John C. Hunninghake,
Nicholas E. Villalobos,
Edward Kim,
Rosina O. Weber,
Christopher J. MacLellan
Abstract:
Detecting elevated intracranial pressure (ICP) is crucial in diagnosing and managing various neurological conditions. These fluctuations in pressure are transmitted to the optic nerve sheath (ONS), resulting in changes to its diameter, which can then be detected using ultrasound imaging devices. However, interpreting sonographic images of the ONS can be challenging. In this work, we propose two sy…
▽ More
Detecting elevated intracranial pressure (ICP) is crucial in diagnosing and managing various neurological conditions. These fluctuations in pressure are transmitted to the optic nerve sheath (ONS), resulting in changes to its diameter, which can then be detected using ultrasound imaging devices. However, interpreting sonographic images of the ONS can be challenging. In this work, we propose two systems that actively monitor the ONS diameter throughout an ultrasound video and make a final prediction as to whether ICP is elevated. To construct our systems, we leverage subject matter expert (SME) guidance, structuring our processing pipeline according to their collection procedure, while also prioritizing interpretability and computational efficiency. We conduct a number of experiments, demonstrating that our proposed systems are able to outperform various baselines. One of our SMEs then manually validates our top system's performance, lending further credibility to our approach while demonstrating its potential utility in a clinical setting.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Classes Are Not Equal: An Empirical Study on Image Recognition Fairness
Authors:
Jiequan Cui,
Beier Zhu,
Xin Wen,
Xiaojuan Qi,
Bei Yu,
Hanwang Zhang
Abstract:
In this paper, we present an empirical study on image recognition fairness, i.e., extreme class accuracy disparity on balanced data like ImageNet. We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets, network architectures, and model capacities. Moreover, several intriguing properties of fairness are id…
▽ More
In this paper, we present an empirical study on image recognition fairness, i.e., extreme class accuracy disparity on balanced data like ImageNet. We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets, network architectures, and model capacities. Moreover, several intriguing properties of fairness are identified. First, the unfairness lies in problematic representation rather than classifier bias. Second, with the proposed concept of Model Prediction Bias, we investigate the origins of problematic representation during optimization. Our findings reveal that models tend to exhibit greater prediction biases for classes that are more challenging to recognize. It means that more other classes will be confused with harder classes. Then the False Positives (FPs) will dominate the learning in optimization, thus leading to their poor accuracy. Further, we conclude that data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification. The Code is available at https://github.com/dvlab-research/Parametric-Contrastive-Learning.
△ Less
Submitted 12 March, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Simulation Studies for the First Pathfinder of the CATCH Space Mission
Authors:
Yiming Huang,
Juan Zhang,
Lian Tao,
Zhengwei Li,
Donghua Zhao,
Qian-Qing Yin,
Xiangyang Wen,
Jingyu Xiao,
Chen Zhang,
Shuang-Nan Zhang,
Shaolin Xiong,
Qingcui Bu,
Jirong Cang,
Dezhi Cao,
Wen Chen,
Siran Ding,
Min Gao,
Yang Gao,
Shujin Hou,
Liping Jia,
Ge Jin,
Dalin Li,
Jinsong Li,
Panping Li,
Yajun Li
, et al. (20 additional authors not shown)
Abstract:
The Chasing All Transients Constellation Hunters (CATCH) space mission is an intelligent constellation consisting of 126 micro-satellites in three types (A, B, and C), designed for X-ray observation with the objective of studying the dynamic universe. Currently, we are actively developing the first Pathfinder (CATCH-1) for the CATCH mission, specifically for type-A satellites. CATCH-1 is equipped…
▽ More
The Chasing All Transients Constellation Hunters (CATCH) space mission is an intelligent constellation consisting of 126 micro-satellites in three types (A, B, and C), designed for X-ray observation with the objective of studying the dynamic universe. Currently, we are actively developing the first Pathfinder (CATCH-1) for the CATCH mission, specifically for type-A satellites. CATCH-1 is equipped with Micro Pore Optics (MPO) and a 4-pixel Silicon Drift Detector (SDD) array. To assess its scientific performance, including the effective area of the optical system, on-orbit background, and telescope sensitivity, we employ the Monte Carlo software Geant4 for simulation in this study. The MPO optics exhibit an effective area of $41$ cm$^2$ at the focal spot for 1 keV X-rays, while the entire telescope system achieves an effective area of $29$ cm$^2$ at 1 keV when taking into account the SDD detector's detection efficiency. The primary contribution to the background is found to be from the Cosmic X-ray Background. Assuming a 625 km orbit with an inclination of $29^\circ$, the total background for CATCH-1 is estimated to be $8.13\times10^{-2}$ counts s$^{-1}$ in the energy range of 0.5--4 keV. Based on the background within the central detector and assuming a Crab-like source spectrum, the estimated ideal sensitivity could achieve $1.9\times10^{-12}$ erg cm$^{-2}$ s$^{-1}$ for an exposure of 10$^4$ s in the energy band of 0.5--4 keV. Furthermore, after simulating the background caused by low-energy charged particles near the geomagnetic equator, we have determined that there is no need to install a magnetic deflector.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.