Search | arXiv e-print repository

Twist operator correlator revisited and tau function on Hurwitz space

Abstract: Correlation function of twist operators is a natural quantity of interest in two-dimensional conformal field theory (2d CFT) and finds relevance in various physical contexts. For computing twist operator correlators associated with generic branched covers of genus zero and one, we present a generalization of the conventional stress-tensor method to encompass generic 2d CFTs without relying on any… ▽ More Correlation function of twist operators is a natural quantity of interest in two-dimensional conformal field theory (2d CFT) and finds relevance in various physical contexts. For computing twist operator correlators associated with generic branched covers of genus zero and one, we present a generalization of the conventional stress-tensor method to encompass generic 2d CFTs without relying on any free field realization. This is achieved by employing a generalization of the argument of Calabrese-Cardy in the cyclic genus zero case. The generalized stress-tensor method reveals a compelling relation between the twist operator correlator and the tau function on Hurwitz space, the moduli space of branched covers, of Kokotov-Korotkin. This stems from the close relation between stress-tensor one-point function and Bergman projective connection of branched cover. The tau function on Hurwitz space is in turn related to the more general isomonodromic tau function, and this chain of correspondence thus relates the twist operator correlator to a canonical algebro-geometric object and endows it with an integrable system interpretation. Conversely, the tau function on Hurwitz space essentially admits a CFT interpretation as the holomorphic part of the twist operator correlator of $c=1$ free boson. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: 27 pages, 2 figures, 1 table. Comments welcome!

arXiv:2306.14586 [pdf, other]

doi 10.1016/j.physa.2022.128263

Consensus in Complex Networks with Noisy Agents and Peer Pressure

Authors: Christopher Griffin, Anna Squicciarini, Feiran Jia

Abstract: In this paper we study a discrete time consensus model on a connected graph with monotonically increasing peer-pressure and noise perturbed outputs masking a hidden state. We assume that each agent maintains a constant hidden state and a presents a dynamic output that is perturbed by random noise drawn from a mean-zero distribution. We show consensus is ensured in the limit as time goes to infinit… ▽ More In this paper we study a discrete time consensus model on a connected graph with monotonically increasing peer-pressure and noise perturbed outputs masking a hidden state. We assume that each agent maintains a constant hidden state and a presents a dynamic output that is perturbed by random noise drawn from a mean-zero distribution. We show consensus is ensured in the limit as time goes to infinity under certain assumptions on the increasing peer-pressure term and also show that the hidden state cannot be exactly recovered even when model dynamics and outputs are known. The exact nature of the distribution is computed for a simple two vertex graph and results found are shown to generalize (empirically) to more complex graph structures. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: 11 pages, 10 figures

arXiv:2306.13256 [pdf, other]

Magnetic-field-induced splitting of Rydberg Electromagnetically Induced Transparency (EIT) and Autler-Townes (AT) spectra in $^{87}$Rb vapor cell

Authors: Xinheng Li, Yue Cui, Jianhai Hao, Fei Zhou, Fengdong Jia, Jian Zhang, Feng Xie, Zhiping Zhong

Abstract: We theoretically and experimentally investigate the Rydberg electromagnetically induced transparency (EIT) and Autler-Townes (AT) splitting of $^{87}$Rb vapor under the combined influence of a magnetic field and a microwave field. In the presence of static magnetic field, the effect of the microwave field leads to the dressing and splitting of each $m_F$ state, resulting in multiple spectral peaks… ▽ More We theoretically and experimentally investigate the Rydberg electromagnetically induced transparency (EIT) and Autler-Townes (AT) splitting of $^{87}$Rb vapor under the combined influence of a magnetic field and a microwave field. In the presence of static magnetic field, the effect of the microwave field leads to the dressing and splitting of each $m_F$ state, resulting in multiple spectral peaks in the EIT-AT spectrum. A simplified analytical formula was developed to explain the EIT-AT spectrum in a static magnetic field, and the calculations are in excellent agreement with experimental results.We further studied the enhancement of the Rydberg atom microwave electric field sensor performance by making use of the splitting interval between the two maximum absolute $m_F$ states under static magnetic field. The traceable measurement limit of weak electric field by EIT-AT splitting method was extended by an order of magnitude, which is promising for precise microwave electric field measurement. △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: 12 pages, 4 figures

arXiv:2306.09590 [pdf, other]

The 1st-place Solution for CVPR 2023 OpenLane Topology in Autonomous Driving Challenge

Authors: Dongming Wu, Fan Jia, Jiahao Chang, Zhuoling Li, Jianjian Sun, Chunrui Han, Shuailin Li, Yingfei Liu, Zheng Ge, Tiancai Wang

Abstract: We present the 1st-place solution of OpenLane Topology in Autonomous Driving Challenge. Considering that topology reasoning is based on centerline detection and traffic element detection, we develop a multi-stage framework for high performance. Specifically, the centerline is detected by the powerful PETRv2 detector and the popular YOLOv8 is employed to detect the traffic elements. Further, we des… ▽ More We present the 1st-place solution of OpenLane Topology in Autonomous Driving Challenge. Considering that topology reasoning is based on centerline detection and traffic element detection, we develop a multi-stage framework for high performance. Specifically, the centerline is detected by the powerful PETRv2 detector and the popular YOLOv8 is employed to detect the traffic elements. Further, we design a simple yet effective MLP-based head for topology prediction. Our method achieves 55\% OLS on the OpenLaneV2 test set, surpassing the 2nd solution by 8 points. △ Less

Submitted 15 June, 2023; originally announced June 2023.

Comments: Accepted by CVPR2023 Workshop (https://opendrivelab.com/AD23Challenge.html#openlane_topology)

arXiv:2306.06018 [pdf]

The structural stability and polarization analysis of rhombohedral phase HfO2

Authors: Wenbin Ouyang, Fanghao Jia, Wei Ren

Abstract: A comparative theoretical study is presented for the rhombohedral R3 and R3m phase HfO2, of two possible forms in its heavily Zr-doped ferroelectric thin films found recently in experiments. Their structural stability and polarization under the in-plane compressive strain are comprehensively investigated. We discovered that there is a phase transition from R3 to R3m phase under the biaxial compres… ▽ More A comparative theoretical study is presented for the rhombohedral R3 and R3m phase HfO2, of two possible forms in its heavily Zr-doped ferroelectric thin films found recently in experiments. Their structural stability and polarization under the in-plane compressive strain are comprehensively investigated. We discovered that there is a phase transition from R3 to R3m phase under the biaxial compressive strain. Both the direction and amplitude of their polarization can be tuned by the strain. By performing a symmetry mode analysis, we are able to understand its improper nature of the ferroelectricity. These results may help to shed light on the understanding of the hafnia ferroelectric thin films. △ Less

Submitted 30 June, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

arXiv:2306.01337 [pdf, other]

MathChat: Converse to Tackle Challenging Math Problems with LLM Agents

Authors: Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang

Abstract: Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields. LLMs, with their generalized ability, are used as a foundation model to build AI agents for different tasks. In this paper, we study the effectiveness of utilizing LLM age… ▽ More Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields. LLMs, with their generalized ability, are used as a foundation model to build AI agents for different tasks. In this paper, we study the effectiveness of utilizing LLM agents to solve math problems through conversations. We propose MathChat, a conversational problem-solving framework designed for math problems. MathChat consists of an LLM agent and a user proxy agent which is responsible for tool execution and additional guidance. This synergy facilitates a collaborative problem-solving process, where the agents engage in a dialogue to solve the problems. We perform evaluation on difficult high school competition problems from the MATH dataset. Utilizing Python, we show that MathChat can further improve previous tool-using prompting methods by 6%. △ Less

Submitted 28 June, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

Comments: Update version

arXiv:2305.13718 [pdf, other]

Exploring Self-supervised Logic-enhanced Training for Large Language Models

Authors: Fangkai Jiao, Zhiyang Teng, Bosheng Ding, Zhengyuan Liu, Nancy F. Chen, Shafiq Joty

Abstract: Existing efforts to improve logical reasoning ability of language models have predominantly relied on supervised fine-tuning, hindering generalization to new domains and/or tasks. The development of Large Langauge Models (LLMs) has demonstrated the capacity of compressing abundant knowledge into a single proxy, enabling them to tackle multiple tasks effectively. Our preliminary experiments, nevert… ▽ More Existing efforts to improve logical reasoning ability of language models have predominantly relied on supervised fine-tuning, hindering generalization to new domains and/or tasks. The development of Large Langauge Models (LLMs) has demonstrated the capacity of compressing abundant knowledge into a single proxy, enabling them to tackle multiple tasks effectively. Our preliminary experiments, nevertheless, show that LLMs do not show capability on logical reasoning. The performance of LLMs on logical reasoning benchmarks is far behind the existing state-of-the-art baselines. In this paper, we make the first attempt to investigate the feasibility of incorporating logical knowledge through self-supervised post-training, and activating it via in-context learning, which we termed as LogicLLM. Specifically, we devise an auto-regressive objective variant of MERIt and integrate it with two LLM series, i.e., FLAN-T5 and LLaMA, with parameter size ranging from 3 billion to 13 billion. The results on two challenging logical reasoning benchmarks demonstrate the effectiveness of LogicLLM. Besides, we conduct extensive ablation studies to analyze the key factors in designing logic-oriented proxy tasks. △ Less

Submitted 16 June, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: 16 pages, NAACL 2024

arXiv:2305.03025 [pdf, other]

Panda LLM: Training Data and Evaluation for Open-Sourced Chinese Instruction-Following Large Language Models

Authors: Fangkai Jiao, Bosheng Ding, Tianze Luo, Zhanfeng Mo

Abstract: This project focuses on enhancing open-source large language models through instruction-tuning and providing comprehensive evaluations of their performance. We explore how various training data factors, such as quantity, quality, and linguistic distribution, influence the performance of instruction-tuned models trained on publicly accessible high-quality instruction datasets for both English and C… ▽ More This project focuses on enhancing open-source large language models through instruction-tuning and providing comprehensive evaluations of their performance. We explore how various training data factors, such as quantity, quality, and linguistic distribution, influence the performance of instruction-tuned models trained on publicly accessible high-quality instruction datasets for both English and Chinese languages. Our goal is to supplement evaluation with quantitative analyses, providing valuable insights for the continued advancement of open-source chat models. Our model, data, and code are publicly available for others to use and build upon. △ Less

Submitted 4 May, 2023; originally announced May 2023.

arXiv:2304.09316 [pdf, ps, other]

Microwave electrometry with Rydberg atoms in a vapor cell using microwave amplitude modulation

Authors: Jianhai Hao, Fengdong Jia, Yue Cui, Yuhan Wang, Fei Zhou, Xiubin Liu, Jian Zhang, Feng Xie, Zhiping Zhong

Abstract: We have theoretically and experimentally studied the dispersive signal of the Rydberg atomic electromagnetically induced transparency (EIT) - Autler-Townes (AT) splitting spectra obtained using amplitude modulation of the microwave (MW) field. In addition to the two zero-crossing points, the dispersion signal has two positive maxima with an interval defined as the shoulder interval of the dispersi… ▽ More We have theoretically and experimentally studied the dispersive signal of the Rydberg atomic electromagnetically induced transparency (EIT) - Autler-Townes (AT) splitting spectra obtained using amplitude modulation of the microwave (MW) field. In addition to the two zero-crossing points, the dispersion signal has two positive maxima with an interval defined as the shoulder interval of the dispersion signal $Δf_{\text{sho}}$. The relationship of MW field strength $E_{\text{MW}}$ and $Δf_{\text{sho}}$ are studied at the MW frequencies of 31.6 GHz, 22.1 GHz, and 9.2 GHz respectively. The results show that $Δf_{\text{sho}}$ can be used to character the much weaker $E_{\text{MW}}$ than the interval of two zero-crossing points $Δf_{\text{zeros}}$ and the traditional EIT-AT splitting interval $Δf_{\text{m}}$, the minimum $E_{\text{MW}}$ measured by $Δf_{\text{sho}}$ is about 30 times smaller than that by $Δf_{\text{m}}$. As an example, the minimum $E_{\text{MW}}$ at 9.2 GHz that can be characterized by $Δf_{\text{sho}}$ is 0.056 mV/cm, which is the minimum value characterized by frequency interval using vapour cell without adding any auxiliary fields. The proposed method can improve the weak limit and sensitivity of $E_{\text{MW}}$ measured by spectral frequency interval, which is important in the direct measurement of weak $E_{\text{MW}}$. △ Less

Submitted 18 April, 2023; originally announced April 2023.

arXiv:2304.06795 [pdf, other]

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

Authors: Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg

Abstract: This paper introduces a novel Token-and-Duration Transducer (TDT) architecture for sequence-to-sequence tasks. TDT extends conventional RNN-Transducer architectures by jointly predicting both a token and its duration, i.e. the number of input frames covered by the emitted token. This is achieved by using a joint network with two outputs which are independently normalized to generate distributions… ▽ More This paper introduces a novel Token-and-Duration Transducer (TDT) architecture for sequence-to-sequence tasks. TDT extends conventional RNN-Transducer architectures by jointly predicting both a token and its duration, i.e. the number of input frames covered by the emitted token. This is achieved by using a joint network with two outputs which are independently normalized to generate distributions over tokens and durations. During inference, TDT models can skip input frames guided by the predicted duration output, which makes them significantly faster than conventional Transducers which process the encoder output frame by frame. TDT models achieve both better accuracy and significantly faster inference than conventional Transducers on different sequence transduction tasks. TDT models for Speech Recognition achieve better accuracy and up to 2.82X faster inference than conventional Transducers. TDT models for Speech Translation achieve an absolute gain of over 1 BLEU on the MUST-C test compared with conventional Transducers, and its inference is 2.27X faster. In Speech Intent Classification and Slot Filling tasks, TDT models improve the intent accuracy by up to over 1% (absolute) over conventional Transducers, while running up to 1.28X faster. Our implementation of the TDT model will be open-sourced with the NeMo (https://github.com/NVIDIA/NeMo) toolkit. △ Less

Submitted 29 May, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

arXiv:2304.02168 [pdf, other]

I2I: Initializing Adapters with Improvised Knowledge

Authors: Tejas Srinivasan, Furong Jia, Mohammad Rostami, Jesse Thomason

Abstract: Adapters present a promising solution to the catastrophic forgetting problem in continual learning. However, training independent Adapter modules for every new task misses an opportunity for cross-task knowledge transfer. We propose Improvise to Initialize (I2I), a continual learning algorithm that initializes Adapters for incoming tasks by distilling knowledge from previously-learned tasks' Adapt… ▽ More Adapters present a promising solution to the catastrophic forgetting problem in continual learning. However, training independent Adapter modules for every new task misses an opportunity for cross-task knowledge transfer. We propose Improvise to Initialize (I2I), a continual learning algorithm that initializes Adapters for incoming tasks by distilling knowledge from previously-learned tasks' Adapters. We evaluate I2I on CLiMB, a multimodal continual learning benchmark, by conducting experiments on sequences of visual question answering tasks. Adapters trained with I2I consistently achieve better task accuracy than independently-trained Adapters, demonstrating that our algorithm facilitates knowledge transfer between task Adapters. I2I also results in better cross-task knowledge transfer than the state-of-the-art AdapterFusion without incurring the associated parametric cost. △ Less

Submitted 10 July, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

Comments: Accepted at 2nd Conference on Lifelong Learning Agents (CoLLAs), 2023

arXiv:2303.10868 [pdf, other]

Retrieving Multimodal Information for Augmented Generation: A Survey

Authors: Ruochen Zhao, Hailin Chen, Weishi Wang, Fangkai Jiao, Xuan Long Do, Chengwei Qin, Bosheng Ding, Xiaobao Guo, Minzhi Li, Xingxuan Li, Shafiq Joty

Abstract: As Large Language Models (LLMs) become popular, there emerged an important trend of using multimodality to augment the LLMs' generation ability, which enables LLMs to better interact with the world. However, there lacks a unified perception of at which stage and how to incorporate different modalities. In this survey, we review methods that assist and augment generative models by retrieving multim… ▽ More As Large Language Models (LLMs) become popular, there emerged an important trend of using multimodality to augment the LLMs' generation ability, which enables LLMs to better interact with the world. However, there lacks a unified perception of at which stage and how to incorporate different modalities. In this survey, we review methods that assist and augment generative models by retrieving multimodal knowledge, whose formats range from images, codes, tables, graphs, to audio. Such methods offer a promising solution to important concerns such as factuality, reasoning, interpretability, and robustness. By providing an in-depth review, this survey is expected to provide scholars with a deeper understanding of the methods' applications and encourage them to adapt existing techniques to the fast-growing field of LLMs. △ Less

Submitted 30 November, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

arXiv:2303.10364 [pdf, other]

Evolution of the number and temperature of the remaining cold atoms in CW-laser photoionization of laser-cooled $^{87}$Rb atoms

Authors: Fei Wang, Feng-Dong Jia, Wei-Chen Liang, Xiao-Kang Li, Yu-Han Wang, Jing-Yu Qian, Dian-Cheng Zhang, Yong Wu, Jian-Guo Wang, Rong-Hua Lu, Xiang-Yuan Xu, Ya-Ping Ruan, Ping Xue, Zhi-Ping Zhong

Abstract: Based on the Rb$^+$-Rb hybrid trap, we investigate the effect of ion-atom elastic collisions on the number and temperature of the remaining atoms. We measured the remaining atomic number and temperature as a function of the wavelength and intensity of the ionization laser, and whether the ion trap was turned on. Fittings with a single exponential decay function plus an offset to the number and rad… ▽ More Based on the Rb$^+$-Rb hybrid trap, we investigate the effect of ion-atom elastic collisions on the number and temperature of the remaining atoms. We measured the remaining atomic number and temperature as a function of the wavelength and intensity of the ionization laser, and whether the ion trap was turned on. Fittings with a single exponential decay function plus an offset to the number and radius of the remaining atoms are found to be in good agreement. We found a difference in the exponential factor of different wavelengths of ionization laser with the ion trap on or off. We suppose that the presence of electrons affects ion-atom collisions through disorder-induced heating. Our research contributes to a better understanding of how ultracold neutral plasma evolves, particularly the subsequent kinetics of atomic processes, which also serves as a useful reference for high-energy-density plasma. △ Less

Submitted 21 March, 2023; v1 submitted 18 March, 2023; originally announced March 2023.

Comments: 14 figures, 9 pages

arXiv:2303.10360 [pdf, other]

Generation of cold polyatomic cations by cascade reactive two-body ion-atom collisions

Authors: Wei-Chen Liang, Feng-Dong Jia, Fei Wang, Xi Zhang, Yu-Han Wang, Jing-Yu Qian, Xiao-Qing Hu, Yong Wu, Jian-Guo Wang, Ping Xue, Zhi-Ping Zhong

Abstract: Polyatomic cations $^{87}$Rb$_M^+$ ($M$ = 2, 3,$\ldots$) have been produced by cascade two-body ion-atom reactive collisions in the two-step CW-laser photoionization of laser-cooled $^{87}$Rb atoms and accumulated in the ion trap. Using resonant-excitation mass spectrometry and resonant excitation-assisted time-of-flight mass spectrometry, we directly observed and distinguished the charged reactio… ▽ More Polyatomic cations $^{87}$Rb$_M^+$ ($M$ = 2, 3,$\ldots$) have been produced by cascade two-body ion-atom reactive collisions in the two-step CW-laser photoionization of laser-cooled $^{87}$Rb atoms and accumulated in the ion trap. Using resonant-excitation mass spectrometry and resonant excitation-assisted time-of-flight mass spectrometry, we directly observed and distinguished the charged reaction products. We experimentally verified the cascade generation and cascade dissociation of $^{87}$Rb$_M^+$. The populations of $^{87}$Rb$_M^+$ are quantitatively investigated by solving the rate equations. The $^{87}$Rb$^+$-$^{87}$Rb reaction rate coefficient was derived as 9.10$\times10^{-11}$ cm$^3$/s accordingly. The methods developed here for assembling and detecting homonuclear polyatomic cations can be applied to any experiment in ion-atom hybrid traps. The present study lays the foundation for exploring atomically precise metal clusters and physics from few- to many-body perspective. △ Less

Submitted 10 July, 2024; v1 submitted 18 March, 2023; originally announced March 2023.

Comments: 5 figures

arXiv:2303.04395 [pdf, other]

Towards Practical Autonomous Flight Simulation for Flapping Wing Biomimetic Robots with Experimental Validation

Authors: Chen Qian, Yongchun Fang, Fan jia, Jifu Yan, Yiming Liang, Tiefeng Li

Abstract: Tried-and-true flapping wing robot simulation is essential in developing flapping wing mechanisms and algorithms. This paper presents a novel application-oriented flapping wing platform, highly compatible with various mechanical designs and adaptable to different robotic tasks. First, the blade element theory and the quasi-steady model are put forward to compute the flapping wing aerodynamics base… ▽ More Tried-and-true flapping wing robot simulation is essential in developing flapping wing mechanisms and algorithms. This paper presents a novel application-oriented flapping wing platform, highly compatible with various mechanical designs and adaptable to different robotic tasks. First, the blade element theory and the quasi-steady model are put forward to compute the flapping wing aerodynamics based on wing kinematics. Translational lift, translational drag, rotational lift, and added mass force are all considered in the computation. Then we use the proposed simulation platform to investigate the passive wing rotation and the wing-tail interaction phenomena of a particular flapping-wing robot. With the help of the simulation tool and a novel statistic based on dynamic differences from the averaged system, several behaviors display their essence by investigating the flapping wing robot dynamic characteristics. After that, the attitude tracking control problem and the positional trajectory tracking problem are both overcome by robust control techniques. Further comparison simulations reveal that the proposed control algorithms compared with other existing ones show apparent superiority. What is more, with the same control algorithm and parameters tuned in simulation, we conduct real flight experiments on a self-made flapping wing robot, and obtain similar results from the proposed simulation platform. In contrast to existing simulation tools, the proposed one is compatible with most existing flapping wing robots, and can inherently drill into each subtle behavior in corresponding applications by observing aerodynamic forces and torques on each blade element. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2302.07072 [pdf, other]

Differentially Private Diffusion Auction: The Single-unit Case

Authors: Fengjuan Jia, Mengxiao Zhang, Jiamou Liu, Bakh Khoussainov

Abstract: Diffusion auction refers to an emerging paradigm of online marketplace where an auctioneer utilises a social network to attract potential buyers. Diffusion auction poses significant privacy risks. From the auction outcome, it is possible to infer hidden, and potentially sensitive, preferences of buyers. To mitigate such risks, we initiate the study of differential privacy (DP) in diffusion auction… ▽ More Diffusion auction refers to an emerging paradigm of online marketplace where an auctioneer utilises a social network to attract potential buyers. Diffusion auction poses significant privacy risks. From the auction outcome, it is possible to infer hidden, and potentially sensitive, preferences of buyers. To mitigate such risks, we initiate the study of differential privacy (DP) in diffusion auction mechanisms. DP is a well-established notion of privacy that protects a system against inference attacks. Achieving DP in diffusion auctions is non-trivial as the well-designed auction rules are required to incentivise the buyers to truthfully report their neighbourhood. We study the single-unit case and design two differentially private diffusion mechanisms (DPDMs): recursive DPDM and layered DPDM. We prove that these mechanisms guarantee differential privacy, incentive compatibility and individual rationality for both valuations and neighbourhood. We then empirically compare their performance on real and synthetic datasets. △ Less

Submitted 16 February, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

arXiv:2301.12576 [pdf, other]

Uncovering Adversarial Risks of Test-Time Adaptation

Authors: Tong Wu, Feiran Jia, Xiangyu Qi, Jiachen T. Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

Abstract: Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts. It allows a base model to adapt to an unforeseen distribution during inference by leveraging the information from the batch of (unlabeled) test data. However, we uncover a novel security vulnerability of TTA based on the insight that predictions on benign samples can be impacted by ma… ▽ More Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts. It allows a base model to adapt to an unforeseen distribution during inference by leveraging the information from the batch of (unlabeled) test data. However, we uncover a novel security vulnerability of TTA based on the insight that predictions on benign samples can be impacted by malicious samples in the same batch. To exploit this vulnerability, we propose Distribution Invading Attack (DIA), which injects a small fraction of malicious data into the test batch. DIA causes models using TTA to misclassify benign and unperturbed test data, providing an entirely new capability for adversaries that is infeasible in canonical machine learning pipelines. Through comprehensive evaluations, we demonstrate the high effectiveness of our attack on multiple benchmarks across six TTA methods. In response, we investigate two countermeasures to robustify the existing insecure TTA implementations, following the principle of "security by design". Together, we hope our findings can make the community aware of the utility-security tradeoffs in deploying TTA and provide valuable insights for developing robust TTA approaches. △ Less

Submitted 4 February, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

arXiv:2301.01283 [pdf, other]

Cross Modal Transformer: Towards Fast and Robust 3D Object Detection

Authors: Junjie Yan, Yingfei Liu, Jianjian Sun, Fan Jia, Shuailin Li, Tiancai Wang, Xiangyu Zhang

Abstract: In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed by encoding the 3D points into multi-modal features. The core design of CMT… ▽ More In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. It achieves 74.1\% NDS (state-of-the-art with single model) on nuScenes test set while maintaining fast inference speed. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code is released at https://github.com/junjie18/CMT. △ Less

Submitted 18 September, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

arXiv:2211.14204 [pdf]

doi 10.1103/PhysRevApplied.19.044085

Stacking up electron-rich and electron-deficient monolayers to achieve extraordinary mid- to far-infrared excitonic absorption: Interlayer excitons in the C3B/C3N bilayer

Authors: Zhao Tang, Greis J. Cruz, Fanhao Jia, Yabei Wu, Weiyi Xia, Peihong Zhang

Abstract: Our ability to efficiently detect and generate far-infrared (i.e., terahertz) radiation is vital in areas spanning from biomedical imaging to interstellar spectroscopy. Despite decades of intense research, bridging the terahertz gap between electronics and optics remains a major challenge due to the lack of robust materials that can efficiently operate in this frequency range, and two-dimensional… ▽ More Our ability to efficiently detect and generate far-infrared (i.e., terahertz) radiation is vital in areas spanning from biomedical imaging to interstellar spectroscopy. Despite decades of intense research, bridging the terahertz gap between electronics and optics remains a major challenge due to the lack of robust materials that can efficiently operate in this frequency range, and two-dimensional (2D) type-II heterostructures may be ideal candidates to fill this gap. Herein, using highly accurate many-body perturbation theory within the GW plus Bethe-Salpeter equation approach, we predict that a type-II heterostructure consisting of an electron rich C3N and an electron deficient C3B monolayers can give rise to extraordinary optical activities in the mid- to far-infrared range. C3N and C3B are two graphene-derived 2D materials that have attracted increasing research attention. Although both C3N and C3B monolayers are moderate gap 2D materials, and they only couple through the rather weak van der Waals interactions, the bilayer heterostructure surprisingly supports extremely bright, low-energy interlayer excitons with large binding energies of 0.2 ~ 0.4 eV, offering an ideal material with interlayer excitonic states for mid-to far-infrared applications at room temperature. We also investigate in detail the properties and formation mechanism of the inter- and intra-layer excitons. △ Less

Submitted 25 November, 2022; originally announced November 2022.

Comments: 15 pages, 6 figures

arXiv:2211.05103 [pdf, ps, other]

Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models

Authors: Travis M. Bartley, Fei Jia, Krishna C. Puvvada, Samuel Kriman, Boris Ginsburg

Abstract: In this paper, we extend previous self-supervised approaches for language identification by experimenting with Conformer based architecture in a multilingual pre-training paradigm. We find that pre-trained speech models optimally encode language discriminatory information in lower layers. Further, we demonstrate that the embeddings obtained from these layers are significantly robust to classify un… ▽ More In this paper, we extend previous self-supervised approaches for language identification by experimenting with Conformer based architecture in a multilingual pre-training paradigm. We find that pre-trained speech models optimally encode language discriminatory information in lower layers. Further, we demonstrate that the embeddings obtained from these layers are significantly robust to classify unseen languages and different acoustic environments without additional training. After fine-tuning a pre-trained Conformer model on the VoxLingua107 dataset, we achieve results similar to current state-of-the-art systems for language identification. More, our model accomplishes this with 5x less parameters. We open-source the model through the NVIDIA NeMo toolkit. △ Less

Submitted 13 March, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

Comments: Submitted to ICASSP 2023

arXiv:2211.03541 [pdf, other]

Multi-blank Transducers for Speech Recognition

Authors: Hainan Xu, Fei Jia, Somshubra Majumdar, Shinji Watanabe, Boris Ginsburg

Abstract: This paper proposes a modification to RNN-Transducer (RNN-T) models for automatic speech recognition (ASR). In standard RNN-T, the emission of a blank symbol consumes exactly one input frame; in our proposed method, we introduce additional blank symbols, which consume two or more input frames when emitted. We refer to the added symbols as big blanks, and the method multi-blank RNN-T. For training… ▽ More This paper proposes a modification to RNN-Transducer (RNN-T) models for automatic speech recognition (ASR). In standard RNN-T, the emission of a blank symbol consumes exactly one input frame; in our proposed method, we introduce additional blank symbols, which consume two or more input frames when emitted. We refer to the added symbols as big blanks, and the method multi-blank RNN-T. For training multi-blank RNN-Ts, we propose a novel logit under-normalization method in order to prioritize emissions of big blanks. With experiments on multiple languages and datasets, we show that multi-blank RNN-T methods could bring relative speedups of over +90%/+139% to model inference for English Librispeech and German Multilingual Librispeech datasets, respectively. The multi-blank RNN-T method also improves ASR accuracy consistently. We will release our implementation of the method in the NeMo (https://github.com/NVIDIA/NeMo) toolkit. △ Less

Submitted 11 April, 2024; v1 submitted 4 November, 2022; originally announced November 2022.

Journal ref: ICASSP 2023

arXiv:2211.02797 [pdf, other]

doi 10.1103/PhysRevC.107.024304

Optimization of generator coordinate method with machine-learning techniques for nuclear spectra and neutrinoless double-beta decay: ridge regression for nuclei with axial deformation

Authors: X. Zhang, W. Lin, J. M. Yao, C. F. Jiao, A. M. Romero, T. R. Rodríguez, H. Hergert

Abstract: The generator coordinate method (GCM) is an important tool of choice for modeling large-amplitude collective motion in atomic nuclei. The computational complexity of the GCM increases rapidly with the number of collective coordinates. It imposes a strong restriction on the applicability of the method. In this work, we propose a subspace-reduction algorithm that employs optimal statistical ML model… ▽ More The generator coordinate method (GCM) is an important tool of choice for modeling large-amplitude collective motion in atomic nuclei. The computational complexity of the GCM increases rapidly with the number of collective coordinates. It imposes a strong restriction on the applicability of the method. In this work, we propose a subspace-reduction algorithm that employs optimal statistical ML models as surrogates for exact quantum-number projection calculations for norm and Hamiltonian kernels. The model space of the original GCM is reduced to a subspace relevant for nuclear low energy spectra and the NME of ground state to ground state $0νββ$ decay based on the orthogonality condition (OC) and the energy-transition-orthogonality procedure (ENTROP), respectively. For simplicity, the polynomial ridge regression (RR) algorithm is used to learn the norm and Hamiltonian kernels of axially deformed configurations. The efficiency and accuracy of this algorithm are illustrated for 76Ge and 76Se by comparing results obtained using the optimal RR models to direct GCM calculations. The low-lying energy spectra of $^{76}$Ge and $^{76}$Se, as well as the $0νββ$-decay NME between their ground states, are computed. The results show that the performance of the GCM+OC/ENTROP+RR is more robust than that of the GCM+RR alone, and the former can reproduce the results of the original GCM calculation accurately with a significantly reduced computational cost. △ Less

Submitted 23 January, 2023; v1 submitted 4 November, 2022; originally announced November 2022.

Comments: 15 pages with 19 figures

Journal ref: Phys. Rev. C107, 024304 (2023)

arXiv:2210.17129 [pdf]

Observation of tungsten impurity suppression with ECRH by an X-ray Crystal Spectrometer on EAST

Authors: Lin Zichao, Zhang Hongming, Wang Fudi, Bae Chenonho, Fu Jia, Shen Yongcai, Lu Dian, Jin Yifei, He Liang, Wang Minrui, Lin Guangle, Ye Kaixuan, Wang Shouxin, Zhao Hailin, Lyu Bo

Abstract: Impurity degrades tokamak plasmas confinement by causing energy loss, diluting the fuel concentration, even terminating the discharges in some extreme cases. Previously, the suppression effects of on-axis Electron Cyclotron Resonance Heating (ECRH) on the impurity accumulation have been investigated on EAST by the extreme ultraviolet (EUV) spectroscopy. However, it is difficult to quantify the cha… ▽ More Impurity degrades tokamak plasmas confinement by causing energy loss, diluting the fuel concentration, even terminating the discharges in some extreme cases. Previously, the suppression effects of on-axis Electron Cyclotron Resonance Heating (ECRH) on the impurity accumulation have been investigated on EAST by the extreme ultraviolet (EUV) spectroscopy. However, it is difficult to quantify the changes in impurity tungsten (W) profile since the W line emissions in the EUV range could not be easily resolved. The X-ray Crystal Spectroscopy (XCS), that used to provide the ion temperature and the rotation velocity by measuring lines emissions in the soft X-ray range, also can be used to study the behavior of impurity W emissions. To begin with, in-situ absolute intensity calibration for Tangential XCS (TXCS) is conducted by analyzing the measurements of the bremsstrahlung radiation intensity. After obtaining the calibration coefficient, W44+ ion density profiles are evaluated by Abel inversion using the spectral line of W XLV (3.9095 Å). Thus, a direct observation of W44+ impurity concentration suppressed by ECRH is accomplished. The obtained W density profiles can be used to analyze the W transport by combining with the impurity transport codes in the future. △ Less

Submitted 31 October, 2022; originally announced October 2022.

arXiv:2210.15781 [pdf, other]

A Compact End-to-End Model with Local and Global Context for Spoken Language Identification

Authors: Fei Jia, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg

Abstract: We introduce TitaNet-LID, a compact end-to-end neural network for Spoken Language Identification (LID) that is based on the ContextNet architecture. TitaNet-LID employs 1D depth-wise separable convolutions and Squeeze-and-Excitation layers to effectively capture local and global context within an utterance. Despite its small size, TitaNet-LID achieves performance similar to state-of-the-art models… ▽ More We introduce TitaNet-LID, a compact end-to-end neural network for Spoken Language Identification (LID) that is based on the ContextNet architecture. TitaNet-LID employs 1D depth-wise separable convolutions and Squeeze-and-Excitation layers to effectively capture local and global context within an utterance. Despite its small size, TitaNet-LID achieves performance similar to state-of-the-art models on the VoxLingua107 dataset while being 10 times smaller. Furthermore, it can be easily adapted to new acoustic conditions and unseen languages through simple fine-tuning, achieving a state-of-the-art accuracy of 88.2% on the FLEURS benchmark. Our model is scalable and can achieve a better trade-off between accuracy and speed. TitaNet-LID performs well even on short utterances less than 5s in length, indicating its robustness to input length. △ Less

Submitted 10 August, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

Comments: Accepted to INTERSPEECH 2023

arXiv:2208.01360 [pdf, other]

doi 10.1103/PhysRevLett.129.243402

Expansion dynamics of a shell-shaped Bose-Einstein condensate

Authors: Fan Jia, Zerong Huang, Liyuan Qiu, Rongzi Zhou, Yangqian Yan, Dajun Wang

Abstract: Bose-Einstein condensates (BECs) confined on shell-shaped surfaces have been proposed as a platform for exploring many nontrivial quantum phenomena on curved spaces. However, as the shell-shaped trapping potential generated with the conventional radio frequency dressing method is very sensitive to gravity, so far experimental studies of shell BECs can only be performed in micro-gravity environment… ▽ More Bose-Einstein condensates (BECs) confined on shell-shaped surfaces have been proposed as a platform for exploring many nontrivial quantum phenomena on curved spaces. However, as the shell-shaped trapping potential generated with the conventional radio frequency dressing method is very sensitive to gravity, so far experimental studies of shell BECs can only be performed in micro-gravity environments. Here, we overcome this difficulty and create a shell BEC in the presence of Earth's gravity with immiscible dual-species BECs of sodium and rubidium atoms. After minimizing the displacement between the centers of mass of the two BECs with a magic-wavelength optical dipole trap, the interspecies repulsive interaction ensures the formation of a closed shell of sodium atoms with its center filled by rubidium atoms. Releasing the double BEC together from the trap, we observe explosion of the filled shell accompanied by energy transfer from the inner BEC to the shell BEC. With the inner BEC removed, we obtain a hollow shell BEC which shows self-interference as a manifestation of implosion. Our results pave an alternative way for investigating many of the intriguing physics offered by shell BECs. △ Less

Submitted 2 August, 2022; originally announced August 2022.

arXiv:2206.01256 [pdf, other]

PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

Authors: Yingfei Liu, Junjie Yan, Fan Jia, Shuailin Li, Aqi Gao, Tiancai Wang, Xiangyu Zhang, Jian Sun

Abstract: In this paper, we propose PETRv2, a unified framework for 3D perception from multi-view images. Based on PETR, PETRv2 explores the effectiveness of temporal modeling, which utilizes the temporal information of previous frames to boost 3D object detection. More specifically, we extend the 3D position embedding (3D PE) in PETR for temporal modeling. The 3D PE achieves the temporal alignment on objec… ▽ More In this paper, we propose PETRv2, a unified framework for 3D perception from multi-view images. Based on PETR, PETRv2 explores the effectiveness of temporal modeling, which utilizes the temporal information of previous frames to boost 3D object detection. More specifically, we extend the 3D position embedding (3D PE) in PETR for temporal modeling. The 3D PE achieves the temporal alignment on object position of different frames. A feature-guided position encoder is further introduced to improve the data adaptability of 3D PE. To support for multi-task learning (e.g., BEV segmentation and 3D lane detection), PETRv2 provides a simple yet effective solution by introducing task-specific queries, which are initialized under different spaces. PETRv2 achieves state-of-the-art performance on 3D object detection, BEV segmentation and 3D lane detection. Detailed robustness analysis is also conducted on PETR framework. We hope PETRv2 can serve as a strong baseline for 3D perception. Code is available at \url{https://github.com/megvii-research/PETR}. △ Less

Submitted 14 November, 2022; v1 submitted 2 June, 2022; originally announced June 2022.

Comments: Adding 3D lane detection results on OpenLane Dataset

arXiv:2206.00700 [pdf, other]

doi 10.1145/3583780.3615040

RoCourseNet: Distributionally Robust Training of a Prediction Aware Recourse Model

Authors: Hangzhi Guo, Feiran Jia, Jinghui Chen, Anna Squicciarini, Amulya Yadav

Abstract: Counterfactual (CF) explanations for machine learning (ML) models are preferred by end-users, as they explain the predictions of ML models by providing a recourse (or contrastive) case to individuals who are adversely impacted by predicted outcomes. Existing CF explanation methods generate recourses under the assumption that the underlying target ML model remains stationary over time. However, due… ▽ More Counterfactual (CF) explanations for machine learning (ML) models are preferred by end-users, as they explain the predictions of ML models by providing a recourse (or contrastive) case to individuals who are adversely impacted by predicted outcomes. Existing CF explanation methods generate recourses under the assumption that the underlying target ML model remains stationary over time. However, due to commonly occurring distributional shifts in training data, ML models constantly get updated in practice, which might render previously generated recourses invalid and diminish end-users trust in our algorithmic framework. To address this problem, we propose RoCourseNet, a training framework that jointly optimizes predictions and recourses that are robust to future data shifts. This work contains four key contributions: (1) We formulate the robust recourse generation problem as a tri-level optimization problem which consists of two sub-problems: (i) a bi-level problem that finds the worst-case adversarial shift in the training data, and (ii) an outer minimization problem to generate robust recourses against this worst-case shift. (2) We leverage adversarial training to solve this tri-level optimization problem by: (i) proposing a novel virtual data shift (VDS) algorithm to find worst-case shifted ML models via explicitly considering the worst-case data shift in the training dataset, and (ii) a block-wise coordinate descent procedure to optimize for prediction and corresponding robust recourses. (3) We evaluate RoCourseNet's performance on three real-world datasets, and show that RoCourseNet consistently achieves more than 96% robust validity and outperforms state-of-the-art baselines by at least 10% in generating robust CF explanations. (4) Finally, we generalize the RoCourseNet framework to accommodate any parametric post-hoc methods for improving robust validity. △ Less

Submitted 18 August, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

arXiv:2204.11540 [pdf, other]

doi 10.1007/s11042-022-12800-8

Research Status of Deep Learning Methods for Rumor Detection

Authors: Li Tan, Ge Wang, Feiyang Jia, Xiaofeng Lian

Abstract: To manage the rumors in social media to reduce the harm of rumors in society. Many studies used methods of deep learning to detect rumors in open networks. To comprehensively sort out the research status of rumor detection from multiple perspectives, this paper analyzes the highly focused work from three perspectives: Feature Selection, Model Structure, and Research Methods. From the perspective o… ▽ More To manage the rumors in social media to reduce the harm of rumors in society. Many studies used methods of deep learning to detect rumors in open networks. To comprehensively sort out the research status of rumor detection from multiple perspectives, this paper analyzes the highly focused work from three perspectives: Feature Selection, Model Structure, and Research Methods. From the perspective of feature selection, we divide methods into content feature, social feature, and propagation structure feature of the rumors. Then, this work divides deep learning models of rumor detection into CNN, RNN, GNN, Transformer based on the model structure, which is convenient for comparison. Besides, this work summarizes 30 works into 7 rumor detection methods such as propagation trees, adversarial learning, cross-domain methods, multi-task learning, unsupervised and semi-supervised methods, based knowledge graph, and other methods for the first time. And compare the advantages of different methods to detect rumors. In addition, this review enumerate datasets available and discusses the potential issues and future work to help researchers advance the development of field. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: Accepted by MTAP

arXiv:2204.04746 [pdf, other]

doi 10.1016/j.media.2023.102803

CholecTriplet2021: A benchmark challenge for surgical action triplet recognition

Authors: Chinedu Innocent Nwoye, Deepak Alapatt, Tong Yu, Armine Vardazaryan, Fangfang Xia, Zixuan Zhao, Tong Xia, Fucang Jia, Yuxuan Yang, Hao Wang, Derong Yu, Guoyan Zheng, Xiaotian Duan, Neil Getty, Ricardo Sanchez-Matilla, Maria Robu, Li Zhang, Huabin Chen, Jiacheng Wang, Liansheng Wang, Bokai Zhang, Beerend Gerats, Sista Raviteja, Rachana Sathish, Rong Tao , et al. (37 additional authors not shown)

Abstract: Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in… ▽ More Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of <instrument, verb, target> combination delivers comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms by competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery. △ Less

Submitted 29 December, 2022; v1 submitted 10 April, 2022; originally announced April 2022.

Comments: CholecTriplet2021 challenge report. Paper accepted at Elsevier journal of Medical Image Analysis. 22 pages, 8 figures, 11 tables. Challenge website: https://cholectriplet2021.grand-challenge.org

Journal ref: Medical Image Analysis 86 (2023) 102803

arXiv:2203.00357 [pdf, other]

MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning

Authors: Fangkai Jiao, Yangyang Guo, Xuemeng Song, Liqiang Nie

Abstract: Logical reasoning is of vital importance to natural language understanding. Previous studies either employ graph-based models to incorporate prior knowledge about logical relations, or introduce symbolic logic into neural models through data augmentation. These methods, however, heavily depend on annotated training data, and thus suffer from over-fitting and poor generalization problems due to the… ▽ More Logical reasoning is of vital importance to natural language understanding. Previous studies either employ graph-based models to incorporate prior knowledge about logical relations, or introduce symbolic logic into neural models through data augmentation. These methods, however, heavily depend on annotated training data, and thus suffer from over-fitting and poor generalization problems due to the dataset sparsity. To address these two problems, in this paper, we propose MERIt, a MEta-path guided contrastive learning method for logical ReasonIng of text, to perform self-supervised pre-training on abundant unlabeled text data. Two novel strategies serve as indispensable components of our method. In particular, a strategy based on meta-path is devised to discover the logical structure in natural texts, followed by a counterfactual data augmentation strategy to eliminate the information shortcut induced by pre-training. The experimental results on two challenging logical reasoning benchmarks, i.e., ReClor and LogiQA, demonstrate that our method outperforms the SOTA baselines with significant improvements. △ Less

Submitted 1 March, 2022; originally announced March 2022.

Comments: 14 pages, 6 figures, Findings of ACL 2022

arXiv:2201.13123 [pdf, other]

Lessons from the AdKDD'21 Privacy-Preserving ML Challenge

Authors: Eustache Diemert, Romain Fabre, Alexandre Gilotte, Fei Jia, Basile Leparmentier, Jérémie Mary, Zhonghua Qu, Ugo Tanielian, Hui Yang

Abstract: Designing data sharing mechanisms providing performance and strong privacy guarantees is a hot topic for the Online Advertising industry. Namely, a prominent proposal discussed under the Improving Web Advertising Business Group at W3C only allows sharing advertising signals through aggregated, differentially private reports of past displays. To study this proposal extensively, an open Privacy-Pres… ▽ More Designing data sharing mechanisms providing performance and strong privacy guarantees is a hot topic for the Online Advertising industry. Namely, a prominent proposal discussed under the Improving Web Advertising Business Group at W3C only allows sharing advertising signals through aggregated, differentially private reports of past displays. To study this proposal extensively, an open Privacy-Preserving Machine Learning Challenge took place at AdKDD'21, a premier workshop on Advertising Science with data provided by advertising company Criteo. In this paper, we describe the challenge tasks, the structure of the available datasets, report the challenge results, and enable its full reproducibility. A key finding is that learning models on large, aggregated data in the presence of a small set of unaggregated data points can be surprisingly efficient and cheap. We also run additional experiments to observe the sensitivity of winning methods to different parameters such as privacy budget or quantity of available privileged side information. We conclude that the industry needs either alternate designs for private data sharing or a breakthrough in learning with aggregated data only to keep ad relevance at a reasonable level. △ Less

Submitted 31 January, 2022; originally announced January 2022.

arXiv:2201.11560 [pdf, other]

doi 10.2514/6.2021-2601

Assessment of Detached Eddy Simulation and Sliding Mesh Interface in Predicting Tiltrotor Performance in Helicopter and Airplane Modes

Authors: Feilin Jia, John Moore, Qiqi Wang

Abstract: This paper presents numerical investigation on performance and flow field of the full-scale XV-15 tiltrotor in both helicopter mode (hovering flight and forward flight) and aeroplane propeller mode using Detached Eddy Simulation, in which the movement of the rotor is achieved using a Sliding Mesh Interface. Comparison of our CFD results against experiment data and other CFD results is performed an… ▽ More This paper presents numerical investigation on performance and flow field of the full-scale XV-15 tiltrotor in both helicopter mode (hovering flight and forward flight) and aeroplane propeller mode using Detached Eddy Simulation, in which the movement of the rotor is achieved using a Sliding Mesh Interface. Comparison of our CFD results against experiment data and other CFD results is performed and presented. △ Less

Submitted 27 January, 2022; originally announced January 2022.

Journal ref: AIAA AVIATION 2021 FORUM 2021 (p. 2601)

arXiv:2111.07568 [pdf, other]

Can Graph Neural Networks Learn to Solve MaxSAT Problem?

Authors: Minghao Liu, Fuqi Jia, Pei Huang, Fan Zhang, Yuchen Sun, Shaowei Cai, Feifei Ma, Jian Zhang

Abstract: With the rapid development of deep learning techniques, various recent work has tried to apply graph neural networks (GNNs) to solve NP-hard problems such as Boolean Satisfiability (SAT), which shows the potential in bridging the gap between machine learning and symbolic reasoning. However, the quality of solutions predicted by GNNs has not been well investigated in the literature. In this paper,… ▽ More With the rapid development of deep learning techniques, various recent work has tried to apply graph neural networks (GNNs) to solve NP-hard problems such as Boolean Satisfiability (SAT), which shows the potential in bridging the gap between machine learning and symbolic reasoning. However, the quality of solutions predicted by GNNs has not been well investigated in the literature. In this paper, we study the capability of GNNs in learning to solve Maximum Satisfiability (MaxSAT) problem, both from theoretical and practical perspectives. We build two kinds of GNN models to learn the solution of MaxSAT instances from benchmarks, and show that GNNs have attractive potential to solve MaxSAT problem through experimental evaluation. We also present a theoretical explanation of the effect that GNNs can learn to solve MaxSAT problem to some extent for the first time, based on the algorithmic alignment theory. △ Less

Submitted 15 November, 2021; originally announced November 2021.

arXiv:2110.15764 [pdf, other]

ε-weakened Robustness of Deep Neural Networks

Authors: Pei Huang, Yuting Yang, Minghao Liu, Fuqi Jia, Feifei Ma, Jian Zhang

Abstract: This paper introduces a notation of $\varepsilon$-weakened robustness for analyzing the reliability and stability of deep neural networks (DNNs). Unlike the conventional robustness, which focuses on the "perfect" safe region in the absence of adversarial examples, $\varepsilon$-weakened robustness focuses on the region where the proportion of adversarial examples is bounded by user-specified… ▽ More This paper introduces a notation of $\varepsilon$-weakened robustness for analyzing the reliability and stability of deep neural networks (DNNs). Unlike the conventional robustness, which focuses on the "perfect" safe region in the absence of adversarial examples, $\varepsilon$-weakened robustness focuses on the region where the proportion of adversarial examples is bounded by user-specified $\varepsilon$. Smaller $\varepsilon$ means a smaller chance of failure. Under such robustness definition, we can give conclusive results for the regions where conventional robustness ignores. We prove that the $\varepsilon$-weakened robustness decision problem is PP-complete and give a statistical decision algorithm with user-controllable error bound. Furthermore, we derive an algorithm to find the maximum $\varepsilon$-weakened robustness radius. The time complexity of our algorithms is polynomial in the dimension and size of the network. So, they are scalable to large real-world networks. Besides, We also show its potential application in analyzing quality issues. △ Less

Submitted 29 October, 2021; originally announced October 2021.

arXiv:2110.10965 [pdf, other]

2020 CATARACTS Semantic Segmentation Challenge

Authors: Imanol Luengo, Maria Grammatikopoulou, Rahim Mohammadi, Chris Walsh, Chinedu Innocent Nwoye, Deepak Alapatt, Nicolas Padoy, Zhen-Liang Ni, Chen-Chen Fan, Gui-Bin Bian, Zeng-Guang Hou, Heonjin Ha, Jiacheng Wang, Haojie Wang, Dong Guo, Lu Wang, Guotai Wang, Mobarakol Islam, Bharat Giddwani, Ren Hongliang, Theodoros Pissas, Claudio Ravasio, Martin Huber, Jeremy Birch, Joan M. Nunez Do Rio , et al. (15 additional authors not shown)

Abstract: Surgical scene segmentation is essential for anatomy and instrument localization which can be further used to assess tissue-instrument interactions during a surgical procedure. In 2017, the Challenge on Automatic Tool Annotation for cataRACT Surgery (CATARACTS) released 50 cataract surgery videos accompanied by instrument usage annotations. These annotations included frame-level instrument presenc… ▽ More Surgical scene segmentation is essential for anatomy and instrument localization which can be further used to assess tissue-instrument interactions during a surgical procedure. In 2017, the Challenge on Automatic Tool Annotation for cataRACT Surgery (CATARACTS) released 50 cataract surgery videos accompanied by instrument usage annotations. These annotations included frame-level instrument presence information. In 2020, we released pixel-wise semantic annotations for anatomy and instruments for 4670 images sampled from 25 videos of the CATARACTS training set. The 2020 CATARACTS Semantic Segmentation Challenge, which was a sub-challenge of the 2020 MICCAI Endoscopic Vision (EndoVis) Challenge, presented three sub-tasks to assess participating solutions on anatomical structure and instrument segmentation. Their performance was assessed on a hidden test set of 531 images from 10 videos of the CATARACTS test set. △ Less

Submitted 24 February, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

arXiv:2110.05280 [pdf]

Multi-institutional Validation of Two-Streamed Deep Learning Method for Automated Delineation of Esophageal Gross Tumor Volume using planning-CT and FDG-PETCT

Authors: Xianghua Ye, Dazhou Guo, Chen-kan Tseng, Jia Ge, Tsung-Min Hung, Ping-Ching Pai, Yanping Ren, Lu Zheng, Xinli Zhu, Ling Peng, Ying Chen, Xiaohua Chen, Chen-Yu Chou, Danni Chen, Jiaze Yu, Yuzhen Chen, Feiran Jiao, Yi Xin, Lingyun Huang, Guotong Xie, Jing Xiao, Le Lu, Senxiang Yan, Dakai Jin, Tsung-Ying Ho

Abstract: Background: The current clinical workflow for esophageal gross tumor volume (GTV) contouring relies on manual delineation of high labor-costs and interuser variability. Purpose: To validate the clinical applicability of a deep learning (DL) multi-modality esophageal GTV contouring model, developed at 1 institution whereas tested at multiple ones. Methods and Materials: We collected 606 esophageal… ▽ More Background: The current clinical workflow for esophageal gross tumor volume (GTV) contouring relies on manual delineation of high labor-costs and interuser variability. Purpose: To validate the clinical applicability of a deep learning (DL) multi-modality esophageal GTV contouring model, developed at 1 institution whereas tested at multiple ones. Methods and Materials: We collected 606 esophageal cancer patients from four institutions. 252 institution-1 patients had a treatment planning-CT (pCT) and a pair of diagnostic FDG-PETCT; 354 patients from other 3 institutions had only pCT. A two-streamed DL model for GTV segmentation was developed using pCT and PETCT scans of a 148 patient institution-1 subset. This built model had the flexibility of segmenting GTVs via only pCT or pCT+PETCT combined. For independent evaluation, the rest 104 institution-1 patients behaved as unseen internal testing, and 354 institutions 2-4 patients were used for external testing. We evaluated manual revision degrees by human experts to assess the contour-editing effort. The performance of the deep model was compared against 4 radiation oncologists in a multiuser study with 20 random external patients. Contouring accuracy and time were recorded for the pre-and post-DL assisted delineation process. Results: Our model achieved high segmentation accuracy in internal testing (mean Dice score: 0.81 using pCT and 0.83 using pCT+PET) and generalized well to external evaluation (mean DSC: 0.80). Expert assessment showed that the predicted contours of 88% patients need only minor or no revision. In multi-user evaluation, with the assistance of a deep model, inter-observer variation and required contouring time were reduced by 37.6% and 48.0%, respectively. Conclusions: Deep learning predicted GTV contours were in close agreement with the ground truth and could be adopted clinically with mostly minor or no changes. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Comments: 36 pages, 10 figures

arXiv:2109.14956 [pdf]

Comparative Validation of Machine Learning Algorithms for Surgical Workflow and Skill Analysis with the HeiChole Benchmark

Authors: Martin Wagner, Beat-Peter Müller-Stich, Anna Kisilenko, Duc Tran, Patrick Heger, Lars Mündermann, David M Lubotsky, Benjamin Müller, Tornike Davitashvili, Manuela Capek, Annika Reinke, Tong Yu, Armine Vardazaryan, Chinedu Innocent Nwoye, Nicolas Padoy, Xinyang Liu, Eung-Joo Lee, Constantin Disch, Hans Meine, Tong Xia, Fucang Jia, Satoshi Kondo, Wolfgang Reiter, Yueming Jin, Yonghao Long , et al. (16 additional authors not shown)

Abstract: PURPOSE: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported fo… ▽ More PURPOSE: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported for phase recognition on an open data single-center dataset. In this work we investigated the generalizability of phase recognition algorithms in a multi-center setting including more difficult recognition tasks such as surgical action and surgical skill. METHODS: To achieve this goal, a dataset with 33 laparoscopic cholecystectomy videos from three surgical centers with a total operation time of 22 hours was created. Labels included annotation of seven surgical phases with 250 phase transitions, 5514 occurences of four surgical actions, 6980 occurences of 21 surgical instruments from seven instrument categories and 495 skill classifications in five skill dimensions. The dataset was used in the 2019 Endoscopic Vision challenge, sub-challenge for surgical workflow and skill analysis. Here, 12 teams submitted their machine learning algorithms for recognition of phase, action, instrument and/or skill assessment. RESULTS: F1-scores were achieved for phase recognition between 23.9% and 67.7% (n=9 teams), for instrument presence detection between 38.5% and 63.8% (n=8 teams), but for action recognition only between 21.8% and 23.3% (n=5 teams). The average absolute error for skill assessment was 0.78 (n=1 team). CONCLUSION: Surgical workflow and skill analysis are promising technologies to support the surgical team, but are not solved yet, as shown by our comparison of algorithms. This novel benchmark can be used for comparable evaluation and validation of future work. △ Less

Submitted 30 September, 2021; originally announced September 2021.

arXiv:2108.01856 [pdf, other]

doi 10.1103/PhysRevA.105.023313

Improved characterization of Feshbach resonances and interaction potentials between $^{23}$Na and $^{87}$Rb atoms

Authors: Zhichao Guo, Fan Jia, Bing Zhu, Lintao Li, Jeremy M. Hutson, Dajun Wang

Abstract: The ultracold mixture of \Na and \Rb atoms has become an important system for investigating physics in Bose-Bose atomic mixtures and for forming ultracold ground-state polar molecules. In this work, we provide an improved characterization of the most commonly used Feshbach resonance near 347.64 G between \Na and \Rb in their absolute ground states. We form Feshbach molecules using this resonance a… ▽ More The ultracold mixture of \Na and \Rb atoms has become an important system for investigating physics in Bose-Bose atomic mixtures and for forming ultracold ground-state polar molecules. In this work, we provide an improved characterization of the most commonly used Feshbach resonance near 347.64 G between \Na and \Rb in their absolute ground states. We form Feshbach molecules using this resonance and measure their binding energies by dissociating them via magnetic field modulation. We use the binding energies to refine the singlet and triplet potential energy curves, using coupled-channel bound-state calculations. We then use coupled-channel scattering calculations on the resulting potentials to produce a high-precision mapping between magnetic field and scattering length. We also observe 10 additional $s$-wave Feshbach resonances for \Na and \Rb in different combinations of Zeeman sublevels of the $F = 1$ hyperfine states. Some of the resonances show 2-body inelastic decay due to spin exchange. We compare the resonance properties with coupled-channel scattering calculations that full take account of inelastic properties. △ Less

Submitted 19 January, 2022; v1 submitted 4 August, 2021; originally announced August 2021.

Comments: 10 pages, 6 figures

arXiv:2107.10126 [pdf]

doi 10.1038/s41524-022-00815-6

Prediction of protected band edge states and dielectric tunable quasiparticle and excitonic properties of monolayer MoSi$_2$N$_4$

Authors: Yabei Wu, Zhao Tang, Weiyi Xia, Weiwei Gao, Fanhao Jia, Yubo Zhang, Wenguang Zhu, Wenqing Zhang, Peihong Zhang

Abstract: The electronic structure of two-dimensional (2D) materials are inherently prone to environmental perturbations, which may pose significant challenges to their applications in electronic or optoelectronic devices. A 2D material couples with its environment through two mechanisms: local chemical coupling and nonlocal dielectric screening effects. The local chemical coupling is often difficult to pre… ▽ More The electronic structure of two-dimensional (2D) materials are inherently prone to environmental perturbations, which may pose significant challenges to their applications in electronic or optoelectronic devices. A 2D material couples with its environment through two mechanisms: local chemical coupling and nonlocal dielectric screening effects. The local chemical coupling is often difficult to predict or control experimentally. Nonlocal dielectric screening, on the other hand, can be tuned by choosing the substrates or layer thickness in a controllable manner. Therefore, a compelling 2D electronic material should offer band edge states that are robust against local chemical coupling effects. Here it is demonstrated that the recently synthesized MoSi$_2$N$_4$ is an ideal 2D semiconductor with robust band edge states protected from capricious environmental chemical coupling effects. Detailed many-body perturbation theory calculations are carried out to illustrate how the band edge states of MoSi$_2$N$_4$ are shielded from the direct chemical coupling effects, but its quasiparticle and excitonic properties can be modulated through the nonlocal dielectric screening effects. This unique property, together with the moderate band gap and the thermodynamic and mechanical stability of this material, paves the way for a range of applications of MoSi$_2$N$_4$ in areas including energy, 2D electronics, and optoelectronics. △ Less

Submitted 16 June, 2022; v1 submitted 21 July, 2021; originally announced July 2021.

Journal ref: npj Computational Materials 8, 129 (2022)

arXiv:2106.05735 [pdf, other]

doi 10.1038/s41467-022-30695-9

The Medical Segmentation Decathlon

Authors: Michela Antonelli, Annika Reinke, Spyridon Bakas, Keyvan Farahani, AnnetteKopp-Schneider, Bennett A. Landman, Geert Litjens, Bjoern Menze, Olaf Ronneberger, Ronald M. Summers, Bram van Ginneken, Michel Bilello, Patrick Bilic, Patrick F. Christ, Richard K. G. Do, Marc J. Gollub, Stephan H. Heckers, Henkjan Huisman, William R. Jarnagin, Maureen K. McHugo, Sandy Napel, Jennifer S. Goli Pernicka, Kawal Rhode, Catalina Tobon-Gomez, Eugene Vorontsov , et al. (34 additional authors not shown)

Abstract: International challenges have become the de facto standard for comparative assessment of image analysis algorithms given a specific task. Segmentation is so far the most widely investigated medical image processing task, but the various segmentation challenges have typically been organized in isolation, such that algorithm development was driven by the need to tackle a single specific clinical pro… ▽ More International challenges have become the de facto standard for comparative assessment of image analysis algorithms given a specific task. Segmentation is so far the most widely investigated medical image processing task, but the various segmentation challenges have typically been organized in isolation, such that algorithm development was driven by the need to tackle a single specific clinical problem. We hypothesized that a method capable of performing well on multiple tasks will generalize well to a previously unseen task and potentially outperform a custom-designed solution. To investigate the hypothesis, we organized the Medical Segmentation Decathlon (MSD) - a biomedical image analysis challenge, in which algorithms compete in a multitude of both tasks and modalities. The underlying data set was designed to explore the axis of difficulties typically encountered when dealing with medical images, such as small data sets, unbalanced labels, multi-site data and small objects. The MSD challenge confirmed that algorithms with a consistent good performance on a set of tasks preserved their good average performance on a different set of previously unseen tasks. Moreover, by monitoring the MSD winner for two years, we found that this algorithm continued generalizing well to a wide range of other clinical problems, further confirming our hypothesis. Three main conclusions can be drawn from this study: (1) state-of-the-art image segmentation algorithms are mature, accurate, and generalize well when retrained on unseen tasks; (2) consistent algorithmic performance across multiple tasks is a strong surrogate of algorithmic generalizability; (3) the training of accurate AI segmentation models is now commoditized to non AI experts. △ Less

Submitted 10 June, 2021; originally announced June 2021.

MSC Class: 68T07

arXiv:2106.04663 [pdf, other]

Solving Structured Hierarchical Games Using Differential Backward Induction

Authors: Zun Li, Feiran Jia, Aditya Mate, Shahin Jabbari, Mithun Chakraborty, Milind Tambe, Yevgeniy Vorobeychik

Abstract: From large-scale organizations to decentralized political systems, hierarchical strategic decision making is commonplace. We introduce a novel class of structured hierarchical games (SHGs) that formally capture such hierarchical strategic interactions. In an SHG, each player is a node in a tree, and strategic choices of players are sequenced from root to leaves, with root moving first, followed by… ▽ More From large-scale organizations to decentralized political systems, hierarchical strategic decision making is commonplace. We introduce a novel class of structured hierarchical games (SHGs) that formally capture such hierarchical strategic interactions. In an SHG, each player is a node in a tree, and strategic choices of players are sequenced from root to leaves, with root moving first, followed by its children, then followed by their children, and so on until the leaves. A player's utility in an SHG depends on its own decision, and on the choices of its parent and all the tree leaves. SHGs thus generalize simultaneous-move games, as well as Stackelberg games with many followers. We leverage the structure of both the sequence of player moves as well as payoff dependence to develop a gradient-based back propagation-style algorithm, which we call Differential Backward Induction (DBI), for approximating equilibria of SHGs. We provide a sufficient condition for convergence of DBI and demonstrate its efficacy in finding approximate equilibrium solutions to several SHG models of hierarchical policy-making problems. △ Less

Submitted 27 June, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

Comments: The short version of this paper appears in the proceedings of UAI-22

arXiv:2105.14765 [pdf]

doi 10.1039/D1TC02238E

Two-dimensional Ferroelectric Ferromagnetic Half Semiconductor in VOF monolayer

Authors: Shaowen Xu, Fanhao Jia, Guodong Zhao, Wei Wu, Wei Ren

Abstract: Two-dimensional (2D) multiferroics have been casted great attention owing to their promising prospects for miniaturized electronic and memory devices.Here, we proposed a highly stable 2D multiferroic, VOF monolayer, which is an intrinsic ferromagnetic half semiconductor with large spin polarization ~2 $μ_{B}/V$ atom and a significant uniaxial magnetic anisotropy along a-axis (410 $μeV/V$ atom). Me… ▽ More Two-dimensional (2D) multiferroics have been casted great attention owing to their promising prospects for miniaturized electronic and memory devices.Here, we proposed a highly stable 2D multiferroic, VOF monolayer, which is an intrinsic ferromagnetic half semiconductor with large spin polarization ~2 $μ_{B}/V$ atom and a significant uniaxial magnetic anisotropy along a-axis (410 $μeV/V$ atom). Meanwhile, it shows excellent ferroelectricity with a large spontaneous polarization 32.7 $μC/cm^{2}$ and a moderate energy barrier (~43 meV/atom) between two ferroelectric states, which can be ascribed to the Jahn-Teller distortion.Moreover, VOF monolayer harbors an ultra-large negative Poisson's ratio in the in-plane direction (~-0.34). The Curie temperature evaluated from the Monte Carlo simulations based on the Ising model is about 215 K, which can be enhanced room temperature under -4% compressive biaxial strain.The combination of ferromagnetism and ferroelectricity in the VOF monolayer could provide a promising platform for future study of multiferroic effects and next-generation multifunctional nanoelectronic device applications. △ Less

Submitted 31 May, 2021; originally announced May 2021.

Journal ref: J. Mater. Chem. C, 2021

arXiv:2105.14757 [pdf]

doi 10.1039/D1TC04705A

Predicting Intrinsic Antiferromagnetic and Ferroelastic MnF4 monolayer with Controllable Magnetization

Authors: Shaowen Xu, Fanhao Jia, Xuli Cheng, Wei Ren

Abstract: Two-dimensional (2D) multiferroic materials with controllable magnetism have promising prospects in miniaturized quantum device applications, such as high-density data storage and spintronic devices. Here, using first-principles calculations, we propose a coexistence of antiferromagnetism and ferroelasticity in multiferroic $MnF_{4}$ monolayer. The $MnF_{4}$ monolayer is found to be an intrinsic w… ▽ More Two-dimensional (2D) multiferroic materials with controllable magnetism have promising prospects in miniaturized quantum device applications, such as high-density data storage and spintronic devices. Here, using first-principles calculations, we propose a coexistence of antiferromagnetism and ferroelasticity in multiferroic $MnF_{4}$ monolayer. The $MnF_{4}$ monolayer is found to be an intrinsic wide-gap semiconductor with large spin polarization ~3 $μ_{B}$/Mn, in which the antiferromagnetic order originates from the cooperation and competition of the direct exchange and super exchange. $MnF_{4}$ monolayer is also characterized by strongly uniaxial magnetic anisotropic behavior, that can be manipulated by the reversible ferroelastic strain and carrier doping. Remarkably, the carrier doping not only leads to an antiferromagnetic to ferromagnetic phase transformation, bult also could switch the easy magnetization axis between the in-plane and out-of-plane directions. In addition, the Néel temperature was evaluated to be about 140 K from the Monte Carlo simulations based on the Heisenberg model. The combination of antiferromagnetic and ferroelastic properties in $MnF_{4}$ monolayer provides a promising platform for studying the magnetoelastic effects, and brings about new concepts for next-generation nonvolatile memory and multi-stage storage. △ Less

Submitted 31 May, 2021; originally announced May 2021.

Journal ref: J. Mater. Chem. C, 2021

arXiv:2105.04201 [pdf, other]

REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training

Authors: Fangkai Jiao, Yangyang Guo, Yilin Niu, Feng Ji, Feng-Lin Li, Liqiang Nie

Abstract: Pre-trained Language Models (PLMs) have achieved great success on Machine Reading Comprehension (MRC) over the past few years. Although the general language representation learned from large-scale corpora does benefit MRC, the poor support in evidence extraction which requires reasoning across multiple sentences hinders PLMs from further advancing MRC. To bridge the gap between general PLMs and MR… ▽ More Pre-trained Language Models (PLMs) have achieved great success on Machine Reading Comprehension (MRC) over the past few years. Although the general language representation learned from large-scale corpora does benefit MRC, the poor support in evidence extraction which requires reasoning across multiple sentences hinders PLMs from further advancing MRC. To bridge the gap between general PLMs and MRC, we present REPT, a REtrieval-based Pre-Training approach. In particular, we introduce two self-supervised tasks to strengthen evidence extraction during pre-training, which is further inherited by downstream MRC tasks through the consistent retrieval operation and model architecture. To evaluate our proposed method, we conduct extensive experiments on five MRC datasets that require collecting evidence from and reasoning across multiple sentences. Experimental results demonstrate the effectiveness of our pre-training approach. Moreover, further analysis shows that our approach is able to enhance the capacity of evidence extraction without explicit supervision. △ Less

Submitted 17 May, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

Comments: 14 pages, 3 figures, Findings of ACL 2021

arXiv:2105.01277 [pdf]

doi 10.1103/PhysRevResearch.3.033247

Lee-Huang-Yang effects in the ultracold mixture of $^{23}$Na and $^{87}$Rb with attractive interspecies interactions

Authors: Zhichao Guo, Fan Jia, Lintao Li, Yinfeng Ma, Jeremy M. Hutson, Xiaoling Cui, Dajun Wang

Abstract: The beyond-mean-field Lee-Huang-Yang (LHY) correction is ubiquitous in dilute ultracold quantum gases. However, its effects are often elusive due to the typically much larger influence of the mean-field energy. In this work, we study an ultracold mixture of $^{23}$Na and $^{87}$Rb with tunable attractive interspecies interactions. The LHY effects manifest in the formation of self-bound quantum liq… ▽ More The beyond-mean-field Lee-Huang-Yang (LHY) correction is ubiquitous in dilute ultracold quantum gases. However, its effects are often elusive due to the typically much larger influence of the mean-field energy. In this work, we study an ultracold mixture of $^{23}$Na and $^{87}$Rb with tunable attractive interspecies interactions. The LHY effects manifest in the formation of self-bound quantum liquid droplets and the expansion dynamics of the gas-phase sample. A liquid-to-gas phase diagram is obtained by measuring the critical atom numbers below which the self-bound behavior disappears. In stark contrast to trapped gas-phase condensates, the gas-phase mixture formed following the liquid-to-gas phase transition shows an anomalous expansion featuring a larger release energy for increasing mean-field attractions. △ Less

Submitted 19 January, 2022; v1 submitted 3 May, 2021; originally announced May 2021.

Comments: 8 pages, 7 figures

Journal ref: Phys. Rev. Research 3, 033247 (2021)

arXiv:2104.07331 [pdf]

doi 10.1103/PhysRevB.103.165107

Inherited Weak Topological Insulator Signatures in Topological Hourglass Semimetal Nb3XTe6 (X = Si, Ge)

Authors: Q. Wan, T. Y. Yang, S. Li, M. Yang, Z. Zhu, C. L. Wu, C. Peng, S. K. Mo, W. Wu, Z. H. Chen, Y. B. Huang, L. L. Lev, V. N. Strocov, J. Hu, Z. Q. Mao, Hao Zheng, J. F. Jia, Y. G. Shi, Shengyuan A. Yang, N. Xu

Abstract: Using spin-resolved and angle-resolved photoemission spectroscopy and first-principles calculations, we have identified bulk band inversion and spin polarized surface state evolved from a weak topological insulator (TI) phase in van der Waals materials Nb3XTe6 (X = Si, Ge). The fingerprints of weak TI homologically emerge with hourglass fermions, as multi nodal chains composed by the same pair of… ▽ More Using spin-resolved and angle-resolved photoemission spectroscopy and first-principles calculations, we have identified bulk band inversion and spin polarized surface state evolved from a weak topological insulator (TI) phase in van der Waals materials Nb3XTe6 (X = Si, Ge). The fingerprints of weak TI homologically emerge with hourglass fermions, as multi nodal chains composed by the same pair of valence and conduction bands gapped by spin orbit coupling. The novel topological state, with a pair of valence and conduction bands encoding both weak TI and hourglass semimetal nature, is essential and guaranteed by nonsymmorphic symmetry. It is distinct from TIs studied previously based on band inversions without symmetry protections. △ Less

Submitted 15 April, 2021; originally announced April 2021.

Comments: 4 figures

Journal ref: Phys. Rev. B 103, 165107 (2021)

arXiv:2103.04439 [pdf, other]

Adaptive Agent Architecture for Real-time Human-Agent Teaming

Authors: Tianwei Ni, Huao Li, Siddharth Agrawal, Suhas Raja, Fan Jia, Yikang Gui, Dana Hughes, Michael Lewis, Katia Sycara

Abstract: Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-agent teams. However, human-agent teaming is less well studied because it is so new and involves asymmetry in policy and intent not present in human t… ▽ More Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-agent teams. However, human-agent teaming is less well studied because it is so new and involves asymmetry in policy and intent not present in human teams. To optimize team performance in human-agent teaming, it is critical that agents infer human intent and adapt their polices for smooth coordination. Most literature in human-agent teaming builds agents referencing a learned human model. Though these agents are guaranteed to perform well with the learned model, they lay heavy assumptions on human policy such as optimality and consistency, which is unlikely in many real-world scenarios. In this paper, we propose a novel adaptive agent architecture in human-model-free setting on a two-player cooperative game, namely Team Space Fortress (TSF). Previous human-human team research have shown complementary policies in TSF game and diversity in human players' skill, which encourages us to relax the assumptions on human policy. Therefore, we discard learning human models from human data, and instead use an adaptation strategy on a pre-trained library of exemplar policies composed of RL algorithms or rule-based methods with minimal assumptions of human behavior. The adaptation strategy relies on a novel similarity metric to infer human policy and then selects the most complementary policy in our library to maximize the team performance. The adaptive agent architecture can be deployed in real-time and generalize to any off-the-shelf static agents. We conducted human-agent experiments to evaluate the proposed adaptive agent framework, and demonstrated the suboptimality, diversity, and adaptability of human policies in human-agent teams. △ Less

Submitted 7 March, 2021; originally announced March 2021.

Comments: The first three authors contributed equally. In AAAI 2021 Workshop on Plan, Activity, and Intent Recognition

arXiv:2102.10646 [pdf, other]

A Game-Theoretic Approach for Hierarchical Epidemic Control

Authors: Feiran Jia, Aditya Mate, Zun Li, Shahin Jabbari, Mithun Chakraborty, Milind Tambe, Michael Wellman, Yevgeniy Vorobeychik

Abstract: We design and analyze a multi-level game-theoretic model of hierarchical policy interventions for epidemic control, such as those in response to the COVID-19 pandemic. Our model captures the potentially mismatched priorities among a hierarchy of policy-makers (e.g., federal, state, and local governments) with respect to two cost components that have opposite dependence on the policy strength -- po… ▽ More We design and analyze a multi-level game-theoretic model of hierarchical policy interventions for epidemic control, such as those in response to the COVID-19 pandemic. Our model captures the potentially mismatched priorities among a hierarchy of policy-makers (e.g., federal, state, and local governments) with respect to two cost components that have opposite dependence on the policy strength -- post-intervention infection rates and the socio-economic cost of policy implementation. Additionally, our model includes a crucial third factor in decisions: a cost of non-compliance with the policy-maker immediately above in the hierarchy, such as non-compliance of counties with state-level policies. We propose two novel algorithms for approximating solutions to such games. The first is based on best response dynamics (BRD), and exploits the tree structure of the game. The second combines quadratic integer programming (QIP), which enables us to collapse the two lowest levels of the game, with best response dynamics. Through extensive experiments, we show that our QIP-based approach significantly outperforms the BRD algorithm both in running time and the quality of equilibrium solutions. Finally, we apply the QIP-based algorithm to experiments based on both synthetic and real-world data under various parameter configurations and analyze the resulting (approximate) equilibria to gain insight into the impact of decentralization on overall welfare (measured as the negative sum of costs) as well as emergent properties like free-riding and fairness in cost distribution among policy-makers. △ Less

Submitted 3 August, 2022; v1 submitted 21 February, 2021; originally announced February 2021.

arXiv:2101.06948 [pdf, ps, other]

doi 10.1109/TVT.2021.3068774

Improving Physical Layer Security for Reconfigurable Intelligent Surface aided NOMA 6G Networks

Authors: Zhe Zhang, Chensi Zhang, Chengjun Jiang, Fan Jia, Jianhua Ge, Fengkui Gong

Abstract: The intrinsic integration of the nonorthogonal multiple access (NOMA) and reconfigurable intelligent surface (RIS) techniques is envisioned to be a promising approach to significantly improve both the spectrum efficiency and energy efficiency for future wireless communication networks. In this paper, the physical layer security (PLS) for a RIS-aided NOMA 6G networks is investigated, in which a RIS… ▽ More The intrinsic integration of the nonorthogonal multiple access (NOMA) and reconfigurable intelligent surface (RIS) techniques is envisioned to be a promising approach to significantly improve both the spectrum efficiency and energy efficiency for future wireless communication networks. In this paper, the physical layer security (PLS) for a RIS-aided NOMA 6G networks is investigated, in which a RIS is deployed to assist the two "dead zone" NOMA users and both internal and external eavesdropping are considered. For the scenario with only internal eavesdropping, we consider the worst case that the near-end user is untrusted and may try to intercept the information of far-end user. A joint beamforming and power allocation sub-optimal scheme is proposed to improve the system PLS. Then we extend our work to a scenario with both internal and external eavesdropping. Two sub-scenarios are considered in this scenario: one is the sub-scenario without channel state information (CSI) of eavesdroppers, and another is the sub-scenario where the eavesdroppers' CSI are available. For the both sub-scenarios, a noise beamforming scheme is introduced to be against the external eavesdroppers. An optimal power allocation scheme is proposed to further improve the system physical security for the second sub-scenario. Simulation results show the superior performance of the proposed schemes. Moreover, it has also been shown that increasing the number of reflecting elements can bring more gain in secrecy performance than that of the transmit antennas. △ Less

Submitted 18 January, 2021; originally announced January 2021.

arXiv:2101.01133 [pdf, other]

Stereo Correspondence and Reconstruction of Endoscopic Data Challenge

Authors: Max Allan, Jonathan Mcleod, Congcong Wang, Jean Claude Rosenthal, Zhenglei Hu, Niklas Gard, Peter Eisert, Ke Xue Fu, Trevor Zeffiro, Wenyao Xia, Zhanshi Zhu, Huoling Luo, Fucang Jia, Xiran Zhang, Xiaohong Li, Lalith Sharan, Tom Kurmann, Sebastian Schmid, Raphael Sznitman, Dimitris Psychogyios, Mahdi Azizian, Danail Stoyanov, Lena Maier-Hein, Stefanie Speidel

Abstract: The stereo correspondence and reconstruction of endoscopic data sub-challenge was organized during the Endovis challenge at MICCAI 2019 in Shenzhen, China. The task was to perform dense depth estimation using 7 training datasets and 2 test sets of structured light data captured using porcine cadavers. These were provided by a team at Intuitive Surgical. 10 teams participated in the challenge day.… ▽ More The stereo correspondence and reconstruction of endoscopic data sub-challenge was organized during the Endovis challenge at MICCAI 2019 in Shenzhen, China. The task was to perform dense depth estimation using 7 training datasets and 2 test sets of structured light data captured using porcine cadavers. These were provided by a team at Intuitive Surgical. 10 teams participated in the challenge day. This paper contains 3 additional methods which were submitted after the challenge finished as well as a supplemental section from these teams on issues they found with the dataset. △ Less

Submitted 28 January, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

Showing 51–100 of 148 results for author: Jia, F