Search | arXiv e-print repository

Many-body adiabatic passage: Instability, chaos, and quantum classical correspondence

Authors: Anant Vijay Varma, Amichay Vardi, Doron Cohen

Abstract: Adiabatic passage in systems of interacting bosons is substantially affected by interactions and inter-particle entanglement. We consider STIRAP-like schemes in Bose-Hubbard chains that exhibit low-dimensional chaos (a 3 site chain), and high-dimensional chaos (more than 3 sites). The dynamics that is generated by a transfer protocol exhibits striking classical and quantum chaos fingerprints that… ▽ More Adiabatic passage in systems of interacting bosons is substantially affected by interactions and inter-particle entanglement. We consider STIRAP-like schemes in Bose-Hubbard chains that exhibit low-dimensional chaos (a 3 site chain), and high-dimensional chaos (more than 3 sites). The dynamics that is generated by a transfer protocol exhibits striking classical and quantum chaos fingerprints that are manifest in the mean-field classical treatment, in the truncated-Wigner semiclassical treatment, and in the full many-body quantum simulations. △ Less

Submitted 2 September, 2024; originally announced September 2024.

Comments: 13 pages, 14 figures

arXiv:2409.00949 [pdf, other]

Stability of multiplexed NCS based on an epsilon-greedy algorithm for communication selection

Authors: Harsh Oza, Irinel-Constantin Morarescu, Vineeth S. Varma, Ravi Banavar

Abstract: In this letter, we study a Networked Control System (NCS) with multiplexed communication and Bernoulli packet drops. Multiplexed communication refers to the constraint that transmission of a control signal and an observation signal cannot occur simultaneously due to the limited bandwidth. First, we propose an epsilon-greedy algorithm for the selection of the communication sequence that also ensure… ▽ More In this letter, we study a Networked Control System (NCS) with multiplexed communication and Bernoulli packet drops. Multiplexed communication refers to the constraint that transmission of a control signal and an observation signal cannot occur simultaneously due to the limited bandwidth. First, we propose an epsilon-greedy algorithm for the selection of the communication sequence that also ensures Mean Square Stability (MSS). We formulate the system as a Markovian Jump Linear System (MJLS) and provide the necessary conditions for MSS in terms of Linear Matrix Inequalities (LMIs) that need to be satisfied for three corner cases. We prove that the system is MSS for any convex combination of these three corner cases. Furthermore, we propose to use the epsilon-greedy algorithm with the epsilon that satisfies MSS conditions for training a Deep Q Network (DQN). The DQN is used to obtain an optimal communication sequence that minimizes a quadratic cost. We validate our approach with a numerical example that shows the efficacy of our method in comparison to the round-robin and a random scheme. △ Less

Submitted 2 September, 2024; originally announced September 2024.

Comments: A preliminary version of this article has been submitted to IEEE Control Systems articles

arXiv:2408.17345 [pdf, ps, other]

Dimensional confinement and superdiffusive rotational motion of uniaxial colloids in the presence of cylindrical obstacles

Authors: Vikki Anand Varma, Sujin B Babu

Abstract: In biological system like cell the macromolecules which are anisotropic particles diffuse in a crowded medium. In the present work we have studied the diffusion of spheroidal particles diffusing between cylindrical obstacles by varying the density of the obstacles as well as the spheroidal particles. Analytical calculation of the free energy showed that the orientational vector of a single oblate… ▽ More In biological system like cell the macromolecules which are anisotropic particles diffuse in a crowded medium. In the present work we have studied the diffusion of spheroidal particles diffusing between cylindrical obstacles by varying the density of the obstacles as well as the spheroidal particles. Analytical calculation of the free energy showed that the orientational vector of a single oblate particle will be aligned perpendicular and a prolate particle will be aligned parallel to the symmetry axis of the cylindrical obstacles in equilibrium. The nematic transition of the system with and without obstacle remained the same, but in the case of obstacles the nematic vector of the spheroid system always remained parallel to the cylindrical axis. The component of the translational diffusion coefficient of the spheroidal particle perpendicular to the axis of the cylinder is calculated for isotropic system which agrees with analytical calculation. When the cylinders overlap such that the spheroidal particles can only diffuse along the direction parallel to the axis of the cylinder we could observe dimensional confinement. This was observed by the discontinuous fall of the diffusion coefficient, when plotted against the chemical potential both for single particle as well as for finite volume fraction. The rotational diffusion coefficient quickly reached the bulk value as the distance between the obstacle increased in the isotropic phase. In the nematic phase the rotational motion of the spheroid should be arrested. We observed that even though the entire system remained in the nematic phase the oblate particle close to the cylinder underwent flipping motion. The consequence is that when the rotational mean squared displacement was calculated it showed a super-diffusive behavior even though the orientational self correlation function never relaxed to zero. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: 13 pages, 14 figures

arXiv:2408.05300 [pdf, other]

High-Precision Ringdown Surrogate Model for Non-Precessing Binary Black Holes

Authors: Lorena Magaña Zertuche, Leo C. Stein, Keefe Mitman, Scott E. Field, Vijay Varma, Michael Boyle, Nils Deppe, Lawrence E. Kidder, Jordan Moxon, Harald P. Pfeiffer, Mark A. Scheel, Kyle C. Nelli, William Throwe, Nils L. Vu

Abstract: Highly precise and robust waveform models are required as improvements in detector sensitivity enable us to test general relativity with more precision than ever before. In this work, we introduce a spin-aligned surrogate ringdown model. This ringdown surrogate, NRSur3dq8_RD, is built with numerical waveforms produced using Cauchy-characteristic evolution. In addition, these waveforms are in the s… ▽ More Highly precise and robust waveform models are required as improvements in detector sensitivity enable us to test general relativity with more precision than ever before. In this work, we introduce a spin-aligned surrogate ringdown model. This ringdown surrogate, NRSur3dq8_RD, is built with numerical waveforms produced using Cauchy-characteristic evolution. In addition, these waveforms are in the superrest frame of the remnant black hole allowing us to do a correct analysis of the ringdown spectrum. The novel prediction of our surrogate model is complex-valued quasinormal mode (QNM) amplitudes, with median relative errors of $10^{-2}-10^{-3}$ over the parameter space. Like previous remnant surrogates, we also predict the remnant black hole's mass and spin. The QNM mode amplitude errors translate into median errors on ringdown waveforms of $10^{-4}$. The high accuracy and QNM mode content provided by our surrogate will enable high-precision ringdown analyses such as tests of general relativity. Our ringdown model is publicly available through the python package surfinBH. △ Less

Submitted 9 August, 2024; originally announced August 2024.

Comments: 11+2 pages, 13 figures, 1 table. This new model is publicly available through surfinBH https://pypi.org/project/surfinBH/

arXiv:2408.05147 [pdf, other]

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Authors: Tom Lieberum, Senthooran Rajamanoharan, Arthur Conmy, Lewis Smith, Nicolas Sonnerat, Vikrant Varma, János Kramár, Anca Dragan, Rohin Shah, Neel Nanda

Abstract: Sparse autoencoders (SAEs) are an unsupervised method for learning a sparse decomposition of a neural network's latent representations into seemingly interpretable features. Despite recent excitement about their potential, research applications outside of industry are limited by the high cost of training a comprehensive suite of SAEs. In this work, we introduce Gemma Scope, an open suite of JumpRe… ▽ More Sparse autoencoders (SAEs) are an unsupervised method for learning a sparse decomposition of a neural network's latent representations into seemingly interpretable features. Despite recent excitement about their potential, research applications outside of industry are limited by the high cost of training a comprehensive suite of SAEs. In this work, we introduce Gemma Scope, an open suite of JumpReLU SAEs trained on all layers and sub-layers of Gemma 2 2B and 9B and select layers of Gemma 2 27B base models. We primarily train SAEs on the Gemma 2 pre-trained models, but additionally release SAEs trained on instruction-tuned Gemma 2 9B for comparison. We evaluate the quality of each SAE on standard metrics and release these results. We hope that by releasing these SAE weights, we can help make more ambitious safety and interpretability research easier for the community. Weights and a tutorial can be found at https://huggingface.co/google/gemma-scope and an interactive demo can be found at https://www.neuronpedia.org/gemma-scope △ Less

Submitted 19 August, 2024; v1 submitted 9 August, 2024; originally announced August 2024.

Comments: 12 main text pages, and 14 pages of acknowledgements, references and appendices

arXiv:2408.02384 [pdf, other]

Strategic Federated Learning: Application to Smart Meter Data Clustering

Authors: Hassan Mohamad, Chao Zhang, Samson Lasaulce, Vineeth S Varma, Mérouane Debbah, Mounir Ghogho

Abstract: Federated learning (FL) involves several clients that share with a fusion center (FC), the model each client has trained with its own data. Conventional FL, which can be interpreted as an estimation or distortion-based approach, ignores the final use of model information (MI) by the FC and the other clients. In this paper, we introduce a novel FL framework in which the FC uses an aggregate version… ▽ More Federated learning (FL) involves several clients that share with a fusion center (FC), the model each client has trained with its own data. Conventional FL, which can be interpreted as an estimation or distortion-based approach, ignores the final use of model information (MI) by the FC and the other clients. In this paper, we introduce a novel FL framework in which the FC uses an aggregate version of the MI to make decisions that affect the client's utility functions. Clients cannot choose the decisions and can only use the MI reported to the FC to maximize their utility. Depending on the alignment between the client and FC utilities, the client may have an individual interest in adding strategic noise to the model. This general framework is stated and specialized to the case of clustering, in which noisy cluster representative information is reported. This is applied to the problem of power consumption scheduling. In this context, utility non-alignment occurs, for instance, when the client wants to consume when the price of electricity is low, whereas the FC wants the consumption to occur when the total power is the lowest. This is illustrated with aggregated real data from Ausgrid \cite{ausgrid}. Our numerical analysis clearly shows that the client can increase his utility by adding noise to the model reported to the FC. Corresponding results and source codes can be downloaded from \cite{source-code}. △ Less

Submitted 5 August, 2024; originally announced August 2024.

arXiv:2407.18319 [pdf, other]

Gravitational wave surrogate model for spinning, intermediate mass ratio binaries based on perturbation theory and numerical relativity

Authors: Katie Rink, Ritesh Bachhar, Tousif Islam, Nur E. M. Rifat, Kevin Gonzalez-Quesada, Scott E. Field, Gaurav Khanna, Scott A. Hughes, Vijay Varma

Abstract: We present BHPTNRSur2dq1e3, a reduced order surrogate model of gravitational waves emitted from binary black hole (BBH) systems in the comparable to large mass ratio regime with aligned spin ($χ_1$) on the heavier mass ($m_1$). We trained this model on waveform data generated from point particle black hole perturbation theory (ppBHPT) with mass ratios varying from $3 \leq q \leq 1000$ and spins fr… ▽ More We present BHPTNRSur2dq1e3, a reduced order surrogate model of gravitational waves emitted from binary black hole (BBH) systems in the comparable to large mass ratio regime with aligned spin ($χ_1$) on the heavier mass ($m_1$). We trained this model on waveform data generated from point particle black hole perturbation theory (ppBHPT) with mass ratios varying from $3 \leq q \leq 1000$ and spins from $-0.8 \leq χ_1 \leq 0.8$. The waveforms are $13,500 \ m_1$ long and include all spin-weighted spherical harmonic modes up to $\ell = 4$ except the $(4,1)$ and $m = 0$ modes. We find that for binaries with $χ_1 \lesssim -0.5$, retrograde quasi-normal modes are significantly excited, thereby complicating the modeling process. To overcome this issue, we introduce a domain decomposition approach to model the inspiral and merger-ringdown portion of the signal separately. The resulting model can faithfully reproduce ppBHPT waveforms with a median time-domain mismatch error of $8 \times 10^{-5}$. We then calibrate our model with numerical relativity (NR) data in the comparable mass regime $(3 \leq q \leq 10)$. By comparing with spin-aligned BBH NR simulations at $q = 15$, we find that the dominant quadrupolar (subdominant) modes agree to better than $\approx 10^{-3} \ (\approx 10^{-2})$ when using a time-domain mismatch error, where the largest source of calibration error comes from the transition-to-plunge and ringdown approximations of perturbation theory. Mismatch errors are below $\approx 10^{-2}$ for systems with mass ratios between $6 \leq q \leq 15$ and typically get smaller at larger mass ratio. Our two models - both the ppBHPT waveform model and the NR-calibrated ppBHPT model - will be publicly available through gwsurrogate and the Black Hole Perturbation Toolkit packages. △ Less

Submitted 25 July, 2024; originally announced July 2024.

Comments: 20 pages, 17 figures

arXiv:2407.15544 [pdf, other]

Shell mergers in the late stages of massive star evolution: new insight from 3D hydrodynamic simulations

Authors: Federico Rizzuti, Raphael Hirschi, Vishnu Varma, William David Arnett, Cyril Georgy, Casey Meakin, Miroslav Mocák, Alexander St. John Murphy, Thomas Rauscher

Abstract: One-dimensional (1D) stellar evolution models are widely used across various astrophysical fields, however they are still dominated by important uncertainties that deeply affect their predictive power. Among those, the merging of independent convective regions is a poorly understood phenomenon predicted by some 1D models but whose occurrence and impact in real stars remain very uncertain. Being an… ▽ More One-dimensional (1D) stellar evolution models are widely used across various astrophysical fields, however they are still dominated by important uncertainties that deeply affect their predictive power. Among those, the merging of independent convective regions is a poorly understood phenomenon predicted by some 1D models but whose occurrence and impact in real stars remain very uncertain. Being an intrinsically multi-D phenomenon, it is challenging to predict the exact behaviour of shell mergers with 1D models. In this work, we conduct a detailed investigation of a multiple shell merging event in a 20 M$_\odot$ star using 3D hydrodynamic simulations. Making use of the active tracers for composition and the nuclear network included in the 3D model, we study the merging not only from a dynamical standpoint but also considering its nucleosynthesis and energy generation. Our simulations confirm the occurrence of the merging also in 3D, but reveal significant differences from the 1D case. Specifically, we identify entrainment and the erosion of stable regions as the main mechanisms that drive the merging, we predict much faster convective velocities compared to the mixing-length-theory velocities, and observe multiple burning phases within the same merged shell, with important effects for the chemical composition of the star, which presents a strongly asymmetric (dipolar) distribution. We expect that these differences will have important effects on the final structure of massive stars and thus their final collapse dynamics and possible supernova explosion, subsequently affecting the resulting nucleosynthesis and remnant. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: 18 pages, 16 figures. Accepted for publication in MNRAS

arXiv:2407.14435 [pdf, other]

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders

Authors: Senthooran Rajamanoharan, Tom Lieberum, Nicolas Sonnerat, Arthur Conmy, Vikrant Varma, János Kramár, Neel Nanda

Abstract: Sparse autoencoders (SAEs) are a promising unsupervised approach for identifying causally relevant and interpretable linear features in a language model's (LM) activations. To be useful for downstream tasks, SAEs need to decompose LM activations faithfully; yet to be interpretable the decomposition must be sparse -- two objectives that are in tension. In this paper, we introduce JumpReLU SAEs, whi… ▽ More Sparse autoencoders (SAEs) are a promising unsupervised approach for identifying causally relevant and interpretable linear features in a language model's (LM) activations. To be useful for downstream tasks, SAEs need to decompose LM activations faithfully; yet to be interpretable the decomposition must be sparse -- two objectives that are in tension. In this paper, we introduce JumpReLU SAEs, which achieve state-of-the-art reconstruction fidelity at a given sparsity level on Gemma 2 9B activations, compared to other recent advances such as Gated and TopK SAEs. We also show that this improvement does not come at the cost of interpretability through manual and automated interpretability studies. JumpReLU SAEs are a simple modification of vanilla (ReLU) SAEs -- where we replace the ReLU with a discontinuous JumpReLU activation function -- and are similarly efficient to train and run. By utilising straight-through-estimators (STEs) in a principled manner, we show how it is possible to train JumpReLU SAEs effectively despite the discontinuous JumpReLU function introduced in the SAE's forward pass. Similarly, we use STEs to directly train L0 to be sparse, instead of training on proxies such as L1, avoiding problems like shrinkage. △ Less

Submitted 1 August, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

Comments: v2: new appendix H comparing kernel functions & bug-fixes to pseudo-code in Appendix J v3: further bug-fix to pseudo-code in Appendix J

arXiv:2407.12867 [pdf, other]

Swift-BAT GUANO follow-up of gravitational-wave triggers in the third LIGO-Virgo-KAGRA observing run

Authors: Gayathri Raman, Samuele Ronchini, James Delaunay, Aaron Tohuvavohu, Jamie A. Kennea, Tyler Parsotan, Elena Ambrosi, Maria Grazia Bernardini, Sergio Campana, Giancarlo Cusumano, Antonino D'Ai, Paolo D'Avanzo, Valerio D'Elia, Massimiliano De Pasquale, Simone Dichiara, Phil Evans, Dieter Hartmann, Paul Kuin, Andrea Melandri, Paul O'Brien, Julian P. Osborne, Kim Page, David M. Palmer, Boris Sbarufatti, Gianpiero Tagliaferri , et al. (1797 additional authors not shown)

Abstract: We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wav… ▽ More We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wave Transient Catalogs (GWTC-3). Targeted searches were carried out on the entire GW sample using the maximum--likelihood NITRATES pipeline on the BAT data made available via the GUANO infrastructure. We do not detect any significant electromagnetic emission that is temporally and spatially coincident with any of the GW candidates. We report flux upper limits in the 15-350 keV band as a function of sky position for all the catalog candidates. For GW candidates where the Swift-BAT false alarm rate is less than 10$^{-3}$ Hz, we compute the GW--BAT joint false alarm rate. Finally, the derived Swift-BAT upper limits are used to infer constraints on the putative electromagnetic emission associated with binary black hole mergers. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 50 pages, 10 figures, 4 tables

arXiv:2407.11939 [pdf]

AI-Driven Physics-Informed Bio-Silicon Intelligence System: Integrating Hybrid Systems, Biocomputing, Neural Networks, and Machine Learning, for Advanced Neurotechnology

Authors: Vincent Jorgsson, Raghav Kumar, Mustaf Ahmed, Maxx Yung, Aryaman Pattnayak, Sri Pradhyumna Sridhar, Vaishnav Varma, Arun Ram Ponnambalam, Georg Weidlich, Dimitris Pinotsis

Abstract: We present the Bio-Silicon Intelligence System (BSIS), an innovative hybrid platform that integrates biological neural networks with silicon-based computing. The BSIS, a Physics-Informed Hybrid Hierarchical Reinforcement Learning State Machine, employs carbon nanotube-coated electrodes to interface rat brains with computational systems, enabling high-fidelity neural interfacing and bidirectional c… ▽ More We present the Bio-Silicon Intelligence System (BSIS), an innovative hybrid platform that integrates biological neural networks with silicon-based computing. The BSIS, a Physics-Informed Hybrid Hierarchical Reinforcement Learning State Machine, employs carbon nanotube-coated electrodes to interface rat brains with computational systems, enabling high-fidelity neural interfacing and bidirectional communication through self-organizing systems in both biological and silicon forms. Our system leverages both analogue and digital AI theory, incorporating concepts from computational theory, chaos theory, dynamical systems theory, physics, and quantum mechanics. Additionally, the BSIS replicates the neuronal dynamics typical of intelligent brain tissue, employing nonlinear operations underlying learning and information storage. Neural signals are read through the FreeEEG32 board and BrainFlow software, then features are extracted and mapped to game actions by tracking feature changes in continuous data. Metadata is encoded into both analogue and digital brain stimulation signals at the microvolt level using our proprietary software and hardware. The system employs a dual signaling approach for training the rat brain, incorporating a reward solution and sound as well as human-inaudible distress sounds. This paper details the design, theory, functionality, and technical specifications of the BSIS, highlighting its interdisciplinary approach and advanced technological integration. △ Less

Submitted 31 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.05020 [pdf, other]

A game theory analysis of decentralized epidemic management with opinion dynamics

Authors: Olivier Lindamulage De Silva, Samson Lasaulce, Irinel-Constantin Morarescu, Vineeth S. Varma

Abstract: In this paper, we introduce a static game that allows one to numerically assess the loss of efficiency induced by decentralized control or management of a global epidemic. Each player represents a region which is assumed to choose its control to implement a tradeoff between socio-economic aspects and health aspects; the control comprises both epidemic control physical measures and influence action… ▽ More In this paper, we introduce a static game that allows one to numerically assess the loss of efficiency induced by decentralized control or management of a global epidemic. Each player represents a region which is assumed to choose its control to implement a tradeoff between socio-economic aspects and health aspects; the control comprises both epidemic control physical measures and influence actions on the region opinion. The Generalized Nash equilibrium $(\mathrm{GNE})$ analysis of the proposed game model is conducted. The direct analysis of this game of practical interest is non-trivial but it turns out that one can construct an auxiliary game which allows one: to prove existence and uniqueness; to compute the GNE and the optimal centralized solution (sum-cost) of the game. These results allow us to assess numerically the loss (measured in terms of Price of Anarchy ($\mathrm{PoA}$)) induced by decentralization with or without taking into account the opinion dynamics. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.00476 [pdf, other]

Large Language Models for Power Scheduling: A User-Centric Approach

Authors: Thomas Mongaillard, Samson Lasaulce, Othman Hicheur, Chao Zhang, Lina Bariah, Vineeth S. Varma, Hang Zou, Qiyang Zhao, Merouane Debbah

Abstract: While traditional optimization and scheduling schemes are designed to meet fixed, predefined system requirements, future systems are moving toward user-driven approaches and personalized services, aiming to achieve high quality-of-experience (QoE) and flexibility. This challenge is particularly pronounced in wireless and digitalized energy networks, where users' requirements have largely not been… ▽ More While traditional optimization and scheduling schemes are designed to meet fixed, predefined system requirements, future systems are moving toward user-driven approaches and personalized services, aiming to achieve high quality-of-experience (QoE) and flexibility. This challenge is particularly pronounced in wireless and digitalized energy networks, where users' requirements have largely not been taken into consideration due to the lack of a common language between users and machines. The emergence of powerful large language models (LLMs) marks a radical departure from traditional system-centric methods into more advanced user-centric approaches by providing a natural communication interface between users and devices. In this paper, for the first time, we introduce a novel architecture for resource scheduling problems by constructing three LLM agents to convert an arbitrary user's voice request (VRQ) into a resource allocation vector. Specifically, we design an LLM intent recognition agent to translate the request into an optimization problem (OP), an LLM OP parameter identification agent, and an LLM OP solving agent. To evaluate system performance, we construct a database of typical VRQs in the context of electric vehicle (EV) charging. As a proof of concept, we primarily use Llama 3 8B. Through testing with different prompt engineering scenarios, the obtained results demonstrate the efficiency of the proposed architecture. The conducted performance analysis allows key insights to be extracted. For instance, having a larger set of candidate OPs to model the real-world problem might degrade the final performance because of a higher recognition/OP classification noise level. All results and codes are open source. △ Less

Submitted 19 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

arXiv:2406.08428 [pdf, other]

Improving Noise Robustness through Abstractions and its Impact on Machine Learning

Authors: Alfredo Ibias, Karol Capala, Varun Ravi Varma, Anna Drozdz, Jose Sousa

Abstract: Noise is a fundamental problem in learning theory with huge effects in the application of Machine Learning (ML) methods, due to real world data tendency to be noisy. Additionally, introduction of malicious noise can make ML methods fail critically, as is the case with adversarial attacks. Thus, finding and developing alternatives to improve robustness to noise is a fundamental problem in ML. In th… ▽ More Noise is a fundamental problem in learning theory with huge effects in the application of Machine Learning (ML) methods, due to real world data tendency to be noisy. Additionally, introduction of malicious noise can make ML methods fail critically, as is the case with adversarial attacks. Thus, finding and developing alternatives to improve robustness to noise is a fundamental problem in ML. In this paper, we propose a method to deal with noise: mitigating its effect through the use of data abstractions. The goal is to reduce the effect of noise over the model's performance through the loss of information produced by the abstraction. However, this information loss comes with a cost: it can result in an accuracy reduction due to the missing information. First, we explored multiple methodologies to create abstractions, using the training dataset, for the specific case of numerical data and binary classification tasks. We also tested how these abstractions can affect robustness to noise with several experiments that explore the robustness of an Artificial Neural Network to noise when trained using raw data \emph{vs} when trained using abstracted data. The results clearly show that using abstractions is a viable approach for developing noise robust ML methods. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.05015 [pdf, other]

Quantum Alternating Operator Ansatz for the Preparation and Detection of Long-Lived Singlet States in NMR

Authors: Pratham Hullamballi, Vishal Varma, T. S. Mahesh

Abstract: Designing efficient and robust quantum control strategies is vital for developing quantum technologies. One recent strategy is the Quantum Alternating Operator Ansatz (QAOA) sequence that alternatively propagates under two noncommuting Hamiltonians, whose control parameters can be optimized to generate a gate or prepare a state. Here, we describe the design of the QAOA sequence and their variants… ▽ More Designing efficient and robust quantum control strategies is vital for developing quantum technologies. One recent strategy is the Quantum Alternating Operator Ansatz (QAOA) sequence that alternatively propagates under two noncommuting Hamiltonians, whose control parameters can be optimized to generate a gate or prepare a state. Here, we describe the design of the QAOA sequence and their variants to prepare long-lived singlet states (LLS) from the thermal state in NMR. With extraordinarily long lifetimes exceeding the spin-lattice relaxation time constant $T_1$, LLS have been of great interest for various applications, from spectroscopy to medical imaging. Accordingly, designing sequences for efficiently preparing LLS in a general spin system is crucial. Using numerical analysis, we study the efficiency and robustness of the QAOA sequences over a wide range of errors in the control parameters. Using a two-qubit NMR register, we conduct an experimental study to benchmark QAOA sequences against other prominent methods of LLS preparation and observe the significantly superior performance of the QAOA sequences. △ Less

Submitted 8 August, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

Comments: 9 pages, 8 figures

arXiv:2405.21033 [pdf, other]

3D simulations of convective shell Neon-burning in a massive star

Authors: C. Georgy, F. Rizzuti, R. Hirschi, V. Varma, W. D. Arnett, C. Meakin, M. Mocak, A. StJ. Murphy, T. Rauscher

Abstract: The treatment of convection remains a major weakness in the modelling of stellar evolution with one-dimensional (1D) codes. The ever increasing computing power makes now possible to simulate in 3D part of a star for a fraction of its life, allowing us to study the full complexity of convective zones with hydrodynamics codes. Here, we performed state-of-the-art hydrodynamics simulations of turbulen… ▽ More The treatment of convection remains a major weakness in the modelling of stellar evolution with one-dimensional (1D) codes. The ever increasing computing power makes now possible to simulate in 3D part of a star for a fraction of its life, allowing us to study the full complexity of convective zones with hydrodynamics codes. Here, we performed state-of-the-art hydrodynamics simulations of turbulence in a neon-burning convective zone, during the late stage of the life of a massive star. We produced a set of simulations varying the resolution of the computing domain (from 1283 to 10243 cells) and the efficiency of the nuclear reactions (by boosting the energy generation rate from nominal to a factor of 1000). We analysed our results by the mean of Fourier transform of the velocity field, and mean-field decomposition of the various transport equations. Our results are in line with previous studies, showing that the behaviour of the bulk of the convective zone is already well captured at a relatively low resolution (2563), while the details of the convective boundaries require higher resolutions. The different boosting factors used show how various quantities (velocity, buoyancy, abundances, abundance variances) depend on the energy generation rate. We found that for low boosting factors, convective zones are well mixed, validating the approach usually used in 1D stellar evolution codes. However, when nuclear burning and turbulent transport occur on the same timescale, a more sophisticated treatment would be needed. This is typically the case when shell mergers occur. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: 17 pages, 20 figures, accepted for publication in MNRAS

arXiv:2405.16129 [pdf, other]

iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers

Authors: Harshit Gupta, Manav Chaudhary, Tathagata Raha, Shivansh Subramanian, Vasudeva Varma

Abstract: This paper describes our approach for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense. The BRAINTEASER task comprises multiple-choice Question Answering designed to evaluate the models' lateral thinking capabilities. It consists of Sentence Puzzle and Word Puzzle subtasks that require models to defy default common-sense associations and exhibit unconventional thinking. We propo… ▽ More This paper describes our approach for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense. The BRAINTEASER task comprises multiple-choice Question Answering designed to evaluate the models' lateral thinking capabilities. It consists of Sentence Puzzle and Word Puzzle subtasks that require models to defy default common-sense associations and exhibit unconventional thinking. We propose a unique strategy to improve the performance of pre-trained language models, notably the Gemini 1.0 Pro Model, in both subtasks. We employ static and dynamic few-shot prompting techniques and introduce a model-generated reasoning strategy that utilizes the LLM's reasoning capabilities to improve performance. Our approach demonstrated significant improvements, showing that it performed better than the baseline models by a considerable margin but fell short of performing as well as the human annotators, thus highlighting the efficacy of the proposed strategies. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.11192 [pdf, other]

BrainStorm @ iREL at #SMM4H 2024: Leveraging Translation and Topical Embeddings for Annotation Detection in Tweets

Authors: Manav Chaudhary, Harshit Gupta, Vasudeva Varma

Abstract: The proliferation of LLMs in various NLP tasks has sparked debates regarding their reliability, particularly in annotation tasks where biases and hallucinations may arise. In this shared task, we address the challenge of distinguishing annotations made by LLMs from those made by human domain experts in the context of COVID-19 symptom detection from tweets in Latin American Spanish. This paper pres… ▽ More The proliferation of LLMs in various NLP tasks has sparked debates regarding their reliability, particularly in annotation tasks where biases and hallucinations may arise. In this shared task, we address the challenge of distinguishing annotations made by LLMs from those made by human domain experts in the context of COVID-19 symptom detection from tweets in Latin American Spanish. This paper presents BrainStorm @ iRELs approach to the SMM4H 2024 Shared Task, leveraging the inherent topical information in tweets, we propose a novel approach to identify and classify annotations, aiming to enhance the trustworthiness of annotated data. △ Less

Submitted 20 July, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

Comments: Accepted at SMM4H, colocated at ACL 2024

arXiv:2404.16014 [pdf, other]

Improving Dictionary Learning with Gated Sparse Autoencoders

Authors: Senthooran Rajamanoharan, Arthur Conmy, Lewis Smith, Tom Lieberum, Vikrant Varma, János Kramár, Rohin Shah, Neel Nanda

Abstract: Recent work has found that sparse autoencoders (SAEs) are an effective technique for unsupervised discovery of interpretable features in language models' (LMs) activations, by finding sparse, linear reconstructions of LM activations. We introduce the Gated Sparse Autoencoder (Gated SAE), which achieves a Pareto improvement over training with prevailing methods. In SAEs, the L1 penalty used to enco… ▽ More Recent work has found that sparse autoencoders (SAEs) are an effective technique for unsupervised discovery of interpretable features in language models' (LMs) activations, by finding sparse, linear reconstructions of LM activations. We introduce the Gated Sparse Autoencoder (Gated SAE), which achieves a Pareto improvement over training with prevailing methods. In SAEs, the L1 penalty used to encourage sparsity introduces many undesirable biases, such as shrinkage -- systematic underestimation of feature activations. The key insight of Gated SAEs is to separate the functionality of (a) determining which directions to use and (b) estimating the magnitudes of those directions: this enables us to apply the L1 penalty only to the former, limiting the scope of undesirable side effects. Through training SAEs on LMs of up to 7B parameters we find that, in typical hyper-parameter ranges, Gated SAEs solve shrinkage, are similarly interpretable, and require half as many firing features to achieve comparable reconstruction fidelity. △ Less

Submitted 30 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

Comments: 15 main text pages, 22 appendix pages

arXiv:2404.06948 [pdf, other]

MetaCheckGPT -- A Multi-task Hallucination Detector Using LLM Uncertainty and Meta-models

Authors: Rahul Mehta, Andrew Hoblitzell, Jack O'Keefe, Hyeju Jang, Vasudeva Varma

Abstract: Hallucinations in large language models (LLMs) have recently become a significant problem. A recent effort in this direction is a shared task at Semeval 2024 Task 6, SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes. This paper describes our winning solution ranked 1st and 2nd in the 2 sub-tasks of model agnostic and model aware tracks respectively. We propose… ▽ More Hallucinations in large language models (LLMs) have recently become a significant problem. A recent effort in this direction is a shared task at Semeval 2024 Task 6, SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes. This paper describes our winning solution ranked 1st and 2nd in the 2 sub-tasks of model agnostic and model aware tracks respectively. We propose a meta-regressor framework of LLMs for model evaluation and integration that achieves the highest scores on the leaderboard. We also experiment with various transformer-based models and black box methods like ChatGPT, Vectara, and others. In addition, we perform an error analysis comparing GPT4 against our best model which shows the limitations of the former. △ Less

Submitted 11 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: Entry for SemEval-2024 Shared Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes

MSC Class: 68T07; 68T50 ACM Class: I.2.7

arXiv:2404.04248 [pdf, other]

doi 10.3847/2041-8213/ad5beb

Observation of Gravitational Waves from the Coalescence of a $2.5\text{-}4.5~M_\odot$ Compact Object and a Neutron Star

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, S. Akçay, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah , et al. (1771 additional authors not shown)

Abstract: We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the so… ▽ More We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the source has a mass less than $5~M_\odot$ at 99% credibility. We cannot definitively determine from gravitational-wave data alone whether either component of the source is a neutron star or a black hole. However, given existing estimates of the maximum neutron star mass, we find the most probable interpretation of the source to be the coalescence of a neutron star with a black hole that has a mass between the most massive neutron stars and the least massive black holes observed in the Galaxy. We provisionally estimate a merger rate density of $55^{+127}_{-47}~\text{Gpc}^{-3}\,\text{yr}^{-1}$ for compact binary coalescences with properties similar to the source of GW230529_181500; assuming that the source is a neutron star-black hole merger, GW230529_181500-like sources constitute about 60% of the total merger rate inferred for neutron star-black hole coalescences. The discovery of this system implies an increase in the expected rate of neutron star-black hole mergers with electromagnetic counterparts and provides further evidence for compact objects existing within the purported lower mass gap. △ Less

Submitted 26 July, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: 45 pages (10 pages author list, 13 pages main text, 1 page acknowledgements, 13 pages appendices, 8 pages bibliography), 17 figures, 16 tables. Update to match version published in The Astrophysical Journal Letters. Data products available from https://zenodo.org/records/10845779

Report number: LIGO-P2300352

Journal ref: ApJL 970, L34 (2024)

arXiv:2403.10278 [pdf, other]

Optimizing post-Newtonian parameters and fixing the BMS frame for numerical-relativity waveform hybridizations

Authors: Dongze Sun, Michael Boyle, Keefe Mitman, Mark A. Scheel, Leo C. Stein, Saul A. Teukolsky, Vijay Varma

Abstract: Numerical relativity (NR) simulations of binary black holes provide precise waveforms, but are typically too computationally expensive to produce waveforms with enough orbits to cover the whole frequency band of gravitational-wave observatories. Accordingly, it is important to be able to hybridize NR waveforms with analytic, post-Newtonian (PN) waveforms, which are accurate during the early inspir… ▽ More Numerical relativity (NR) simulations of binary black holes provide precise waveforms, but are typically too computationally expensive to produce waveforms with enough orbits to cover the whole frequency band of gravitational-wave observatories. Accordingly, it is important to be able to hybridize NR waveforms with analytic, post-Newtonian (PN) waveforms, which are accurate during the early inspiral phase. We show that to build such hybrids, it is crucial to both fix the Bondi-Metzner-Sachs (BMS) frame of the NR waveforms to match that of PN theory, and optimize over the PN parameters. We test such a hybridization procedure including all spin-weighted spherical harmonic modes with $|m|\leq \ell$ for $\ell\leq 8$, using 29 NR waveforms with mass ratios $q\leq 10$ and spin magnitudes $|χ_1|, |χ_2|\leq 0.8$. We find that for spin-aligned systems, the PN and NR waveforms agree very well. The difference is limited by the small nonzero orbital eccentricity of the NR waveforms, or equivalently by the lack of eccentric terms in the PN waveforms. To maintain full accuracy of the simulations, the matching window for spin-aligned systems should be at least 5 orbits long and end at least 15 orbits before merger. For precessing systems, the errors are larger than for spin-aligned cases. The errors are likely limited by the absence of precession-related spin-spin PN terms. Using $10^5\,M$ long NR waveforms, we find that there is no optimal choice of the matching window within this time span, because the hybridization result for precessing cases is always better if using earlier or longer matching windows. We provide the mean orbital frequency of the smallest acceptable matching window as a function of the target error between the PN and NR waveforms and the black hole spins. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 21 pages, 22 figures

arXiv:2403.09473 [pdf, other]

Analysis of a continuous opinion and discrete action dynamics model coupled with an external observation dynamics

Authors: Anthony Couthures, Thomas Mongaillard, Vineeth S. Varma, Samson Lasaulce, Irinel-Constantin Morarescu

Abstract: We consider a set of consumers in a city or town (who thus generate pollution) whose opinion is governed by a continuous opinion and discrete action (CODA) dynamics model. This dynamics is coupled with an observation signal dynamics, which defines the information the consumers have access to regarding the common pollution. We show that the external observation signal has a significant impact on th… ▽ More We consider a set of consumers in a city or town (who thus generate pollution) whose opinion is governed by a continuous opinion and discrete action (CODA) dynamics model. This dynamics is coupled with an observation signal dynamics, which defines the information the consumers have access to regarding the common pollution. We show that the external observation signal has a significant impact on the asymptotic behavior of the CODA model. When the coupling is strong, it induces either a chaotic behavior or convergence towards a limit cycle. When the coupling is weak, a more classical behavior characterized by local agreements in polarized clusters is observed. In both cases, conditions under which clusters of consumers don't change their actions are provided.Numerical examples are provided to illustrate the derived analytical results. △ Less

Submitted 18 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

Comments: ECC conference 2024

arXiv:2403.06212 [pdf, other]

doi 10.1103/PhysRevE.109.064207

Characterization of hybrid quantum eigenstates in systems with mixed classical phasespace

Authors: Anant Vijay Varma, Amichay Vardi, Doron Cohen

Abstract: Generic low-dimensional Hamiltonian systems feature a structured, mixed classical phase-space. The traditional Percival classification of quantum spectra into regular states supported by quasi-integrable regions and irregular states supported by quasi-chaotic regions turns out to be insufficient to capture the richness of the Hilbert space. Berry's conjecture and the eigenstate thermalization hypo… ▽ More Generic low-dimensional Hamiltonian systems feature a structured, mixed classical phase-space. The traditional Percival classification of quantum spectra into regular states supported by quasi-integrable regions and irregular states supported by quasi-chaotic regions turns out to be insufficient to capture the richness of the Hilbert space. Berry's conjecture and the eigenstate thermalization hypothesis are not applicable and quantum effects such as tunneling, scarring, and localization, do not obey the standard paradigms. We demonstrate these statements for a prototype Bose-Hubbard model. We highlight the hybridization of chaotic and regular regions from opposing perspectives of ergodicity and localization. △ Less

Submitted 10 June, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

Comments: 11 pages, 13 figures

Journal ref: Phys. Rev. E 109, 064207 (2024)

arXiv:2403.03004 [pdf, other]

Ultralight vector dark matter search using data from the KAGRA O3GK run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we present the result of a search for $U(1)_{B-L}$ gauge boson DM using the KAGRA data from auxiliary length channels during the first joint observation run together with GEO600. By applying our search pipeline, which takes into account the stochastic nature of ultralight DM, upper bounds on the coupling strength between the $U(1)_{B-L}$ gauge boson and ordinary matter are obtained for a range of DM masses. While our constraints are less stringent than those derived from previous experiments, this study demonstrates the applicability of our method to the lower-mass vector DM search, which is made difficult in this measurement by the short observation time compared to the auto-correlation time scale of DM. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 20 pages, 5 figures

Report number: LIGO-P2300250

arXiv:2403.01709 [pdf, other]

Can LLMs Generate Architectural Design Decisions? -An Exploratory Empirical study

Authors: Rudra Dhar, Karthik Vaidhyanathan, Vasudeva Varma

Abstract: Architectural Knowledge Management (AKM) involves the organized handling of information related to architectural decisions and design within a project or organization. An essential artifact of AKM is the Architecture Decision Records (ADR), which documents key design decisions. ADRs are documents that capture decision context, decision made and various aspects related to a design decision, thereby… ▽ More Architectural Knowledge Management (AKM) involves the organized handling of information related to architectural decisions and design within a project or organization. An essential artifact of AKM is the Architecture Decision Records (ADR), which documents key design decisions. ADRs are documents that capture decision context, decision made and various aspects related to a design decision, thereby promoting transparency, collaboration, and understanding. Despite their benefits, ADR adoption in software development has been slow due to challenges like time constraints and inconsistent uptake. Recent advancements in Large Language Models (LLMs) may help bridge this adoption gap by facilitating ADR generation. However, the effectiveness of LLM for ADR generation or understanding is something that has not been explored. To this end, in this work, we perform an exploratory study that aims to investigate the feasibility of using LLM for the generation of ADRs given the decision context. In our exploratory study, we utilize GPT and T5-based models with 0-shot, few-shot, and fine-tuning approaches to generate the Decision of an ADR given its Context. Our results indicate that in a 0-shot setting, state-of-the-art models such as GPT-4 generate relevant and accurate Design Decisions, although they fall short of human-level performance. Additionally, we observe that more cost-effective models like GPT-3.5 can achieve similar outcomes in a few-shot setting, and smaller models such as Flan-T5 can yield comparable results after fine-tuning. To conclude, this exploratory study suggests that LLM can generate Design Decisions, but further research is required to attain human-level generation and establish standardized widespread adoption. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: This paper has been accepted to IEEE ICSA 2024 (Main Track - Research Track)

arXiv:2401.10602 [pdf, other]

Fractional Conformal Map, Qubit Dynamics and the Leggett-Garg Inequality

Authors: Sourav Paul, Anant Vijay Varma, Sourin Das

Abstract: Any pure state of a qubit can be geometrically represented as a point on the extended complex plane through stereographic projection. By employing successive conformal maps on the extended complex plane, we can generate an effective discrete-time evolution of the pure states of the qubit. This work focuses on a subset of analytic maps known as fractional linear conformal maps. We show that these m… ▽ More Any pure state of a qubit can be geometrically represented as a point on the extended complex plane through stereographic projection. By employing successive conformal maps on the extended complex plane, we can generate an effective discrete-time evolution of the pure states of the qubit. This work focuses on a subset of analytic maps known as fractional linear conformal maps. We show that these maps serve as a unifying framework for a diverse range of quantum-inspired conceivable dynamics, including (i) unitary dynamics,(ii) non-unitary but linear dynamics and (iii) non-unitary and non-linear dynamics where linearity (non-linearity) refers to the action of the discrete time evolution operator on the Hilbert space. We provide a characterization of these maps in terms of Leggett-Garg Inequality complemented with No-signaling in Time (NSIT) and Arrow of Time (AoT) conditions. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: 9 pages, 1 figure

arXiv:2312.15181 [pdf, other]

Multilingual Bias Detection and Mitigation for Indian Languages

Authors: Ankita Maity, Anubhav Sharma, Rudra Dhar, Tushar Abhishek, Manish Gupta, Vasudeva Varma

Abstract: Lack of diverse perspectives causes neutrality bias in Wikipedia content leading to millions of worldwide readers getting exposed by potentially inaccurate information. Hence, neutrality bias detection and mitigation is a critical problem. Although previous studies have proposed effective solutions for English, no work exists for Indian languages. First, we contribute two large datasets, mWikiBias… ▽ More Lack of diverse perspectives causes neutrality bias in Wikipedia content leading to millions of worldwide readers getting exposed by potentially inaccurate information. Hence, neutrality bias detection and mitigation is a critical problem. Although previous studies have proposed effective solutions for English, no work exists for Indian languages. First, we contribute two large datasets, mWikiBias and mWNC, covering 8 languages, for the bias detection and mitigation tasks respectively. Next, we investigate the effectiveness of popular multilingual Transformer-based models for the two tasks by modeling detection as a binary classification problem and mitigation as a style transfer problem. We make the code and data publicly available. △ Less

Submitted 23 December, 2023; originally announced December 2023.

arXiv:2312.10029 [pdf, other]

Challenges with unsupervised LLM knowledge discovery

Authors: Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger, Vladimir Mikulik, Rohin Shah

Abstract: We show that existing unsupervised methods on large language model (LLM) activations do not discover knowledge -- instead they seem to discover whatever feature of the activations is most prominent. The idea behind unsupervised knowledge elicitation is that knowledge satisfies a consistency structure, which can be used to discover knowledge. We first prove theoretically that arbitrary features (no… ▽ More We show that existing unsupervised methods on large language model (LLM) activations do not discover knowledge -- instead they seem to discover whatever feature of the activations is most prominent. The idea behind unsupervised knowledge elicitation is that knowledge satisfies a consistency structure, which can be used to discover knowledge. We first prove theoretically that arbitrary features (not just knowledge) satisfy the consistency structure of a particular leading unsupervised knowledge-elicitation method, contrast-consistent search (Burns et al. - arXiv:2212.03827). We then present a series of experiments showing settings in which unsupervised methods result in classifiers that do not predict knowledge, but instead predict a different prominent feature. We conclude that existing unsupervised methods for discovering latent knowledge are insufficient, and we contribute sanity checks to apply to evaluating future knowledge elicitation methods. Conceptually, we hypothesise that the identification issues explored here, e.g. distinguishing a model's knowledge from that of a simulated character's, will persist for future unsupervised methods. △ Less

Submitted 18 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: 12 pages (38 including references and appendices). First three authors equal contribution, randomised order

arXiv:2312.08588 [pdf, other]

Black Hole Spectroscopy for Precessing Binary Black Hole Coalescences

Authors: Hengrui Zhu, Harrison Siegel, Keefe Mitman, Maximiliano Isi, Will M. Farr, Michael Boyle, Nils Deppe, Lawrence E. Kidder, Sizheng Ma, Jordan Moxon, Kyle C. Nelli, Harald P. Pfeiffer, Mark A. Scheel, Saul A. Teukolsky, William Throwe, Vijay Varma, Nils L. Vu

Abstract: The spectroscopic study of black hole quasinormal modes in gravitational-wave ringdown observations is hindered by our ignorance of which modes should dominate astrophysical signals for different binary configurations, limiting tests of general relativity and astrophysics. In this work, we present a description of the quasinormal modes that are excited in the ringdowns of comparable mass, quasi-ci… ▽ More The spectroscopic study of black hole quasinormal modes in gravitational-wave ringdown observations is hindered by our ignorance of which modes should dominate astrophysical signals for different binary configurations, limiting tests of general relativity and astrophysics. In this work, we present a description of the quasinormal modes that are excited in the ringdowns of comparable mass, quasi-circular precessing binary black hole coalescences -- a key region of parameter space that has yet to be fully explored within the framework of black hole spectroscopy. We suggest that the remnant perturbation for precessing and non-precessing systems is approximately the same up to a rotation, which implies that the relative amplitudes of the quasinormal modes in both systems are also related by a rotation. We present evidence for this by analyzing an extensive catalog of numerical relativity simulations. Additional structure in the amplitudes is connected to the system's kick velocity and other asymmetries in the orbital dynamics. We find that the ringdowns of precessing systems need not be dominated by the ${(\ell,m)=(2,\pm 2)}$ quasinormal modes, and that instead the $(2,\pm 1)$~or~$(2,0)$ quasinormal modes can dominate. Our results are consistent with a ringdown analysis of the LIGO-Virgo gravitational wave signal GW190521, and may also help in understanding phenomenological inspiral-merger-ringdown waveform model systematics. △ Less

Submitted 18 July, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

Comments: Data Release and Analysis Scripts: https://github.com/HengruiPrinceton/precession_ringdown

arXiv:2311.07950 [pdf, ps, other]

Study of the Blazhko type RRc stars in the Stripe 82 region using SDSS and ZTF

Authors: Vaidehi Varma, Jozsef M. Benko, Chow-Choong Ngeow

Abstract: RR Lyrae stars are pulsating stars, many of which also show a long-term variation called the Blazhko effect which is a modulation of amplitude and phase of the lightcurve. In this work, we searched for the incidence rate of the Blazhko effect in the first-overtone pulsating RR Lyrae (RRc) stars of the Galactic halo. The focus was on the Stripe 82 region in the Galactic halo which was studied by Se… ▽ More RR Lyrae stars are pulsating stars, many of which also show a long-term variation called the Blazhko effect which is a modulation of amplitude and phase of the lightcurve. In this work, we searched for the incidence rate of the Blazhko effect in the first-overtone pulsating RR Lyrae (RRc) stars of the Galactic halo. The focus was on the Stripe 82 region in the Galactic halo which was studied by Sesar et al using the Sloan Digital Sky Survey (SDSS) data. In their work, 104 RR Lyrae stars were classified as RRc type. We combined their SDSS light curves with Zwicky Transient Facility (ZTF) data, and use them to document the Blazhko properties of these RRc stars. Our analysis showed that among the 104 RRc stars, 8 were rather RRd stars, and were excluded from the study. Out of remaining 96, 34 were Blazhko type, 62 were non-Blazhko type, giving the incidence rate of 35.42% for Blazhko RRc stars. The shortest Blazhko period found was 12.808 +/- 0.001 d for SDSS 747380, while the longest was 3100 +/- 126 d for SDSS 3585856. Combining SDSS and ZTF data sets increased the probability of detecting the small variations due to the Blazhko effect, and thus provided a unique opportunity to find longer Blazhko periods. We found that 85% of RRc stars had the Blazhko period longer than 200 d. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: 9 pages, 2 tables, 8 figures, AJ accepted

arXiv:2310.01544 [pdf, other]

doi 10.1103/PhysRevD.109.024024

GW190521: tracing imprints of spin-precession on the most massive black hole binary

Authors: Simona J. Miller, Maximiliano Isi, Katerina Chatziioannou, Vijay Varma, Ilya Mandel

Abstract: GW190521 is a remarkable gravitational-wave signal on multiple fronts: its source is the most massive black hole binary identified to date and could have spins misaligned with its orbit, leading to spin-induced precession -- an astrophysically consequential property linked to the binary's origin. However, due to its large mass, GW190521 was only observed during its final 3-4 cycles, making precess… ▽ More GW190521 is a remarkable gravitational-wave signal on multiple fronts: its source is the most massive black hole binary identified to date and could have spins misaligned with its orbit, leading to spin-induced precession -- an astrophysically consequential property linked to the binary's origin. However, due to its large mass, GW190521 was only observed during its final 3-4 cycles, making precession constraints puzzling and giving rise to alternative interpretations, such as eccentricity. Motivated by these complications, we trace the observational imprints of precession on GW190521 by dissecting the data with a novel time domain technique, allowing us to explore the morphology and interplay of the few observed cycles. We find that precession inference hinges on a quiet portion of the pre-merger data that is suppressed relative to the merger-ringdown. Neither pre-merger nor post-merger data alone are the sole driver of inference, but rather their combination: in the quasi-circular scenario, precession emerges as a mechanism to accommodate the lack of a stronger pre-merger signal in light of the observed post-merger. In terms of source dynamics, the pre-merger suppression arises from a tilting of the binary with respect to the observer. Establishing such a consistent picture between the source dynamics and the observed data is crucial for characterizing the growing number of massive binary observations and bolstering the robustness of ensuing astrophysical claims. △ Less

Submitted 18 January, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: 11 pages (excluding references), 9 figures

Report number: LIGO-P2300329

Journal ref: Phys. Rev. D 109, 024024 (2024)

arXiv:2309.14473 [pdf, other]

Analysis of GWTC-3 with fully precessing numerical relativity surrogate models

Authors: Tousif Islam, Avi Vajpeyi, Feroz H. Shaik, Carl-Johan Haster, Vijay Varma, Scott E. Field, Jacob Lange, Richard O'Shaughnessy, Rory Smith

Abstract: The third Gravitational-Wave Transient Catalog (GWTC-3) contains 90 binary coalescence candidates detected by the LIGO-Virgo-KAGRA Collaboration (LVK). We provide a re-analysis of binary black hole (BBH) events using a recently developed numerical relativity (NR) waveform surrogate model, NRSur7dq4, that includes all $\ell \leq 4$ spin-weighted spherical harmonic modes as well as the complete phys… ▽ More The third Gravitational-Wave Transient Catalog (GWTC-3) contains 90 binary coalescence candidates detected by the LIGO-Virgo-KAGRA Collaboration (LVK). We provide a re-analysis of binary black hole (BBH) events using a recently developed numerical relativity (NR) waveform surrogate model, NRSur7dq4, that includes all $\ell \leq 4$ spin-weighted spherical harmonic modes as well as the complete physical effects of precession. Properties of the remnant black holes' (BH's) mass, spin vector, and kick vector are found using an associated remnant surrogate model NRSur7dq4Remnant. Both NRSur7dq4 and NRSur7dq4Remnant models have errors comparable to numerical relativity simulations and allow for high-accuracy parameter estimates. We restrict our analysis to 47 BBH events that fall within the regime of validity of NRSur7dq4 (mass ratios greater than 1/6 and total masses greater than $60 M_{\odot}$). While for most of these events our results match the LVK analyses that were obtained using the semi-analytical models such as IMRPhenomXPHM and SEOBNRv4PHM, we find that for more than 20\% of events the NRSur7dq4 model recovers noticeably different measurements of black hole properties like the masses and spins, as well as extrinsic properties like the binary inclination and distance. For instance, GW150914_095045 exhibits noticeable differences in spin precession and spin magnitude measurements. Other notable findings include one event (GW191109_010717) that constrains the effective spin $χ_{eff}$ to be negative at a 99.3\% credible level and two events (GW191109_010717 and GW200129_065458) with well-constrained kick velocities. Furthermore, compared to the models used in the LVK analyses, NRSur7dq4 recovers a larger signal-to-noise ratio and/or Bayes factors for several events. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: Posteriors and animations are made publicly available at https://nrsur-catalog.github.io/NRSurCat-1

arXiv:2309.02390 [pdf, other]

Explaining grokking through circuit efficiency

Authors: Vikrant Varma, Rohin Shah, Zachary Kenton, János Kramár, Ramana Kumar

Abstract: One of the most surprising puzzles in neural network generalisation is grokking: a network with perfect training accuracy but poor generalisation will, upon further training, transition to perfect generalisation. We propose that grokking occurs when the task admits a generalising solution and a memorising solution, where the generalising solution is slower to learn but more efficient, producing la… ▽ More One of the most surprising puzzles in neural network generalisation is grokking: a network with perfect training accuracy but poor generalisation will, upon further training, transition to perfect generalisation. We propose that grokking occurs when the task admits a generalising solution and a memorising solution, where the generalising solution is slower to learn but more efficient, producing larger logits with the same parameter norm. We hypothesise that memorising circuits become more inefficient with larger training datasets while generalising circuits do not, suggesting there is a critical dataset size at which memorisation and generalisation are equally efficient. We make and confirm four novel predictions about grokking, providing significant evidence in favour of our explanation. Most strikingly, we demonstrate two novel and surprising behaviours: ungrokking, in which a network regresses from perfect to low test accuracy, and semi-grokking, in which a network shows delayed generalisation to partial rather than perfect test accuracy. △ Less

Submitted 5 September, 2023; originally announced September 2023.

arXiv:2308.13666 [pdf, other]

A Joint Fermi-GBM and Swift-BAT Analysis of Gravitational-Wave Candidates from the Third Gravitational-wave Observing Run

Authors: C. Fletcher, J. Wood, R. Hamburg, P. Veres, C. M. Hui, E. Bissaldi, M. S. Briggs, E. Burns, W. H. Cleveland, M. M. Giles, A. Goldstein, B. A. Hristov, D. Kocevski, S. Lesage, B. Mailyan, C. Malacaria, S. Poolakkil, A. von Kienlin, C. A. Wilson-Hodge, The Fermi Gamma-ray Burst Monitor Team, M. Crnogorčević, J. DeLaunay, A. Tohuvavohu, R. Caputo, S. B. Cenko , et al. (1674 additional authors not shown)

Abstract: We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses,… ▽ More We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses, the Targeted Search and the Untargeted Search, we investigate whether there are any coincident GRBs associated with the GWs. We also search the Swift-BAT rate data around the GW times to determine whether a GRB counterpart is present. No counterparts are found. Using both the Fermi-GBM Targeted Search and the Swift-BAT search, we calculate flux upper limits and present joint upper limits on the gamma-ray luminosity of each GW. Given these limits, we constrain theoretical models for the emission of gamma-rays from binary black hole mergers. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.12031 [pdf, other]

doi 10.1145/3649459

CACTUS: a Comprehensive Abstraction and Classification Tool for Uncovering Structures

Authors: Luca Gherardini, Varun Ravi Varma, Karol Capala, Roger Woods, Jose Sousa

Abstract: The availability of large data sets is providing an impetus for driving current artificial intelligent developments. There are, however, challenges for developing solutions with small data sets due to practical and cost-effective deployment and the opacity of deep learning models. The Comprehensive Abstraction and Classification Tool for Uncovering Structures called CACTUS is presented for improve… ▽ More The availability of large data sets is providing an impetus for driving current artificial intelligent developments. There are, however, challenges for developing solutions with small data sets due to practical and cost-effective deployment and the opacity of deep learning models. The Comprehensive Abstraction and Classification Tool for Uncovering Structures called CACTUS is presented for improved secure analytics by effectively employing explainable artificial intelligence. It provides additional support for categorical attributes, preserving their original meaning, optimising memory usage, and speeding up the computation through parallelisation. It shows to the user the frequency of the attributes in each class and ranks them by their discriminative power. Its performance is assessed by application to the Wisconsin diagnostic breast cancer and Thyroid0387 data sets. △ Less

Submitted 23 August, 2023; originally announced August 2023.

arXiv:2308.03822 [pdf, other]

Search for Eccentric Black Hole Coalescences during the Third Observing Run of LIGO and Virgo

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1750 additional authors not shown)

Abstract: Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effect… ▽ More Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effects of eccentricity. Here, we present observational results for a waveform-independent search sensitive to eccentric black hole coalescences, covering the third observing run (O3) of the LIGO and Virgo detectors. We identified no new high-significance candidates beyond those that were already identified with searches focusing on quasi-circular binaries. We determine the sensitivity of our search to high-mass (total mass $M>70$ $M_\odot$) binaries covering eccentricities up to 0.3 at 15 Hz orbital frequency, and use this to compare model predictions to search results. Assuming all detections are indeed quasi-circular, for our fiducial population model, we place an upper limit for the merger rate density of high-mass binaries with eccentricities $0 < e \leq 0.3$ at $0.33$ Gpc$^{-3}$ yr$^{-1}$ at 90\% confidence level. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 24 pages, 5 figures

Report number: LIGO-P2300080

arXiv:2308.02296 [pdf, other]

Scalable multiparty steering based on a single pair of entangled qubits

Authors: Alex Pepper, Travis. J. Baker, Yuanlong Wang, Qiu-Cheng Song, Lynden. K. Shalm, Varun. B. Varma, Sae Woo Nam, Nora Tischler, Sergei Slussarenko, Howard. M. Wiseman, Geoff. J. Pryde

Abstract: The distribution and verification of quantum nonlocality across a network of users is essential for future quantum information science and technology applications. However, beyond simple point-to-point protocols, existing methods struggle with increasingly complex state preparation for a growing number of parties. Here, we show that, surprisingly, multiparty loophole-free quantum steering, where o… ▽ More The distribution and verification of quantum nonlocality across a network of users is essential for future quantum information science and technology applications. However, beyond simple point-to-point protocols, existing methods struggle with increasingly complex state preparation for a growing number of parties. Here, we show that, surprisingly, multiparty loophole-free quantum steering, where one party simultaneously steers arbitrarily many spatially separate parties, is achievable by constructing a quantum network from a set of qubits of which only one pair is entangled. Using these insights, we experimentally demonstrate this type of steering between three parties with the detection loophole closed. With its modest and fixed entanglement requirements, this work introduces a scalable approach to rigorously verify quantum nonlocality across multiple parties, thus providing a practical tool towards developing the future quantum internet. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2307.04833 [pdf, other]

3D Simulations of Magnetoconvection in a Rapidly Rotating Supernova Progenitor

Authors: Vishnu Varma, Bernhard Mueller

Abstract: We present a first 3D magnetohydrodynamic (MHD) simulation of oxygen, neon and carbon shell burning in a rapidly rotating 16 M_sun core-collapse supernova progenitor. We also run a purely hydrodynamic simulation for comparison. After 180s (15 and 7 convective turnovers respectively), the magnetic fields in the oxygen and neon shells achieve saturation at 10^{11}G and 5 x 10^{10}G. The strong Maxwe… ▽ More We present a first 3D magnetohydrodynamic (MHD) simulation of oxygen, neon and carbon shell burning in a rapidly rotating 16 M_sun core-collapse supernova progenitor. We also run a purely hydrodynamic simulation for comparison. After 180s (15 and 7 convective turnovers respectively), the magnetic fields in the oxygen and neon shells achieve saturation at 10^{11}G and 5 x 10^{10}G. The strong Maxwell stresses become comparable to the radial Reynolds stresses and eventually suppress convection. The suppression of mixing by convection and shear instabilities results in the depletion of fuel at the base of the burning regions, so that the burning shell eventually move outward to cooler regions, thus reducing the energy generation rate. The strong magnetic fields efficiently transport angular momentum outwards, quickly spinning down the rapidly rotating convective oxygen and neon shells and forcing them into rigid rotation. The hydrodynamic model shows complicated redistribution of angular momentum and develops regions of retrograde rotation at the base of the convective shells. We discuss implications of our results for stellar evolution and for the subsequent core-collapse supernova. The rapid redistribution of angular momentum in the MHD model casts some doubt on the possibility of retaining significant core angular momentum for explosions driven by millisecond magnetars. However, findings from multi-D models remain tentative until stellar evolution calculations can provide more consistent rotation profiles and estimates of magnetic field strengths to initialise multi-D simulations without substantial numerical transients. We also stress the need for longer simulations, resolution studies, and an investigation of non-ideal effects. △ Less

Submitted 23 October, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

Comments: Submitted to MNRAS (14 pages, 11 Figures)

arXiv:2307.03435 [pdf, other]

doi 10.1103/PhysRevD.108.084015

Extending black-hole remnant surrogate models to extreme mass ratios

Authors: Matteo Boschini, Davide Gerosa, Vijay Varma, Cristobal Armaza, Michael Boyle, Marceline S. Bonilla, Andrea Ceja, Yitian Chen, Nils Deppe, Matthew Giesler, Lawrence E. Kidder, Prayush Kumar, Guillermo Lara, Oliver Long, Sizheng Ma, Keefe Mitman, Peter James Nee, Harald P. Pfeiffer, Antoni Ramos-Buades, Mark A. Scheel, Nils L. Vu, Jooheon Yoo

Abstract: Numerical-relativity surrogate models for both black-hole merger waveforms and remnants have emerged as important tools in gravitational-wave astronomy. While producing very accurate predictions, their applicability is limited to the region of the parameter space where numerical-relativity simulations are available and computationally feasible. Notably, this excludes extreme mass ratios. We presen… ▽ More Numerical-relativity surrogate models for both black-hole merger waveforms and remnants have emerged as important tools in gravitational-wave astronomy. While producing very accurate predictions, their applicability is limited to the region of the parameter space where numerical-relativity simulations are available and computationally feasible. Notably, this excludes extreme mass ratios. We present a machine-learning approach to extend the validity of existing and future numerical-relativity surrogate models toward the test-particle limit, targeting in particular the mass and spin of post-merger black-hole remnants. Our model is trained on both numerical-relativity simulations at comparable masses and analytical predictions at extreme mass ratios. We extend the gaussian-process-regression model NRSur7dq4Remnant, validate its performance via cross validation, and test its accuracy against additional numerical-relativity runs. Our fit, which we dub NRSur7dq4EmriRemnant, reaches an accuracy that is comparable to or higher than that of existing remnant models while providing robust predictions for arbitrary mass ratios. △ Less

Submitted 24 October, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

Comments: 10 pages, 3 figures. Published in PRD. Model publicly available at https://pypi.org/project/surfinBH

Journal ref: Phys.Rev.D 108 (2023) 8, 084015

arXiv:2307.01618 [pdf, other]

doi 10.1109/LCSYS.2023.3291421

A Stackelberg viral marketing design for two competing players

Authors: Olivier Lindamulage De Silva, Vineeth Satheeskumar Varma, Ming Cao, Irinel-Constantin Morarescu, Samson Lasaulce

Abstract: A Stackelberg duopoly model in which two firms compete to maximize their market share is considered. The firms offer a service/product to customers that are spread over several geographical regions (e.g., countries, provinces, or states). Each region has its own characteristics (spreading and recovery rates) of each service propagation. We consider that the spreading rate can be controlled by each… ▽ More A Stackelberg duopoly model in which two firms compete to maximize their market share is considered. The firms offer a service/product to customers that are spread over several geographical regions (e.g., countries, provinces, or states). Each region has its own characteristics (spreading and recovery rates) of each service propagation. We consider that the spreading rate can be controlled by each firm and is subject to some investment that the firm does in each region. One of the main objectives of this work is to characterize the advertising budget allocation strategy for each firm across regions to maximize its market share when competing. To achieve this goal we propose a Stackelberg game model that is relatively simple while capturing the main effects of the competition for market share. {By characterizing the strong/weak Stackelberg equilibria of the game, we provide the associated budget allocation strategy.} In this setting, it is established under which conditions the solution of the game is the so-called ``winner takes all". Numerical results expand upon our theoretical findings and we provide the equilibrium characterization for an example. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: This paper appears in: IEEE Control Systems Letters

Journal ref: IEEE Control Systems Letters 2023

arXiv:2306.08872 [pdf, other]

Neural models for Factual Inconsistency Classification with Explanations

Authors: Tathagata Raha, Mukund Choudhary, Abhinav Menon, Harshit Gupta, KV Aditya Srivatsa, Manish Gupta, Vasudeva Varma

Abstract: Factual consistency is one of the most important requirements when editing high quality documents. It is extremely important for automatic text generation systems like summarization, question answering, dialog modeling, and language modeling. Still, automated factual inconsistency detection is rather under-studied. Existing work has focused on (a) finding fake news keeping a knowledge base in cont… ▽ More Factual consistency is one of the most important requirements when editing high quality documents. It is extremely important for automatic text generation systems like summarization, question answering, dialog modeling, and language modeling. Still, automated factual inconsistency detection is rather under-studied. Existing work has focused on (a) finding fake news keeping a knowledge base in context, or (b) detecting broad contradiction (as part of natural language inference literature). However, there has been no work on detecting and explaining types of factual inconsistencies in text, without any knowledge base in context. In this paper, we leverage existing work in linguistics to formally define five types of factual inconsistencies. Based on this categorization, we contribute a novel dataset, FICLE (Factual Inconsistency CLassification with Explanation), with ~8K samples where each sample consists of two sentences (claim and context) annotated with type and span of inconsistency. When the inconsistency relates to an entity type, it is labeled as well at two levels (coarse and fine-grained). Further, we leverage this dataset to train a pipeline of four neural models to predict inconsistency type with explanations, given a (claim, context) sentence pair. Explanations include inconsistent claim fact triple, inconsistent context span, inconsistent claim component, coarse and fine-grained inconsistent entity types. The proposed system first predicts inconsistent spans from claim and context; and then uses them to predict inconsistency types and inconsistent entity types (when inconsistency is due to entities). We experiment with multiple Transformer-based natural language classification as well as generative models, and find that DeBERTa performs the best. Our proposed methods provide a weighted F1 of ~87% for inconsistency type classification across the five classes. △ Less

Submitted 15 June, 2023; originally announced June 2023.

Comments: ECML-PKDD 2023

arXiv:2306.03148 [pdf, other]

doi 10.1103/PhysRevD.108.064027

Numerical relativity surrogate model with memory effects and post-Newtonian hybridization

Authors: Jooheon Yoo, Keefe Mitman, Vijay Varma, Michael Boyle, Scott E. Field, Nils Deppe, François Hébert, Lawrence E. Kidder, Jordan Moxon, Harald P. Pfeiffer, Mark A. Scheel, Leo C. Stein, Saul A. Teukolsky, William Throwe, Nils L. Vu

Abstract: Numerical relativity simulations provide the most precise templates for the gravitational waves produced by binary black hole mergers. However, many of these simulations use an incomplete waveform extraction technique -- extrapolation -- that fails to capture important physics, such as gravitational memory effects. Cauchy-characteristic evolution (CCE), by contrast, is a much more physically accur… ▽ More Numerical relativity simulations provide the most precise templates for the gravitational waves produced by binary black hole mergers. However, many of these simulations use an incomplete waveform extraction technique -- extrapolation -- that fails to capture important physics, such as gravitational memory effects. Cauchy-characteristic evolution (CCE), by contrast, is a much more physically accurate extraction procedure that fully evolves Einstein's equations to future null infinity and accurately captures the expected physics. In this work, we present a new surrogate model, NRHybSur3dq8$\_$CCE, built from CCE waveforms that have been mapped to the post-Newtonian (PN) BMS frame and then hybridized with PN and effective one-body (EOB) waveforms. This model is trained on 102 waveforms with mass ratios $q\leq8$ and aligned spins $χ_{1z}, \, χ_{2z} \in \left[-0.8, 0.8\right]$. The model spans the entire LIGO-Virgo-KAGRA (LVK) frequency band (with $f_{\text{low}}=20\text{Hz}$) for total masses $M\gtrsim2.25M_{\odot}$ and includes the $\ell\leq4$ and $(\ell,m)=(5,5)$ spin-weight $-2$ spherical harmonic modes, but not the $(3,1)$, $(4,2)$ or $(4,1)$ modes. We find that NRHybSur3dq8$\_$CCE can accurately reproduce the training waveforms with mismatches $\lesssim2\times10^{-4}$ for total masses $2.25M_{\odot}\leq M\leq300M_{\odot}$ and can, for a modest degree of extrapolation, capably model outside of its training region. Most importantly, unlike previous waveform models, the new surrogate model successfully captures memory effects. △ Less

Submitted 14 September, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

Comments: 14 pages, 11 figures. Accepted for publication in PRD

Journal ref: Phys. Rev. D 108, 064027 (2023)

arXiv:2305.13912 [pdf, other]

doi 10.1093/mnras/stad1572

3D stellar evolution: hydrodynamic simulations of a complete burning phase in a massive star

Authors: F. Rizzuti, R. Hirschi, W. D. Arnett, C. Georgy, C. Meakin, A. StJ. Murphy, T. Rauscher, V. Varma

Abstract: Our knowledge of stellar evolution is driven by one-dimensional (1D) simulations. 1D models, however, are severely limited by uncertainties on the exact behaviour of many multi-dimensional phenomena occurring inside stars, affecting their structure and evolution. Recent advances in computing resources have allowed small sections of a star to be reproduced with multi-D hydrodynamic models, with an… ▽ More Our knowledge of stellar evolution is driven by one-dimensional (1D) simulations. 1D models, however, are severely limited by uncertainties on the exact behaviour of many multi-dimensional phenomena occurring inside stars, affecting their structure and evolution. Recent advances in computing resources have allowed small sections of a star to be reproduced with multi-D hydrodynamic models, with an unprecedented degree of detail and realism. In this work, we present a set of 3D simulations of a convective neon-burning shell in a 20 M$_\odot$ star run for the first time continuously from its early development through to complete fuel exhaustion, using unaltered input conditions from a 321D-guided 1D stellar model. These simulations help answer some open questions in stellar physics. In particular, they show that convective regions do not grow indefinitely due to entrainment of fresh material, but fuel consumption prevails over entrainment, so when fuel is exhausted convection also starts decaying. Our results show convergence between the multi-D simulations and the new 321D-guided 1D model, concerning the amount of convective boundary mixing to include in stellar models. The size of the convective zones in a star strongly affects its structure and evolution, thus revising their modelling in 1D will have important implications for the life and fate of stars. This will thus affect theoretical predictions related to nucleosynthesis, supernova explosions and compact remnants. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 12 pages, 15 figures. Accepted for publication in MNRAS

arXiv:2305.03300 [pdf, other]

LLM-RM at SemEval-2023 Task 2: Multilingual Complex NER using XLM-RoBERTa

Authors: Rahul Mehta, Vasudeva Varma

Abstract: Named Entity Recognition(NER) is a task of recognizing entities at a token level in a sentence. This paper focuses on solving NER tasks in a multilingual setting for complex named entities. Our team, LLM-RM participated in the recently organized SemEval 2023 task, Task 2: MultiCoNER II,Multilingual Complex Named Entity Recognition. We approach the problem by leveraging cross-lingual representation… ▽ More Named Entity Recognition(NER) is a task of recognizing entities at a token level in a sentence. This paper focuses on solving NER tasks in a multilingual setting for complex named entities. Our team, LLM-RM participated in the recently organized SemEval 2023 task, Task 2: MultiCoNER II,Multilingual Complex Named Entity Recognition. We approach the problem by leveraging cross-lingual representation provided by fine-tuning XLM-Roberta base model on datasets of all of the 12 languages provided -- Bangla, Chinese, English, Farsi, French, German, Hindi, Italian, Portuguese, Spanish, Swedish and Ukrainian △ Less

Submitted 5 May, 2023; originally announced May 2023.

Comments: Submitted to SemEval-2023, The 17th International Workshop on Semantic Evaluation

arXiv:2305.02096 [pdf, other]

doi 10.1063/5.0159448

Counterdiabatic driving for long-lived singlet state preparation

Authors: Abhinav Suresh, Vishal Varma, Priya Batra, T S Mahesh

Abstract: The quantum adiabatic method, which maintains populations in their instantaneous eigenstates throughout the state evolution, is an established and often a preferred choice for state preparation and manipulation. Though it minimizes the driving cost significantly, its slow speed is a severe limitation in noisy intermediate-scale quantum (NISQ) era technologies. Since adiabatic paths are extensive i… ▽ More The quantum adiabatic method, which maintains populations in their instantaneous eigenstates throughout the state evolution, is an established and often a preferred choice for state preparation and manipulation. Though it minimizes the driving cost significantly, its slow speed is a severe limitation in noisy intermediate-scale quantum (NISQ) era technologies. Since adiabatic paths are extensive in many physical processes, it is of broader interest to achieve adiabaticity at a much faster rate. Shortcuts to adiabaticity techniques which overcome the slow adiabatic process by driving the system faster through non-adiabatic paths, have seen increased attention recently. The extraordinarily long lifetime of the long-lived singlet states (LLS) in nuclear magnetic resonance, established over the past decade, has opened several important applications ranging from spectroscopy to biomedical imaging. Various methods, including adiabatic methods, are already being used to prepare LLS. In this article, we report the use of counterdiabatic driving (CD) to speed up LLS preparation with faster drives. Using NMR experiments, we show that CD can give stronger LLS order in shorter durations than conventional adiabatic driving. △ Less

Submitted 28 June, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

arXiv:2304.13254 [pdf, other]

doi 10.3847/1538-4357/ad0ec9

The directional isotropy of LIGO-Virgo binaries

Authors: Maximiliano Isi, Will M. Farr, Vijay Varma

Abstract: We demonstrate how to constrain the degree of absolute alignment of the total angular momenta of LIGO-Virgo binary black holes, looking for a special direction in space that would break isotropy. We also allow for inhomogeneities in the distribution of black holes over the sky. Making use of dipolar models for the spatial distribution and orientation of the sources, we analyze 57 signals with fals… ▽ More We demonstrate how to constrain the degree of absolute alignment of the total angular momenta of LIGO-Virgo binary black holes, looking for a special direction in space that would break isotropy. We also allow for inhomogeneities in the distribution of black holes over the sky. Making use of dipolar models for the spatial distribution and orientation of the sources, we analyze 57 signals with false-alarm rates < 1/yr from the third LIGO-Virgo observing run. Accounting for selection biases, we find the population of LIGO-Virgo black holes to be fully consistent with both homogeneity and isotropy. We additionally find the data to constrain some directions of alignment more than others, and produce posteriors for the directions of total angular momentum of all binaries in our set. All code and data are made publicly available in https://github.com/maxisi/gwisotropy/. △ Less

Submitted 25 April, 2023; originally announced April 2023.

Report number: LIGO-P2300088

Journal ref: ApJ 962 19 (2024)

arXiv:2304.11836 [pdf, other]

doi 10.1103/PhysRevD.107.124051

Numerical simulations of black hole-neutron star mergers in scalar-tensor gravity

Authors: Sizheng Ma, Vijay Varma, Leo C. Stein, Francois Foucart, Matthew D. Duez, Lawrence E. Kidder, Harald P. Pfeiffer, Mark A. Scheel

Abstract: We present a numerical-relativity simulation of a black hole - neutron star merger in scalar-tensor (ST) gravity with binary parameters consistent with the gravitational wave event GW200115. In this exploratory simulation, we consider the Damour-Esposito-Farese extension to Brans-Dicke theory, and maximize the effect of spontaneous scalarization by choosing a soft equation of state and ST theory p… ▽ More We present a numerical-relativity simulation of a black hole - neutron star merger in scalar-tensor (ST) gravity with binary parameters consistent with the gravitational wave event GW200115. In this exploratory simulation, we consider the Damour-Esposito-Farese extension to Brans-Dicke theory, and maximize the effect of spontaneous scalarization by choosing a soft equation of state and ST theory parameters at the edge of known constraints. We extrapolate the gravitational waves, including tensor and scalar (breathing) modes, to future null-infinity. The numerical waveforms undergo ~ 22 wave cycles before the merger, and are in good agreement with predictions from post-Newtonian theory during the inspiral. We find the ST system evolves faster than its general-relativity (GR) counterpart due to dipole radiation, merging a full gravitational-wave cycle before the GR counterpart. This enables easy differentiation between the ST waveforms and GR in the context of parameter estimation. However, we find that dipole radiation's effect may be partially degenerate with the NS tidal deformability during the late inspiral stage, and a full Bayesian analysis is necessary to fully understand the degeneracies between ST and binary parameters in GR. △ Less

Submitted 13 June, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

Journal ref: Phys. Rev. D 107, 124051 (2023)

arXiv:2304.10459 [pdf, other]

doi 10.1103/PhysRevApplied.20.034030

Long-Lived Singlet State in an Oriented Phase and its Survival across the Phase Transition Into an Isotropic Phase

Authors: Vishal Varma, T S Mahesh

Abstract: Long-lived singlet states (LLS) of nuclear spin pairs have been extensively studied and utilized in the isotropic phase via liquid state NMR. However, there are hardly any reports of LLS in the anisotropic phase that allows contribution from the dipolar coupling in addition to the scalar coupling, thereby opening many exciting possibilities. Here we report observing LLS in a pair of nuclear spins… ▽ More Long-lived singlet states (LLS) of nuclear spin pairs have been extensively studied and utilized in the isotropic phase via liquid state NMR. However, there are hardly any reports of LLS in the anisotropic phase that allows contribution from the dipolar coupling in addition to the scalar coupling, thereby opening many exciting possibilities. Here we report observing LLS in a pair of nuclear spins partially oriented in the nematic phase of a liquid crystal solvent. The spins are strongly interacting via the residual dipole-dipole coupling. We observe LLS in the oriented phase living up to three times longer than the usual spin-lattice relaxation time constant ($T_1$). Upon heating, the system undergoes a phase transition from nematic into isotropic phase, wherein the LLS is up to five times longer lived than the corresponding $T_1$. Interestingly, the LLS prepared in the oriented phase can survive the transition from the nematic to the isotropic phase. As an application of LLS in the oriented phase, we utilize its longer life to measure the small translational diffusion coefficient of solute molecules in the liquid crystal solvent. Finally, we propose utilizing the phase transition to lock or unlock access to LLS. △ Less

Submitted 30 October, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

Comments: 10 pages, 10 figures

Journal ref: Phys. Rev. Applied 20, 034030 (2023)

arXiv:2304.08393 [pdf, other]

Search for gravitational-lensing signatures in the full third observing run of the LIGO-Virgo network

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, C. Alléné, A. Allocca, P. A. Altin , et al. (1670 additional authors not shown)

Abstract: Gravitational lensing by massive objects along the line of sight to the source causes distortions of gravitational wave-signals; such distortions may reveal information about fundamental physics, cosmology and astrophysics. In this work, we have extended the search for lensing signatures to all binary black hole events from the third observing run of the LIGO--Virgo network. We search for repeated… ▽ More Gravitational lensing by massive objects along the line of sight to the source causes distortions of gravitational wave-signals; such distortions may reveal information about fundamental physics, cosmology and astrophysics. In this work, we have extended the search for lensing signatures to all binary black hole events from the third observing run of the LIGO--Virgo network. We search for repeated signals from strong lensing by 1) performing targeted searches for subthreshold signals, 2) calculating the degree of overlap amongst the intrinsic parameters and sky location of pairs of signals, 3) comparing the similarities of the spectrograms amongst pairs of signals, and 4) performing dual-signal Bayesian analysis that takes into account selection effects and astrophysical knowledge. We also search for distortions to the gravitational waveform caused by 1) frequency-independent phase shifts in strongly lensed images, and 2) frequency-dependent modulation of the amplitude and phase due to point masses. None of these searches yields significant evidence for lensing. Finally, we use the non-detection of gravitational-wave lensing to constrain the lensing rate based on the latest merger-rate estimates and the fraction of dark matter composed of compact objects. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: 28 pages, 11 figures

Report number: LIGO-P2200031

Showing 1–50 of 288 results for author: Varma, V