Search | arXiv e-print repository

Evaluating the Impact of Data Availability on Machine Learning-augmented MPC for a Building Energy Management System

Authors: Jens Engel, Thomas Schmitt, Tobias Rodemann, Jürgen Adamy

Abstract: A major challenge in the development of Model Predictive Control (MPC)-based energy management systems (EMSs) for buildings is the availability of an accurate model. One approach to address this is to augment an existing gray-box model with data-driven residual estimators. The efficacy of such estimators, and hence the performance of the EMS, relies on the availability of sufficient and suitable t… ▽ More A major challenge in the development of Model Predictive Control (MPC)-based energy management systems (EMSs) for buildings is the availability of an accurate model. One approach to address this is to augment an existing gray-box model with data-driven residual estimators. The efficacy of such estimators, and hence the performance of the EMS, relies on the availability of sufficient and suitable training data. In this work, we evaluate how different data availability scenarios affect estimator and controller performance. To do this, we perform software-in-the-loop (SiL) simulation with a physics-based digital twin using real measurement data. Simulation results show that acceptable estimation and control performance can already be achieved with limited available data, and we confirm that leveraging historical data for pretraining boosts efficacy. △ Less

Submitted 18 July, 2024; originally announced July 2024.

Comments: 5 pages, 4 figures. To be published in 2024 IEEE PES Innovative Smart Grid Technologies Europe (ISGT EUROPE) proceedings

arXiv:2406.09905 [pdf, other]

Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in the Wild

Authors: Lingni Ma, Yuting Ye, Fangzhou Hong, Vladimir Guzov, Yifeng Jiang, Rowan Postyeni, Luis Pesqueira, Alexander Gamino, Vijay Baiyya, Hyo Jin Kim, Kevin Bailey, David Soriano Fosas, C. Karen Liu, Ziwei Liu, Jakob Engel, Renzo De Nardi, Richard Newcombe

Abstract: We introduce Nymeria - a large-scale, diverse, richly annotated human motion dataset collected in the wild with multiple multimodal egocentric devices. The dataset comes with a) full-body 3D motion ground truth; b) egocentric multimodal recordings from Project Aria devices with RGB, grayscale, eye-tracking cameras, IMUs, magnetometer, barometer, and microphones; and c) an additional "observer" dev… ▽ More We introduce Nymeria - a large-scale, diverse, richly annotated human motion dataset collected in the wild with multiple multimodal egocentric devices. The dataset comes with a) full-body 3D motion ground truth; b) egocentric multimodal recordings from Project Aria devices with RGB, grayscale, eye-tracking cameras, IMUs, magnetometer, barometer, and microphones; and c) an additional "observer" device providing a third-person viewpoint. We compute world-aligned 6DoF transformations for all sensors, across devices and capture sessions. The dataset also provides 3D scene point clouds and calibrated gaze estimation. We derive a protocol to annotate hierarchical language descriptions of in-context human motion, from fine-grain pose narrations, to atomic actions and activity summarization. To the best of our knowledge, the Nymeria dataset is the world largest in-the-wild collection of human motion with natural and diverse activities; first of its kind to provide synchronized and localized multi-device multimodal egocentric data; and the world largest dataset with motion-language descriptions. It contains 1200 recordings of 300 hours of daily activities from 264 participants across 50 locations, travelling a total of 399Km. The motion-language descriptions provide 310.5K sentences in 8.64M words from a vocabulary size of 6545. To demonstrate the potential of the dataset we define key research tasks for egocentric body tracking, motion synthesis, and action recognition and evaluate several state-of-the-art baseline algorithms. Data and code will be open-sourced. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.09598 [pdf, other]

Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking

Authors: Prithviraj Banerjee, Sindi Shkodrani, Pierre Moulon, Shreyas Hampali, Fan Zhang, Jade Fountain, Edward Miller, Selen Basol, Richard Newcombe, Robert Wang, Jakob Julian Engel, Tomas Hodan

Abstract: We introduce HOT3D, a publicly available dataset for egocentric hand and object tracking in 3D. The dataset offers over 833 minutes (more than 3.7M images) of multi-view RGB/monochrome image streams showing 19 subjects interacting with 33 diverse rigid objects, multi-modal signals such as eye gaze or scene point clouds, as well as comprehensive ground truth annotations including 3D poses of object… ▽ More We introduce HOT3D, a publicly available dataset for egocentric hand and object tracking in 3D. The dataset offers over 833 minutes (more than 3.7M images) of multi-view RGB/monochrome image streams showing 19 subjects interacting with 33 diverse rigid objects, multi-modal signals such as eye gaze or scene point clouds, as well as comprehensive ground truth annotations including 3D poses of objects, hands, and cameras, and 3D models of hands and objects. In addition to simple pick-up/observe/put-down actions, HOT3D contains scenarios resembling typical actions in a kitchen, office, and living room environment. The dataset is recorded by two head-mounted devices from Meta: Project Aria, a research prototype of light-weight AR/AI glasses, and Quest 3, a production VR headset sold in millions of units. Ground-truth poses were obtained by a professional motion-capture system using small optical markers attached to hands and objects. Hand annotations are provided in the UmeTrack and MANO formats and objects are represented by 3D meshes with PBR materials obtained by an in-house scanner. We aim to accelerate research on egocentric hand-object interaction by making the HOT3D dataset publicly available and by co-organizing public challenges on the dataset at ECCV 2024. The dataset can be downloaded from the project website: https://facebookresearch.github.io/hot3d/. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2403.13064 [pdf, other]

SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model

Authors: Armen Avetisyan, Christopher Xie, Henry Howard-Jenkins, Tsun-Yi Yang, Samir Aroudj, Suvam Patra, Fuyang Zhang, Duncan Frost, Luke Holland, Campbell Orme, Jakob Engel, Edward Miller, Richard Newcombe, Vasileios Balntas

Abstract: We introduce SceneScript, a method that directly produces full scene models as a sequence of structured language commands using an autoregressive, token-based approach. Our proposed scene representation is inspired by recent successes in transformers & LLMs, and departs from more traditional methods which commonly describe scenes as meshes, voxel grids, point clouds or radiance fields. Our method… ▽ More We introduce SceneScript, a method that directly produces full scene models as a sequence of structured language commands using an autoregressive, token-based approach. Our proposed scene representation is inspired by recent successes in transformers & LLMs, and departs from more traditional methods which commonly describe scenes as meshes, voxel grids, point clouds or radiance fields. Our method infers the set of structured language commands directly from encoded visual data using a scene language encoder-decoder architecture. To train SceneScript, we generate and release a large-scale synthetic dataset called Aria Synthetic Environments consisting of 100k high-quality in-door scenes, with photorealistic and ground-truth annotated renders of egocentric scene walkthroughs. Our method gives state-of-the art results in architectural layout estimation, and competitive results in 3D object detection. Lastly, we explore an advantage for SceneScript, which is the ability to readily adapt to new commands via simple additions to the structured language, which we illustrate for tasks such as coarse 3D object part reconstruction. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: see project page, https://projectaria.com/scenescript

arXiv:2402.13349 [pdf, other]

Aria Everyday Activities Dataset

Authors: Zhaoyang Lv, Nicholas Charron, Pierre Moulon, Alexander Gamino, Cheng Peng, Chris Sweeney, Edward Miller, Huixuan Tang, Jeff Meissner, Jing Dong, Kiran Somasundaram, Luis Pesqueira, Mark Schwesinger, Omkar Parkhi, Qiao Gu, Renzo De Nardi, Shangyi Cheng, Steve Saarinen, Vijay Baiyya, Yuyang Zou, Richard Newcombe, Jakob Julian Engel, Xiaqing Pan, Carl Ren

Abstract: We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor locations. Each of the recording contains multimodal sensor data recorded through the Project Aria glasses. In addition, AEA provides machine perception data includi… ▽ More We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor locations. Each of the recording contains multimodal sensor data recorded through the Project Aria glasses. In addition, AEA provides machine perception data including high frequency globally aligned 3D trajectories, scene point cloud, per-frame 3D eye gaze vector and time aligned speech transcription. In this paper, we demonstrate a few exemplar research applications enabled by this dataset, including neural scene reconstruction and prompted segmentation. AEA is an open source dataset that can be downloaded from https://www.projectaria.com/datasets/aea/. We are also providing open-source implementations and examples of how to use the dataset in Project Aria Tools https://github.com/facebookresearch/projectaria_tools. △ Less

Submitted 21 February, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

Comments: Dataset website: https://www.projectaria.com/datasets/aea/

arXiv:2311.18259 [pdf, other]

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Authors: Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain , et al. (76 additional authors not shown)

Abstract: We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from… ▽ More We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from 1 to 42 minutes each and 1,286 hours of video combined. The multimodal nature of the dataset is unprecedented: the video is accompanied by multichannel audio, eye gaze, 3D point clouds, camera poses, IMU, and multiple paired language descriptions -- including a novel "expert commentary" done by coaches and teachers and tailored to the skilled-activity domain. To push the frontier of first-person video understanding of skilled human activity, we also present a suite of benchmark tasks and their annotations, including fine-grained activity understanding, proficiency estimation, cross-view translation, and 3D hand/body pose. All resources are open sourced to fuel new research in the community. Project page: http://ego-exo4d-data.org/ △ Less

Submitted 29 April, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

Comments: updated baseline results and dataset statistics to match the released v2 data; added table to appendix comparing stats of Ego-Exo4D alongside other datasets

arXiv:2309.10660 [pdf, other]

doi 10.1109/PESGRE58662.2023.10404171

Implicit Incorporation of Heuristics in MPC-Based Control of a Hydrogen Plant

Authors: Thomas Schmitt, Jens Engel, Martin Kopp, Tobias Rodemann

Abstract: The replacement of fossil fuels in combination with an increasing share of renewable energy sources leads to an increased focus on decentralized microgrids. One option is the local production of green hydrogen in combination with fuel cell vehicles (FCVs). In this paper, we develop a control strategy based on Model Predictive Control (MPC) for an energy management system (EMS) of a hydrogen plant,… ▽ More The replacement of fossil fuels in combination with an increasing share of renewable energy sources leads to an increased focus on decentralized microgrids. One option is the local production of green hydrogen in combination with fuel cell vehicles (FCVs). In this paper, we develop a control strategy based on Model Predictive Control (MPC) for an energy management system (EMS) of a hydrogen plant, which is currently under installation in Offenbach, Germany. The plant includes an electrolyzer, a compressor, a low pressure storage tank, and six medium pressure storage tanks with complex heuristic physical coupling during the filling and extraction of hydrogen. Since these heuristics are too complex to be incorporated into the optimal control problem (OCP) explicitly, we propose a novel approach to do so implicitly. First, the MPC is executed without considering them. Then, the so-called allocator uses a heuristic model (of arbitrary complexity) to verify whether the MPC's plan is valid. If not, it introduces additional constraints to the MPC's OCP to implicitly respect the tanks' pressure levels. The MPC is executed again and the new plan is applied to the plant. Simulation results with real-world measurement data of the facility's energy management and realistic fueling scenarios show its advantages over rule-based control. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: 8 pages, 3 figures. To be published in IEEE 3rd International Conference on Power Electronics, Smart Grid, and Renewable Energy (PESGRE 2023) proceedings

arXiv:2309.08803 [pdf, other]

Robust Indoor Localization with Ranging-IMU Fusion

Authors: Fan Jiang, David Caruso, Ashutosh Dhekne, Qi Qu, Jakob Julian Engel, Jing Dong

Abstract: Indoor wireless ranging localization is a promising approach for low-power and high-accuracy localization of wearable devices. A primary challenge in this domain stems from non-line of sight propagation of radio waves. This study tackles a fundamental issue in wireless ranging: the unpredictability of real-time multipath determination, especially in challenging conditions such as when there is no… ▽ More Indoor wireless ranging localization is a promising approach for low-power and high-accuracy localization of wearable devices. A primary challenge in this domain stems from non-line of sight propagation of radio waves. This study tackles a fundamental issue in wireless ranging: the unpredictability of real-time multipath determination, especially in challenging conditions such as when there is no direct line of sight. We achieve this by fusing range measurements with inertial measurements obtained from a low cost Inertial Measurement Unit (IMU). For this purpose, we introduce a novel asymmetric noise model crafted specifically for non-Gaussian multipath disturbances. Additionally, we present a novel Levenberg-Marquardt (LM)-family trust-region adaptation of the iSAM2 fusion algorithm, which is optimized for robust performance for our ranging-IMU fusion problem. We evaluate our solution in a densely occupied real office environment. Our proposed solution can achieve temporally consistent localization with an average absolute accuracy of $\sim$0.3m in real-world settings. Furthermore, our results indicate that we can achieve comparable accuracy even with infrequent (1Hz) range measurements. △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2308.15634 [pdf, other]

doi 10.1103/PhysRevLett.132.182502

Ab initio uncertainty quantification of neutrinoless double-beta decay in $^{76}$Ge

Authors: A. Belley, J. M. Yao, B. Bally, J. Pitcher, J. Engel, H. Hergert, J. D. Holt, T. Miyagi, T. R. Rodriguez, A. M. Romero, S. R. Stroberg, X. Zhang

Abstract: The observation of neutrinoless double-beta ($0νββ$) decay would offer proof of lepton number violation, demonstrating that neutrinos are Majorana particles, while also helping us understand why there is more matter than antimatter in the Universe. If the decay is driven by the exchange of the three known light neutrinos, a discovery would, in addition, link the observed decay rate to the neutrino… ▽ More The observation of neutrinoless double-beta ($0νββ$) decay would offer proof of lepton number violation, demonstrating that neutrinos are Majorana particles, while also helping us understand why there is more matter than antimatter in the Universe. If the decay is driven by the exchange of the three known light neutrinos, a discovery would, in addition, link the observed decay rate to the neutrino mass scale through a theoretical quantity known as the nuclear matrix element (NME). Accurate values of the NMEs for all nuclei considered for use in $0νββ$ experiments are therefore crucial for designing and interpreting those experiments. Here, we report the first comprehensive ab initio uncertainty quantification of the $0νββ$-decay NME, in the key nucleus $^{76}$Ge. Our method employs nuclear strong and weak interactions derived within chiral effective field theory and recently developed many-body emulators. Our result, with a conservative treatment of uncertainty, is an NME of $2.60^{+1.28}_{-1.36}$, which, together with the best-existing half-life sensitivity and phase-space factor, sets an upper limit for effective neutrino mass of $187^{+205}_{-62}$ meV. The result is important for designing next-generation germanium detectors aiming to cover the entire inverted hierarchy region of neutrino masses. △ Less

Submitted 19 January, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

Comments: 7 pages, 1 table, and 2 figures

Journal ref: Phys. Rev. Lett. 132, 182502 (2024)

arXiv:2308.13561 [pdf, other]

Project Aria: A New Tool for Egocentric Multi-Modal AI Research

Authors: Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, Cheng Peng, Chris Sweeney, Cole Wilson, Dan Barnes, Daniel DeTone, David Caruso, Derek Valleroy, Dinesh Ginjupalli, Duncan Frost, Edward Miller, Elias Mueggler, Evgeniy Oleinik, Fan Zhang, Guruprasad Somasundaram, Gustavo Solaira , et al. (49 additional authors not shown)

Abstract: Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, mul… ▽ More Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, multi-modal data recording and streaming device with the goal to foster and accelerate research in this area. In this paper, we describe the Aria device hardware including its sensor configuration and the corresponding software tools that enable recording and processing of such data. △ Less

Submitted 1 October, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.11802 [pdf, other]

Effects of Quasiparticle-Vibration Coupling on Gamow-Teller Strength and $β$ Decay with the Skyrme Proton-Neutron Finite-Amplitude Method

Authors: Qunqun Liu, Jonathan Engel, Nobuo Hinohara, Markus Kortelainen

Abstract: We adapt the proton-neutron finite-amplitude method, which in its original form is an efficient implementation of the Skyrme quasiparticle random phase approximation, to include the coupling of quasiparticles to like-particle phonons. The approach allows us to add beyond-QRPA correlations to computations of Gamow-Teller strength and $β$-decay rates in deformed nuclei for the first time. We test th… ▽ More We adapt the proton-neutron finite-amplitude method, which in its original form is an efficient implementation of the Skyrme quasiparticle random phase approximation, to include the coupling of quasiparticles to like-particle phonons. The approach allows us to add beyond-QRPA correlations to computations of Gamow-Teller strength and $β$-decay rates in deformed nuclei for the first time. We test the approach in several deformed isotopes for which measured strength distributions are available. The additional correlations dramatically improve agreement with the data, and will lead to improved global $β$-decay rates. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: 7 pages, 6 figures

arXiv:2306.09080 [pdf, other]

doi 10.1109/CCTA54093.2023.10252861

Regression-Based Model Error Compensation for Hierarchical MPC Building Energy Management System

Authors: Thomas Schmitt, Jens Engel, Tobias Rodemann

Abstract: One of the major challenges in the development of energy management systems (EMSs) for complex buildings is accurate modeling. To address this, we propose an EMS, which combines a Model Predictive Control (MPC) approach with data-driven model error compensation. The hierarchical MPC approach consists of two layers: An aggregator controls the overall energy flows of the building in an aggregated pe… ▽ More One of the major challenges in the development of energy management systems (EMSs) for complex buildings is accurate modeling. To address this, we propose an EMS, which combines a Model Predictive Control (MPC) approach with data-driven model error compensation. The hierarchical MPC approach consists of two layers: An aggregator controls the overall energy flows of the building in an aggregated perspective, while a distributor distributes heating and cooling powers to individual temperature zones. The controllers of both layers employ regression-based error estimation to predict and incorporate the model error. The proposed approach is evaluated in a software-in-the-loop simulation using a physics-based digital twin model. Simulation results show the efficacy and robustness of the proposed approach △ Less

Submitted 1 August, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: 8 pages, 4 figures. To be published in 2023 IEEE Conference on Control Technology and Applications (CCTA) proceedings

arXiv:2304.03451 [pdf, other]

Fundamental Symmetries, Neutrons, and Neutrinos (FSNN): Whitepaper for the 2023 NSAC Long Range Plan

Authors: B. Acharya, C. Adams, A. A. Aleksandrova, K. Alfonso, P. An, S. Baeßler, A. B. Balantekin, P. S. Barbeau, F. Bellini, V. Bellini, R. S. Beminiwattha, J. C. Bernauer, T. Bhattacharya, M. Bishof, A. E. Bolotnikov, P. A. Breur, M. Brodeur, J. P. Brodsky, L. J. Broussard, T. Brunner, D. P. Burdette, J. Caylor, M. Chiu, V. Cirigliano, J. A. Clark , et al. (154 additional authors not shown)

Abstract: This whitepaper presents the research priorities decided on by attendees of the 2022 Town Meeting for Fundamental Symmetries, Neutrons and Neutrinos, which took place December 13-15, 2022 in Chapel Hill, NC, as part of the Nuclear Science Advisory Committee (NSAC) 2023 Long Range Planning process. A total of 275 scientists registered for the meeting. The whitepaper makes a number of explicit recom… ▽ More This whitepaper presents the research priorities decided on by attendees of the 2022 Town Meeting for Fundamental Symmetries, Neutrons and Neutrinos, which took place December 13-15, 2022 in Chapel Hill, NC, as part of the Nuclear Science Advisory Committee (NSAC) 2023 Long Range Planning process. A total of 275 scientists registered for the meeting. The whitepaper makes a number of explicit recommendations and justifies them in detail. △ Less

Submitted 6 April, 2023; originally announced April 2023.

arXiv:2302.03917 [pdf, other]

Noise2Music: Text-conditioned Music Generation with Diffusion Models

Authors: Qingqing Huang, Daniel S. Park, Tao Wang, Timo I. Denk, Andy Ly, Nanxin Chen, Zhengdong Zhang, Zhishuai Zhang, Jiahui Yu, Christian Frank, Jesse Engel, Quoc V. Le, William Chan, Zhifeng Chen, Wei Han

Abstract: We introduce Noise2Music, where a series of diffusion models is trained to generate high-quality 30-second music clips from text prompts. Two types of diffusion models, a generator model, which generates an intermediate representation conditioned on text, and a cascader model, which generates high-fidelity audio conditioned on the intermediate representation and possibly the text, are trained and… ▽ More We introduce Noise2Music, where a series of diffusion models is trained to generate high-quality 30-second music clips from text prompts. Two types of diffusion models, a generator model, which generates an intermediate representation conditioned on text, and a cascader model, which generates high-fidelity audio conditioned on the intermediate representation and possibly the text, are trained and utilized in succession to generate high-fidelity music. We explore two options for the intermediate representation, one using a spectrogram and the other using audio with lower fidelity. We find that the generated audio is not only able to faithfully reflect key elements of the text prompt such as genre, tempo, instruments, mood, and era, but goes beyond to ground fine-grained semantics of the prompt. Pretrained large language models play a key role in this story -- they are used to generate paired text for the audio of the training set and to extract embeddings of the text prompts ingested by the diffusion models. Generated examples: https://google-research.github.io/noise2music △ Less

Submitted 6 March, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

Comments: 15 pages

arXiv:2302.02165 [pdf, other]

doi 10.1088/1361-6633/ad1e39

Opportunities for Fundamental Physics Research with Radioactive Molecules

Authors: Gordon Arrowsmith-Kron, Michail Athanasakis-Kaklamanakis, Mia Au, Jochen Ballof, Robert Berger, Anastasia Borschevsky, Alexander A. Breier, Fritz Buchinger, Dmitry Budker, Luke Caldwell, Christopher Charles, Nike Dattani, Ruben P. de Groote, David DeMille, Timo Dickel, Jacek Dobaczewski, Christoph E. Düllmann, Ephraim Eliav, Jon Engel, Mingyu Fan, Victor Flambaum, Kieran T. Flanagan, Alyssa Gaiser, Ronald Garcia Ruiz, Konstantin Gaul , et al. (37 additional authors not shown)

Abstract: Molecules containing short-lived, radioactive nuclei are uniquely positioned to enable a wide range of scientific discoveries in the areas of fundamental symmetries, astrophysics, nuclear structure, and chemistry. Recent advances in the ability to create, cool, and control complex molecules down to the quantum level, along with recent and upcoming advances in radioactive species production at seve… ▽ More Molecules containing short-lived, radioactive nuclei are uniquely positioned to enable a wide range of scientific discoveries in the areas of fundamental symmetries, astrophysics, nuclear structure, and chemistry. Recent advances in the ability to create, cool, and control complex molecules down to the quantum level, along with recent and upcoming advances in radioactive species production at several facilities around the world, create a compelling opportunity to coordinate and combine these efforts to bring precision measurement and control to molecules containing extreme nuclei. In this manuscript, we review the scientific case for studying radioactive molecules, discuss recent atomic, molecular, nuclear, astrophysical, and chemical advances which provide the foundation for their study, describe the facilities where these species are and will be produced, and provide an outlook for the future of this nascent field. △ Less

Submitted 4 February, 2023; originally announced February 2023.

Journal ref: Rep. Prog. Phys. 87 084301 (2024)

arXiv:2301.12662 [pdf, other]

SingSong: Generating musical accompaniments from singing

Authors: Chris Donahue, Antoine Caillon, Adam Roberts, Ethan Manilow, Philippe Esling, Andrea Agostinelli, Mauro Verzetti, Ian Simon, Olivier Pietquin, Neil Zeghidour, Jesse Engel

Abstract: We present SingSong, a system that generates instrumental music to accompany input vocals, potentially offering musicians and non-musicians alike an intuitive new way to create music featuring their own voice. To accomplish this, we build on recent developments in musical source separation and audio generation. Specifically, we apply a state-of-the-art source separation algorithm to a large corpus… ▽ More We present SingSong, a system that generates instrumental music to accompany input vocals, potentially offering musicians and non-musicians alike an intuitive new way to create music featuring their own voice. To accomplish this, we build on recent developments in musical source separation and audio generation. Specifically, we apply a state-of-the-art source separation algorithm to a large corpus of music audio to produce aligned pairs of vocals and instrumental sources. Then, we adapt AudioLM (Borsos et al., 2022) -- a state-of-the-art approach for unconditional audio generation -- to be suitable for conditional "audio-to-audio" generation tasks, and train it on the source-separated (vocal, instrumental) pairs. In a pairwise comparison with the same vocal inputs, listeners expressed a significant preference for instrumentals generated by SingSong compared to those from a strong retrieval baseline. Sound examples at https://g.co/magenta/singsong △ Less

Submitted 29 January, 2023; originally announced January 2023.

arXiv:2301.11325 [pdf, other]

MusicLM: Generating Music From Text

Authors: Andrea Agostinelli, Timo I. Denk, Zalán Borsos, Jesse Engel, Mauro Verzetti, Antoine Caillon, Qingqing Huang, Aren Jansen, Adam Roberts, Marco Tagliasacchi, Matt Sharifi, Neil Zeghidour, Christian Frank

Abstract: We introduce MusicLM, a model generating high-fidelity music from text descriptions such as "a calming violin melody backed by a distorted guitar riff". MusicLM casts the process of conditional music generation as a hierarchical sequence-to-sequence modeling task, and it generates music at 24 kHz that remains consistent over several minutes. Our experiments show that MusicLM outperforms previous s… ▽ More We introduce MusicLM, a model generating high-fidelity music from text descriptions such as "a calming violin melody backed by a distorted guitar riff". MusicLM casts the process of conditional music generation as a hierarchical sequence-to-sequence modeling task, and it generates music at 24 kHz that remains consistent over several minutes. Our experiments show that MusicLM outperforms previous systems both in audio quality and adherence to the text description. Moreover, we demonstrate that MusicLM can be conditioned on both text and a melody in that it can transform whistled and hummed melodies according to the style described in a text caption. To support future research, we publicly release MusicCaps, a dataset composed of 5.5k music-text pairs, with rich text descriptions provided by human experts. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Comments: Supplementary material at https://google-research.github.io/seanet/musiclm/examples and https://kaggle.com/datasets/googleai/musiccaps

arXiv:2212.11099 [pdf, other]

Neutrinoless Double Beta Decay

Authors: C. Adams, K. Alfonso, C. Andreoiu, E. Angelico, I. J. Arnquist, J. A. A. Asaadi, F. T. Avignone, S. N. Axani, A. S. Barabash, P. S. Barbeau, L. Baudis, F. Bellini, M. Beretta, T. Bhatta, V. Biancacci, M. Biassoni, E. Bossio, P. A. Breur, J. P. Brodsky, C. Brofferio, E. Brown, R. Brugnera, T. Brunner, N. Burlac, E. Caden , et al. (207 additional authors not shown)

Abstract: This White Paper, prepared for the Fundamental Symmetries, Neutrons, and Neutrinos Town Meeting related to the 2023 Nuclear Physics Long Range Plan, makes the case for double beta decay as a critical component of the future nuclear physics program. The major experimental collaborations and many theorists have endorsed this white paper. This White Paper, prepared for the Fundamental Symmetries, Neutrons, and Neutrinos Town Meeting related to the 2023 Nuclear Physics Long Range Plan, makes the case for double beta decay as a critical component of the future nuclear physics program. The major experimental collaborations and many theorists have endorsed this white paper. △ Less

Submitted 21 December, 2022; originally announced December 2022.

Comments: white paper submitted for the Fundamental Symmetries, Neutrons, and Neutrinos Town Meeting in support of the US Nuclear Physics Long Range Planning Process

arXiv:2212.08038 [pdf, ps, other]

Redefining Relationships in Music

Authors: Christian Detweiler, Beth Coleman, Fernando Diaz, Lieke Dom, Chris Donahue, Jesse Engel, Cheng-Zhi Anna Huang, Larry James, Ethan Manilow, Amanda McCroskery, Kyle Pedersen, Pamela Peter-Agbia, Negar Rostamzadeh, Robert Thomas, Marco Zamarato, Ben Zevenbergen

Abstract: AI tools increasingly shape how we discover, make and experience music. While these tools can have the potential to empower creativity, they may fundamentally redefine relationships between stakeholders, to the benefit of some and the detriment of others. In this position paper, we argue that these tools will fundamentally reshape our music culture, with profound effects (for better and for worse)… ▽ More AI tools increasingly shape how we discover, make and experience music. While these tools can have the potential to empower creativity, they may fundamentally redefine relationships between stakeholders, to the benefit of some and the detriment of others. In this position paper, we argue that these tools will fundamentally reshape our music culture, with profound effects (for better and for worse) on creators, consumers and the commercial enterprises that often connect them. By paying careful attention to emerging Music AI technologies and developments in other creative domains and understanding the implications, people working in this space could decrease the possible negative impacts on the practice, consumption and meaning of music. Given that many of these technologies are already available, there is some urgency in conducting analyses of these technologies now. It is important that people developing and working with these tools address these issues now to help guide their evolution to be equitable and empower creativity. We identify some potential risks and opportunities associated with existing and forthcoming AI tools for music, though more work is needed to identify concrete actions which leverage the opportunities while mitigating risks. △ Less

Submitted 16 December, 2022; v1 submitted 13 December, 2022; originally announced December 2022.

Comments: Presented at Cultures in AI/AI in Culture workshop at NeurIPS 2022

arXiv:2209.14458 [pdf, other]

The Chamber Ensemble Generator: Limitless High-Quality MIR Data via Generative Modeling

Authors: Yusong Wu, Josh Gardner, Ethan Manilow, Ian Simon, Curtis Hawthorne, Jesse Engel

Abstract: Data is the lifeblood of modern machine learning systems, including for those in Music Information Retrieval (MIR). However, MIR has long been mired by small datasets and unreliable labels. In this work, we propose to break this bottleneck using generative modeling. By pipelining a generative model of notes (Coconet trained on Bach Chorales) with a structured synthesis model of chamber ensembles (… ▽ More Data is the lifeblood of modern machine learning systems, including for those in Music Information Retrieval (MIR). However, MIR has long been mired by small datasets and unreliable labels. In this work, we propose to break this bottleneck using generative modeling. By pipelining a generative model of notes (Coconet trained on Bach Chorales) with a structured synthesis model of chamber ensembles (MIDI-DDSP trained on URMP), we demonstrate a system capable of producing unlimited amounts of realistic chorale music with rich annotations including mixes, stems, MIDI, note-level performance attributes (staccato, vibrato, etc.), and even fine-grained synthesis parameters (pitch, amplitude, etc.). We call this system the Chamber Ensemble Generator (CEG), and use it to generate a large dataset of chorales from four different chamber ensembles (CocoChorales). We demonstrate that data generated using our approach improves state-of-the-art models for music transcription and source separation, and we release both the system and the dataset as an open-source foundation for future work in the MIR community. △ Less

Submitted 28 September, 2022; originally announced September 2022.

arXiv:2209.10009 [pdf, other]

Elucidating the finite temperature quasiparticle random phase approximation

Authors: E. M. Ney, A. Ravlić, J. Engel, N. Paar

Abstract: In numerous astrophysical scenarios, such as core-collapse supernovae and neutron star mergers, as in well as heavy-ion collision experiments, transitions between thermally populated nuclear excited states have been shown to play an important role. Due to its simplicity and excellent extrapolation ability, the finite-temperature quasiparticle random phase approximation (FT-QRPA) presents itself as… ▽ More In numerous astrophysical scenarios, such as core-collapse supernovae and neutron star mergers, as in well as heavy-ion collision experiments, transitions between thermally populated nuclear excited states have been shown to play an important role. Due to its simplicity and excellent extrapolation ability, the finite-temperature quasiparticle random phase approximation (FT-QRPA) presents itself as an efficient method to study the properties of hot nuclei. The statistical ensembles in the FT-QRPA make the theory much richer than its zero-temperature counterpart, but also obscure the meaning of various physical quantities. In this work, we clarify several aspects of the FT-QRPA, including notations seen in the literature, and demonstrate how to extract physical quantities from the theory. To exemplify the correct treatment of finite-temperature transitions, we place special emphasis on the charge-exchange transitions described within the proton-neutron FT-QRPA (FT-PNQRPA). With the FT-PNQRPA built on the nuclear energy-density functional theory, we obtain solutions using a relativistic matrix approach and also the non-relativistic finite amplitude method. We show that the Ikeda sum rule is fulfilled with the proper treatment of de-excitations from thermally populated excited states. Additionally, we demonstrate the impact of these transitions on stellar electron capture (EC) rates in ${}^{58,78}$Ni. While their inclusion does not influence the EC rates in ${}^{58}$Ni, the rates in ${}^{78}$Ni are dominated by de-excitations for temperatures $T > 0.5$ MeV. In systems with a large negative $Q$-value, the inclusion of de-excitations within the FT-QRPA is necessary for a complete description of reaction rates at finite temperature. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Comments: 19 pages, 4 figures, submitted for publication

arXiv:2208.06373 [pdf, other]

doi 10.3847/1538-4357/acaf56

The Influence of Beta Decay Rates on r-Process Observables

Authors: Kelsey A. Lund, J. Engel, G. C. McLaughlin, M. R. Mumpower, E. M. Ney, R. Surman

Abstract: The rapid neutron capture process (r-process) is one of the main mechanisms whereby elements heavier than iron are synthesized, and is entirely responsible for the natural production of the actinides. Kilonova emissions are modeled as being largely powered by the radioactive decay of species synthesized via the r -process. Given that the r -process occurs far from nuclear stability, unmeasured bet… ▽ More The rapid neutron capture process (r-process) is one of the main mechanisms whereby elements heavier than iron are synthesized, and is entirely responsible for the natural production of the actinides. Kilonova emissions are modeled as being largely powered by the radioactive decay of species synthesized via the r -process. Given that the r -process occurs far from nuclear stability, unmeasured beta decay rates play an essential role in setting the time scale for the r -process. In an effort to better understand the sensitivity of kilonova modeling to different theoretical global beta-decay descriptions, we incorporate these into nucleosynthesis calculations. We compare the results of these calculations and highlight differences in kilonova nuclear energy generation and light curve predictions, as well as final abundances and their implications for nuclear cosmochronometry. We investigate scenarios where differences in beta decay rates are responsible for increased nuclear heating on time scales of days that propagates into a significantly increased average bolometric luminosity between 1-10 days post-merger. We identify key nuclei, both measured and unmeasured, whose decay rates are directly impact nuclear heating generation on timescales responsible for light curve evolution. We also find that uncertainties in beta decay rates significantly impact ages estimates from cosmochronometry. △ Less

Submitted 12 August, 2022; originally announced August 2022.

Report number: LA-UR 22-28160

arXiv:2207.01085 [pdf, other]

doi 10.1088/1361-6471/aca03e

Towards Precise and Accurate Calculations of Neutrinoless Double-Beta Decay: Project Scoping Workshop Report

Authors: V. Cirigliano, Z. Davoudi, J. Engel, R. J. Furnstahl, G. Hagen, U. Heinz, H. Hergert, M. Horoi, C. W. Johnson, A. Lovato, E. Mereghetti, W. Nazarewicz, A. Nicholson, T. Papenbrock, S. Pastore, M. Plumlee, D. R. Phillips, P. E. Shanahan, S. R. Stroberg, F. Viens, A. Walker-Loud, K. A. Wendt, S. M. Wild

Abstract: We present the results of a National Science Foundation (NSF) Project Scoping Workshop, the purpose of which was to assess the current status of calculations for the nuclear matrix elements governing neutrinoless double-beta decay and determine if more work on them is required. After reviewing important recent progress in the application of effective field theory, lattice quantum chromodynamics, a… ▽ More We present the results of a National Science Foundation (NSF) Project Scoping Workshop, the purpose of which was to assess the current status of calculations for the nuclear matrix elements governing neutrinoless double-beta decay and determine if more work on them is required. After reviewing important recent progress in the application of effective field theory, lattice quantum chromodynamics, and ab initio nuclear-structure theory to double-beta decay, we discuss the state of the art in nuclear-physics uncertainty quantification and then construct a road map for work in all these areas to fully complement the increasingly sensitive experiments in operation and under development. The road map contains specific projects in theoretical and computational physics as well as an uncertainty-quantification plan that employs Bayesian Model Mixing and an analysis of correlations between double-beta-decay rates and other observables. The goal of this program is a set of accurate and precise matrix elements, in all nuclei of interest to experimentalists, delivered together with carefully assessed uncertainties. Such calculations will allow crisp conclusions from the observation or non-observation of neutrinoless double-beta decay, no matter what new physics is at play. △ Less

Submitted 3 July, 2022; originally announced July 2022.

Comments: This Project Scoping Workshop report is focused on the US context for the theory of neutrinloess double beta decay. Its authors plan to produce a journal article that addresses similar issues, but is more inclusive as regards non-US efforts on this problem. We would be happy to receive further input that will help us refine our text before it is submitted to the journal

Report number: INT-PUB-22-018

Journal ref: J. Phys. G: Nucl. Part. Phys. 49, 120502 (2022)

arXiv:2206.05408 [pdf, other]

Multi-instrument Music Synthesis with Spectrogram Diffusion

Authors: Curtis Hawthorne, Ian Simon, Adam Roberts, Neil Zeghidour, Josh Gardner, Ethan Manilow, Jesse Engel

Abstract: An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes. Recent neural synthesizers have exhibited a tradeoff between domain-specific models that offer detailed control of only specific instruments, or raw waveform models that can train on any music but with minimal control and slow generat… ▽ More An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes. Recent neural synthesizers have exhibited a tradeoff between domain-specific models that offer detailed control of only specific instruments, or raw waveform models that can train on any music but with minimal control and slow generation. In this work, we focus on a middle ground of neural synthesizers that can generate audio from MIDI sequences with arbitrary combinations of instruments in realtime. This enables training on a wide range of transcription datasets with a single model, which in turn offers note-level control of composition and instrumentation across a wide range of instruments. We use a simple two-stage process: MIDI to spectrograms with an encoder-decoder Transformer, then spectrograms to audio with a generative adversarial network (GAN) spectrogram inverter. We compare training the decoder as an autoregressive model and as a Denoising Diffusion Probabilistic Model (DDPM) and find that the DDPM approach is superior both qualitatively and as measured by audio reconstruction and Fréchet distance metrics. Given the interactivity and generality of this approach, we find this to be a promising first step towards interactive and expressive neural synthesis for arbitrary combinations of instruments and notes. △ Less

Submitted 12 December, 2022; v1 submitted 10 June, 2022; originally announced June 2022.

arXiv:2206.04615 [pdf, other]

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting. △ Less

Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

arXiv:2204.12971 [pdf, other]

doi 10.1103/PhysRevC.106.014315

Ab initio studies of double Gamow-Teller transition and its correlation with neutrinoless double beta decay

Authors: J. M. Yao, I. Ginnett, A. Belley, T. Miyagi, R. Wirth, S. Bogner, J. Engel, H. Hergert, J. D. Holt, S. R. Stroberg

Abstract: We use chiral interactions and several {\em ab initio} methods to compute the nuclear matrix elements (NMEs) for ground-state to ground-state double Gamow-Teller transitions in a range of isotopes, and explore the correlation of these NMEs with those for neutrinoless double beta decay produced by the exchange of a light Majorana neutrino. When all the NMEs of both isospin-conserving and isospin-ch… ▽ More We use chiral interactions and several {\em ab initio} methods to compute the nuclear matrix elements (NMEs) for ground-state to ground-state double Gamow-Teller transitions in a range of isotopes, and explore the correlation of these NMEs with those for neutrinoless double beta decay produced by the exchange of a light Majorana neutrino. When all the NMEs of both isospin-conserving and isospin-changing transitions from the {\em ab initio} calculations are considered, the correlation is strong. For the experimentally relevant isospin-changing transitions by themselves, however, the correlation is weaker and may not be helpful for reducing the uncertainty in the NMEs for neutrinoless double-beta decay. △ Less

Submitted 7 July, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

Comments: 16 pages with 19 figures, Phys. Rev. C (in press)

Journal ref: Phys. Rev. C106, 014315 (2022)

arXiv:2203.15182 [pdf, other]

Long-term Visual Map Sparsification with Heterogeneous GNN

Authors: Ming-Fang Chang, Yipu Zhao, Rajvi Shah, Jakob J. Engel, Michael Kaess, Simon Lucey

Abstract: We address the problem of map sparsification for long-term visual localization. For map sparsification, a commonly employed assumption is that the pre-build map and the later captured localization query are consistent. However, this assumption can be easily violated in the dynamic world. Additionally, the map size grows as new data accumulate through time, causing large data overhead in the long t… ▽ More We address the problem of map sparsification for long-term visual localization. For map sparsification, a commonly employed assumption is that the pre-build map and the later captured localization query are consistent. However, this assumption can be easily violated in the dynamic world. Additionally, the map size grows as new data accumulate through time, causing large data overhead in the long term. In this paper, we aim to overcome the environmental changes and reduce the map size at the same time by selecting points that are valuable to future localization. Inspired by the recent progress in Graph Neural Network(GNN), we propose the first work that models SfM maps as heterogeneous graphs and predicts 3D point importance scores with a GNN, which enables us to directly exploit the rich information in the SfM map graph. Two novel supervisions are proposed: 1) a data-fitting term for selecting valuable points to future localization based on training queries; 2) a K-Cover term for selecting sparse points with full map coverage. The experiments show that our method selected map points on stable and widely visible structures and outperformed baselines in localization performance. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: Accepted by CVPR 2022

arXiv:2203.15140 [pdf, other]

Improving Source Separation by Explicitly Modeling Dependencies Between Sources

Authors: Ethan Manilow, Curtis Hawthorne, Cheng-Zhi Anna Huang, Bryan Pardo, Jesse Engel

Abstract: We propose a new method for training a supervised source separation system that aims to learn the interdependent relationships between all combinations of sources in a mixture. Rather than independently estimating each source from a mix, we reframe the source separation problem as an Orderless Neural Autoregressive Density Estimator (NADE), and estimate each source from both the mix and a random s… ▽ More We propose a new method for training a supervised source separation system that aims to learn the interdependent relationships between all combinations of sources in a mixture. Rather than independently estimating each source from a mix, we reframe the source separation problem as an Orderless Neural Autoregressive Density Estimator (NADE), and estimate each source from both the mix and a random subset of the other sources. We adapt a standard source separation architecture, Demucs, with additional inputs for each individual source, in addition to the input mixture. We randomly mask these input sources during training so that the network learns the conditional dependencies between the sources. By pairing this training method with a block Gibbs sampling procedure at inference time, we demonstrate that the network can iteratively improve its separation performance by conditioning a source estimate on its earlier source estimates. Experiments on two source separation datasets show that training a Demucs model with an Orderless NADE approach and using Gibbs sampling (up to 512 steps) at inference time strongly outperforms a Demucs baseline that uses a standard regression loss and direct (one step) estimation of sources. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: To appear at ICASSP 2022

arXiv:2203.12169 [pdf, other]

Neutrinoless Double-Beta Decay: A Roadmap for Matching Theory to Experiment

Authors: Vincenzo Cirigliano, Zohreh Davoudi, Wouter Dekens, Jordy de Vries, Jonathan Engel, Xu Feng, Julia Gehrlein, Michael L. Graesser, Lukáš Gráf, Heiko Hergert, Luchang Jin, Emanuele Mereghetti, Amy Nicholson, Saori Pastore, Michael J. Ramsey-Musolf, Richard Ruiz, Martin Spinrath, Ubirajara van Kolck, André Walker-Loud

Abstract: The observation of neutrino oscillations and hence non-zero neutrino masses provided a milestone in the search for physics beyond the Standard Model. But even though we now know that neutrinos are massive, the nature of neutrino masses, i.e., whether they are Dirac or Majorana, remains an open question. A smoking-gun signature of Majorana neutrinos is the observation of neutrinoless double-beta de… ▽ More The observation of neutrino oscillations and hence non-zero neutrino masses provided a milestone in the search for physics beyond the Standard Model. But even though we now know that neutrinos are massive, the nature of neutrino masses, i.e., whether they are Dirac or Majorana, remains an open question. A smoking-gun signature of Majorana neutrinos is the observation of neutrinoless double-beta decay, a process that violates the lepton-number conservation of the Standard Model. This white paper focuses on the theoretical aspects of the neutrinoless double-beta decay program and lays out a roadmap for future developments. The roadmap is a multi-scale path starting from high-energy models of neutrinoless double-beta decay all the way to the low-energy nuclear many-body problem that needs to be solved to supplement measurements of the decay rate. The path goes through a systematic effective-field-theory description of the underlying processes at various scales and needs to be supplemented by lattice quantum chromodynamics input. The white paper also discusses the interplay between neutrinoless double-beta decay, experiments at the Large Hadron Collider and results from astrophysics and cosmology in probing simplified models of lepton-number violation at the TeV scale, and the generation of the matter-antimatter asymmetry via leptogenesis. This white paper is prepared for the topical groups TF11 (Theory of Neutrino Physics), TF05 (Lattice Gauge Theory), RF04 (Baryon and Lepton Number Violating Processes), NF03 (Beyond the Standard Model) and NF05 (Neutrino Properties) within the Theory Frontier, Rare Processes and Precision Frontier, and Neutrino Physics Frontier of the U.S. Community Study on the Future of Particle Physics (Snowmass 2021). △ Less

Submitted 22 March, 2022; originally announced March 2022.

Comments: Contribution to Snowmass 2021

Report number: LA-UR-22-22587

arXiv:2203.08103 [pdf, other]

Electric dipole moments and the search for new physics

Authors: Ricardo Alarcon, Jim Alexander, Vassilis Anastassopoulos, Takatoshi Aoki, Rick Baartman, Stefan Baeßler, Larry Bartoszek, Douglas H. Beck, Franco Bedeschi, Robert Berger, Martin Berz, Hendrick L. Bethlem, Tanmoy Bhattacharya, Michael Blaskiewicz, Thomas Blum, Themis Bowcock, Anastasia Borschevsky, Kevin Brown, Dmitry Budker, Sergey Burdin, Brendan C. Casey, Gianluigi Casse, Giovanni Cantatore, Lan Cheng, Timothy Chupp , et al. (118 additional authors not shown)

Abstract: Static electric dipole moments of nondegenerate systems probe mass scales for physics beyond the Standard Model well beyond those reached directly at high energy colliders. Discrimination between different physics models, however, requires complementary searches in atomic-molecular-and-optical, nuclear and particle physics. In this report, we discuss the current status and prospects in the near fu… ▽ More Static electric dipole moments of nondegenerate systems probe mass scales for physics beyond the Standard Model well beyond those reached directly at high energy colliders. Discrimination between different physics models, however, requires complementary searches in atomic-molecular-and-optical, nuclear and particle physics. In this report, we discuss the current status and prospects in the near future for a compelling suite of such experiments, along with developments needed in the encompassing theoretical framework. △ Less

Submitted 4 April, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: Contribution to Snowmass 2021; updated with community edits and endorsements

arXiv:2203.03022 [pdf, ps, other]

HEAR: Holistic Evaluation of Audio Representations

Authors: Joseph Turian, Jordie Shier, Humair Raj Khan, Bhiksha Raj, Björn W. Schuller, Christian J. Steinmetz, Colin Malloy, George Tzanetakis, Gissel Velarde, Kirk McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse Engel, Justin Salamon, Philippe Esling, Pranay Manocha, Shinji Watanabe, Zeyu Jin, Yonatan Bisk

Abstract: What audio embedding approach generalizes best to a wide range of downstream tasks across a variety of everyday domains without fine-tuning? The aim of the HEAR benchmark is to develop a general-purpose audio representation that provides a strong basis for learning in a wide variety of tasks and scenarios. HEAR evaluates audio representations using a benchmark suite across a variety of domains, in… ▽ More What audio embedding approach generalizes best to a wide range of downstream tasks across a variety of everyday domains without fine-tuning? The aim of the HEAR benchmark is to develop a general-purpose audio representation that provides a strong basis for learning in a wide variety of tasks and scenarios. HEAR evaluates audio representations using a benchmark suite across a variety of domains, including speech, environmental sound, and music. HEAR was launched as a NeurIPS 2021 shared challenge. In the spirit of shared exchange, each participant submitted an audio embedding model following a common API that is general-purpose, open-source, and freely available to use. Twenty-nine models by thirteen external teams were evaluated on nineteen diverse downstream tasks derived from sixteen datasets. Open evaluation code, submitted models and datasets are key contributions, enabling comprehensive and reproducible evaluation, as well as previously impossible longitudinal studies. It still remains an open question whether one single general-purpose audio representation can perform as holistically as the human ear. △ Less

Submitted 29 May, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

Comments: to appear in Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track

arXiv:2203.01619 [pdf, other]

doi 10.1103/PhysRevC.105.064317

Solving Nuclear Structure Problems with the Adaptive Variational Quantum Algorithm

Authors: A. M. Romero, J. Engel, Ho Lun Tang, Sophia E. Economou

Abstract: We use the Lipkin-Meshkov-Glick (LMG) model and the valence-space nuclear shell model to examine the likely performance of variational quantum eigensolvers in nuclear-structure theory. The LMG model exhibits both a phase transition and spontaneous symmetry breaking at the mean-field level in one of the phases, features that characterize collective dynamics in medium-mass and heavy nuclei. We show… ▽ More We use the Lipkin-Meshkov-Glick (LMG) model and the valence-space nuclear shell model to examine the likely performance of variational quantum eigensolvers in nuclear-structure theory. The LMG model exhibits both a phase transition and spontaneous symmetry breaking at the mean-field level in one of the phases, features that characterize collective dynamics in medium-mass and heavy nuclei. We show that with appropriate modifications, the ADAPT-VQE algorithm, a particularly flexible and accurate variational approach, is not troubled by these complications. We treat up to 12 particles and show that the number of quantum operations needed to approach the ground-state energy scales linearly with the number of qubits. We find similar scaling when the algorithm is applied to the nuclear shell model with realistic interactions in the $sd$ and $pf$ shells. Although most of these simulations contain no noise, we use a noise model from real IBM hardware to show that for the LMG model with four particles, weak noise has no effect on the efficiency of the algorithm. △ Less

Submitted 28 June, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

Comments: 10 pages, 7 figures. Identical in content to published version. Now incudes analysis of noise

Journal ref: Phys. Rev. C 105, 064317 (2022)

arXiv:2202.07765 [pdf, other]

General-purpose, long-context autoregressive modeling with Perceiver AR

Authors: Curtis Hawthorne, Andrew Jaegle, Cătălina Cangea, Sebastian Borgeaud, Charlie Nash, Mateusz Malinowski, Sander Dieleman, Oriol Vinyals, Matthew Botvinick, Ian Simon, Hannah Sheahan, Neil Zeghidour, Jean-Baptiste Alayrac, João Carreira, Jesse Engel

Abstract: Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression. However, the most commonly used autoregressive models, Transformers, are prohibitively expensive to scale to the number of inputs and layers needed to capture this long-range structure. We develop Perceiver AR, an autoregressive, modality-agnostic… ▽ More Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression. However, the most commonly used autoregressive models, Transformers, are prohibitively expensive to scale to the number of inputs and layers needed to capture this long-range structure. We develop Perceiver AR, an autoregressive, modality-agnostic architecture which uses cross-attention to map long-range inputs to a small number of latents while also maintaining end-to-end causal masking. Perceiver AR can directly attend to over a hundred thousand tokens, enabling practical long-context density estimation without the need for hand-crafted sparsity patterns or memory mechanisms. When trained on images or music, Perceiver AR generates outputs with clear long-term coherence and structure. Our architecture also obtains state-of-the-art likelihood on long-sequence benchmarks, including 64 x 64 ImageNet images and PG-19 books. △ Less

Submitted 14 June, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

Comments: ICML 2022

arXiv:2201.12983 [pdf, other]

doi 10.1103/PhysRevC.105.044314

Global calculation of two-neutrino double-$β$ decay within the finite amplitude method in nuclear density functional theory

Authors: Nobuo Hinohara, Jonathan Engel

Abstract: Two-neutrino double-beta ($2νββ$) decay has been used to constrain the neutron-proton part of effective interactions, which in turn is used to compute the nuclear matrix elements for neutrinoless double-beta decay, the observation of which would have important consequences for fundamental physics. We carefully examine $2νββ$ matrix elements within the proton-neutron quasiparticle random-phase appr… ▽ More Two-neutrino double-beta ($2νββ$) decay has been used to constrain the neutron-proton part of effective interactions, which in turn is used to compute the nuclear matrix elements for neutrinoless double-beta decay, the observation of which would have important consequences for fundamental physics. We carefully examine $2νββ$ matrix elements within the proton-neutron quasiparticle random-phase approximation with nuclear energy density functionals. We work with functionals that are fit globally to single-beta-decay half-lives and charge-exchange giant-resonance energies, but not to $2νββ$ half-lives themselves, to evaluate the $2νββ$ nuclear matrix elements for all important nuclei, including those whose half-lives have not yet been measured. Such a comprehensive evaluation in large model spaces without configuration truncation requires an efficient computational scheme; we employ a double contour integration within the finite amplitude method. The results generally reproduce the nuclear matrix element extracted from half-lives well, without the use of any of those half-lives in the fitting procedure. We present predictions of the matrix elements in a total of 27 nuclei with half-lives that are still unmeasured. △ Less

Submitted 19 April, 2022; v1 submitted 30 January, 2022; originally announced January 2022.

Comments: 16 pages, 5 figures

Journal ref: Phys. Rev. C 105, 044314 (2022)

arXiv:2112.14621 [pdf, other]

doi 10.1103/PhysRevC.105.034349

Two-body weak currents in heavy nuclei

Authors: E. M. Ney, J. Engel, N. Schunck

Abstract: In light and medium-mass nuclei, two-body weak currents from chiral effective field theory account for a significant portion of the phenomenological quenching of Gamow-Teller transition matrix elements. Here we examine the systematic effects of two-body axial currents on Gamow-Teller strength and $β$-decay rates in heavy nuclei within energy-density functional theory. Using a Skyrme functional and… ▽ More In light and medium-mass nuclei, two-body weak currents from chiral effective field theory account for a significant portion of the phenomenological quenching of Gamow-Teller transition matrix elements. Here we examine the systematic effects of two-body axial currents on Gamow-Teller strength and $β$-decay rates in heavy nuclei within energy-density functional theory. Using a Skyrme functional and the charge-changing finite amplitude method, we add the contributions of two-body currents to the usual one-body linear response in the Gamow-Teller channel, both exactly and though a density-matrix expansion. The two-body currents, as expected, usually quench both summed Gamow-Teller strength and decay rates, but by an amount that decreases as the neutron excess grows. In addition, they can enhance individual low-lying transitions, leading to decay rates that are quite different from those that an energy-independent quenching would produce, particularly in neutron-rich nuclei. We show that both these unexpected effects are related to changes in the total nucleon density as the number of neutrons increases. △ Less

Submitted 29 December, 2021; originally announced December 2021.

Comments: 13 pages, 8 figures

arXiv:2112.09312 [pdf, other]

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Authors: Yusong Wu, Ethan Manilow, Yi Deng, Rigel Swavely, Kyle Kastner, Tim Cooijmans, Aaron Courville, Cheng-Zhi Anna Huang, Jesse Engel

Abstract: Musical expression requires control of both what notes are played, and how they are performed. Conventional audio synthesizers provide detailed expressive controls, but at the cost of realism. Black-box neural audio synthesis and concatenative samplers can produce realistic audio, but have few mechanisms for control. In this work, we introduce MIDI-DDSP a hierarchical model of musical instruments… ▽ More Musical expression requires control of both what notes are played, and how they are performed. Conventional audio synthesizers provide detailed expressive controls, but at the cost of realism. Black-box neural audio synthesis and concatenative samplers can produce realistic audio, but have few mechanisms for control. In this work, we introduce MIDI-DDSP a hierarchical model of musical instruments that enables both realistic neural audio synthesis and detailed user control. Starting from interpretable Differentiable Digital Signal Processing (DDSP) synthesis parameters, we infer musical notes and high-level properties of their expressive performance (such as timbre, vibrato, dynamics, and articulation). This creates a 3-level hierarchy (notes, performance, synthesis) that affords individuals the option to intervene at each level, or utilize trained priors (performance given notes, synthesis given performance) for creative assistance. Through quantitative experiments and listening tests, we demonstrate that this hierarchy can reconstruct high-fidelity audio, accurately predict performance attributes for a note sequence, independently manipulate the attributes of a given performance, and as a complete system, generate realistic audio from a novel note sequence. By utilizing an interpretable hierarchy, with multiple levels of granularity, MIDI-DDSP opens the door to assistive tools to empower individuals across a diverse range of musical experience. △ Less

Submitted 17 March, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

Comments: Accepted by International Conference on Learning Representations (ICLR) 2022

arXiv:2112.01626 [pdf, other]

doi 10.1103/PhysRevC.105.055801

Finite-temperature electron-capture rates for neutron-rich nuclei around N=50 and effects on core-collapse supernovae simulations

Authors: S. Giraud, E. M. Ney, A. Ravlić, R. G. T. Zegers, J. Engel, N. Paar, B. A. Brown, J. -M. Gabler, J. Lesniak, J. Rebenstock

Abstract: The temperature dependence of stellar electron-capture (EC) rates is investigated, with a focus on nuclei around $N=50$, just above $Z=28$, which play an important role during the collapse phase of core-collapse supernovae (CCSN). Two new microscopic calculations of stellar EC rates are obtained from a relativistic and a non-relativistic finite-temperature quasiparticle random-phase approximation… ▽ More The temperature dependence of stellar electron-capture (EC) rates is investigated, with a focus on nuclei around $N=50$, just above $Z=28$, which play an important role during the collapse phase of core-collapse supernovae (CCSN). Two new microscopic calculations of stellar EC rates are obtained from a relativistic and a non-relativistic finite-temperature quasiparticle random-phase approximation approaches, for a conventional grid of temperatures and densities. In both approaches, EC rates due to Gamow-Teller transitions are included. In the relativistic calculation contributions from first-forbidden transitions are also included, and add strongly to the EC rates. The new EC rates are compared with large-scale shell model calculations for the specific case of $^{86}$Kr, providing insight into the finite-temperature effects on the EC rates. At relevant thermodynamic conditions for core-collapse, the discrepancies between the different calculations of this work are within about one order of magnitude. Numerical simulations of CCSN are performed with the spherically-symmetric GR1D simulation code to quantify the impact of such differences on the dynamics of the collapse. These simulations also include EC rates based on two parametrized approximations. A comparison of the neutrino luminosities and enclosed mass at core bounce shows that differences between simulations with different sets of EC rates are relatively small ($\approx 5\%$), suggesting that the EC rates used as inputs for these simulations have become well constrained. △ Less

Submitted 2 December, 2021; originally announced December 2021.

arXiv:2111.14951 [pdf, other]

Expressive Communication: A Common Framework for Evaluating Developments in Generative Models and Steering Interfaces

Authors: Ryan Louie, Jesse Engel, Anna Huang

Abstract: There is an increasing interest from ML and HCI communities in empowering creators with better generative models and more intuitive interfaces with which to control them. In music, ML researchers have focused on training models capable of generating pieces with increasing long-range structure and musical coherence, while HCI researchers have separately focused on designing steering interfaces that… ▽ More There is an increasing interest from ML and HCI communities in empowering creators with better generative models and more intuitive interfaces with which to control them. In music, ML researchers have focused on training models capable of generating pieces with increasing long-range structure and musical coherence, while HCI researchers have separately focused on designing steering interfaces that support user control and ownership. In this study, we investigate through a common framework how developments in both models and user interfaces are important for empowering co-creation where the goal is to create music that communicates particular imagery or ideas (e.g., as is common for other purposeful tasks in music creation like establishing mood or creating accompanying music for another media). Our study is distinguished in that it measures communication through both composer's self-reported experiences, and how listeners evaluate this communication through the music. In an evaluation study with 26 composers creating 100+ pieces of music and listeners providing 1000+ head-to-head comparisons, we find that more expressive models and more steerable interfaces are important and complementary ways to make a difference in composers communicating through music and supporting their creative empowerment. △ Less

Submitted 29 November, 2021; originally announced November 2021.

Comments: 15 pages, 6 figures, submitted to ACM Intelligent User Interfaces 2022 Conference

arXiv:2111.03017 [pdf, other]

MT3: Multi-Task Multitrack Music Transcription

Authors: Josh Gardner, Ian Simon, Ethan Manilow, Curtis Hawthorne, Jesse Engel

Abstract: Automatic Music Transcription (AMT), inferring musical notes from raw audio, is a challenging task at the core of music understanding. Unlike Automatic Speech Recognition (ASR), which typically focuses on the words of a single speaker, AMT often requires transcribing multiple instruments simultaneously, all while preserving fine-scale pitch and timing information. Further, many AMT datasets are "l… ▽ More Automatic Music Transcription (AMT), inferring musical notes from raw audio, is a challenging task at the core of music understanding. Unlike Automatic Speech Recognition (ASR), which typically focuses on the words of a single speaker, AMT often requires transcribing multiple instruments simultaneously, all while preserving fine-scale pitch and timing information. Further, many AMT datasets are "low-resource", as even expert musicians find music transcription difficult and time-consuming. Thus, prior work has focused on task-specific architectures, tailored to the individual instruments of each task. In this work, motivated by the promising results of sequence-to-sequence transfer learning for low-resource Natural Language Processing (NLP), we demonstrate that a general-purpose Transformer model can perform multi-task AMT, jointly transcribing arbitrary combinations of musical instruments across several transcription datasets. We show this unified training framework achieves high-quality transcription results across a range of datasets, dramatically improving performance for low-resource instruments (such as guitar), while preserving strong performance for abundant instruments (such as piano). Finally, by expanding the scope of AMT, we expose the need for more consistent evaluation metrics and better dataset alignment, and provide a strong baseline for this new direction of multi-task AMT. △ Less

Submitted 15 March, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

Comments: ICLR 2022 camera-ready version

arXiv:2108.04068 [pdf, other]

On the role of data, statistics and decisions in a pandemic

Authors: Beate Jahn, Sarah Friedrich, Joachim Behnke, Joachim Engel, Ursula Garczarek, Ralf Münnich, Markus Pauly, Adalbert Wilhelm, Olaf Wolkenhauer, Markus Zwick, Uwe Siebert, Tim Friede

Abstract: A pandemic poses particular challenges to decision-making because of the need to continuously adapt decisions to rapidly changing evidence and available data. For example, which countermeasures are appropriate at a particular stage of the pandemic? How can the severity of the pandemic be measured? What is the effect of vaccination in the population and which groups should be vaccinated first? The… ▽ More A pandemic poses particular challenges to decision-making because of the need to continuously adapt decisions to rapidly changing evidence and available data. For example, which countermeasures are appropriate at a particular stage of the pandemic? How can the severity of the pandemic be measured? What is the effect of vaccination in the population and which groups should be vaccinated first? The process of decision-making starts with data collection and modeling and continues to the dissemination of results and the subsequent decisions taken. The goal of this paper is to give an overview of this process and to provide recommendations for the different steps from a statistical perspective. In particular, we discuss a range of modeling techniques including mathematical, statistical and decision-analytic models along with their applications in the COVID-19 context. With this overview, we aim to foster the understanding of the goals of these modeling approaches and the specific data requirements that are essential for the interpretation of results and for successful interdisciplinary collaborations. A special focus is on the role played by data in these different models, and we incorporate into the discussion the importance of statistical literacy, and of effective dissemination and communication of findings. △ Less

Submitted 8 March, 2022; v1 submitted 6 August, 2021; originally announced August 2021.

arXiv:2107.09142 [pdf, other]

Sequence-to-Sequence Piano Transcription with Transformers

Authors: Curtis Hawthorne, Ian Simon, Rigel Swavely, Ethan Manilow, Jesse Engel

Abstract: Automatic Music Transcription has seen significant progress in recent years by training custom deep neural networks on large datasets. However, these models have required extensive domain-specific design of network architectures, input/output representations, and complex decoding schemes. In this work, we show that equivalent performance can be achieved using a generic encoder-decoder Transformer… ▽ More Automatic Music Transcription has seen significant progress in recent years by training custom deep neural networks on large datasets. However, these models have required extensive domain-specific design of network architectures, input/output representations, and complex decoding schemes. In this work, we show that equivalent performance can be achieved using a generic encoder-decoder Transformer with standard decoding methods. We demonstrate that the model can learn to translate spectrogram inputs directly to MIDI-like output events for several transcription tasks. This sequence-to-sequence approach simplifies transcription by jointly modeling audio features and language-like output dependencies, thus removing the need for task-specific architectures. These results point toward possibilities for creating new Music Information Retrieval models by focusing on dataset creation and labeling rather than custom model design. △ Less

Submitted 19 July, 2021; originally announced July 2021.

arXiv:2105.03471 [pdf, other]

doi 10.1103/PhysRevC.104.054317

Application of efficient generator-coordinate subspace-selection algorithm to neutrinoless double-$β$ decay

Authors: A. M Romero, J. M. Yao, B. Bally, T. R. Rodríguez, J. Engel

Abstract: The generator coordinate method begins with the variational construction of a set of non-orthogonal mean-field states that span a subspace of the full many-body Hilbert space. These states are then often projected onto states with good quantum numbers to restore symmetries, leading to a set with members that can be similar to one another, and it is sometimes possible to reduce this set without gre… ▽ More The generator coordinate method begins with the variational construction of a set of non-orthogonal mean-field states that span a subspace of the full many-body Hilbert space. These states are then often projected onto states with good quantum numbers to restore symmetries, leading to a set with members that can be similar to one another, and it is sometimes possible to reduce this set without greatly affecting results. Here we propose a greedy algorithm that we call the energy-transition-orthogonality procedure (ENTROP) to select subsets of important states. As applied here, the approach selects on the basis of diagonal energy, orthogonality, and contribution to the matrix element that governs neutrinoless double-$β$ decay. We present both shell-model and preliminary ab initio calculations of this matrix element for the decay of $^{76}$Ge, with quadrupole deformation parameters and the isoscalar pairing strength as generator coordinates. ENTROP converges quickly, reducing significantly the number of basis states needed for an accurate calculation. △ Less

Submitted 20 June, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

Comments: 7 pages, 9 figures, authors added, some details clarified

arXiv:2103.16091 [pdf, other]

Symbolic Music Generation with Diffusion Models

Authors: Gautam Mittal, Jesse Engel, Curtis Hawthorne, Ian Simon

Abstract: Score-based generative models and diffusion probabilistic models have been successful at generating high-quality samples in continuous domains such as images and audio. However, due to their Langevin-inspired sampling mechanisms, their application to discrete and sequential data has been limited. In this work, we present a technique for training diffusion models on sequential data by parameterizin… ▽ More Score-based generative models and diffusion probabilistic models have been successful at generating high-quality samples in continuous domains such as images and audio. However, due to their Langevin-inspired sampling mechanisms, their application to discrete and sequential data has been limited. In this work, we present a technique for training diffusion models on sequential data by parameterizing the discrete domain in the continuous latent space of a pre-trained variational autoencoder. Our method is non-autoregressive and learns to generate sequences of latent embeddings through the reverse process and offers parallel generation with a constant number of iterative refinement steps. We apply this technique to modeling symbolic music and show strong unconditional generation and post-hoc conditional infilling results compared to autoregressive language models operating over the same continuous embeddings. △ Less

Submitted 25 November, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

Comments: ISMIR 2021

arXiv:2103.06089 [pdf, other]

Variable-rate discrete representation learning

Authors: Sander Dieleman, Charlie Nash, Jesse Engel, Karen Simonyan

Abstract: Semantically meaningful information content in perceptual signals is usually unevenly distributed. In speech signals for example, there are often many silences, and the speed of pronunciation can vary considerably. In this work, we propose slow autoencoders (SlowAEs) for unsupervised learning of high-level variable-rate discrete representations of sequences, and apply them to speech. We show that… ▽ More Semantically meaningful information content in perceptual signals is usually unevenly distributed. In speech signals for example, there are often many silences, and the speed of pronunciation can vary considerably. In this work, we propose slow autoencoders (SlowAEs) for unsupervised learning of high-level variable-rate discrete representations of sequences, and apply them to speech. We show that the resulting event-based representations automatically grow or shrink depending on the density of salient information in the input signals, while still allowing for faithful signal reconstruction. We develop run-length Transformers (RLTs) for event-based representation modelling and use them to construct language models in the speech domain, which are able to generate grammatical and semantically coherent utterances and continuations. △ Less

Submitted 10 March, 2021; originally announced March 2021.

Comments: 26 pages, 15 figures, samples can be found at https://vdrl.github.io/

arXiv:2008.09696 [pdf, other]

doi 10.1103/PhysRevLett.126.182502

Coupled-cluster calculations of neutrinoless double-beta decay in $^{48}$Ca

Authors: S. J. Novario, P. Gysbers, J. Engel, G. Hagen, G. R. Jansen, T. D. Morris, P. Navrátil, T. Papenbrock, S. Quaglioni

Abstract: We use coupled-cluster theory and nuclear interactions from chiral effective field theory to compute the nuclear matrix element for the neutrinoless double-beta decay of $^{48}$Ca. Benchmarks with the no-core shell model in several light nuclei inform us about the accuracy of our approach. For $^{48}$Ca we find a relatively small matrix element. We also compute the nuclear matrix element for the t… ▽ More We use coupled-cluster theory and nuclear interactions from chiral effective field theory to compute the nuclear matrix element for the neutrinoless double-beta decay of $^{48}$Ca. Benchmarks with the no-core shell model in several light nuclei inform us about the accuracy of our approach. For $^{48}$Ca we find a relatively small matrix element. We also compute the nuclear matrix element for the two-neutrino double-beta decay of $^{48}$Ca with a quenching factor deduced from two-body currents in recent ab-initio calculation of the Ikeda sum-rule in $^{48}$Ca [Gysbers et al., Nature Physics 15, 428-431 (2019)]. △ Less

Submitted 12 May, 2021; v1 submitted 21 August, 2020; originally announced August 2020.

Comments: 14 pages, 13 figures; Version accepted for publication, Supplemental material also updated

Journal ref: Phys. Rev. Lett. 126, 182502 (2021)

arXiv:2007.04957 [pdf, other]

doi 10.1103/PhysRevLett.125.212501

Gamow-Teller strength in $^{48}$Ca and $^{78}$Ni with the charge-exchange subtracted second random-phase approximation

Authors: D. Gambacurta, M. Grasso, J. Engel

Abstract: We develop a fully self-consistent subtracted second random-phase approximation for charge-exchange processes with Skyrme energy-density functionals. As a first application, we study Gamow-Teller excitations in the doubly-magic nucleus $^{48}$Ca, the lightest double-$β$ emitter that could be used in an experiment, and in $^{78}$Ni, the single-beta-decay rate of which is known. The amount of Gamow-… ▽ More We develop a fully self-consistent subtracted second random-phase approximation for charge-exchange processes with Skyrme energy-density functionals. As a first application, we study Gamow-Teller excitations in the doubly-magic nucleus $^{48}$Ca, the lightest double-$β$ emitter that could be used in an experiment, and in $^{78}$Ni, the single-beta-decay rate of which is known. The amount of Gamow-Teller strength below 20 or 30 MeV is considerably smaller than in other energy-density-functional calculations and agrees better with experiment in $^{48}$Ca, as does the beta-decay rate in $^{78}$Ni. These important results, obtained without \textit{ad hoc} quenching factors, are due to the presence of two-particle -- two-hole configurations. Their density progressively increases with excitation energy, leading to a long high-energy tail in the spectrum, a fact that may have implications for the computation of nuclear matrix elements for neutrinoless double-$β$ decay in the same framework. △ Less

Submitted 29 September, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

Journal ref: Phys. Rev. Lett. 125, 212501 (2020)

arXiv:2007.01867 [pdf, other]

doi 10.1109/LRA.2020.3007421

TLIO: Tight Learned Inertial Odometry

Authors: Wenxin Liu, David Caruso, Eddy Ilg, Jing Dong, Anastasios I. Mourikis, Kostas Daniilidis, Vijay Kumar, Jakob Engel

Abstract: In this work we propose a tightly-coupled Extended Kalman Filter framework for IMU-only state estimation. Strap-down IMU measurements provide relative state estimates based on IMU kinematic motion model. However the integration of measurements is sensitive to sensor bias and noise, causing significant drift within seconds. Recent research by Yan et al. (RoNIN) and Chen et al. (IONet) showed the ca… ▽ More In this work we propose a tightly-coupled Extended Kalman Filter framework for IMU-only state estimation. Strap-down IMU measurements provide relative state estimates based on IMU kinematic motion model. However the integration of measurements is sensitive to sensor bias and noise, causing significant drift within seconds. Recent research by Yan et al. (RoNIN) and Chen et al. (IONet) showed the capability of using trained neural networks to obtain accurate 2D displacement estimates from segments of IMU data and obtained good position estimates from concatenating them. This paper demonstrates a network that regresses 3D displacement estimates and its uncertainty, giving us the ability to tightly fuse the relative state measurement into a stochastic cloning EKF to solve for pose, velocity and sensor biases. We show that our network, trained with pedestrian data from a headset, can produce statistically consistent measurement and uncertainty to be used as the update step in the filter, and the tightly-coupled system outperforms velocity integration approaches in position estimates, and AHRS attitude filter in orientation estimates. △ Less

Submitted 10 July, 2020; v1 submitted 5 July, 2020; originally announced July 2020.

Comments: Correcting graph and bibliography. Adding journal reference information and DOI, in IEEE Robotics and Automation Letters

arXiv:2005.12883 [pdf, other]

doi 10.1103/PhysRevC.102.034326

Global Description of Beta Decay with the Axially-Deformed Skyrme Finite Amplitude Method: Extension to Odd-Mass and Odd-Odd Nuclei

Authors: E. M. Ney, J. Engel, N. Schunck

Abstract: We use the finite amplitude method (FAM), an efficient implementation of the quasiparticle random phase approximation, to compute beta-decay rates with Skyrme energy-density functionals for 3983 nuclei, essentially all the medium-mass and heavy isotopes on the neutron rich side of stability. We employ an extension of the FAM that treats odd-mass and odd-odd nuclear ground states in the equal filli… ▽ More We use the finite amplitude method (FAM), an efficient implementation of the quasiparticle random phase approximation, to compute beta-decay rates with Skyrme energy-density functionals for 3983 nuclei, essentially all the medium-mass and heavy isotopes on the neutron rich side of stability. We employ an extension of the FAM that treats odd-mass and odd-odd nuclear ground states in the equal filling approximation. Our rates are in reasonable agreement both with experimental data where available and with rates from other global calculations. △ Less

Submitted 28 May, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

Comments: 11 pages, 7 figures, supplemental material. Submitted to Phys. Rev. C

arXiv:2004.00188 [pdf, other]

Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset

Authors: Lee Callender, Curtis Hawthorne, Jesse Engel

Abstract: We introduce the Expanded Groove MIDI dataset (E-GMD), an automatic drum transcription (ADT) dataset that contains 444 hours of audio from 43 drum kits, making it an order of magnitude larger than similar datasets, and the first with human-performed velocity annotations. We use E-GMD to optimize classifiers for use in downstream generation by predicting expressive dynamics (velocity) and show with… ▽ More We introduce the Expanded Groove MIDI dataset (E-GMD), an automatic drum transcription (ADT) dataset that contains 444 hours of audio from 43 drum kits, making it an order of magnitude larger than similar datasets, and the first with human-performed velocity annotations. We use E-GMD to optimize classifiers for use in downstream generation by predicting expressive dynamics (velocity) and show with listening tests that they produce outputs with improved perceptual quality, despite similar results on classification metrics. Via the listening tests, we argue that standard classifier metrics, such as accuracy and F-measure score, are insufficient proxies of performance in downstream tasks because they do not fully align with the perceptual quality of generated outputs. △ Less

Submitted 1 December, 2020; v1 submitted 31 March, 2020; originally announced April 2020.

Comments: Examples available at https://goo.gl/magenta/e-gmd-examples

arXiv:2001.05171 [pdf, other]

Teddy: A System for Interactive Review Analysis

Authors: Xiong Zhang, Jonathan Engel, Sara Evensen, Yuliang Li, Çağatay Demiralp, Wang-Chiew Tan

Abstract: Reviews are integral to e-commerce services and products. They contain a wealth of information about the opinions and experiences of users, which can help better understand consumer decisions and improve user experience with products and services. Today, data scientists analyze reviews by developing rules and models to extract, aggregate, and understand information embedded in the review text. How… ▽ More Reviews are integral to e-commerce services and products. They contain a wealth of information about the opinions and experiences of users, which can help better understand consumer decisions and improve user experience with products and services. Today, data scientists analyze reviews by developing rules and models to extract, aggregate, and understand information embedded in the review text. However, working with thousands of reviews, which are typically noisy incomplete text, can be daunting without proper tools. Here we first contribute results from an interview study that we conducted with fifteen data scientists who work with review text, providing insights into their practices and challenges. Results suggest data scientists need interactive systems for many review analysis tasks. In response we introduce Teddy, an interactive system that enables data scientists to quickly obtain insights from reviews and improve their extraction and modeling pipelines. △ Less

Submitted 15 January, 2020; originally announced January 2020.

Comments: CHI'20

Showing 1–50 of 153 results for author: Engel, J