Zum Hauptinhalt springen

Showing 1–50 of 173 results for author: Sala, F

.
  1. arXiv:2408.17383  [pdf, other

    cs.LG cs.AI

    MoRe Fine-Tuning with 10x Fewer Parameters

    Authors: Wenxuan Tan, Nicholas Roberts, Tzu-Heng Huang, Jitian Zhao, John Cooper, Samuel Guo, Chengyu Duan, Frederic Sala

    Abstract: Parameter-efficient fine-tuning (PEFT) techniques have unlocked the potential to cheaply and easily specialize large pretrained models. However, the most prominent approaches, like low-rank adapters (LoRA), depend on heuristics or rules-of-thumb for their architectural choices -- potentially limiting their performance for new models and architectures. This limitation suggests that techniques from… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

  2. arXiv:2407.12450  [pdf, other

    physics.acc-ph hep-ex

    Interim report for the International Muon Collider Collaboration (IMCC)

    Authors: C. Accettura, S. Adrian, R. Agarwal, C. Ahdida, C. Aimé, A. Aksoy, G. L. Alberghi, S. Alden, N. Amapane, D. Amorim, P. Andreetto, F. Anulli, R. Appleby, A. Apresyan, P. Asadi, M. Attia Mahmoud, B. Auchmann, J. Back, A. Badea, K. J. Bae, E. J. Bahng, L. Balconi, F. Balli, L. Bandiera, C. Barbagallo , et al. (362 additional authors not shown)

    Abstract: The International Muon Collider Collaboration (IMCC) [1] was established in 2020 following the recommendations of the European Strategy for Particle Physics (ESPP) and the implementation of the European Strategy for Particle Physics-Accelerator R&D Roadmap by the Laboratory Directors Group [2], hereinafter referred to as the the European LDG roadmap. The Muon Collider Study (MuC) covers the accele… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: This document summarises the International Muon Collider Collaboration (IMCC) progress and status of the Muon Collider R&D programme

  3. arXiv:2407.11004  [pdf, other

    cs.CL cs.AI cs.LG

    The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators

    Authors: Tzu-Heng Huang, Catherine Cao, Vaishnavi Bhargava, Frederic Sala

    Abstract: Large pretrained models can be used as annotators, helping replace or augment crowdworkers and enabling distilling generalist models into smaller specialist models. Unfortunately, this comes at a cost: employing top-of-the-line models often requires paying thousands of dollars for API calls, while the resulting datasets are static and challenging to audit. To address these challenges, we propose a… ▽ More

    Submitted 25 June, 2024; originally announced July 2024.

  4. arXiv:2407.03651  [pdf, other

    cs.CL cs.AI

    Evaluating Language Model Context Windows: A "Working Memory" Test and Inference-time Correction

    Authors: Amanda Dsouza, Christopher Glaze, Changho Shin, Frederic Sala

    Abstract: Large language models are prominently used in real-world applications, often tasked with reasoning over large volumes of documents. An exciting development in this space is models boasting extended context capabilities, with some accommodating over 2 million tokens. Such long context model capabilities remain uncertain in production systems, motivating the need to benchmark their performance on re… ▽ More

    Submitted 14 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  5. arXiv:2407.01667  [pdf, other

    hep-ph astro-ph.CO

    ALP leptogenesis

    Authors: Martina Cataldi, Alberto Mariotti, Filippo Sala, Miguel Vanvlasselaer

    Abstract: We propose a novel realisation of leptogenesis that relies on the out-of-equilibrium decay of an axion-like particle (ALP) into right-handed Majorana neutrinos (RHNs) in the early Universe. With respect to standard thermal leptogenesis, our mechanism lowers by two orders of magnitude the RHN mass, or the tuning in the RHN mass splittings, needed to reproduce the baryon asymmetry of the Universe an… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 28 pages, 10 figures

    Report number: DESY-24-095

  6. arXiv:2406.03642  [pdf, other

    cs.CL cs.LG

    Is Free Self-Alignment Possible?

    Authors: Dyah Adila, Changho Shin, Yijing Zhang, Frederic Sala

    Abstract: Aligning pretrained language models (LMs) is a complex and resource-intensive process, often requiring access to large amounts of ground-truth preference data and substantial compute. Are these costs necessary? That is, it is possible to align using only inherent model knowledge and without additional training? We tackle this challenge with AlignEZ, a novel approach that uses (1) self-generated pr… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  7. arXiv:2406.00894  [pdf, other

    cs.LG cs.AI cs.CL

    Pretrained Hybrids with MAD Skills

    Authors: Nicholas Roberts, Samuel Guo, Zhiqi Gao, Satya Sai Srinath Namburi GNVV, Sonia Cromp, Chengjun Wu, Chengyu Duan, Frederic Sala

    Abstract: While Transformers underpin modern large language models (LMs), there is a growing list of alternative architectures with new capabilities, promises, and tradeoffs. This makes choosing the right LM architecture challenging. Recently-proposed $\textit{hybrid architectures}$ seek a best-of-all-worlds approach that reaps the benefits of all architectures. Hybrid design is difficult for two reasons: i… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  8. arXiv:2405.04568  [pdf, other

    hep-ph astro-ph.CO astro-ph.HE

    Relic Neutrino Background from Cosmic-Ray Reservoirs

    Authors: Andrea Giovanni De Marchi, Alessandro Granelli, Jacopo Nava, Filippo Sala

    Abstract: We compute the flux of relic neutrino background (R$ν$B) up-scattered by ultra-high-energy (UHE) cosmic rays (CRs) in clusters that act as CR-reservoirs. The long trapping times of UHECRs make this flux larger than that of R$ν$B up-scattered by UHECRs on their way to Earth, which we also compute. We find that IceCube excludes R$ν$B weighted overdensities larger than $10^{10}$ in clusters, and that… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 5 pages + appendices, 5 figures, 1 table

  9. arXiv:2404.16188  [pdf, other

    cs.LG cs.AI stat.ML

    Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

    Authors: Harit Vishwakarma, Reid, Chen, Sui Jiet Tay, Satya Sai Srinath Namburi, Frederic Sala, Ramya Korlakai Vinayak

    Abstract: Auto-labeling is an important family of techniques that produce labeled training sets with minimum manual labeling. A prominent variant, threshold-based auto-labeling (TBAL), works by finding a threshold on a model's confidence scores above which it can accurately label unlabeled data points. However, many models are known to produce overconfident scores, leading to poor TBAL performance. While a… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  10. arXiv:2404.08461  [pdf, other

    cs.LG cs.AI

    OTTER: Improving Zero-Shot Classification via Optimal Transport

    Authors: Changho Shin, Jitian Zhao, Sonia Cromp, Harit Vishwakarma, Frederic Sala

    Abstract: Popular zero-shot models suffer due to artifacts inherited from pretraining. A particularly detrimental artifact, caused by unbalanced web-scale pretraining data, is mismatched label distribution. Existing approaches that seek to repair the label distribution are not suitable in zero-shot settings, as they have incompatible requirements such as access to labeled downstream task data or knowledge o… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 29 pages

  11. arXiv:2403.07391  [pdf, ps, other

    cond-mat.mtrl-sci

    Towards adiabatic-connection interpolation model with broader applicability

    Authors: Lucian A. Constantin, Szymon Śmiga, Fabio Della Sala

    Abstract: The Adiabatic Connection Integrand Interpolation (ACII) method represents a general path for calculating correlation energies in electronic systems within the Den sity Functional Theory. ACII functionals include both exact-exchange and the second-order correlation energy, as well as an interpolating function toward the strictly-correlated electron (SCE) regime. Several interpolating functions have… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  12. arXiv:2403.05615  [pdf, other

    hep-ph astro-ph.CO astro-ph.HE gr-qc

    Particle shells from relativistic bubble walls

    Authors: Iason Baldes, Maximilian Dichtl, Yann Gouttenoire, Filippo Sala

    Abstract: Relativistic bubble walls from cosmological phase transitions (PT) necessarily accumulate expanding shells of particles. We systematically characterize shell properties, and identify and calculate the processes that prevent them from free streaming: phase-space saturation effects, out-of-equilibrium $2\to2$ and $3\to2$ shell-shell and shell-bath interactions, and shell interactions with bubble wal… ▽ More

    Submitted 21 May, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: 73 pages, 18 figures and 8 tables including appendices and references, v2: shell-bath processes added

  13. arXiv:2401.15478  [pdf, other

    q-bio.QM cs.LG q-bio.MN

    Product Manifold Representations for Learning on Biological Pathways

    Authors: Daniel McNeela, Frederic Sala, Anthony Gitter

    Abstract: Machine learning models that embed graphs in non-Euclidean spaces have shown substantial benefits in a variety of contexts, but their application has not been studied extensively in the biological domain, particularly with respect to biological pathway graphs. Such graphs exhibit a variety of complex network structures, presenting challenges to existing embedding approaches. Learning high-quality… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: 28 pages, 19 figures

  14. arXiv:2401.12278  [pdf, other

    hep-ph astro-ph.CO

    Early Matter Domination at Colliders: Long Live the Glueball!

    Authors: Fady Bishara, Filippo Sala, Kai Schmidt-Hoberg

    Abstract: We prove that collider searches for long-lived particles (LLPs) can test the dynamics responsible for matter domination in the early universe. In this letter we concentrate on the specific example of glueballs from a GeV-scale confining dark sector and compute the dilution of cosmological relics induced by their decay. We then show that searches for long-lived glueballs from Higgs decays test incr… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 5 pages + refs, 1 figure

  15. arXiv:2401.12225  [pdf, other

    cs.CV cs.LG

    Multimodal Data Curation via Object Detection and Filter Ensembles

    Authors: Tzu-Heng Huang, Changho Shin, Sui Jiet Tay, Dyah Adila, Frederic Sala

    Abstract: We propose an approach for curating multimodal data that we used for our entry in the 2023 DataComp competition filtering track. Our technique combines object detection and weak supervision-based ensembling. In the first of two steps in our approach, we employ an out-of-the-box zero-shot object detection model to extract granular information and produce a variety of filter designs. In the second s… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: Appeared in the Workshop of Towards the Next Generation of Computer Vision Datasets (TNGCV) on ICCV 2023

  16. arXiv:2312.09282  [pdf, other

    hep-ph

    Baryogenesis and Leptogenesis from Supercooled Confinement

    Authors: Maximilian Dichtl, Jacopo Nava, Silvia Pascoli, Filippo Sala

    Abstract: We propose a framework of baryogenesis and leptogenesis that relies on a supercooled confining phase transition (PT) in the early universe. The baryon or lepton asymmetry is sourced by decays of hadrons of the strong dynamics after the PT, and it is enhanced compared to the non-confining case, which was the only one explored so far. This widens the energy range of the PT, where the observed baryon… ▽ More

    Submitted 23 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 34 pages, 9 figures, minor revision, accepted for publication in JHEP

  17. arXiv:2312.04740  [pdf, other

    cs.LG cs.AI cs.GT

    Train 'n Trade: Foundations of Parameter Markets

    Authors: Tzu-Heng Huang, Harit Vishwakarma, Frederic Sala

    Abstract: Organizations typically train large models individually. This is costly and time-consuming, particularly for large-scale foundation models. Such vertical production is known to be suboptimal. Inspired by this economic insight, we ask whether it is possible to leverage others' expertise by trading the constituent parts in models, i.e., sets of weights, as if they were market commodities. While rece… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: accepted at NeurIPS 2023

  18. arXiv:2312.00960  [pdf

    cs.CL cs.AI cs.LG

    The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models

    Authors: Satya Sai Srinath Namburi, Makesh Sreedhar, Srinath Srinivasan, Frederic Sala

    Abstract: Compressing large language models (LLMs), often consisting of billions of parameters, provides faster inference, smaller memory footprints, and enables local deployment. Two standard compression techniques are pruning and quantization, with the former eliminating redundant connections in model layers and the latter representing model parameters with fewer bits. The key tradeoff is between the degr… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Accepted to EMNLP 2023 Findings

  19. arXiv:2309.16430  [pdf, other

    physics.chem-ph

    Adiabatic connection interaction strength interpolation method made accurate for the uniform electron gas

    Authors: Lucian A. Constantin, Subrata Jana, Szymon Śmiga, Fabio Della Sala

    Abstract: The adiabatic connection interaction strength interpolation (ISI)-like method provides a high-level expression for the correlation energy, being in principle exact in the weak-interaction limit, where it recovers the second-order Görling-Levy perturbation term, but also in the strong-interaction limit that is described by the strictly correlated electron approach. In this work, we construct the ge… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  20. arXiv:2309.04344  [pdf, other

    cs.LG cs.AI

    Zero-Shot Robustification of Zero-Shot Models

    Authors: Dyah Adila, Changho Shin, Linrong Cai, Frederic Sala

    Abstract: Zero-shot inference is a powerful paradigm that enables the use of large pretrained models for downstream classification tasks without further training. However, these models are vulnerable to inherited biases that can impact their performance. The traditional solution is fine-tuning, but this undermines the key advantage of pretrained models, which is their ability to be used out-of-the-box. We p… ▽ More

    Submitted 12 February, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: International Conference on Learning Representations (ICLR), 2024

  21. arXiv:2307.14430  [pdf, other

    cs.CL cs.LG

    Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models

    Authors: Mayee F. Chen, Nicholas Roberts, Kush Bhatia, Jue Wang, Ce Zhang, Frederic Sala, Christopher Ré

    Abstract: The quality of training data impacts the performance of pre-trained large language models (LMs). Given a fixed budget of tokens, we study how to best select data that leads to good downstream model performance across tasks. We develop a new framework based on a simple hypothesis: just as humans acquire interdependent skills in a deliberate order, language models also follow a natural order when le… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

  22. arXiv:2307.12226  [pdf, other

    cs.LG cs.AI stat.ML

    Geometry-Aware Adaptation for Pretrained Models

    Authors: Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala

    Abstract: Machine learning models -- including prominent zero-shot models -- are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped with a metric that relates the labels via distances between them. We propose a simple approach to exploit this information to adapt the trained model to reliably predict new classes -- or, in the case of… ▽ More

    Submitted 27 November, 2023; v1 submitted 23 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023

  23. arXiv:2307.11031  [pdf, ps, other

    cs.LG cs.CL

    Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification

    Authors: Neel Guha, Mayee F. Chen, Kush Bhatia, Azalia Mirhoseini, Frederic Sala, Christopher Ré

    Abstract: Recent work has shown that language models' (LMs) prompt-based learning capabilities make them well suited for automating data labeling in domains where manual annotation is expensive. The challenge is that while writing an initial prompt is cheap, improving a prompt is costly -- practitioners often require significant labeled data in order to evaluate the impact of prompt modifications. Our work… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: 38 pages, 22 figures, 8 tables

  24. arXiv:2307.02715  [pdf, other

    physics.chem-ph

    Regularized and Opposite spin-scaled functionals from Møller-Plesset adiabatic connection -- higher accuracy at lower cost

    Authors: Kimberly J. Daas, Derk P. Kooi, Nina C. Peters, Eduardo Fabiano, Fabio Della Sala, Paola Gori-Giorgi, Stefan Vuckovic

    Abstract: Non-covalent interactions (NCIs) play a crucial role in biology, chemistry, material science, and everything in between. To improve pure quantum-chemical simulations of NCIs, we propose a methodology for constructing approximate correlation energies by combining an interpolation along the Møller adiabatic connection (MP AC) with a regularization and spin-scaling strategy applied to MP2 correlation… ▽ More

    Submitted 7 July, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: 12 pages + 5 SI, 8 figures + 6 SI

  25. arXiv:2306.15555  [pdf, other

    hep-ph astro-ph.CO

    Bubbletrons

    Authors: Iason Baldes, Maximilian Dichtl, Yann Gouttenoire, Filippo Sala

    Abstract: In cosmological first-order phase transitions (PT) with relativistic bubble walls, high-energy shells of particles generically form on the inner and outer sides of the walls. Shells from different bubbles can then collide with energies much larger than the PT or inflation scales, and with sizeable rates, realising a `bubbletron'. As an application, we calculate the maximal dark matter mass… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: 5 pages plus references, 5 figures

  26. arXiv:2304.00754  [pdf, ps, other

    physics.chem-ph

    Gaussian expansion of Yukawa non-local kinetic energy functionals: application to metal clusters

    Authors: F. Sarcinella, S. Śmiga, F. Della Sala, E. Fabiano

    Abstract: The development of kinetic energy (KE) functionals is one of the current challenges in density functional theory (DFT). The Yukawa non-local KE functionals [Phys. Rev. B 103, 155127 (2021)] have been shown to describe accurately the Lindhard response of the homogeneous electron gas (HEG) directly in the real space, without any step in the reciprocal space. However, the Yukawa kernel employs an exp… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: 9 pages, 6 figures

  27. arXiv:2303.17713  [pdf, other

    cs.LG cs.CY stat.ML

    Mitigating Source Bias for Fairer Weak Supervision

    Authors: Changho Shin, Sonia Cromp, Dyah Adila, Frederic Sala

    Abstract: Weak supervision enables efficient development of training sets by reducing the need for ground truth labels. However, the techniques that make weak supervision attractive -- such as integrating any source of signal to estimate unknown labels -- also entail the danger that the produced pseudolabels are highly biased. Surprisingly, given everyday use and the potential for increased bias, weak super… ▽ More

    Submitted 29 November, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2023

  28. arXiv:2303.17154  [pdf, ps, other

    math.AG hep-th

    Flops and Hilbert schemes of space curve singularities

    Authors: Duiliu-Emanuel Diaconescu, Mauro Porta, Francesco Sala, Arian Vosoughinia

    Abstract: Using pagoda flop transitions between smooth projective threefolds, a relation is derived between the Euler numbers of moduli spaces of stable pairs which are scheme-theoretically supported on a fixed singular space curve and Euler numbers of Flag Hilbert schemes associated to a plane curve singularity. When the space curve singularity is locally complete intersection, one obtains a relation betwe… ▽ More

    Submitted 23 June, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: v2: 67 pages, some technical assumptions removed, typos fixed, main results unchanged. v1: 52 pages

    MSC Class: 14N35 (Primary); 14D23; 14E99; 14F99 (Secondary)

  29. arXiv:2303.12107  [pdf, other

    hep-ph astro-ph.HE

    Dark Matter spikes around Sgr A* in $γ$-rays

    Authors: Shyam Balaji, Divya Sachdeva, Filippo Sala, Joseph Silk

    Abstract: We use H.E.S.S. $γ$-ray observations of Sgr A* to derive novel limits on the Dark Matter (DM) annihilation cross-section. We quantify their dependence on uncertainties i) in the DM halo profile, which we vary from peaked to cored, and ii) in the shape of the DM spike around Sgr A*, dynamically heated by the nuclear star cluster. For peaked halo profiles and depending on the heating of the spike, o… ▽ More

    Submitted 13 September, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: Published in JCAP, added discussion on energy resolution impact for DM annihilation into photons channel

  30. arXiv:2303.08533  [pdf, other

    physics.acc-ph hep-ex hep-ph

    Towards a Muon Collider

    Authors: Carlotta Accettura, Dean Adams, Rohit Agarwal, Claudia Ahdida, Chiara Aimè, Nicola Amapane, David Amorim, Paolo Andreetto, Fabio Anulli, Robert Appleby, Artur Apresyan, Aram Apyan, Sergey Arsenyev, Pouya Asadi, Mohammed Attia Mahmoud, Aleksandr Azatov, John Back, Lorenzo Balconi, Laura Bandiera, Roger Barlow, Nazar Bartosik, Emanuela Barzi, Fabian Batsch, Matteo Bauce, J. Scott Berg , et al. (272 additional authors not shown)

    Abstract: A muon collider would enable the big jump ahead in energy reach that is needed for a fruitful exploration of fundamental interactions. The challenges of producing muon collisions at high luminosity and 10 TeV centre of mass energy are being investigated by the recently-formed International Muon Collider Collaboration. This Review summarises the status and the recent advances on muon colliders desi… ▽ More

    Submitted 27 November, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 118 pages, 103 figures

  31. arXiv:2303.07527  [pdf, other

    cs.LG cs.CV

    Domain Generalization via Nuclear Norm Regularization

    Authors: Zhenmei Shi, Yifei Ming, Ying Fan, Frederic Sala, Yingyu Liang

    Abstract: The ability to generalize to unseen domains is crucial for machine learning systems deployed in the real world, especially when we only have data from limited training domains. In this paper, we propose a simple and effective regularization method based on the nuclear norm of the learned features for domain generalization. Intuitively, the proposed regularizer mitigates the impacts of environmenta… ▽ More

    Submitted 4 December, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: 23 pages

  32. arXiv:2212.10579  [pdf, other

    hep-ph cs.LG hep-ex stat.ML

    Resonant Anomaly Detection with Multiple Reference Datasets

    Authors: Mayee F. Chen, Benjamin Nachman, Frederic Sala

    Abstract: An important class of techniques for resonant anomaly detection in high energy physics builds models that can distinguish between reference and target datasets, where only the latter has appreciable signal. Such techniques, including Classification Without Labels (CWoLa) and Simulation Assisted Likelihood-free Anomaly Detection (SALAD) rely on a single reference dataset. They cannot take advantage… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

  33. arXiv:2211.13375  [pdf, other

    cs.LG cs.AI stat.ML

    Lifting Weak Supervision To Structured Prediction

    Authors: Harit Vishwakarma, Nicholas Roberts, Frederic Sala

    Abstract: Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates from a variety of sources. WS is theoretically well understood for binary classification, where simple approaches enable consistent estimation of pseudolabel noise rates. Using this result, it has been shown that downstream models trained on the pseudolab… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

  34. arXiv:2211.12620  [pdf, other

    cs.LG cs.AI stat.ML

    Promises and Pitfalls of Threshold-based Auto-labeling

    Authors: Harit Vishwakarma, Heguang Lin, Frederic Sala, Ramya Korlakai Vinayak

    Abstract: Creating large-scale high-quality labeled datasets is a major bottleneck in supervised machine learning workflows. Threshold-based auto-labeling (TBAL), where validation data obtained from humans is used to find a confidence threshold above which the data is machine-labeled, reduces reliance on manual annotation. TBAL is emerging as a widely-used solution in practice. Given the long shelf-life and… ▽ More

    Submitted 21 February, 2024; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2023 (Spotlight)

    Journal ref: Thirty Seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

  35. arXiv:2210.03324  [pdf, other

    cs.LG cs.AI stat.ML

    AutoML for Climate Change: A Call to Action

    Authors: Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White

    Abstract: The challenge that climate change poses to humanity has spurred a rapidly developing field of artificial intelligence research focused on climate change applications. The climate change AI (CCAI) community works on a diverse, challenging set of problems which often involve physics-constrained ML or heterogeneous spatiotemporal data. It would be desirable to use automated machine learning (AutoML)… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  36. arXiv:2210.02441  [pdf, other

    cs.CL

    Ask Me Anything: A simple strategy for prompting language models

    Authors: Simran Arora, Avanika Narayan, Mayee F. Chen, Laurel Orr, Neel Guha, Kush Bhatia, Ines Chami, Frederic Sala, Christopher Ré

    Abstract: Large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt that demonstrates how to perform the task and no additional training. Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect promp… ▽ More

    Submitted 19 November, 2022; v1 submitted 5 October, 2022; originally announced October 2022.

  37. arXiv:2208.14362  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels

    Authors: Nicholas Roberts, Xintong Li, Tzu-Heng Huang, Dyah Adila, Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, Aws Albarghouthi, Frederic Sala

    Abstract: Weak supervision (WS) is a powerful method to build labeled datasets for training supervised models in the face of little-to-no labeled data. It replaces hand-labeling data with aggregating multiple noisy-but-cheap label estimates expressed by labeling functions (LFs). While it has been used successfully in many domains, weak supervision's application scope is limited by the difficulty of construc… ▽ More

    Submitted 24 November, 2023; v1 submitted 30 August, 2022; originally announced August 2022.

    Comments: NeurIPS 2022 Datasets and Benchmarks Track

  38. arXiv:2207.08926  [pdf, ps, other

    math.AG math.QA math.RT

    Cohomological Hall algebras and their representations via torsion pairs

    Authors: Duiliu-Emanuel Diaconescu, Mauro Porta, Francesco Sala

    Abstract: In this paper, we provide a way of attaching to a torsion pair $(T,F)$ on the heart of a stable $\infty$-category $C$ a cohomological (K-theoretical, categorified) Hall algebra and corresponding left and right representations. More precisely, the algebra is associated to the torsion part, while the representation is associated to the torsion-free part. The left and right actions enable us to con… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: 68 pages

    MSC Class: 14A20 (Primary); 14A30; 14F08; 17B37 (Secondary)

  39. Hot and heavy dark matter from a weak scale phase transition

    Authors: Iason Baldes, Yann Gouttenoire, Filippo Sala

    Abstract: We point out that dark matter which is produced non-adiabatically in a phase transition (PT) with fast bubble walls receives a boost in velocity which leads to long free-streaming lengths. We find that this could be observed via the suppressed matter power spectrum for dark matter masses around $10^8 - 10^9$ GeV and energy scales of the PT around $10^{2} - 10^3$ GeV. The PT should take place at th… ▽ More

    Submitted 5 December, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: 9 pages plus appendices. V2: Clarifications and references added, accepted for publication in SciPost Physics

    Report number: ULB-TH/22-12

    Journal ref: SciPost Phys. 14, 033 (2023)

  40. arXiv:2203.13270  [pdf, other

    stat.ML cs.LG

    Shoring Up the Foundations: Fusing Model Embeddings and Weak Supervision

    Authors: Mayee F. Chen, Daniel Y. Fu, Dyah Adila, Michael Zhang, Frederic Sala, Kayvon Fatahalian, Christopher Ré

    Abstract: Foundation models offer an exciting new paradigm for constructing models with out-of-the-box embeddings and a few labeled examples. However, it is not clear how to best apply foundation models without labeled data. A potential approach is to fuse foundation models with weak supervision frameworks, which use weak label sources -- pre-trained models, heuristics, crowd-workers -- to construct pseudol… ▽ More

    Submitted 1 August, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: UAI 2022 Camera Ready

  41. arXiv:2203.12023  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Generative Modeling Helps Weak Supervision (and Vice Versa)

    Authors: Benedikt Boecking, Nicholas Roberts, Willie Neiswanger, Stefano Ermon, Frederic Sala, Artur Dubrawski

    Abstract: Many promising applications of supervised machine learning face hurdles in the acquisition of labeled data in sufficient quantity and quality, creating an expensive bottleneck. To overcome such limitations, techniques that do not depend on ground truth labels have been studied, including weak supervision and generative modeling. While these techniques would seem to be usable in concert, improving… ▽ More

    Submitted 11 March, 2023; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: Published as a conference paper at ICLR 2023

    ACM Class: I.2.0; I.4.m

  42. arXiv:2203.07261  [pdf, other

    hep-ph hep-ex

    The physics case of a 3 TeV muon collider stage

    Authors: Jorge De Blas, Dario Buttazzo, Rodolfo Capdevilla, David Curtin, Roberto Franceschini, Fabio Maltoni, Patrick Meade, Federico Meloni, Shufang Su, Eleni Vryonidou, Andrea Wulzer, Chiara Aimè, Aram Apyan, Pouya Asadi, Mohammed Attia Mahmoud, Aleksandr Azatov, Nazar Bartosik, Alessandro Bertolin, Salvatore Bottaro, Laura Buonincontri, Massimo Casarsa, Luca Castelli, Maria Gabriella Catanesi, Francesco Giovanni Celiberto, Alessandro Cerri , et al. (109 additional authors not shown)

    Abstract: In the path towards a muon collider with center of mass energy of 10 TeV or more, a stage at 3 TeV emerges as an appealing option. Reviewing the physics potential of such muon collider is the main purpose of this document. In order to outline the progression of the physics performances across the stages, a few sensitivity projections for higher energy are also presented. There are many opportuniti… ▽ More

    Submitted 27 May, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: 73 pages, 28 figures; Contribution to Snowmass 2021

  43. arXiv:2203.07256  [pdf, other

    hep-ph hep-ex

    Muon Collider Physics Summary

    Authors: Chiara Aimè, Aram Apyan, Mohammed Attia Mahmoud, Nazar Bartosik, Alessandro Bertolin, Maurizio Bonesini, Salvatore Bottaro, Dario Buttazzo, Rodolfo Capdevilla, Massimo Casarsa, Luca Castelli, Maria Gabriella Catanesi, Francesco Giovanni Celiberto, Alessandro Cerri, Cari Cesarotti, Grigorios Chachamis, Siyu Chen, Yang-Ting Chien, Mauro Chiesa, Gianmaria Collazuol, Marco Costa, Nathaniel Craig, David Curtin, Sridhara Dasu, Jorge De Blas , et al. (100 additional authors not shown)

    Abstract: The perspective of designing muon colliders with high energy and luminosity, which is being investigated by the International Muon Collider Collaboration, has triggered a growing interest in their physics reach. We present a concise summary of the muon colliders potential to explore new physics, leveraging on the unique possibility of combining high available energy with very precise measurements.

    Submitted 27 May, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: 21 pages, 7 figures; Contribution to Snowmass 2021

  44. Search for secluded dark matter towards the Galactic Centre with the ANTARES neutrino telescope

    Authors: A. Albert, S. Alves, M. Andre, M. Anghinolfi, G. Anton, M. Ardid, S. Ardid, J. -J. Aubert, J. Aublin, B. Baret, S. Basa, B. Belhorma, M. Bendahman, F. Benfenati, V. Bertin, S. Biagi, M. Bissinger, J. Boumaaza, M. Bouta, M. C. Bouwhuis, H. Branzas, R. Bruijn, J. Brunner, J. Busto, B. Caiffi , et al. (124 additional authors not shown)

    Abstract: Searches for dark matter (DM) have not provided any solid evidence for the existence of weakly interacting massive particles in the GeV-TeV mass range. Coincidentally, the scale of new physics is being pushed by collider searches well beyond the TeV domain. This situation strongly motivates the exploration of DM masses much larger than a TeV. Secluded scenarios contain a natural way around the uni… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

  45. arXiv:2202.11531  [pdf, ps, other

    physics.chem-ph cond-mat.other

    Self-Consistent Implementation of Kohn-Sham Adiabatic Connection Models with Improved Treatment of the Strong-Interaction Limit

    Authors: S. Śmiga, F. Della Sala, P. Gori-Giorgi, E. Fabiano

    Abstract: Adiabatic connection models (ACMs), which interpolate between the limits of weak and strong interaction, are powerful tools to build accurate exchange-correlation functionals. If the exact weak-interaction expansion from second-order perturbation theory is included, a self-consistent implementation of these functionals is challenging and still absent in the literature. In this work we fill this ga… ▽ More

    Submitted 7 September, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

    Comments: 40 pages, 6 figures

  46. arXiv:2112.07686  [pdf, other

    hep-ph astro-ph.CO gr-qc

    Friction pressure on relativistic bubble walls

    Authors: Yann Gouttenoire, Ryusuke Jinno, Filippo Sala

    Abstract: During a cosmological first-order phase transition, particles of the plasma crossing the bubble walls can radiate a gauge boson. The resulting pressure cannot be computed perturbatively for large coupling constant and/or large supercooling. We resum the real and virtual emissions at all leading-log orders, both analytically and numerically using a Monte-Carlo simulation. We find that radiated boso… ▽ More

    Submitted 14 July, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: 26 pages, 8 figures, plus appendices and references. v2: additional references added, matches JHEP publication

  47. arXiv:2112.03865  [pdf, other

    cs.LG cs.AI

    Universalizing Weak Supervision

    Authors: Changho Shin, Winfred Li, Harit Vishwakarma, Nicholas Roberts, Frederic Sala

    Abstract: Weak supervision (WS) frameworks are a popular way to bypass hand-labeling large datasets for training data-hungry models. These approaches synthesize multiple noisy but cheaply-acquired estimates of labels into a set of high-quality pseudolabels for downstream training. However, the synthesis technique is specific to a particular kind of label, such as binary labels or sequences, and each new lab… ▽ More

    Submitted 29 November, 2023; v1 submitted 7 December, 2021; originally announced December 2021.

    Comments: ICLR 2022

  48. arXiv:2111.08036  [pdf, ps, other

    math.AG

    On the Chow ring of the classifying stack of algebraic tori

    Authors: Francesco Sala

    Abstract: We investigate the structure of the Chow ring of the classifying stacks $BT$ of algebraic tori, as it has been defined by B. Totaro. Some previous work of N. Karpenko, A. Merkurjev, S. Blinstein and F. Scavia has shed some light on the structure of such rings. In particular Karpenko showed the absence of torsion classes in the case of permutation tori, while Merkurjev and Blinstein described in a… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

  49. arXiv:2111.00249  [pdf, ps, other

    math.RT math.AG math.QA

    Shuffle algebras for quivers as quantum groups

    Authors: Andrei Neguţ, Francesco Sala, Olivier Schiffmann

    Abstract: We define a quantum loop group $\mathbf{U}^+_Q$ associated to an arbitrary quiver $Q=(I,E)$ and maximal set of deformation parameters, with generators indexed by $I \times \mathbb{Z}$ and some explicit quadratic and cubic relations. We prove that $\mathbf{U}^+_Q$ is isomorphic to the (generic, small) shuffle algebra associated to the quiver $Q$ and hence, by [Neg21a], to the localized K-theoretic… ▽ More

    Submitted 8 May, 2023; v1 submitted 30 October, 2021; originally announced November 2021.

    Comments: Added Section 5 (concerning special values of the parameters) and Section 7 (on the Hall algebra of an arbitrary curve)

  50. arXiv:2110.13926  [pdf, other

    hep-ph astro-ph.CO

    Supercool Composite Dark Matter Beyond 100 TeV

    Authors: Iason Baldes, Yann Gouttenoire, Filippo Sala, Géraldine Servant

    Abstract: Dark Matter could be a composite state of a confining sector with an approximate scale symmetry. We consider the case where the associated pseudo-Goldstone boson, the dilaton, mediates its interactions with the Standard Model. When the confining phase transition in the early universe is supercooled, its dynamics allows for Dark Matter masses up to $10^6$ TeV. We derive the precise parameter space… ▽ More

    Submitted 13 July, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

    Comments: 35 pages plus appendices and references, accepted for publication in JHEP

    Report number: ULB-TH/21-17; DESY 21-172

    Journal ref: JHEP 07 (2022) 084