Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Collins, J H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.03761  [pdf, other

    astro-ph.GA cs.LG hep-ph physics.data-an

    Weakly-Supervised Anomaly Detection in the Milky Way

    Authors: Mariel Pettee, Sowmya Thanvantri, Benjamin Nachman, David Shih, Matthew R. Buckley, Jack H. Collins

    Abstract: Large-scale astrophysics datasets present an opportunity for new machine learning techniques to identify regions of interest that might otherwise be overlooked by traditional searches. To this end, we use Classification Without Labels (CWoLa), a weakly-supervised anomaly detection method, to identify cold stellar streams within the more than one billion Milky Way stars observed by the Gaia satelli… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  2. arXiv:2210.11489  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Machine-Learning Compression for Particle Physics Discoveries

    Authors: Jack H. Collins, Yifeng Huang, Simon Knapen, Benjamin Nachman, Daniel Whiteson

    Abstract: In collider-based particle and nuclear physics experiments, data are produced at such extreme rates that only a subset can be recorded for later analysis. Typically, algorithms select individual collision events for preservation and store the complete experimental response. A relatively new alternative strategy is to additionally save a partial record for a larger subset of events, allowing for la… ▽ More

    Submitted 18 December, 2022; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: 9 pages, 3 figures

    Report number: SLAC-PUB-17704

  3. arXiv:2209.04732  [pdf

    cs.DB cs.AI

    Ontologizing Health Systems Data at Scale: Making Translational Discovery a Reality

    Authors: Tiffany J. Callahan, Adrianne L. Stefanski, Jordan M. Wyrwa, Chenjie Zeng, Anna Ostropolets, Juan M. Banda, William A. Baumgartner Jr., Richard D. Boyce, Elena Casiraghi, Ben D. Coleman, Janine H. Collins, Sara J. Deakyne-Davies, James A. Feinstein, Melissa A. Haendel, Asiyah Y. Lin, Blake Martin, Nicolas A. Matentzoglu, Daniella Meeker, Justin Reese, Jessica Sinclair, Sanya B. Taneja, Katy E. Trinkley, Nicole A. Vasilevsky, Andrew Williams, Xingman A. Zhang , et al. (7 additional authors not shown)

    Abstract: Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OB… ▽ More

    Submitted 30 January, 2023; v1 submitted 10 September, 2022; originally announced September 2022.

    Comments: Supplementary Material is included at the end of the manuscript

    ACM Class: J.3

  4. arXiv:2109.10919  [pdf, other

    hep-ph cs.LG hep-ex

    An Exploration of Learnt Representations of W Jets

    Authors: Jack H. Collins

    Abstract: I present a Variational Autoencoder (VAE) trained on collider physics data (specifically boosted $W$ jets), with reconstruction error given by an approximation to the Earth Movers Distance (EMD) between input and output jets. This VAE learns a concrete representation of the data manifold, with semantically meaningful and interpretable latent space directions which are hierarchically organized in t… ▽ More

    Submitted 18 April, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

    Comments: Published version, to appear in ICLR workshop Deep Generative Models for Highly Structured Data. Additional appendices

    Report number: SLAC-PUB-17622