Skip to main content

Showing 1–15 of 15 results for author: Bhimji, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2306.00258  [pdf, other

    cs.LG math.NA

    Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

    Authors: Shashank Subramanian, Peter Harrington, Kurt Keutzer, Wahid Bhimji, Dmitriy Morozov, Michael Mahoney, Amir Gholami

    Abstract: Pre-trained machine learning (ML) models have shown great performance for a wide range of applications, in particular in natural language processing (NLP) and computer vision (CV). Here, we study how pre-training could be used for scientific machine learning (SciML) applications, specifically in the context of transfer learning. We study the transfer behavior of these models as (i) the pre-trained… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: 16 pages, 11 figures

    Journal ref: NeurIPS 2023

  2. arXiv:2209.08868  [pdf, other

    physics.comp-ph cs.DC hep-ex hep-lat hep-th

    Snowmass 2021 Computational Frontier CompF4 Topical Group Report: Storage and Processing Resource Access

    Authors: W. Bhimji, D. Carder, E. Dart, J. Duarte, I. Fisk, R. Gardner, C. Guok, B. Jayatilaka, T. Lehman, M. Lin, C. Maltzahn, S. McKee, M. S. Neubauer, O. Rind, O. Shadura, N. V. Tran, P. van Gemmeren, G. Watts, B. A. Weaver, F. Würthwein

    Abstract: Computing plays a significant role in all areas of high energy physics. The Snowmass 2021 CompF4 topical group's scope is facilities R&D, where we consider "facilities" as the computing hardware and software infrastructure inside the data centers plus the networking between data centers, irrespective of who owns them, and what policies are applied for using them. In other words, it includes commer… ▽ More

    Submitted 29 September, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: Snowmass 2021 Computational Frontier CompF4 topical group report. v2: Expanded introduction. Updated author list. 52 pages, 6 figures

  3. arXiv:2205.04601  [pdf, other

    cs.LG nlin.CD physics.ao-ph physics.flu-dyn physics.geo-ph

    Long-term stability and generalization of observationally-constrained stochastic data-driven models for geophysical turbulence

    Authors: Ashesh Chattopadhyay, Jaideep Pathak, Ebrahim Nabizadeh, Wahid Bhimji, Pedram Hassanzadeh

    Abstract: Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lo… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  4. arXiv:2105.12880  [pdf, other

    cs.DC cs.PF

    The Petascale DTN Project: High Performance Data Transfer for HPC Facilities

    Authors: Eli Dart, William Allcock, Wahid Bhimji, Tim Boerner, Ravinderjeet Cheema, Andrew Cherry, Brent Draney, Salman Habib, Damian Hazen, Jason Hill, Matt Kollross, Suzanne Parete-Koon, Daniel Pelfrey, Adrian Pope, Jeff Porter, David Wheeler

    Abstract: The movement of large-scale (tens of Terabytes and larger) data sets between high performance computing (HPC) facilities is an important and increasingly critical capability. A growing number of scientific collaborations rely on HPC facilities for tasks which either require large-scale data sets as input or produce large-scale data sets as output. In order to enable the transfer of these data sets… ▽ More

    Submitted 8 September, 2021; v1 submitted 26 May, 2021; originally announced May 2021.

  5. arXiv:1907.03382  [pdf, other

    cs.LG cs.PF stat.ML

    Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale

    Authors: Atılım Güneş Baydin, Lei Shao, Wahid Bhimji, Lukas Heinrich, Lawrence Meadows, Jialin Liu, Andreas Munk, Saeid Naderiparizi, Bradley Gram-Hansen, Gilles Louppe, Mingfei Ma, Xiaohui Zhao, Philip Torr, Victor Lee, Kyle Cranmer, Prabhat, Frank Wood

    Abstract: Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL frame… ▽ More

    Submitted 27 August, 2019; v1 submitted 7 July, 2019; originally announced July 2019.

    Comments: 14 pages, 8 figures

    MSC Class: 68T37; 68T05; 62P35 ACM Class: G.3; I.2.6; J.2

    Journal ref: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC19), November 17--22, 2019

  6. arXiv:1809.06166  [pdf, other

    cs.LG astro-ph.IM stat.ML

    Graph Neural Networks for IceCube Signal Classification

    Authors: Nicholas Choma, Federico Monti, Lisa Gerhardt, Tomasz Palczewski, Zahra Ronaghi, Prabhat, Wahid Bhimji, Michael M. Bronstein, Spencer R. Klein, Joan Bruna

    Abstract: Tasks involving the analysis of geometric (graph- and manifold-structured) data have recently gained prominence in the machine learning community, giving birth to a rapidly developing field of geometric deep learning. In this work, we leverage graph neural networks to improve signal detection in the IceCube neutrino observatory. The IceCube detector array is modeled as a graph, where vertices are… ▽ More

    Submitted 17 September, 2018; originally announced September 2018.

  7. arXiv:1807.07706  [pdf, other

    cs.LG hep-ph physics.data-an stat.ML

    Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model

    Authors: Atılım Güneş Baydin, Lukas Heinrich, Wahid Bhimji, Lei Shao, Saeid Naderiparizi, Andreas Munk, Jialin Liu, Bradley Gram-Hansen, Gilles Louppe, Lawrence Meadows, Philip Torr, Victor Lee, Prabhat, Kyle Cranmer, Frank Wood

    Abstract: We present a novel probabilistic programming framework that couples directly to existing large-scale simulators through a cross-platform probabilistic execution protocol, which allows general-purpose inference engines to record and control random number draws within simulators in a language-agnostic way. The execution of existing simulators as probabilistic programs enables highly interpretable po… ▽ More

    Submitted 17 February, 2020; v1 submitted 20 July, 2018; originally announced July 2018.

    Comments: 20 pages, 9 figures

    MSC Class: 68T37; 68T05; 62P35 ACM Class: G.3; I.2.6; J.2

    Journal ref: In Advances in Neural Information Processing Systems 33 (NeurIPS), Vancouver, Canada, 2019

  8. arXiv:1807.02876  [pdf, other

    physics.comp-ph cs.LG hep-ex stat.ML

    Machine Learning in High Energy Physics Community White Paper

    Authors: Kim Albertsson, Piero Altoe, Dustin Anderson, John Anderson, Michael Andrews, Juan Pedro Araque Espinosa, Adam Aurisano, Laurent Basara, Adrian Bevan, Wahid Bhimji, Daniele Bonacorsi, Bjorn Burkle, Paolo Calafiura, Mario Campanelli, Louis Capps, Federico Carminati, Stefano Carrazza, Yi-fan Chen, Taylor Childers, Yann Coadou, Elias Coniavitis, Kyle Cranmer, Claire David, Douglas Davis, Andrea De Simone , et al. (103 additional authors not shown)

    Abstract: Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We d… ▽ More

    Submitted 16 May, 2019; v1 submitted 8 July, 2018; originally announced July 2018.

    Comments: Editors: Sergei Gleyzer, Paul Seyfert and Steven Schramm

  9. arXiv:1712.07901  [pdf, other

    cs.AI physics.data-an

    Improvements to Inference Compilation for Probabilistic Programming in Large-Scale Scientific Simulators

    Authors: Mario Lezcano Casado, Atilim Gunes Baydin, David Martinez Rubio, Tuan Anh Le, Frank Wood, Lukas Heinrich, Gilles Louppe, Kyle Cranmer, Karen Ng, Wahid Bhimji, Prabhat

    Abstract: We consider the problem of Bayesian inference in the family of probabilistic models implicitly defined by stochastic generative models of data. In scientific fields ranging from population biology to cosmology, low-level mechanistic components are composed to create complex generative models. These models lead to intractable likelihoods and are typically non-differentiable, which poses challenges… ▽ More

    Submitted 21 December, 2017; originally announced December 2017.

    Comments: 7 pages, 2 figures

    MSC Class: 68T37; 68T05; 62P35 ACM Class: G.3; I.2.6; J.2

  10. arXiv:1711.03573  [pdf, other

    hep-ex cs.DC cs.LG physics.data-an

    Deep Neural Networks for Physics Analysis on low-level whole-detector data at the LHC

    Authors: Wahid Bhimji, Steven Andrew Farrell, Thorsten Kurth, Michela Paganini, Prabhat, Evan Racah

    Abstract: There has been considerable recent activity applying deep convolutional neural nets (CNNs) to data from particle physics experiments. Current approaches on ATLAS/CMS have largely focussed on a subset of the calorimeter, and for identifying objects or particular particle types. We explore approaches that use the entire calorimeter, combined with track information, for directly conducting physics an… ▽ More

    Submitted 29 November, 2017; v1 submitted 9 November, 2017; originally announced November 2017.

    Comments: Presented at ACAT 2017 Conference, Submitted to J. Phys. Conf. Ser

  11. arXiv:1708.05256  [pdf, other

    cs.PF cs.CV cs.LG

    Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data

    Authors: Thorsten Kurth, Jian Zhang, Nadathur Satish, Ioannis Mitliagkas, Evan Racah, Mostofa Ali Patwary, Tareq Malas, Narayanan Sundaram, Wahid Bhimji, Mikhail Smorkalov, Jack Deslippe, Mikhail Shiryaev, Srinivas Sridharan, Prabhat, Pradeep Dubey

    Abstract: This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures. We develop supervised convolutional architectures for discriminating signals in high-energy physics data as well as semi-supervised architectures for localizing and classifying extreme weather in climate data. Our Intelcaffe-based implementation… ▽ More

    Submitted 17 August, 2017; originally announced August 2017.

    Comments: 12 pages, 9 figures

  12. CosmoGAN: creating high-fidelity weak lensing convergence maps using Generative Adversarial Networks

    Authors: Mustafa Mustafa, Deborah Bard, Wahid Bhimji, Zarija Lukić, Rami Al-Rfou, Jan M. Kratochvil

    Abstract: Inferring model parameters from experimental data is a grand challenge in many sciences, including cosmology. This often relies critically on high fidelity numerical simulations, which are prohibitively computationally expensive. The application of deep learning techniques to generative modeling is renewing interest in using high dimensional density estimators as computationally inexpensive emulat… ▽ More

    Submitted 22 May, 2019; v1 submitted 7 June, 2017; originally announced June 2017.

    Comments: 11 pages, 8 figures

    Journal ref: Computational Astrophysics and CosmologySimulations, Data Analysis and Algorithms 2019 6:1

  13. PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures

    Authors: Md. Mostofa Ali Patwary, Nadathur Rajagopalan Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Prabhat, Pradeep Dubey

    Abstract: Computing $k$-Nearest Neighbors (KNN) is one of the core kernels used in many machine learning, data mining and scientific computing applications. Although kd-tree based $O(\log n)$ algorithms have been proposed for computing KNN, due to its inherent sequentiality, linear algorithms are being used in practice. This limits the applicability of such methods to millions of data points, with limited s… ▽ More

    Submitted 27 July, 2016; originally announced July 2016.

    Comments: 11 pages in PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures, Md. Mostofa Ali Patwary et.al., IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2016

  14. arXiv:1601.07621  [pdf, other

    stat.ML cs.LG physics.data-an

    Revealing Fundamental Physics from the Daya Bay Neutrino Experiment using Deep Neural Networks

    Authors: Evan Racah, Seyoon Ko, Peter Sadowski, Wahid Bhimji, Craig Tull, Sang-Yun Oh, Pierre Baldi, Prabhat

    Abstract: Experiments in particle physics produce enormous quantities of data that must be analyzed and interpreted by teams of physicists. This analysis is often exploratory, where scientists are unable to enumerate the possible types of signal prior to performing the experiment. Thus, tools for summarizing, clustering, visualizing and classifying high-dimensional data are essential. In this work, we show… ▽ More

    Submitted 6 December, 2016; v1 submitted 27 January, 2016; originally announced January 2016.

  15. Establishing Applicability of SSDs to LHC Tier-2 Hardware Configuration

    Authors: Samuel C Skipsey, Wahid Bhimji, Mike Kenyon

    Abstract: Solid State Disk technologies are increasingly replacing high-speed hard disks as the storage technology in high-random-I/O environments. There are several potentially I/O bound services within the typical LHC Tier-2 - in the back-end, with the trend towards many-core architectures continuing, worker nodes running many single-threaded jobs and storage nodes delivering many simultaneous files can b… ▽ More

    Submitted 15 February, 2011; originally announced February 2011.

    Comments: 6 pages, 1 figure, 4 tables. Conference proceedings for CHEP2010