Search | arXiv e-print repository

doi 10.1051/epjconf/202429504003

Adoption of a token-based authentication model for the CMS Submission Infrastructure

Authors: Antonio Perez-Calero Yzquierdo, Marco Mascheroni, Edita Kizinevic, Farrukh Aftab Khan, Hyunwoo Kim, Maria Acosta Flechas, Nikos Tsipinakis, Saqib Haleem, Frank Wurthwein

Abstract: The CMS Submission Infrastructure (SI) is the main computing resource provisioning system for CMS workloads. A number of HTCondor pools are employed to manage this infrastructure, which aggregates geographically distributed resources from the WLCG and other providers. Historically, the model of authentication among the diverse components of this infrastructure has relied on the Grid Security Infra… ▽ More The CMS Submission Infrastructure (SI) is the main computing resource provisioning system for CMS workloads. A number of HTCondor pools are employed to manage this infrastructure, which aggregates geographically distributed resources from the WLCG and other providers. Historically, the model of authentication among the diverse components of this infrastructure has relied on the Grid Security Infrastructure (GSI), based on identities and X509 certificates. In contrast, commonly used modern authentication standards are based on capabilities and tokens. The WLCG has identified this trend and aims at a transparent replacement of GSI for all its workload management, data transfer and storage access operations, to be completed during the current LHC Run 3. As part of this effort, and within the context of CMS computing, the Submission Infrastructure group is in the process of phasing out the GSI part of its authentication layers, in favor of IDTokens and Scitokens. The use of tokens is already well integrated into the HTCondor Software Suite, which has allowed us to fully migrate the authentication between internal components of SI. Additionally, recent versions of the HTCondor-CE support tokens as well, enabling CMS resource requests to Grid sites employing this CE technology to be granted by means of token exchange. After a rollout campaign to sites, successfully completed by the third quarter of 2022, the totality of HTCondor CEs in use by CMS are already receiving Scitoken-based pilot jobs. On the ARC CE side, a parallel campaign was launched to foster the adoption of the REST interface at CMS sites (required to enable token-based job submission via HTCondor-G), which is nearing completion as well. In this contribution, the newly adopted authentication model will be described. We will then report on the migration status and final steps towards complete GSI phase out in the CMS SI. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY & NUCLEAR PHYSICS - 2023

arXiv:2405.14639 [pdf, other]

doi 10.1051/epjconf/202429503036

Repurposing of the Run 2 CMS High Level Trigger Infrastructure as a Cloud Resource for Offline Computing

Authors: Marco Mascheroni, Antonio Perez-Calero Yzquierdo, Edita Kizinevic, Farrukh Aftab Khan, Hyunwoo Kim, Maria Acosta Flechas, Nikos Tsipinakis, Saqib Haleem, Damiele Spiga, Christoph Wissing, Frank Wurthwein

Abstract: The former CMS Run 2 High Level Trigger (HLT) farm is one of the largest contributors to CMS compute resources, providing about 25k job slots for offline computing. This CPU farm was initially employed as an opportunistic resource, exploited during inter-fill periods, in the LHC Run 2. Since then, it has become a nearly transparent extension of the CMS capacity at CERN, being located on-site at th… ▽ More The former CMS Run 2 High Level Trigger (HLT) farm is one of the largest contributors to CMS compute resources, providing about 25k job slots for offline computing. This CPU farm was initially employed as an opportunistic resource, exploited during inter-fill periods, in the LHC Run 2. Since then, it has become a nearly transparent extension of the CMS capacity at CERN, being located on-site at the LHC interaction point 5 (P5), where the CMS detector is installed. This resource has been configured to support the execution of critical CMS tasks, such as prompt detector data reconstruction. It can therefore be used in combination with the dedicated Tier 0 capacity at CERN, in order to process and absorb peaks in the stream of data coming from the CMS detector. The initial configuration for this resource, based on statically configured VMs, provided the required level of functionality. However, regular operations of this cluster revealed certain limitations compared to the resource provisioning and use model employed in the case of WLCG sites. A new configuration, based on a vacuum-like model, has been implemented for this resource in order to solve the detected shortcomings. This paper reports about this redeployment work on the permanent cloud for an enhanced support to CMS offline computing, comparing the former and new models' respective functionalities, along with the commissioning effort for the new setup. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY & NUCLEAR PHYSICS - 2023

arXiv:2402.05244 [pdf, ps, other]

CRIU -- Checkpoint Restore in Userspace for computational simulations and scientific applications

Authors: Fabio Andrijauskas, Igor Sfiligoi, Diego Davila, Aashay Arora, Jonathan Guiang, Brian Bockelman, Greg Thain, Frank Wurthwein

Abstract: Creating new materials, discovering new drugs, and simulating systems are essential processes for research and innovation and require substantial computational power. While many applications can be split into many smaller independent tasks, some cannot and may take hours or weeks to run to completion. To better manage those longer-running jobs, it would be desirable to stop them at any arbitrary p… ▽ More Creating new materials, discovering new drugs, and simulating systems are essential processes for research and innovation and require substantial computational power. While many applications can be split into many smaller independent tasks, some cannot and may take hours or weeks to run to completion. To better manage those longer-running jobs, it would be desirable to stop them at any arbitrary point in time and later continue their computation on another compute resource; this is usually referred to as checkpointing. While some applications can manage checkpointing programmatically, it would be preferable if the batch scheduling system could do that independently. This paper evaluates the feasibility of using CRIU (Checkpoint Restore in Userspace), an open-source tool for the GNU/Linux environments, emphasizing the OSG's OSPool HTCondor setup. CRIU allows checkpointing the process state into a disk image and can deal with both open files and established network connections seamlessly. Furthermore, it can checkpoint traditional Linux processes and containerized workloads. The functionality seems adequate for many scenarios supported in the OSPool. However, some limitations prevent it from being usable in all circumstances. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY & NUCLEAR PHYSICS - 2023

arXiv:2312.12589 [pdf, other]

400Gbps benchmark of XRootD HTTP-TPC

Authors: Aashay Arora, Jonathan Guiang, Diego Davila, Frank Würthwein, Justas Balcas, Harvey Newman

Abstract: Due to the increased demand of network traffic expected during the HL-LHC era, the T2 sites in the USA will be required to have 400Gbps of available bandwidth to their storage solution. With the above in mind we are pursuing a scale test of XRootD software when used to perform Third Party Copy transfers using the HTTP protocol. Our main objective is to understand the possible limitations in the so… ▽ More Due to the increased demand of network traffic expected during the HL-LHC era, the T2 sites in the USA will be required to have 400Gbps of available bandwidth to their storage solution. With the above in mind we are pursuing a scale test of XRootD software when used to perform Third Party Copy transfers using the HTTP protocol. Our main objective is to understand the possible limitations in the software stack to achieve the target transfer rate; to that end we have set up a testbed of multiple XRootD servers in both UCSD and Caltech which are connected through a dedicated link capable of 400 Gbps end-to-end. Building upon our experience deploying containerized XRootD servers, we use Kubernetes to easily deploy and test different configurations of our testbed. In this work, we will present our experience doing these tests and the lessons learned. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: 8 pages, 4 figures, submitted to CHEP'23

arXiv:2308.11733 [pdf]

Demand-driven provisioning of Kubernetes-like resources in OSG

Authors: Igor Sfiligoi, Frank Würthwein, Jeff Dost, Brian Lin, David Schultz

Abstract: The OSG-operated Open Science Pool is an HTCondor-based virtual cluster that aggregates resources from compute clusters provided by several organizations. Most of the resources are not owned by OSG, so demand-based dynamic provisioning is important for maximizing usage without incurring excessive waste. OSG has long relied on GlideinWMS for most of its resource provisioning needs but is limited to… ▽ More The OSG-operated Open Science Pool is an HTCondor-based virtual cluster that aggregates resources from compute clusters provided by several organizations. Most of the resources are not owned by OSG, so demand-based dynamic provisioning is important for maximizing usage without incurring excessive waste. OSG has long relied on GlideinWMS for most of its resource provisioning needs but is limited to resources that provide a Grid-compliant Compute Entrypoint. To work around this limitation, the OSG Software Team has developed a glidein container that resource providers could use to directly contribute to the OSPool. The problem of that approach is that it is not demand-driven, relegating it to backfill scenarios only. To address this limitation, a demand-driven direct provisioner of Kubernetes resources has been developed and successfully used on the NRP. The setup still relies on the OSG-maintained backfill container image but automates the provisioning matchmaking and successive requests. That provisioner has also been extended to support Lancium, a green computing cloud provider with a Kubernetes-like proprietary interface. The provisioner logic has been intentionally kept very simple, making this extension a low-cost project. Both NRP and Lancium resources have been provisioned exclusively using this mechanism for many months. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: 6 pages, 3 figures, Submitted to Proceedings of CHEP23

arXiv:2308.07999 [pdf]

IceCube experience using XRootD-based Origins with GPU workflows in PNRP

Authors: David Schultz, Igor Sfiligoi, Benedikt Riedel, Fabio Andrijauskas, Derek Weitzel, Frank Würthwein

Abstract: The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope located at the geographic South Pole. Understanding detector systematic effects is a continuous process. This requires the Monte Carlo simulation to be updated periodically to quantify potential changes and improvements in science results with more detailed modeling of the systematic effects. IceCube's largest systematic effe… ▽ More The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope located at the geographic South Pole. Understanding detector systematic effects is a continuous process. This requires the Monte Carlo simulation to be updated periodically to quantify potential changes and improvements in science results with more detailed modeling of the systematic effects. IceCube's largest systematic effect comes from the optical properties of the ice the detector is embedded in. Over the last few years there have been considerable improvements in the understanding of the ice, which require a significant processing campaign to update the simulation. IceCube normally stores the results in a central storage system at the University of Wisconsin-Madison, but it ran out of disk space in 2022. The Prototype National Research Platform (PNRP) project thus offered to provide both GPU compute and storage capacity to IceCube in support of this activity. The storage access was provided via XRootD-based OSDF Origins, a first for IceCube computing. We report on the overall experience using PNRP resources, with both successes and pain points. △ Less

Submitted 15 August, 2023; originally announced August 2023.

Comments: 7 pages, 3 figures, 1 table, To be published in Proceedings of CHEP23

arXiv:2308.03678 [pdf]

Evaluation of ARM CPUs for IceCube available through Google Kubernetes Engine

Authors: Igor Sfiligoi, David Schultz, Benedikt Riedel, Frank Würthwein

Abstract: The IceCube experiment has substantial simulation needs and is in continuous search for the most cost-effective ways to satisfy them. The most CPU-intensive part relies on CORSIKA, a cosmic ray air shower simulation. Historically, IceCube relied exclusively on x86-based CPUs, like Intel Xeon and AMD EPYC, but recently server-class ARM-based CPUs are also becoming available, both on-prem and in the… ▽ More The IceCube experiment has substantial simulation needs and is in continuous search for the most cost-effective ways to satisfy them. The most CPU-intensive part relies on CORSIKA, a cosmic ray air shower simulation. Historically, IceCube relied exclusively on x86-based CPUs, like Intel Xeon and AMD EPYC, but recently server-class ARM-based CPUs are also becoming available, both on-prem and in the cloud. In this paper we present our experience in running a sample CORSIKA simulation on both ARM and x86 CPUs available through Google Kubernetes Engine (GKE). We used the production binaries for the x86 instances, but had to build the binaries for ARM instances from source code, which turned out to be mostly painless. Our benchmarks show that ARM-based CPUs in GKE were not only the most cost-effective but were also the fastest in absolute terms in all the tested configurations. While the advantage is not drastic, about 20% in cost-effectiveness and less than 10% in absolute terms, it is still large enough to warrant an investment in ARM support for IceCube. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 5 pages,3 tables, Submitted to proceedings of CHEP23

arXiv:2307.11069 [pdf, other]

doi 10.1109/ICNC57223.2023.10074058

Effectiveness and predictability of in-network storage cache for scientific workflows

Authors: Caitlin Sim, Kesheng Wu, Alex Sim, Inder Monga, Chin Guok, Frank Wurthwein, Diego Davila, Harvey Newman, Justas Balcas

Abstract: Large scientific collaborations often have multiple scientists accessing the same set of files while doing different analyses, which create repeated accesses to the large amounts of shared data located far away. These data accesses have long latency due to distance and occupy the limited bandwidth available over the wide-area network. To reduce the wide-area network traffic and the data access lat… ▽ More Large scientific collaborations often have multiple scientists accessing the same set of files while doing different analyses, which create repeated accesses to the large amounts of shared data located far away. These data accesses have long latency due to distance and occupy the limited bandwidth available over the wide-area network. To reduce the wide-area network traffic and the data access latency, regional data storage caches have been installed as a new networking service. To study the effectiveness of such a cache system in scientific applications, we examine the Southern California Petabyte Scale Cache for a high-energy physics experiment. By examining about 3TB of operational logs, we show that this cache removed 67.6% of file requests from the wide-area network and reduced the traffic volume on wide-area network by 12.3TB (or 35.4%) an average day. The reduction in the traffic volume (35.4%) is less than the reduction in file counts (67.6%) because the larger files are less likely to be reused. Due to this difference in data access patterns, the cache system has implemented a policy to avoid evicting smaller files when processing larger files. We also build a machine learning model to study the predictability of the cache behavior. Tests show that this model is able to accurately predict the cache accesses, cache misses, and network throughput, making the model useful for future studies on resource provisioning and planning. △ Less

Submitted 20 July, 2023; originally announced July 2023.

arXiv:2305.10551 [pdf, other]

doi 10.1145/3569951.3597574

Defining a canonical unit for accounting purposes

Authors: Fabio Andrijauskas, Igor Sfiligoi, Frank Würthwein

Abstract: Compute resource providers often put in place batch compute systems to maximize the utilization of such resources. However, compute nodes in such clusters, both physical and logical, contain several complementary resources, with notable examples being CPUs, GPUs, memory and ephemeral storage. User jobs will typically require more than one such resource, resulting in co-scheduling trade-offs of par… ▽ More Compute resource providers often put in place batch compute systems to maximize the utilization of such resources. However, compute nodes in such clusters, both physical and logical, contain several complementary resources, with notable examples being CPUs, GPUs, memory and ephemeral storage. User jobs will typically require more than one such resource, resulting in co-scheduling trade-offs of partial nodes, especially in multi-user environments. When accounting for either user billing or scheduling overhead, it is thus important to consider all such resources together. We thus define the concept of a threshold-based "canonical unit" that combines several resource types into a single discrete unit and use it to characterize scheduling overhead and make resource billing more fair for both resource providers and users. Note that the exact definition of a canonical unit is not prescribed and may change between resource providers. Nevertheless, we provide a template and two example definitions that we consider appropriate in the context of the Open Science Grid. △ Less

Submitted 17 May, 2023; originally announced May 2023.

Comments: 6 pages, 2 figures, To be published in proceedings of PEARC23

Journal ref: Practice and Experience in Advanced Research Computing (PEARC '23). Association for Computing Machinery, New York, NY, USA, 288-291. (2023)

arXiv:2305.10346 [pdf]

doi 10.1145/3569951.3597586

Testing GitHub projects on custom resources using unprivileged Kubernetes runners

Authors: Igor Sfiligoi, Daniel McDonald, Rob Knight, Frank Würthwein

Abstract: GitHub is a popular repository for hosting software projects, both due to ease of use and the seamless integration with its testing environment. Native GitHub Actions make it easy for software developers to validate new commits and have confidence that new code does not introduce major bugs. The freely available test environments are limited to only a few popular setups but can be extended with cu… ▽ More GitHub is a popular repository for hosting software projects, both due to ease of use and the seamless integration with its testing environment. Native GitHub Actions make it easy for software developers to validate new commits and have confidence that new code does not introduce major bugs. The freely available test environments are limited to only a few popular setups but can be extended with custom Action Runners. Our team had access to a Kubernetes cluster with GPU accelerators, so we explored the feasibility of automatically deploying GPU-providing runners there. All available Kubernetes-based setups, however, require cluster-admin level privileges. To address this problem, we developed a simple custom setup that operates in a completely unprivileged manner. In this paper we provide a summary description of the setup and our experience using it in the context of two Knight lab projects on the Prototype National Research Platform system. △ Less

Submitted 17 May, 2023; originally announced May 2023.

Comments: 5 pages, 1 figure, To be published in proceedings of PEARC23

Journal ref: Practice and Experience in Advanced Research Computing (PEARC '23). Association for Computing Machinery, New York, NY, USA, 332-335. (2023)

arXiv:2305.00856 [pdf, other]

doi 10.1145/3589012.3594897

Analyzing Transatlantic Network Traffic over Scientific Data Caches

Authors: Z. Deng, A. Sim, K. Wu, C. Guok, D. Hazen, I. Monga, F. Andrijauskas, F. Wuerthwein, D. Weitzel

Abstract: Large scientific collaborations often share huge volumes of data around the world. Consequently a significant amount of network bandwidth is needed for data replication and data access. Users in the same region may possibly share resources as well as data, especially when they are working on related topics with similar datasets. In this work, we study the network traffic patterns and resource util… ▽ More Large scientific collaborations often share huge volumes of data around the world. Consequently a significant amount of network bandwidth is needed for data replication and data access. Users in the same region may possibly share resources as well as data, especially when they are working on related topics with similar datasets. In this work, we study the network traffic patterns and resource utilization for scientific data caches connecting European networks to the US. We explore the efficiency of resource utilization, especially for network traffic which consists mostly of transatlantic data transfers, and the potential for having more caching node deployments. Our study shows that these data caches reduced network traffic volume by 97% during the study period. This demonstrates that such caching nodes are effective in reducing wide-area network traffic. △ Less

Submitted 17 July, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

arXiv:2209.13714 [pdf, other]

doi 10.1109/XLOOP56614.2022.00008

Managed Network Services for Exascale Data Movement Across Large Global Scientific Collaborations

Authors: Frank Würthwein, Jonathan Guiang, Aashay Arora, Diego Davila, John Graham, Dima Mishin, Thomas Hutton, Igor Sfiligoi, Harvey Newman, Justas Balcas, Tom Lehman, Xi Yang, Chin Guok

Abstract: Unique scientific instruments designed and operated by large global collaborations are expected to produce Exabyte-scale data volumes per year by 2030. These collaborations depend on globally distributed storage and compute to turn raw data into science. While all of these infrastructures have batch scheduling capabilities to share compute, Research and Education networks lack those capabilities.… ▽ More Unique scientific instruments designed and operated by large global collaborations are expected to produce Exabyte-scale data volumes per year by 2030. These collaborations depend on globally distributed storage and compute to turn raw data into science. While all of these infrastructures have batch scheduling capabilities to share compute, Research and Education networks lack those capabilities. There is thus uncontrolled competition for bandwidth between and within collaborations. As a result, data "hogs" disk space at processing facilities for much longer than it takes to process, leading to vastly over-provisioned storage infrastructures. Integrated co-scheduling of networks as part of high-level managed workflows might reduce these storage needs by more than an order of magnitude. This paper describes such a solution, demonstrates its functionality in the context of the Large Hadron Collider (LHC) at CERN, and presents the next-steps towards its use in production. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Comments: Submitted to the proceedings of the XLOOP workshop held in conjunction with Supercomputing 22

arXiv:2209.08868 [pdf, other]

Snowmass 2021 Computational Frontier CompF4 Topical Group Report: Storage and Processing Resource Access

Authors: W. Bhimji, D. Carder, E. Dart, J. Duarte, I. Fisk, R. Gardner, C. Guok, B. Jayatilaka, T. Lehman, M. Lin, C. Maltzahn, S. McKee, M. S. Neubauer, O. Rind, O. Shadura, N. V. Tran, P. van Gemmeren, G. Watts, B. A. Weaver, F. Würthwein

Abstract: Computing plays a significant role in all areas of high energy physics. The Snowmass 2021 CompF4 topical group's scope is facilities R&D, where we consider "facilities" as the computing hardware and software infrastructure inside the data centers plus the networking between data centers, irrespective of who owns them, and what policies are applied for using them. In other words, it includes commer… ▽ More Computing plays a significant role in all areas of high energy physics. The Snowmass 2021 CompF4 topical group's scope is facilities R&D, where we consider "facilities" as the computing hardware and software infrastructure inside the data centers plus the networking between data centers, irrespective of who owns them, and what policies are applied for using them. In other words, it includes commercial clouds, federally funded High Performance Computing (HPC) systems for all of science, and systems funded explicitly for a given experimental or theoretical program. This topical group report summarizes the findings and recommendations for the storage, processing, networking and associated software service infrastructures for future high energy physics research, based on the discussions organized through the Snowmass 2021 community study. △ Less

Submitted 29 September, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

Comments: Snowmass 2021 Computational Frontier CompF4 topical group report. v2: Expanded introduction. Updated author list. 52 pages, 6 figures

arXiv:2205.09682 [pdf]

doi 10.1145/3491418.3535130

Comparing single-node and multi-node performance of an important fusion HPC code benchmark

Authors: Emily A. Belli, Jeff Candy, Igor Sfiligoi, Frank Würthwein

Abstract: Fusion simulations have traditionally required the use of leadership scale High Performance Computing (HPC) resources in order to produce advances in physics. The impressive improvements in compute and memory capacity of many-GPU compute nodes are now allowing for some problems that once required a multi-node setup to be also solvable on a single node. When possible, the increased interconnect ban… ▽ More Fusion simulations have traditionally required the use of leadership scale High Performance Computing (HPC) resources in order to produce advances in physics. The impressive improvements in compute and memory capacity of many-GPU compute nodes are now allowing for some problems that once required a multi-node setup to be also solvable on a single node. When possible, the increased interconnect bandwidth can result in order of magnitude higher science throughput, especially for communication-heavy applications. In this paper we analyze the performance of the fusion simulation tool CGYRO, an Eulerian gyrokinetic turbulence solver designed and optimized for collisional, electromagnetic, multiscale simulation, which is widely used in the fusion research community. Due to the nature of the problem, the application has to work on a large multi-dimensional computational mesh as a whole, requiring frequent exchange of large amounts of data between the compute processes. In particular, we show that the average-scale nl03 benchmark CGYRO simulation can be run at an acceptable speed on a single Google Cloud instance with 16 A100 GPUs, outperforming 8 NERSC Perlmutter Phase1 nodes, 16 ORNL Summit nodes and 256 NERSC Cori nodes. Moving from a multi-node to a single-node GPU setup we get comparable simulation times using less than half the number of GPUs. Larger benchmark problems, however, still require a multi-node HPC setup due to GPU memory capacity needs, since at the time of writing no vendor offers nodes with a sufficient GPU memory setup. The upcoming external NVSWITCH does however promise to deliver an almost equivalent solution for up to 256 NVIDIA GPUs. △ Less

Submitted 19 May, 2022; originally announced May 2022.

Comments: 6 pages, 1 table, 1 figure, to be published in proceedings of PEARC22

Journal ref: PEARC '22: Practice and Experience in Advanced Research Computing (2022) 10 1-4

arXiv:2205.09232 [pdf]

doi 10.1145/3491418.3535125

The anachronism of whole-GPU accounting

Authors: Igor Sfiligoi, David Schultz, Frank Würthwein, Benedikt Riedel, Dmitry Y. Mishin

Abstract: NVIDIA has been making steady progress in increasing the compute performance of its GPUs, resulting in order of magnitude compute throughput improvements over the years. With several models of GPUs coexisting in many deployments, the traditional accounting method of treating all GPUs as being equal is not reflecting compute output anymore. Moreover, for applications that require significant CPU-ba… ▽ More NVIDIA has been making steady progress in increasing the compute performance of its GPUs, resulting in order of magnitude compute throughput improvements over the years. With several models of GPUs coexisting in many deployments, the traditional accounting method of treating all GPUs as being equal is not reflecting compute output anymore. Moreover, for applications that require significant CPU-based compute to complement the GPU-based compute, it is becoming harder and harder to make full use of the newer GPUs, requiring sharing of those GPUs between multiple applications in order to maximize the achievable science output. This further reduces the value of whole-GPU accounting, especially when the sharing is done at the infrastructure level. We thus argue that GPU accounting for throughput-oriented infrastructures should be expressed in GPU core hours, much like it is normally done for the CPUs. While GPU core compute throughput does change between GPU generations, the variability is similar to what we expect to see among CPU cores. To validate our position, we present an extensive set of run time measurements of two IceCube photon propagation workflows on 14 GPU models, using both on-prem and Cloud resources. The measurements also outline the influence of GPU sharing at both HTCondor and Kubernetes infrastructure level. △ Less

Submitted 18 May, 2022; originally announced May 2022.

Comments: 6 pages, 2 tables, 1 figure, to be published in proceedings of PEARC22

Journal ref: PEARC '22: Practice and Experience in Advanced Research Computing (2022) 58 1-5

arXiv:2205.05598 [pdf, other]

doi 10.1145/3526064.3534111

Studying Scientific Data Lifecycle in On-demand Distributed Storage Caches

Authors: Julian Bellavita, Alex Sim, Kesheng Wu, Inder Monga, Chin Guok, Frank Würthwein, Diego Davila

Abstract: The XRootD system is used to transfer, store, and cache large datasets from high-energy physics (HEP). In this study we focus on its capability as distributed on-demand storage cache. Through exploring a large set of daily log files between 2020 and 2021, we seek to understand the data access patterns that might inform future cache design. Our study begins with a set of summary statistics regardin… ▽ More The XRootD system is used to transfer, store, and cache large datasets from high-energy physics (HEP). In this study we focus on its capability as distributed on-demand storage cache. Through exploring a large set of daily log files between 2020 and 2021, we seek to understand the data access patterns that might inform future cache design. Our study begins with a set of summary statistics regarding file read operations, file lifetimes, and file transfers. We observe that the number of read operations on each file remains nearly constant, while the average size of a read operation grows over time. Furthermore, files tend to have a consistent length of time during which they remain open and are in use. Based on this comprehensive study of the cache access statistics, we developed a cache simulator to explore the behavior of caches of different sizes. Within a certain size range, we find that increasing the XRootD cache size improves the cache hit rate, yielding faster overall file access. In particular, we find that increase the cache size from 40TB to 56TB could increase the hit rate from 0.62 to 0.89, which is a significant increase in cache effectiveness for modest cost. △ Less

Submitted 11 May, 2022; originally announced May 2022.

arXiv:2205.05563 [pdf, other]

doi 10.1145/3526064.3534110

Access Trends of In-network Cache for Scientific Data

Authors: Ruize Han, Alex Sim, Kesheng Wu, Inder Monga, Chin Guok, Frank Würthwein, Diego Davila, Justas Balcas, Harvey Newman

Abstract: Scientific collaborations are increasingly relying on large volumes of data for their work and many of them employ tiered systems to replicate the data to their worldwide user communities. Each user in the community often selects a different subset of data for their analysis tasks; however, members of a research group often are working on related research topics that require similar data objects.… ▽ More Scientific collaborations are increasingly relying on large volumes of data for their work and many of them employ tiered systems to replicate the data to their worldwide user communities. Each user in the community often selects a different subset of data for their analysis tasks; however, members of a research group often are working on related research topics that require similar data objects. Thus, there is a significant amount of data sharing possible. In this work, we study the access traces of a federated storage cache known as the Southern California Petabyte Scale Cache. By studying the access patterns and potential for network traffic reduction by this caching system, we aim to explore the predictability of the cache uses and the potential for a more general in-network data caching. Our study shows that this distributed storage cache is able to reduce the network traffic volume by a factor of 2.35 during a part of the study period. We further show that machine learning models could predict cache utilization with an accuracy of 0.88. This demonstrates that such cache usage is predictable, which could be useful for managing complex networking resources such as in-network caching. △ Less

Submitted 11 May, 2022; originally announced May 2022.

arXiv:2205.01004 [pdf]

doi 10.1145/3491418.3535123

Auto-scaling HTCondor pools using Kubernetes compute resources

Authors: Igor Sfiligoi, Thomas DeFanti, Frank Würthwein

Abstract: HTCondor has been very successful in managing globally distributed, pleasantly parallel scientific workloads, especially as part of the Open Science Grid. HTCondor system design makes it ideal for integrating compute resources provisioned from anywhere, but it has very limited native support for autonomously provisioning resources managed by other solutions. This work presents a solution that allo… ▽ More HTCondor has been very successful in managing globally distributed, pleasantly parallel scientific workloads, especially as part of the Open Science Grid. HTCondor system design makes it ideal for integrating compute resources provisioned from anywhere, but it has very limited native support for autonomously provisioning resources managed by other solutions. This work presents a solution that allows for autonomous, demand-driven provisioning of Kubernetes-managed resources. A high-level overview of the employed architectures is presented, paired with the description of the setups used in both on-prem and Cloud deployments in support of several Open Science Grid communities. The experience suggests that the described solution should be generally suitable for contributing Kubernetes-based resources to existing HTCondor pools. △ Less

Submitted 2 May, 2022; originally announced May 2022.

Comments: 6 pages, 3 figures, to be published in proceedings of PEARC22

Journal ref: PEARC '22: Practice and Experience in Advanced Research Computing (2022) 57 1-4

arXiv:2203.08280 [pdf]

Data Transfer and Network Services management for Domain Science Workflows

Authors: Tom Lehman, Xi Yang, Chin Guok, Frank Wuerthwein, Igor Sfiligoi, John Graham, Aashay Arora, Dima Mishin, Diego Davila, Jonathan Guiang, Tom Hutton, Harvey Newman, Justas Balcas

Abstract: This paper describes a vision and work in progress to elevate network resources and data transfer management to the same level as compute and storage in the context of services access, scheduling, life cycle management, and orchestration. While domain science workflows often include active compute resource allocation and management, the data transfers and associated network resource coordination i… ▽ More This paper describes a vision and work in progress to elevate network resources and data transfer management to the same level as compute and storage in the context of services access, scheduling, life cycle management, and orchestration. While domain science workflows often include active compute resource allocation and management, the data transfers and associated network resource coordination is not handled in a similar manner. As a result data transfers can introduce a degree of uncertainty in workflow operations, and the associated lack of network information does not allow for either the workflow operations or the network use to be optimized. The net result is that domain science workflow processes are forced to view the network as an opaque infrastructure into which they inject data and hope that it emerges at the destination with an acceptable Quality of Experience. There is little ability for applications to interact with the network to exchange information, negotiate performance parameters, discover expected performance metrics, or receive status/troubleshooting information in real time. Developing mechanisms to allow an application workflow to obtain information regarding the network services, capabilities, and options, to a degree similar to what is possible for compute resources is the primary motivation for this work. The initial focus is on the Open Science Grid (OSG)/Compact Muon Solenoid (CMS) Large Hadron Collider (LHC) workflows with Rucio/FTS/XRootD based data transfers and the interoperation with the ESnet SENSE (Software-Defined Network for End-to-end Networked Science at the Exascale) system. △ Less

Submitted 20 March, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: contribution to Snowmass 2022

arXiv:2110.13187 [pdf]

doi 10.1007/978-981-19-0898-9_20

Data intensive physics analysis in Azure cloud

Authors: Igor Sfiligoi, Frank Würthwein, Diego Davila

Abstract: The Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) is one of the largest data producers in the scientific world, with standard data products centrally produced, and then used by often competing teams within the collaboration. This work is focused on how a local institution, University of California San Diego (UCSD), partnered with the Open Science Grid (OSG) to use Azure… ▽ More The Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC) is one of the largest data producers in the scientific world, with standard data products centrally produced, and then used by often competing teams within the collaboration. This work is focused on how a local institution, University of California San Diego (UCSD), partnered with the Open Science Grid (OSG) to use Azure cloud resources to augment its available computing to accelerate time to results for multiple analyses pursued by a small group of collaborators. The OSG is a federated infrastructure allowing many independent resource providers to serve many independent user communities in a transparent manner. Historically the resources would come from various research institutions, spanning small universities to large HPC centers, based on either community needs or grant allocations, so adding commercial clouds as resource providers is a natural evolution. The OSG technology allows for easy integration of cloud resources, but the data-intensive nature of CMS compute jobs required the deployment of additional data caching infrastructure to ensure high efficiency. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: 11 pages, 5 figures, to be published in proceedings of ICOCBI 2021

Journal ref: Lecture Notes on Data Engineering and Communications Technologies, vol 117. Springer, Singapore. 2022

arXiv:2107.03963 [pdf]

doi 10.1109/eScience51609.2021.00034

Expanding IceCube GPU computing into the Clouds

Authors: Igor Sfiligoi, Shava Smallen, Frank Würthwein, Nicole Wolter, David Schultz, Benedikt Riedel

Abstract: The IceCube collaboration relies on GPU compute for many of its needs, including ray tracing simulation and machine learning activities. GPUs are however still a relatively scarce commodity in the scientific resource provider community, so we expanded the available resource pool with GPUs provisioned from the commercial Cloud providers. The provisioned resources were fully integrated into the norm… ▽ More The IceCube collaboration relies on GPU compute for many of its needs, including ray tracing simulation and machine learning activities. GPUs are however still a relatively scarce commodity in the scientific resource provider community, so we expanded the available resource pool with GPUs provisioned from the commercial Cloud providers. The provisioned resources were fully integrated into the normal IceCube workload management system through the Open Science Grid (OSG) infrastructure and used CloudBank for budget management. The result was an approximate doubling of GPU wall hours used by IceCube over a period of 2 weeks, adding over 3.1 fp32 EFLOP hours for a price tag of about $58k. This paper describes the setup used and the operational experience. △ Less

Submitted 8 July, 2021; originally announced July 2021.

Comments: 2 pages, 2 figures, to be published in proceedings of eScience 2021

Journal ref: 2021 IEEE 17th International Conference on eScience (eScience), 2021, pp. 227-228

arXiv:2107.03947 [pdf]

doi 10.1109/eScience51609.2021.00040

HTCondor data movement at 100 Gbps

Authors: Igor Sfiligoi, Frank Würthwein, Thomas DeFanti, John Graham

Abstract: HTCondor is a major workload management system used in distributed high throughput computing (dHTC) environments, e.g., the Open Science Grid. One of the distinguishing features of HTCondor is the native support for data movement, allowing it to operate without a shared filesystem. Coupling data handling and compute scheduling is both convenient for users and allows for significant infrastructure… ▽ More HTCondor is a major workload management system used in distributed high throughput computing (dHTC) environments, e.g., the Open Science Grid. One of the distinguishing features of HTCondor is the native support for data movement, allowing it to operate without a shared filesystem. Coupling data handling and compute scheduling is both convenient for users and allows for significant infrastructure flexibility but does introduce some limitations. The default HTCondor data transfer mechanism routes both the input and output data through the submission node, making it a potential bottleneck. In this document we show that by using a node equipped with a 100 Gbps network interface (NIC) HTCondor can serve data at up to 90 Gbps, which is sufficient for most current use cases, as it would saturate the border network links of most research universities at the time of writing. △ Less

Submitted 8 July, 2021; originally announced July 2021.

Comments: 2 pages, 2 figures, to be published in proceedings of eScience 2021

Journal ref: 2021 IEEE 17th International Conference on eScience (eScience), 2021, pp. 239-240

arXiv:2105.00964 [pdf, other]

doi 10.1145/3452411.3464441

Analyzing scientific data sharing patterns for in-network data caching

Authors: Elizabeth Copps, Huiyi Zhang, Alex Sim, Kesheng Wu, Inder Monga, Chin Guok, Frank Würthwein, Diego Davila, Edgar Fajardo

Abstract: The volume of data moving through a network increases with new scientific experiments and simulations. Network bandwidth requirements also increase proportionally to deliver data within a certain time frame. We observe that a significant portion of the popular dataset is transferred multiple times to different users as well as to the same user for various reasons. In-network data caching for the s… ▽ More The volume of data moving through a network increases with new scientific experiments and simulations. Network bandwidth requirements also increase proportionally to deliver data within a certain time frame. We observe that a significant portion of the popular dataset is transferred multiple times to different users as well as to the same user for various reasons. In-network data caching for the shared data has shown to reduce the redundant data transfers and consequently save network traffic volume. In addition, overall application performance is expected to improve with in-network caching because access to the locally cached data results in lower latency. This paper shows how much data was shared over the study period, how much network traffic volume was consequently saved, and how much the temporary in-network caching increased the scientific application performance. It also analyzes data access patterns in applications and the impacts of caching nodes on the regional data repository. From the results, we observed that the network bandwidth demand was reduced by nearly a factor of 3 over the study period. △ Less

Submitted 3 May, 2021; originally announced May 2021.

arXiv:2104.06913 [pdf]

doi 10.1145/3437359.3465563

Managing Cloud networking costs for data-intensive applications by provisioning dedicated network links

Authors: Igor Sfiligoi, Michael Hare, David Schultz, Frank Würthwein, Benedikt Riedel, Tom Hutton, Steve Barnet, Vladimir Brik

Abstract: Many scientific high-throughput applications can benefit from the elastic nature of Cloud resources, especially when there is a need to reduce time to completion. Cost considerations are usually a major issue in such endeavors, with networking often a major component; for data-intensive applications, egress networking costs can exceed the compute costs. Dedicated network links provide a way to low… ▽ More Many scientific high-throughput applications can benefit from the elastic nature of Cloud resources, especially when there is a need to reduce time to completion. Cost considerations are usually a major issue in such endeavors, with networking often a major component; for data-intensive applications, egress networking costs can exceed the compute costs. Dedicated network links provide a way to lower the networking costs, but they do add complexity. In this paper we provide a description of a 100 fp32 PFLOPS Cloud burst in support of IceCube production compute, that used Internet2 Cloud Connect service to provision several logically-dedicated network links from the three major Cloud providers, namely Amazon Web Services, Microsoft Azure and Google Cloud Platform, that in aggregate enabled approximately 100 Gbps egress capability to on-prem storage. It provides technical details about the provisioning process, the benefits and limitations of such a setup and an analysis of the costs incurred. △ Less

Submitted 14 April, 2021; originally announced April 2021.

Comments: 8 pages, 7 figures, 4 tables, to be published in proceedings of PEARC21

arXiv:2103.12116 [pdf, other]

doi 10.1051/epjconf/202125102001

Systematic benchmarking of HTTPS third party copy on 100Gbps links using XRootD

Authors: Edgar Fajardo, Aashay Arora, Diego Davila, Richard Gao, Frank Würthwein, Brian Bockelman

Abstract: The High Luminosity Large Hadron Collider provides a data challenge. The amount of data recorded from the experiments and transported to hundreds of sites will see a thirty fold increase in annual data volume. A systematic approach to contrast the performance of different Third Party Copy(TPC) transfer protocols arises. Two contenders, XRootD-HTTPS and the GridFTP are evaluated in their performanc… ▽ More The High Luminosity Large Hadron Collider provides a data challenge. The amount of data recorded from the experiments and transported to hundreds of sites will see a thirty fold increase in annual data volume. A systematic approach to contrast the performance of different Third Party Copy(TPC) transfer protocols arises. Two contenders, XRootD-HTTPS and the GridFTP are evaluated in their performance for transferring files from one server to an-other over 100Gbps interfaces. The benchmarking is done by scheduling pods on the Pacific Research Platform Kubernetes cluster to ensure reproducible and repeatable results. This opens a future pathway for network testing of any TPC transfer protocol. △ Less

Submitted 22 March, 2021; originally announced March 2021.

Comments: 7 pages, 8 figures

arXiv:2101.11489 [pdf, other]

Parallelizing the Unpacking and Clustering of Detector Data for Reconstruction of Charged Particle Tracks on Multi-core CPUs and Many-core GPUs

Authors: Giuseppe Cerati, Peter Elmer, Brian Gravelle, Matti Kortelainen, Vyacheslav Krutelyov, Steven Lantz, Mario Masciovecchio, Kevin McDermott, Boyana Norris, Allison Reinsvold Hall, Micheal Reid, Daniel Riley, Matevž Tadel, Peter Wittich, Bei Wang, Frank Würthwein, Avraham Yagil

Abstract: We present results from parallelizing the unpacking and clustering steps of the raw data from the silicon strip modules for reconstruction of charged particle tracks. Throughput is further improved by concurrently processing multiple events using nested OpenMP parallelism on CPU or CUDA streams on GPU. The new implementation along with earlier work in developing a parallelized and vectorized imple… ▽ More We present results from parallelizing the unpacking and clustering steps of the raw data from the silicon strip modules for reconstruction of charged particle tracks. Throughput is further improved by concurrently processing multiple events using nested OpenMP parallelism on CPU or CUDA streams on GPU. The new implementation along with earlier work in developing a parallelized and vectorized implementation of the combinatoric Kalman filter algorithm has enabled efficient global reconstruction of the entire event on modern computer architectures. We demonstrate the performance of the new implementation on Intel Xeon and NVIDIA GPU architectures. △ Less

Submitted 27 January, 2021; originally announced January 2021.

arXiv:2011.14995 [pdf, other]

Adapting LIGO workflows to run in the Open Science Grid

Authors: Edgar Fajardo, Frank Wuerthwein, Brian Bockelman, Miron Livny, Greg Thain, James Alexander Clark, Peter Couvares, Josh Willis

Abstract: During the first observation run the LIGO collaboration needed to offload some of its most, intense CPU workflows from its dedicated computing sites to opportunistic resources. Open Science Grid enabled LIGO to run PyCbC, RIFT and Bayeswave workflows to seamlessly run in a combination of owned and opportunistic resources. One of the challenges is enabling the workflows to use several heterogeneous… ▽ More During the first observation run the LIGO collaboration needed to offload some of its most, intense CPU workflows from its dedicated computing sites to opportunistic resources. Open Science Grid enabled LIGO to run PyCbC, RIFT and Bayeswave workflows to seamlessly run in a combination of owned and opportunistic resources. One of the challenges is enabling the workflows to use several heterogeneous resources in a coordinated and effective way. △ Less

Submitted 30 November, 2020; originally announced November 2020.

arXiv:2007.01408 [pdf, other]

doi 10.1051/epjconf/202024504041

Creating a content delivery network for general science on the internet backbone using XCaches

Authors: Edgar Fajardo, Marian Zvada, Derek Weitzel, Mats Rynge, John Hicks, Mat Selmeci, Brian Lin, Pascal Paschos, Brian Bockelman, Igor Sfiligoi, Andrew Hanushevsky, Frank Würthwein

Abstract: A general problem faced by computing on the grid for opportunistic users is that delivering cycles is simpler than delivering data to those cycles. In this project we show how we integrated XRootD caches placed on the internet backbone to implement a content delivery network for general science workflows. We will show that for some workflows on different science domains like high energy physics, g… ▽ More A general problem faced by computing on the grid for opportunistic users is that delivering cycles is simpler than delivering data to those cycles. In this project we show how we integrated XRootD caches placed on the internet backbone to implement a content delivery network for general science workflows. We will show that for some workflows on different science domains like high energy physics, gravitational waves, and others the combination of data reuse from the workflows together with the use of caches increases CPU efficiency while decreasing network bandwidth use. △ Less

Submitted 28 September, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

arXiv:2005.06151 [pdf, other]

doi 10.1051/epjconf/202024505019

The Scalable Systems Laboratory: a Platform for Software Innovation for HEP

Authors: Robert Gardner, Lincoln Bryant, Mark Neubauer, Frank Wuerthwein, Judith Stephen, Andrew Chien

Abstract: The Scalable Systems Laboratory (SSL), part of the IRIS-HEP Software Institute, provides Institute participants and HEP software developers generally with a means to transition their R&D from conceptual toys to testbeds to production-scale prototypes. The SSL enables tooling, infrastructure, and services supporting the innovation of novel analysis and data architectures, development of software el… ▽ More The Scalable Systems Laboratory (SSL), part of the IRIS-HEP Software Institute, provides Institute participants and HEP software developers generally with a means to transition their R&D from conceptual toys to testbeds to production-scale prototypes. The SSL enables tooling, infrastructure, and services supporting the innovation of novel analysis and data architectures, development of software elements and tool-chains, reproducible functional and scalability testing of service components, and foundational systems R&D for accelerated services developed by the Institute. The SSL is constructed with a core team having expertise in scale testing and deployment of services across a wide range of cyberinfrastructure. The core team embeds and partners with other areas in the Institute, and with LHC and other HEP development and operations teams as appropriate, to define investigations and required service deployment patterns. We describe the approach and experiences with early application deployments, including analysis platforms and intelligent data delivery systems. △ Less

Submitted 13 May, 2020; originally announced May 2020.

arXiv:2004.09492 [pdf]

doi 10.1145/3311790.3396625

Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scientific Computing

Authors: I. Sfiligoi, D. Schultz, B. Riedel, F. Wuerthwein, S. Barnet, V. Brik

Abstract: Scientific computing needs are growing dramatically with time and are expanding in science domains that were previously not compute intensive. When compute workflows spike well in excess of the capacity of their local compute resource, capacity should be temporarily provisioned from somewhere else to both meet deadlines and to increase scientific output. Public Clouds have become an attractive opt… ▽ More Scientific computing needs are growing dramatically with time and are expanding in science domains that were previously not compute intensive. When compute workflows spike well in excess of the capacity of their local compute resource, capacity should be temporarily provisioned from somewhere else to both meet deadlines and to increase scientific output. Public Clouds have become an attractive option due to their ability to be provisioned with minimal advance notice. The available capacity of cost-effective instances is not well understood. This paper presents expanding the IceCube's production HTCondor pool using cost-effective GPU instances in preemptible mode gathered from the three major Cloud providers, namely Amazon Web Services, Microsoft Azure and the Google Cloud Platform. Using this setup, we sustained for a whole workday about 15k GPUs, corresponding to around 170 PFLOP32s, integrating over one EFLOP32 hour worth of science output for a price tag of about $60k. In this paper, we provide the reasoning behind Cloud instance selection, a description of the setup and an analysis of the provisioned resources, as well as a short description of the actual science output of the exercise. △ Less

Submitted 18 April, 2020; originally announced April 2020.

Comments: 5 pages, 7 figures, to be published in proceedings of PEARC'20. arXiv admin note: text overlap with arXiv:2002.06667

arXiv:2003.02319 [pdf, other]

doi 10.1051/epjconf/202024504042

Moving the California distributed CMS xcache from bare metal into containers using Kubernetes

Authors: Edgar Fajardo, Matevz Tadel, Justas Balcas, Alja Tadel, Frank Wuerthwein, Diego Davila, Jonathan Guiang, Igor Sfiligoi

Abstract: The University of California system has excellent networking between all of its campuses as well as a number of other Universities in CA, including Caltech, most of them being connected at 100 Gbps. UCSD and Caltech have thus joined their disk systems into a single logical xcache system, with worker nodes from both sites accessing data from disks at either site. This setup has been in place for a… ▽ More The University of California system has excellent networking between all of its campuses as well as a number of other Universities in CA, including Caltech, most of them being connected at 100 Gbps. UCSD and Caltech have thus joined their disk systems into a single logical xcache system, with worker nodes from both sites accessing data from disks at either site. This setup has been in place for a couple years now and has shown to work very well. Coherently managing nodes at multiple physical locations has however not been trivial, and we have been looking for ways to improve operations. With the Pacific Research Platform (PRP) now providing a Kubernetes resource pool spanning resources in the science DMZs of all the UC campuses, we have recently migrated the xcache services from being hosted bare-metal into containers. This paper presents our experience in both migrating to and operating in the new environment. △ Less

Submitted 4 March, 2020; originally announced March 2020.

arXiv:2002.06667 [pdf]

doi 10.1007/978-3-030-50743-5_2

Running a Pre-Exascale, Geographically Distributed, Multi-Cloud Scientific Simulation

Authors: Igor Sfiligoi, Frank Wuerthwein, Benedikt Riedel, David Schultz

Abstract: As we approach the Exascale era, it is important to verify that the existing frameworks and tools will still work at that scale. Moreover, public Cloud computing has been emerging as a viable solution for both prototyping and urgent computing. Using the elasticity of the Cloud, we have thus put in place a pre-exascale HTCondor setup for running a scientific simulation in the Cloud, with the chosen… ▽ More As we approach the Exascale era, it is important to verify that the existing frameworks and tools will still work at that scale. Moreover, public Cloud computing has been emerging as a viable solution for both prototyping and urgent computing. Using the elasticity of the Cloud, we have thus put in place a pre-exascale HTCondor setup for running a scientific simulation in the Cloud, with the chosen application being IceCube's photon propagation simulation. I.e. this was not a purely demonstration run, but it was also used to produce valuable and much needed scientific results for the IceCube collaboration. In order to reach the desired scale, we aggregated GPU resources across 8 GPU models from many geographic regions across Amazon Web Services, Microsoft Azure, and the Google Cloud Platform. Using this setup, we reached a peak of over 51k GPUs corresponding to almost 380 PFLOP32s, for a total integrated compute of about 100k GPU hours. In this paper we provide the description of the setup, the problems that were discovered and overcome, as well as a short description of the actual science output of the exercise. △ Less

Submitted 16 February, 2020; originally announced February 2020.

Comments: 18 pages, 5 figures, 4 tables, to be published in Proceedings of ISC High Performance 2020

Journal ref: Lecture Notes in Computer Science, vol 12151, year 2020. Springer

arXiv:2002.04568 [pdf]

doi 10.1051/epjconf/202024507059

Characterizing network paths in and out of the clouds

Authors: Igor Sfiligoi, John Graham, Frank Wuerthwein

Abstract: Commercial Cloud computing is becoming mainstream, with funding agencies moving beyond prototyping and starting to fund production campaigns, too. An important aspect of any scientific computing production campaign is data movement, both incoming and outgoing. And while the performance and cost of VMs is relatively well understood, the network performance and cost is not. This paper provides a cha… ▽ More Commercial Cloud computing is becoming mainstream, with funding agencies moving beyond prototyping and starting to fund production campaigns, too. An important aspect of any scientific computing production campaign is data movement, both incoming and outgoing. And while the performance and cost of VMs is relatively well understood, the network performance and cost is not. This paper provides a characterization of networking in various regions of Amazon Web Services, Microsoft Azure and Google Cloud Platform, both between Cloud resources and major DTNs in the Pacific Research Platform, including OSG data federation caches in the network backbone, and inside the clouds themselves. The paper contains both a qualitative analysis of the results as well as latency and throughput measurements. It also includes an analysis of the costs involved with Cloud-based networking. △ Less

Submitted 11 February, 2020; originally announced February 2020.

Comments: 7 pages, 1 figure, 5 tables, to be published in CHEP19 proceedings

Journal ref: EPJ Web of Conferences 245, 07059 (2020)

arXiv:1705.06202 [pdf, other]

Data Access for LIGO on the OSG

Authors: Derek Weitzel, Brian Bockelman, Duncan A. Brown, Peter Couvares, Frank Würthwein, Edgar Fajardo Hernandez

Abstract: During 2015 and 2016, the Laser Interferometer Gravitational-Wave Observatory (LIGO) conducted a three-month observing campaign. These observations delivered the first direct detection of gravitational waves from binary black hole mergers. To search for these signals, the LIGO Scientific Collaboration uses the PyCBC search pipeline. To deliver science results in a timely manner, LIGO collaborated… ▽ More During 2015 and 2016, the Laser Interferometer Gravitational-Wave Observatory (LIGO) conducted a three-month observing campaign. These observations delivered the first direct detection of gravitational waves from binary black hole mergers. To search for these signals, the LIGO Scientific Collaboration uses the PyCBC search pipeline. To deliver science results in a timely manner, LIGO collaborated with the Open Science Grid (OSG) to distribute the required computation across a series of dedicated, opportunistic, and allocated resources. To deliver the petabytes necessary for such a large-scale computation, our team deployed a distributed data access infrastructure based on the XRootD server suite and the CernVM File System (CVMFS). This data access strategy grew from simply accessing remote storage to a POSIX-based interface underpinned by distributed, secure caches across the OSG. △ Less

Submitted 17 May, 2017; originally announced May 2017.

Comments: 6 pages, 3 figures, submitted to PEARC17

arXiv:1508.01443 [pdf, other]

Any Data, Any Time, Anywhere: Global Data Access for Science

Authors: Kenneth Bloom, Tommaso Boccali, Brian Bockelman, Daniel Bradley, Sridhara Dasu, Jeff Dost, Federica Fanzago, Igor Sfiligoi, Alja Mrak Tadel, Matevz Tadel, Carl Vuosalo, Frank Würthwein, Avi Yagil, Marian Zvada

Abstract: Data access is key to science driven by distributed high-throughput computing (DHTC), an essential technology for many major research projects such as High Energy Physics (HEP) experiments. However, achieving efficient data access becomes quite difficult when many independent storage sites are involved because users are burdened with learning the intricacies of accessing each system and keeping ca… ▽ More Data access is key to science driven by distributed high-throughput computing (DHTC), an essential technology for many major research projects such as High Energy Physics (HEP) experiments. However, achieving efficient data access becomes quite difficult when many independent storage sites are involved because users are burdened with learning the intricacies of accessing each system and keeping careful track of data location. We present an alternate approach: the Any Data, Any Time, Anywhere infrastructure. Combining several existing software products, AAA presents a global, unified view of storage systems - a "data federation," a global filesystem for software delivery, and a workflow management system. We present how one HEP experiment, the Compact Muon Solenoid (CMS), is utilizing the AAA infrastructure and some simple performance metrics. △ Less

Submitted 6 August, 2015; originally announced August 2015.

Comments: 9 pages, 6 figures, submitted to 2nd IEEE/ACM International Symposium on Big Data Computing (BDC) 2015

arXiv:1507.07430 [pdf, other]

doi 10.1088/1742-6596/664/3/032010

Designing Computing System Architecture and Models for the HL-LHC era

Authors: Lothar Bauerdick, Brian Bockelman, Peter Elmer, Stephen Gowdy, Matevz Tadel, Frank Wuerthwein

Abstract: This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade. This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade. △ Less

Submitted 20 July, 2015; originally announced July 2015.

Comments: Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japan

arXiv:cs/0307007 [pdf, ps, other]

Management of Grid Jobs and Information within SAMGrid

Authors: A. Baranovski, G. Garzoglio, A. Kreymer, L. Lueking, S. Stonjek, I. Terekhov, F. Wuerthwein, A. Roy, P. Mhashikar, V. Murthi, T. Tannenbaum, R. Walker, F. Ratnikov, T. Rockwell

Abstract: We describe some of the key aspects of the SAMGrid system, used by the D0 and CDF experiments at Fermilab. Having sustained success of the data handling part of SAMGrid, we have developed new services for job and information services. Our job management is rooted in \CondorG and uses enhancements that are general applicability for HEP grids. Our information system is based on a uniform framework… ▽ More We describe some of the key aspects of the SAMGrid system, used by the D0 and CDF experiments at Fermilab. Having sustained success of the data handling part of SAMGrid, we have developed new services for job and information services. Our job management is rooted in \CondorG and uses enhancements that are general applicability for HEP grids. Our information system is based on a uniform framework for configuration management based on XML data representation and processing. △ Less

Submitted 8 July, 2003; v1 submitted 3 July, 2003; originally announced July 2003.

Comments: 7 pages including figures, presented at CHEP 2003

ACM Class: c.1.4

Journal ref: ECONF C0303241:TUAT002,2003

Showing 1–37 of 37 results for author: Würthwein, F