Skip to main content

Showing 1–43 of 43 results for author: Bockelman, B

.
  1. arXiv:2404.02100  [pdf, other

    hep-ex

    Analysis Facilities White Paper

    Authors: D. Ciangottini, A. Forti, L. Heinrich, N. Skidmore, C. Alpigiani, M. Aly, D. Benjamin, B. Bockelman, L. Bryant, J. Catmore, M. D'Alfonso, A. Delgado Peris, C. Doglioni, G. Duckeck, P. Elmer, J. Eschle, M. Feickert, J. Frost, R. Gardner, V. Garonne, M. Giffels, J. Gooding, E. Gramstad, L. Gray, B. Hegner , et al. (41 additional authors not shown)

    Abstract: This white paper presents the current status of the R&D for Analysis Facilities (AFs) and attempts to summarize the views on the future direction of these facilities. These views have been collected through the High Energy Physics (HEP) Software Foundation's (HSF) Analysis Facilities forum, established in March 2022, the Analysis Ecosystems II workshop, that took place in May 2022, and the WLCG/HS… ▽ More

    Submitted 15 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  2. arXiv:2402.05244  [pdf, ps, other

    cs.DC

    CRIU -- Checkpoint Restore in Userspace for computational simulations and scientific applications

    Authors: Fabio Andrijauskas, Igor Sfiligoi, Diego Davila, Aashay Arora, Jonathan Guiang, Brian Bockelman, Greg Thain, Frank Wurthwein

    Abstract: Creating new materials, discovering new drugs, and simulating systems are essential processes for research and innovation and require substantial computational power. While many applications can be split into many smaller independent tasks, some cannot and may take hours or weeks to run to completion. To better manage those longer-running jobs, it would be desirable to stop them at any arbitrary p… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY & NUCLEAR PHYSICS - 2023

  3. arXiv:2312.11485  [pdf, other

    cs.DC hep-ex

    Coffea-Casa: Building composable analysis facilities for the HL-LHC

    Authors: Sam Albin, Garhan Attebury, Kenneth Bloom, Brian Bockelman, Carl Lundstedt, Oksana Shadura, John Thiltges

    Abstract: The large data volumes expected from the High Luminosity LHC (HL-LHC) present challenges to existing paradigms and facilities for end-user data analysis. Modern cyberinfrastructure tools provide a diverse set of services that can be composed into a system that provides physicists with powerful tools that give them straightforward access to large computing resources, with low barriers to entry. The… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: Submitted as proceedings for CHEP 2023 conference to The European Physical Journal

  4. arXiv:2302.01317  [pdf, other

    hep-ex physics.acc-ph physics.comp-ph physics.ins-det

    IRIS-HEP Strategic Plan for the Next Phase of Software Upgrades for HL-LHC Physics

    Authors: Brian Bockelman, Peter Elmer, Gordon Watts

    Abstract: The quest to understand the fundamental building blocks of nature and their interactions is one of the oldest and most ambitious of human scientific endeavors. CERN's Large Hadron Collider (LHC) represents a huge step forward in this quest. The discovery of the Higgs boson, the observation of exceedingly rare decays of $B$ mesons, and stringent constraints on many viable theories of physics beyond… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  5. arXiv:2203.10161  [pdf, other

    physics.data-an cs.SE hep-ex

    Collaborative Computing Support for Analysis Facilities Exploiting Software as Infrastructure Techniques

    Authors: Maria Acosta Flechas, Garhan Attebury, Kenneth Bloom, Brian Bockelman, Lindsey Gray, Burt Holzman, Carl Lundstedt, Oksana Shadura, Nicholas Smith, John Thiltges

    Abstract: Prior to the public release of Kubernetes it was difficult to conduct joint development of elaborate analysis facilities due to the highly non-homogeneous nature of hardware and network topology across compute facilities. However, since the advent of systems like Kubernetes and OpenShift, which provide declarative interfaces for building fault-tolerant and self-healing deployments of networked sof… ▽ More

    Submitted 22 March, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

    Comments: contribution to Snowmass 2021

    Report number: FERMILAB-FN-1163-SCD

  6. arXiv:2203.08010  [pdf, other

    hep-ex

    Analysis Facilities for HL-LHC

    Authors: Doug Benjamin, Kenneth Bloom, Brian Bockelman, Lincoln Bryant, Kyle Cranmer, Rob Gardner, Chris Hollowell, Burt Holzman, Eric Lançon, Ofer Rind, Oksana Shadura, Wei Yang

    Abstract: The HL-LHC presents significant challenges for the HEP analysis community. The number of events in each analysis is expected to increase by an order of magnitude and new techniques are expected to be required; both challenges necessitate new services and approaches for analysis facilities. These services are expected to provide new capabilities, a larger scale, and different access modalities (com… ▽ More

    Submitted 16 March, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: Contribution to Snowmass 2021

  7. arXiv:2112.03074  [pdf, other

    cs.NI hep-ex

    The Service Analysis and Network Diagnosis DataPipeline

    Authors: Derek Weitzel, Shawn McKee, Brian Paul Bockelman, John Thiltges, Marian Babik, Ilija Vukotic

    Abstract: Modern network performance monitoring toolkits, such as perfSONAR, take a remarkable number of measurements about the local network environment. To gain a complete picture of network performance, however, one needs to aggregate data across a large number of endpoints. The Service Analysis and Network Diagnosis (SAND) data pipeline collects data from diverse sources and ingests these measurements i… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: 10 pages, to be published in 2021 IEEE Workshop on Innovating the Network for Data-Intensive Science

  8. Systematic benchmarking of HTTPS third party copy on 100Gbps links using XRootD

    Authors: Edgar Fajardo, Aashay Arora, Diego Davila, Richard Gao, Frank Würthwein, Brian Bockelman

    Abstract: The High Luminosity Large Hadron Collider provides a data challenge. The amount of data recorded from the experiments and transported to hundreds of sites will see a thirty fold increase in annual data volume. A systematic approach to contrast the performance of different Third Party Copy(TPC) transfer protocols arises. Two contenders, XRootD-HTTPS and the GridFTP are evaluated in their performanc… ▽ More

    Submitted 22 March, 2021; originally announced March 2021.

    Comments: 7 pages, 8 figures

  9. Coffea-casa: an analysis facility prototype

    Authors: Matous Adamec, Garhan Attebury, Kenneth Bloom, Brian Bockelman, Carl Lundstedt, Oksana Shadura, John Thiltges

    Abstract: Data analysis in HEP has often relied on batch systems and event loops; users are given a non-interactive interface to computing resources and consider data event-by-event. The "Coffea-casa" prototype analysis facility is an effort to provide users with alternate mechanisms to access computing resources and enable new programming paradigms. Instead of the command-line interface and asynchronous ba… ▽ More

    Submitted 29 June, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

    Comments: Submitted as proceedings fo 25th International Conference on Computing in High-Energy and Nuclear Physics (https://indico.cern.ch/event/948465/)

  10. An intelligent Data Delivery Service for and beyond the ATLAS experiment

    Authors: Wen Guan, Tadashi Maeno, Brian Paul Bockelman, Torre Wenaus, Fahui Lin, Siarhei Padolski, Rui Zhang, Aleksandr Alekseev

    Abstract: The intelligent Data Delivery Service (iDDS) has been developed to cope with the huge increase of computing and storage resource usage in the coming LHC data taking. iDDS has been designed to intelligently orchestrate workflow and data management systems, decoupling data pre-processing, delivery, and main processing in various workflows. It is an experiment-agnostic service around a workflow-orien… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

    Comments: 6 pages, 5 figures

  11. arXiv:2011.14995  [pdf, other

    cs.DC physics.comp-ph

    Adapting LIGO workflows to run in the Open Science Grid

    Authors: Edgar Fajardo, Frank Wuerthwein, Brian Bockelman, Miron Livny, Greg Thain, James Alexander Clark, Peter Couvares, Josh Willis

    Abstract: During the first observation run the LIGO collaboration needed to offload some of its most, intense CPU workflows from its dedicated computing sites to opportunistic resources. Open Science Grid enabled LIGO to run PyCbC, RIFT and Bayeswave workflows to seamlessly run in a combination of owned and opportunistic resources. One of the challenges is enabling the workflows to use several heterogeneous… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

  12. WLCG Authorisation from X.509 to Tokens

    Authors: Brian Bockelman, Andrea Ceccanti, Ian Collier, Linda Cornwall, Thomas Dack, Jaroslav Guenther, Mario Lassnig, Maarten Litmaath, Paul Millar, Mischa Sallé, Hannah Short, Jeny Teheran, Romain Wartel

    Abstract: The WLCG Authorisation Working Group was formed in July 2017 with the objective to understand and meet the needs of a future-looking Authentication and Authorisation Infrastructure (AAI) for WLCG experiments. Much has changed since the early 2000s when X.509 certificates presented the most suitable choice for authorisation within the grid; progress in token based authorisation and identity federat… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 8 pages, 3 figures, to appear in the proceedings of CHEP 2019

  13. Third-party transfers in WLCG using HTTP

    Authors: Brian Bockelman, Andrea Ceccanti, Fabrizio Furano, Paul Millar, Dmitry Litvintsev, Alessandra Forti

    Abstract: Since its earliest days, the Worldwide LHC Computational Grid (WLCG) has relied on GridFTP to transfer data between sites. The announcement that Globus is drop** support of its open source Globus Toolkit (GT), which forms the basis for several FTP client and servers, has created an opportunity to reevaluate the use of FTP. HTTP-TPC, an extension to HTTP compatible with WebDAV, has arisen as a st… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 7 pages, 3 figures, to appear in the proceedings of CHEP 2020

  14. arXiv:2007.01791  [pdf

    cs.DC hep-ex physics.ins-det

    Towards an Intelligent Data Delivery Service

    Authors: Wen Guan, Tadashi Maeno, Gancho Dimitrov, Brian Paul Bockelman, Torre Wenaus, Vakhtang Tsulaia, Nicolo Magini

    Abstract: The ATLAS Event Streaming Service (ESS) at the LHC is an approach to preprocess and deliver data for Event Service (ES) that has implemented a fine-grained approach for ATLAS event processing. The ESS allows one to asynchronously deliver only the input events required by ES processing, with the aim to decrease data traffic over WAN and improve overall data processing throughput. A prototype of ESS… ▽ More

    Submitted 3 July, 2020; originally announced July 2020.

    Comments: 6 pages, 3 figures

  15. Creating a content delivery network for general science on the internet backbone using XCaches

    Authors: Edgar Fajardo, Marian Zvada, Derek Weitzel, Mats Rynge, John Hicks, Mat Selmeci, Brian Lin, Pascal Paschos, Brian Bockelman, Igor Sfiligoi, Andrew Hanushevsky, Frank Würthwein

    Abstract: A general problem faced by computing on the grid for opportunistic users is that delivering cycles is simpler than delivering data to those cycles. In this project we show how we integrated XRootD caches placed on the internet backbone to implement a content delivery network for general science workflows. We will show that for some workflows on different science domains like high energy physics, g… ▽ More

    Submitted 28 September, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

  16. WLCG Networks: Update on Monitoring and Analytics

    Authors: Marian Babik, Shawn McKee, Pedro Andrade, Brian Paul Bockelman, Robert Gardner, Edgar Mauricio Fajardo Hernandez, Edoardo Martelli, Ilija Vukotic, Derek Weitzel, Marian Zvada

    Abstract: WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues including connection failures, congestion and traffic routing. The OSG Networking Area, in partnership with WLCG, is focused on being the primary source of networking information for its partners and constituents. It… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

    Comments: Accepted for publication in CHEP 2019 proceedings

  17. ROOT I/O compression improvements for HEP analysis

    Authors: Oksana Shadura, Brian Paul Bockelman, Philippe Canal, Danilo Piparo, Zhe Zhang

    Abstract: We overview recent changes in the ROOT I/O system, increasing performance and enhancing it and improving its interaction with other data analysis ecosystems. Both the newly introduced compression algorithms, the much faster bulk I/O data path, and a few additional techniques have the potential to significantly to improve experiment's software performance. The need for efficient lossless data compr… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

    Comments: Submitted as a proceeding for CHEP 2019

  18. arXiv:2004.05729  [pdf, other

    cs.DC

    Exploring Erasure Coding Techniques for High Availability of Intermediate Data

    Authors: Zhe Zhang, Brian Bockelman, Derek Weitzel, David Swanson

    Abstract: Scientific computing workflows generate enormous distributed data that is short-lived, yet critical for job completion time. This class of data is called intermediate data. A common way to achieve high data availability is to replicate data. However, an increasing scale of intermediate data generated in modern scientific applications demands new storage techniques to improve storage efficiency. Er… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

  19. arXiv:2004.05723  [pdf, other

    cs.DC

    Trua: Efficient Task Replication for Flexible User-defined Availability in Scientific Grids

    Authors: Zhe Zhang, Brian Bockelman, Derek Weitzel, Xinkai Zhang, Hamid Vakilzadian, David Swanson

    Abstract: Failure is inevitable in scientific computing. As scientific applications and facilities increase their scales over the last decades, finding the root cause of a failure can be very complex or at times nearly impossible. Different scientific computing customers have varying availability demands as well as a diverse willingness to pay for availability. In contrast to existing solutions that try to… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

  20. Speeding HEP Analysis with ROOT Bulk I/O

    Authors: Brian Bockelman, Zhe Zhang, Oksana Shadura

    Abstract: Distinct HEP workflows have distinct I/O needs; while ROOT I/O excels at serializing complex C++ objects common to reconstruction, analysis workflows typically have simpler objects and can sustain higher event rates. To meet these workflows, we have developed a "bulk I/O" interface, allowing multiple events data to be returned per library call. This reduces ROOT-related overheads and increases eve… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: Submitted to proceedings of ACAT 2019

  21. ROOT I/O compression algorithms and their performance impact within Run 3

    Authors: Oksana Shadura, Brian Paul Bockelman

    Abstract: The LHCs Run3 will push the envelope on data-intensive workflows and, since at the lowest level this data is managed using the ROOT software framework, preparations for managing this data are starting already. At the beginning of LHC Run 1, all ROOT data was compressed with the ZLIB algorithm; since then, ROOT has added support for additional algorithms such as LZMA and LZ4, each with unique stren… ▽ More

    Submitted 2 August, 2019; v1 submitted 11 June, 2019; originally announced June 2019.

    Comments: Submitted to proceedings of ACAT 2019

  22. Evolution of ROOT package management

    Authors: Oksana Shadura, Brian Paul Bockelman, Vassil Vassilev

    Abstract: ROOT is a large code base with a complex set of build-time dependencies; there is a significant difference in compilation time between the "core" of ROOT and the full-fledged deployment. We present results on a "delayed build" for internal ROOT packages and external packages. This gives the ability to offer a "lightweight" core of ROOT, later extended by building additional modules to extend the f… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: Submitted to proceedings of ACAT 2019

  23. SciTokens: Demonstrating Capability-Based Access to Remote Scientific Data using HTCondor

    Authors: Alex Withers, Brian Bockelman, Derek Weitzel, Duncan Brown, Jason Patton, Jeff Gaynor, Jim Basney, Todd Tannenbaum, You Alex Gao, Zach Miller

    Abstract: The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the p… ▽ More

    Submitted 22 May, 2019; originally announced May 2019.

    Comments: 8 pages, 3 figures, PEARC '19: Practice and Experience in Advanced Research Computing, July 28-August 1, 2019, Chicago, IL, USA. arXiv admin note: substantial text overlap with arXiv:1807.04728

  24. StashCache: A Distributed Caching Federation for the Open Science Grid

    Authors: Derek Weitzel, Marian Zvada, Ilija Vukotic, Rob Gardner, Brian Bockelman, Mats Rynge, Edgar Fajardo Hernandez, Brian Lin, Matyas Selmeci

    Abstract: Data distribution for opportunistic users is challenging as they neither own the computing resources they are using or any nearby storage. Users are motivated to use opportunistic computing to expand their data processing capacity, but they require storage and fast networking to distribute data to that processing. Since it requires significant management overhead, it is rare for resource providers… ▽ More

    Submitted 16 May, 2019; originally announced May 2019.

    Comments: In Practice and Experience in Advanced Research Computing (PEARC 19), July 28-August 1, 2019, Chicago, IL, USA. ACM, New York, NY, USA, 7 pages

  25. arXiv:1903.04615  [pdf, other

    astro-ph.IM astro-ph.CO gr-qc

    The US Program in Ground-Based Gravitational Wave Science: Contribution from the LIGO Laboratory

    Authors: David Reitze, Rich Abbott, Carl Adams, Rana Adhikari, Nancy Aggarwal, Shreya Anand, Alena Ananyeva, Stuart Anderson, Stephen Appert, Koji Arai, Melody Araya, Stuart Aston, Juan Barayoga, Barry Barish, David Barker, Lisa Barsotti, Jeffrey Bartlett, Joseph Betzwieser, GariLynn Billingsley, Sebastien Biscans, Sylvia Biscoveanu, Kent Blackburn, Carl Blair, Ryan Blair, Brian Bockelman , et al. (159 additional authors not shown)

    Abstract: Recent gravitational-wave observations from the LIGO and Virgo observatories have brought a sense of great excitement to scientists and citizens the world over. Since September 2015,10 binary black hole coalescences and one binary neutron star coalescence have been observed. They have provided remarkable, revolutionary insight into the "gravitational Universe" and have greatly extended the field o… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Comments: For the 2020 Astro decadal

    Journal ref: 2019 BAAS 51(3) 141

  26. Rucio - Scientific data management

    Authors: Martin Barisits, Thomas Beermann, Frank Berghaus, Brian Bockelman, Joaquin Bogado, David Cameron, Dimitrios Christidis, Diego Ciangottini, Gancho Dimitrov, Markus Elsing, Vincent Garonne, Alessandro di Girolamo, Luc Goossens, Wen Guan, Jaroslav Guenther, Tomas Javurek, Dietmar Kuhn, Mario Lassnig, Fernando Lopez, Nicolo Magini, Angelos Molfetas, Armin Nairz, Farid Ould-Saada, Stefan Prenner, Cedric Serfon , et al. (5 additional authors not shown)

    Abstract: Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely distributed locations. Rucio was originally developed to meet the requirements of the high-energy physics experiment ATLAS, and now is continuously extended to support t… ▽ More

    Submitted 6 June, 2019; v1 submitted 26 February, 2019; originally announced February 2019.

    Comments: 21 pages, 11 figures

    Report number: 2510-2044

    Journal ref: Barisits, M., Beermann, T., Berghaus, F. et al. Comput Softw Big Sci (2019) 3: 11

  27. Continuous Performance Benchmarking Framework for ROOT

    Authors: Oksana Shadura, Vassil Vassilev, Brian Paul Bockelman

    Abstract: Foundational software libraries such as ROOT are under intense pressure to avoid software regression, including performance regressions. Continuous performance benchmarking, as a part of continuous integration and other code quality testing, is an industry best-practice to understand how the performance of a software product evolves over time. We present a framework, built from industry best pract… ▽ More

    Submitted 21 February, 2019; v1 submitted 7 December, 2018; originally announced December 2018.

    Comments: 8 pages, 5 figures, CHEP 2018 - 23rd International Conference on Computing in High Energy and Nuclear Physics

  28. Extending ROOT through Modules

    Authors: Oksana Shadura, Brian Paul Bockelman, Vassil Vassilev

    Abstract: The ROOT software framework is foundational for the HEP ecosystem, providing capabilities such as IO, a C++ interpreter, GUI, and math libraries. It uses object-oriented concepts and build-time components to layer between them. We believe additional layering formalisms will benefit ROOT and its users. We present the modularization strategy for ROOT which aims to formalize the description of existi… ▽ More

    Submitted 11 December, 2018; v1 submitted 7 December, 2018; originally announced December 2018.

    Comments: 8 pages, 2 figures, 1 listing, CHEP 2018 - 23rd International Conference on Computing in High Energy and Nuclear Physics

  29. arXiv:1812.00761  [pdf, ps, other

    physics.comp-ph

    HEP Software Foundation Community White Paper Working Group -- Data Organization, Management and Access (DOMA)

    Authors: Dario Berzano, Riccardo Maria Bianchi, Ian Bird, Brian Bockelman, Simone Campana, Kaushik De, Dirk Duellmann, Peter Elmer, Robert Gardner, Vincent Garonne, Claudio Grandi, Oliver Gutsche, Andrew Hanushevsky, Burt Holzman, Bodhitha Jayatilaka, Ivo Jimenez, Michel Jouvin, Oliver Keeble, Alexei Klimentov, Valentin Kuznetsov, Eric Lancon, Mario Lassnig, Miron Livny, Carlos Maltzahn, Shawn McKee , et al. (13 additional authors not shown)

    Abstract: Without significant changes to data organization, management, and access (DOMA), HEP experiments will find scientific output limited by how fast data can be accessed and digested by computational resources. In this white paper we discuss challenges in DOMA that HEP experiments, such as the HL-LHC, will face as well as potential ways to address them. A research and development timeline to assess th… ▽ More

    Submitted 30 November, 2018; originally announced December 2018.

    Comments: arXiv admin note: text overlap with arXiv:1712.06592

    Report number: HSF-CWP-2017-04

  30. Discovering Job Preemptions in the Open Science Grid

    Authors: Zhe Zhang, Brian Bockelman, Derek Weitzel, David Swanson

    Abstract: The Open Science Grid(OSG) is a world-wide computing system which facilitates distributed computing for scientific research. It can distribute a computationally intensive job to geo-distributed clusters and process job's tasks in parallel. For compute clusters on the OSG, physical resources may be shared between OSG and cluster's local user-submitted jobs, with local jobs preempting OSG-based ones… ▽ More

    Submitted 17 July, 2018; originally announced July 2018.

    Comments: 8 pages

  31. SciTokens: Capability-Based Secure Access to Remote Scientific Data

    Authors: Alex Withers, Brian Bockelman, Derek Weitzel, Duncan Brown, Jeff Gaynor, Jim Basney, Todd Tannenbaum, Zach Miller

    Abstract: The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the p… ▽ More

    Submitted 12 July, 2018; originally announced July 2018.

    Comments: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, USA

  32. arXiv:1804.03983  [pdf, other

    physics.comp-ph hep-ex

    HEP Software Foundation Community White Paper Working Group - Data Analysis and Interpretation

    Authors: Lothar Bauerdick, Riccardo Maria Bianchi, Brian Bockelman, Nuno Castro, Kyle Cranmer, Peter Elmer, Robert Gardner, Maria Girone, Oliver Gutsche, Benedikt Hegner, José M. Hernández, Bodhitha Jayatilaka, David Lange, Mark S. Neubauer, Daniel S. Katz, Lukasz Kreczko, James Letts, Shawn McKee, Christoph Paus, Kevin Pedro, Jim Pivarski, Martin Ritter, Eduardo Rodrigues, Tai Sakuma, Elizabeth Sexton-Kennedy , et al. (4 additional authors not shown)

    Abstract: At the heart of experimental high energy physics (HEP) is the development of facilities and instrumentation that provide sensitivity to new phenomena. Our understanding of nature at its most fundamental level is advanced through the analysis and interpretation of data from sophisticated detectors in HEP experiments. The goal of data analysis systems is to realize the maximum possible scientific po… ▽ More

    Submitted 9 April, 2018; originally announced April 2018.

    Comments: arXiv admin note: text overlap with arXiv:1712.06592

    Report number: HSF-CWP-2017-05

  33. arXiv:1804.03326  [pdf, other

    cs.DC

    Increasing Parallelism in the ROOT I/O Subsystem

    Authors: Guilherme Amadio, Brian Bockelman, Philippe Canal, Danilo Piparo, Enric Tejedor, Zhe Zhang

    Abstract: When processing large amounts of data, the rate at which reading and writing can take place is a critical factor. High energy physics data processing relying on ROOT is no exception. The recent parallelisation of LHC experiments' software frameworks and the analysis of the ever increasing amount of collision data collected by experiments further emphasized this issue underlying the need of increas… ▽ More

    Submitted 9 April, 2018; originally announced April 2018.

  34. arXiv:1712.06982  [pdf, other

    physics.comp-ph hep-ex

    A Roadmap for HEP Software and Computing R&D for the 2020s

    Authors: Johannes Albrecht, Antonio Augusto Alves Jr, Guilherme Amadio, Giuseppe Andronico, Nguyen Anh-Ky, Laurent Aphecetche, John Apostolakis, Makoto Asai, Luca Atzori, Marian Babik, Giuseppe Bagliesi, Marilena Bandieramonte, Sunanda Banerjee, Martin Barisits, Lothar A. T. Bauerdick, Stefano Belforte, Douglas Benjamin, Catrin Bernius, Wahid Bhimji, Riccardo Maria Bianchi, Ian Bird, Catherine Biscarat, Jakob Blomer, Kenneth Bloom, Tommaso Boccali , et al. (285 additional authors not shown)

    Abstract: Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for… ▽ More

    Submitted 19 December, 2018; v1 submitted 18 December, 2017; originally announced December 2017.

    Report number: HSF-CWP-2017-01

    Journal ref: Comput Softw Big Sci (2019) 3, 7

  35. arXiv:1711.02659  [pdf, other

    cs.DC

    Optimizing ROOT IO For Analysis

    Authors: Brian Bockelman, Zhe Zhang, Jim Pivarski

    Abstract: The ROOT I/O (RIO) subsystem is foundational to most HEP experiments - it provides a file format, a set of APIs/semantics, and a reference implementation in C++. It is often found at the base of an experiment's framework and is used to serialize the experiment's data; in the case of an LHC experiment, this may be hundreds of petabytes of files! Individual physicists will further use RIO to perform… ▽ More

    Submitted 7 November, 2017; originally announced November 2017.

    Comments: 18th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT)

  36. arXiv:1710.00100  [pdf, other

    cs.DC physics.comp-ph

    HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

    Authors: Burt Holzman, Lothar A. T. Bauerdick, Brian Bockelman, Dave Dykstra, Ian Fisk, Stuart Fuess, Gabriele Garzoglio, Maria Girone, Oliver Gutsche, Dirk Hufnagel, Hyunwoo Kim, Robert Kennedy, Nicolo Magini, David Mason, Panagiotis Spentzouris, Anthony Tiradani, Steve Timm, Eric W. Vaandering

    Abstract: Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly de… ▽ More

    Submitted 29 September, 2017; originally announced October 2017.

    Comments: 15 pages, 9 figures

    Journal ref: Comput Softw Big Sci (2017) 1:1

  37. arXiv:1708.08319  [pdf, other

    cs.PL cs.DB cs.IR

    Fast Access to Columnar, Hierarchically Nested Data via Code Transformation

    Authors: Jim Pivarski, Peter Elmer, Brian Bockelman, Zhe Zhang

    Abstract: Big Data query systems represent data in a columnar format for fast, selective access, and in some cases (e.g. Apache Drill), perform calculations directly on the columnar data without row materialization, avoiding runtime costs. However, many analysis procedures cannot be easily or efficiently expressed as SQL. In High Energy Physics, the majority of data processing requires nested loops with c… ▽ More

    Submitted 3 November, 2017; v1 submitted 20 August, 2017; originally announced August 2017.

    Comments: 10 pages, 2 figures, submitted to IEEE Big Data

  38. arXiv:1705.06202  [pdf, other

    cs.DC astro-ph.IM

    Data Access for LIGO on the OSG

    Authors: Derek Weitzel, Brian Bockelman, Duncan A. Brown, Peter Couvares, Frank Würthwein, Edgar Fajardo Hernandez

    Abstract: During 2015 and 2016, the Laser Interferometer Gravitational-Wave Observatory (LIGO) conducted a three-month observing campaign. These observations delivered the first direct detection of gravitational waves from binary black hole mergers. To search for these signals, the LIGO Scientific Collaboration uses the PyCBC search pipeline. To deliver science results in a timely manner, LIGO collaborated… ▽ More

    Submitted 17 May, 2017; originally announced May 2017.

    Comments: 6 pages, 3 figures, submitted to PEARC17

  39. Exploring compression techniques for ROOT IO

    Authors: Zhe Zhang, Brian Bockelman

    Abstract: ROOT provides an flexible format used throughout the HEP community. The number of use cases - from an archival data format to end-stage analysis - has required a number of tradeoffs to be exposed to the user. For example, a high "compression level" in the traditional DEFLATE algorithm will result in a smaller file (saving disk space) at the cost of slower decompression (costing CPU time when read)… ▽ More

    Submitted 23 April, 2017; originally announced April 2017.

    Comments: Proceedings for 22nd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2016)

  40. arXiv:1603.09303  [pdf, other

    physics.comp-ph astro-ph.CO hep-ex hep-lat hep-ph

    ASCR/HEP Exascale Requirements Review Report

    Authors: Salman Habib, Robert Roser, Richard Gerber, Katie Antypas, Katherine Riley, Tim Williams, Jack Wells, Tjerk Straatsma, A. Almgren, J. Amundson, S. Bailey, D. Bard, K. Bloom, B. Bockelman, A. Borgland, J. Borrill, R. Boughezal, R. Brower, B. Cowan, H. Finkel, N. Frontiere, S. Fuess, L. Ge, N. Gnedin, S. Gottlieb , et al. (29 additional authors not shown)

    Abstract: This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 ti… ▽ More

    Submitted 31 March, 2016; v1 submitted 30 March, 2016; originally announced March 2016.

    Comments: 77 pages, 13 Figures; draft report, subject to further revision

  41. arXiv:1508.01443  [pdf, other

    physics.comp-ph cs.DC hep-ex physics.ins-det

    Any Data, Any Time, Anywhere: Global Data Access for Science

    Authors: Kenneth Bloom, Tommaso Boccali, Brian Bockelman, Daniel Bradley, Sridhara Dasu, Jeff Dost, Federica Fanzago, Igor Sfiligoi, Alja Mrak Tadel, Matevz Tadel, Carl Vuosalo, Frank Würthwein, Avi Yagil, Marian Zvada

    Abstract: Data access is key to science driven by distributed high-throughput computing (DHTC), an essential technology for many major research projects such as High Energy Physics (HEP) experiments. However, achieving efficient data access becomes quite difficult when many independent storage sites are involved because users are burdened with learning the intricacies of accessing each system and kee** ca… ▽ More

    Submitted 6 August, 2015; originally announced August 2015.

    Comments: 9 pages, 6 figures, submitted to 2nd IEEE/ACM International Symposium on Big Data Computing (BDC) 2015

  42. Designing Computing System Architecture and Models for the HL-LHC era

    Authors: Lothar Bauerdick, Brian Bockelman, Peter Elmer, Stephen Gowdy, Matevz Tadel, Frank Wuerthwein

    Abstract: This paper describes a programme to study the computing model in CMS after the next long shutdown near the end of the decade.

    Submitted 20 July, 2015; originally announced July 2015.

    Comments: Submitted to proceedings of the 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), Okinawa, Japan

  43. arXiv:1410.3441  [pdf, other

    cs.DC hep-ex physics.comp-ph

    Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    Authors: David Abdurachmanov, Brian Bockelman, Peter Elmer, Giulio Eulisse, Robert Knight, Shahzad Muzaffar

    Abstract: Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with special… ▽ More

    Submitted 10 October, 2014; originally announced October 2014.

    Comments: Submitted to proceedings of 16th International workshop on Advanced Computing and Analysis Techniques in physics research (ACAT 2014), Prague