Search | arXiv e-print repository

arXiv:1905.09816 [pdf, other]

doi 10.1145/3332186.3333258

SciTokens: Demonstrating Capability-Based Access to Remote Scientific Data using HTCondor

Authors: Alex Withers, Brian Bockelman, Derek Weitzel, Duncan Brown, Jason Patton, Jeff Gaynor, Jim Basney, Todd Tannenbaum, You Alex Gao, Zach Miller

Abstract: The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the p… ▽ More The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. SciTokens introduces a capabilities-based authorization infrastructure for distributed scientific computing, to help scientists manage their security credentials more reliably and securely. SciTokens uses IETF-standard OAuth JSON Web Tokens for capability-based secure access to remote scientific data. These access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems. In this extended abstract, we present the results over the past year of our open source implementation of the SciTokens model and its deployment in the Open Science Grid, including new OAuth support added in the HTCondor 8.8 release series. △ Less

Submitted 22 May, 2019; originally announced May 2019.

Comments: 8 pages, 3 figures, PEARC '19: Practice and Experience in Advanced Research Computing, July 28-August 1, 2019, Chicago, IL, USA. arXiv admin note: substantial text overlap with arXiv:1807.04728

arXiv:1807.04728 [pdf, other]

doi 10.1145/3219104.3219135

SciTokens: Capability-Based Secure Access to Remote Scientific Data

Authors: Alex Withers, Brian Bockelman, Derek Weitzel, Duncan Brown, Jeff Gaynor, Jim Basney, Todd Tannenbaum, Zach Miller

Abstract: The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the p… ▽ More The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. In this paper, we introduce SciTokens, open source software to help scientists manage their security credentials more reliably and securely. We describe the SciTokens system architecture, design, and implementation addressing use cases from the Laser Interferometer Gravitational-Wave Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey Telescope (LSST) projects. We also present our integration with widely-used software that supports distributed scientific computing, including HTCondor, CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for capability-based secure access to remote scientific data. The access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems. △ Less

Submitted 12 July, 2018; originally announced July 2018.

Comments: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, USA

arXiv:1011.0715 [pdf, ps, other]

doi 10.1088/1742-6596/219/4/042017

Flexible Session Management in a Distributed Environment

Authors: Zach Miller, Dan Bradley, Todd Tannenbaum, Igor Sfiligoi

Abstract: Many secure communication libraries used by distributed systems, such as SSL, TLS, and Kerberos, fail to make a clear distinction between the authentication, session, and communication layers. In this paper we introduce CEDAR, the secure communication library used by the Condor High Throughput Computing software, and present the advantages to a distributed computing system resulting from CEDAR's s… ▽ More Many secure communication libraries used by distributed systems, such as SSL, TLS, and Kerberos, fail to make a clear distinction between the authentication, session, and communication layers. In this paper we introduce CEDAR, the secure communication library used by the Condor High Throughput Computing software, and present the advantages to a distributed computing system resulting from CEDAR's separation of these layers. Regardless of the authentication method used, CEDAR establishes a secure session key, which has the flexibility to be used for multiple capabilities. We demonstrate how a layered approach to security sessions can avoid round-trips and latency inherent in network authentication. The creation of a distinct session management layer allows for optimizations to improve scalability by way of delegating sessions to other components in the system. This session delegation creates a chain of trust that reduces the overhead of establishing secure connections and enables centralized enforcement of system-wide security policies. Additionally, secure channels based upon UDP datagrams are often overlooked by existing libraries; we show how CEDAR's structure accommodates this as well. As an example of the utility of this work, we show how the use of delegated security sessions and other techniques inherent in CEDAR's architecture enables US CMS to meet their scalability requirements in deploying Condor over large-scale, wide-area grid systems. △ Less

Submitted 2 November, 2010; originally announced November 2010.

Journal ref: J.Phys.Conf.Ser.219:042017,2010

arXiv:cs/0307007 [pdf, ps, other]

Management of Grid Jobs and Information within SAMGrid

Authors: A. Baranovski, G. Garzoglio, A. Kreymer, L. Lueking, S. Stonjek, I. Terekhov, F. Wuerthwein, A. Roy, P. Mhashikar, V. Murthi, T. Tannenbaum, R. Walker, F. Ratnikov, T. Rockwell

Abstract: We describe some of the key aspects of the SAMGrid system, used by the D0 and CDF experiments at Fermilab. Having sustained success of the data handling part of SAMGrid, we have developed new services for job and information services. Our job management is rooted in \CondorG and uses enhancements that are general applicability for HEP grids. Our information system is based on a uniform framework… ▽ More We describe some of the key aspects of the SAMGrid system, used by the D0 and CDF experiments at Fermilab. Having sustained success of the data handling part of SAMGrid, we have developed new services for job and information services. Our job management is rooted in \CondorG and uses enhancements that are general applicability for HEP grids. Our information system is based on a uniform framework for configuration management based on XML data representation and processing. △ Less

Submitted 8 July, 2003; v1 submitted 3 July, 2003; originally announced July 2003.

Comments: 7 pages including figures, presented at CHEP 2003

ACM Class: c.1.4

Journal ref: ECONF C0303241:TUAT002,2003

arXiv:cs/0305066 [pdf, ps, other]

The CMS Integration Grid Testbed

Authors: Gregory E. Graham, M. Anzar Afaq, Shafqat Aziz, L. A. T. Bauerdick, Michael Ernst, Joseph Kaiser, Natalia Ratnikova, Hans Wenzel, Yujun Wu, Erik Aslakson, Julian Bunn, Saima Iqbal, Iosif Legrand, Harvey Newman, Suresh Singh, Conrad Steenberg, James Branson, Ian Fisk, James Letts, Adam Arbree, Paul Avery, Dimitri Bourilkov, Richard Cavanaugh, Jorge Rodriguez, Suchindra Kategari , et al. (5 additional authors not shown)

Abstract: The CMS Integration Grid Testbed (IGT) comprises USCMS Tier-1 and Tier-2 hardware at the following sites: the California Institute of Technology, Fermi National Accelerator Laboratory, the University of California at San Diego, and the University of Florida at Gainesville. The IGT runs jobs using the Globus Toolkit with a DAGMan and Condor-G front end. The virtual organization (VO) is managed us… ▽ More The CMS Integration Grid Testbed (IGT) comprises USCMS Tier-1 and Tier-2 hardware at the following sites: the California Institute of Technology, Fermi National Accelerator Laboratory, the University of California at San Diego, and the University of Florida at Gainesville. The IGT runs jobs using the Globus Toolkit with a DAGMan and Condor-G front end. The virtual organization (VO) is managed using VO management scripts from the European Data Grid (EDG). Gridwide monitoring is accomplished using local tools such as Ganglia interfaced into the Globus Metadata Directory Service (MDS) and the agent based Mona Lisa. Domain specific software is packaged and installed using the Distrib ution After Release (DAR) tool of CMS, while middleware under the auspices of the Virtual Data Toolkit (VDT) is distributed using Pacman. During a continuo us two month span in Fall of 2002, over 1 million official CMS GEANT based Monte Carlo events were generated and returned to CERN for analysis while being demonstrated at SC2002. In this paper, we describe the process that led to one of the world's first continuously available, functioning grids. △ Less

Submitted 10 June, 2003; v1 submitted 30 May, 2003; originally announced May 2003.

Comments: CHEP 2003 MOCT010

ACM Class: A.0; C.2.4

Journal ref: eConfC0303241:MOCT010B,2003

Showing 1–5 of 5 results for author: Tannenbaum, T