Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3332186.3332212acmotherconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article
Public Access

StashCache: A Distributed Caching Federation for the Open Science Grid

Published: 28 July 2019 Publication History

Abstract

Data distribution for opportunistic users is challenging as they neither own the computing resources they are using or any nearby storage. Users are motivated to use opportunistic computing to expand their data processing capacity, but they require storage and fast networking to distribute data to that processing. Since it requires significant management overhead, it is rare for resource providers to allow opportunistic access to storage. Additionally, in order to use opportunistic storage at several distributed sites, users assume the responsibility to maintain their data.
In this paper we present StashCache, a distributed caching federation that enables opportunistic users to utilize nearby opportunistic storage. StashCache is comprised of four components: data origins, redirectors, caches, and clients. StashCache has been deployed in the Open Science Grid for several years and has been used by many projects. Caches are deployed in geographically distributed locations across the U.S. and Europe. We will present the architecture of StashCache, as well as utilization information of the infrastructure. We will also present performance analysis comparing distributed HTTP Proxies vs StashCache.

References

[1]
Georges Aad, JM Butterworth, J Thion, U Bratzler, PN Ratoff, RB Nickerson, JM Seixas, I Grabowska-Bold, F Meisel, S Lokwitz, et al. 2008. The ATLAS experiment at the CERN large hadron collider. Jinst 3 (2008), S08003.
[2]
Kenneth Bloom, Tommaso Boccali, Brian Bockelman, Daniel Bradley, Sridhara Dasu, Jeff Dost, Federica Fanzago, Igor Sfiligoi, Alja Mrak Tadel, Matevz Tadel, et al. 2015. Any data, any time, anywhere: Global data access for science. In 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC). IEEE, 85--91.
[3]
Barry Blumenfeld, David Dykstra, Lee Lueking, and Eric Wicklund. 2008. CMS conditions data access using FroNTier. In Journal of Physics: Conference Series, Vol. 119. IOP Publishing, 072007.
[4]
B Bockelman, J Caballero Bejar, J De Stefano, John Hover, R Quick, and S Teige. 2014. OASIS: a data and software distribution service for Open Science Grid. In Journal of Physics: Conference Series, Vol. 513. IOP Publishing, 032013.
[5]
Predrag Buncic, C Aguado Sanchez, Jakob Blomer, Leandro Franco, Artem Harutyunian, Pere Mato, and Yushu Yao. 2010. CernVM--a virtual software appliance for LHC applications. In Journal of Physics: Conference Series, Vol. 219. IOP Publishing, 042003.
[6]
Serguei Chatrchyan, EA de Wolf, et al. 2008. The CMS experiment at the CERN LHC. Journal of instrumentation.-Bristol, 2006, currens 3 (2008), S08004--1.
[7]
Cloudflare. 2019. Cloudflare: The Web Performance & Security Company. (March 2019). https://www.cloudflare.com/
[8]
Cloudflare. 2019. What's the maximum file size Cloudflare will cache? (March 2019). https://support.cloudflare.com/hc/en-us/articles/200394750-What-s-the-maximum-file-size-Cloudflare-will-cache-
[9]
Docker. 2019. Enterprise Application Container Platform | Docker. (March 2019). https://www.docker.com/
[10]
Alvise Dorigo, Peter Elmer, Fabrizio Furano, and Andrew Hanushevsky. 2005. XROOTD-A Highly scalable architecture for data access. WSEAS Transactions on Computers 1, 4.3 (2005).
[11]
Dave Dykstra and Lee Lueking. 2010. Greatly improved cache update times for conditions data with Frontier/Squid. In Journal of Physics: Conference Series, Vol. 219. IOP Publishing, 072034.
[12]
Fastly. 2019. Fastly: The edge cloud platform behind the best of the web. (March 2019). https://www.fastly.com/
[13]
The Linux Foundation. 2019. Production-Grade Container Orchestration. (March 2019). https://kubernetes.io/
[14]
Robert Gardner, Simone Campana, Guenter Duckeck, Johannes Elmsheuser, Andrew Hanushevsky, Friedrich G Hönig, Jan Iven, Federica Legger, Ilija Vukotic, Wei Yang, et al. 2014. Data federation strategies for ATLAS using XRootD. In Journal of Physics: Conference Series, Vol. 513. IOP Publishing, 042049.
[15]
Gabriele Garzoglio, Tanya Levshina, Mats Rynge, Chander Sehgal, and Marko Slyz. 2012. Supporting shared resource usage for a diverse user community: the OSG experience and lessons learned. In Journal of Physics: Conference Series, Vol. 396. IOP Publishing, 032046.
[16]
Balachander Krishnamurthy, Craig Wills, and Yin Zhang. 2001. On the use and performance of content distribution networks. In Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement. ACM, 169--182.
[17]
Microsoft. 2019. Large file download optimization with Azure CDN. (March 2019). https://docs.microsoft.com/en-us/azure/cdn/cdn-large-file-optimization
[18]
Erik Nygren, Ramesh K Sitaraman, and Jennifer Sun. 2010. The akamai network: a platform for high-performance internet applications. ACM SIGOPS Operating Systems Review 44, 3 (2010), 2--19.
[19]
Ruth Pordes, Don Petravick, Bill Kramer, Doug Olson, Miron Livny, Alain Roy, Paul Avery, Kent Blackburn, Torre Wenaus, Frank Würthwein, et al. 2007. The open science grid. In Journal of Physics: Conference Series, Vol. 78. IOP Publishing, 012057.
[20]
RPM. 2019. RPM Package Manager. (March 2019). https://rpm.org/contribute.html
[21]
Derek Weitzel. 2019. djw8605/Pearc19-StashCache-Tools: v1.0 Release of tools. (April 2019).
[22]
DerekWeitzel, Brian Bockelman, Duncan A Brown, Peter Couvares, Frank Würthwein, and Edgar Fajardo Hernandez. 2017. Data Access for LIGO on the OSG. In Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact. ACM, 24.
[23]
Derek Weitzel, Brian Lin, Tony Aburaad, caseylargent, Robert Illingworth, Suchandra Thapa, Brian Bockelman, Rob Gardner, Marian Zvada, Lincoln Bryant, and eharstad. 2017. opensciencegrid/StashCache: Multi-Origin Support. (April 2017).

Cited By

View all
  • (2024)Open Science Data Federation - operation and monitoringPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670557(1-5)Online publication date: 17-Jul-2024
  • (2024)Predicting Resource Utilization Trends with Southern California Petabyte Scale CacheEPJ Web of Conferences10.1051/epjconf/202429501044295(01044)Online publication date: 6-May-2024
  • (2023)Analyzing Transatlantic Network Traffic over Scientific Data CachesProceedings of the 2023 on Systems and Network Telemetry and Analytics10.1145/3589012.3594897(19-22)Online publication date: 28-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
PEARC '19: Practice and Experience in Advanced Research Computing 2019: Rise of the Machines (learning)
July 2019
775 pages
ISBN:9781450372275
DOI:10.1145/3332186
  • General Chair:
  • Tom Furlani
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 July 2019

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

PEARC '19

Acceptance Rates

Overall Acceptance Rate 133 of 202 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)7
Reflects downloads up to 26 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Open Science Data Federation - operation and monitoringPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670557(1-5)Online publication date: 17-Jul-2024
  • (2024)Predicting Resource Utilization Trends with Southern California Petabyte Scale CacheEPJ Web of Conferences10.1051/epjconf/202429501044295(01044)Online publication date: 6-May-2024
  • (2023)Analyzing Transatlantic Network Traffic over Scientific Data CachesProceedings of the 2023 on Systems and Network Telemetry and Analytics10.1145/3589012.3594897(19-22)Online publication date: 28-Jul-2023
  • (2023)Effectiveness and predictability of in-network storage cache for Scientific Workflows2023 International Conference on Computing, Networking and Communications (ICNC)10.1109/ICNC57223.2023.10074058(226-230)Online publication date: 20-Feb-2023
  • (2023)Enabling Fast, Effective Visualization of Voluminous Gridded Spatial Datasets2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid57682.2023.00061(592-604)Online publication date: May-2023
  • (2022)Buzzard: Georgia Tech’s Foray into the Open Science GridPractice and Experience in Advanced Research Computing 2022: Revolutionary: Computing, Connections, You10.1145/3491418.3535135(1-5)Online publication date: 8-Jul-2022
  • (2021)Analyzing Scientific Data Sharing Patterns for In-network Data CachingProceedings of the 2021 on Systems and Network Telemetry and Analytics10.1145/3452411.3464441(9-16)Online publication date: 21-Jun-2021
  • (2020)Exploring Erasure Coding Techniques for High Availability of Intermediate Data2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID)10.1109/CCGrid49817.2020.00012(865-872)Online publication date: May-2020
  • (2020)Characterizing network paths in and out of the cloudsEPJ Web of Conferences10.1051/epjconf/202024507059245(07059)Online publication date: 16-Nov-2020
  • (2020)Applying OSiRIS NMAL to Network Slices on SLATEEPJ Web of Conferences10.1051/epjconf/202024507055245(07055)Online publication date: 16-Nov-2020

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media