Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3219104.3219127acmotherconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article

Globus Platform Services for Data Publication

Published: 22 July 2018 Publication History

Abstract

Data publication systems are typically tailored to the requirements and processes of a specific domain, collaboration, and/or use case. We propose here an alternative approach to engineering such systems, based on customizable compositions of simple, independent platform services, each of which provides a distinct function such as identification, metadata association, and discovery. We argue that this approach can reduce costs and increase flexibility and overall service quality. We describe a collection of such services that we are developing within Globus, which initially provide persistent identifier association, data management, and discovery capabilities; we are also working towards an automation service that can reliably and flexibly coordinate these and other services to satisfy varied user needs. We describe data publication use cases that motivate our design, present our vision for a data publication platform, and report on current implementation status.

References

[1]
2018. Amazon States Language. (2018). Retrieved March 22, 2018 from https://states-language.net/spec.html
[2]
2018. DuraCloud. (2018). Retrieved March 22, 2018 from http://duracloud.org/
[3]
2018. Elastic Cloud. (2018). Retrieved March 22, 2018 from https://www.elastic.co/cloud
[4]
2018. Fedora Repository. (2018). Retrieved March 22, 2018 from http://fedorarepository.org/
[5]
2018. figshare. (2018). Retrieved March 22, 2018 from https://figshare.com/
[6]
2018. PURR: Purdue University Research Repository. (2018). Retrieved March 22, 2018 from http://purr.purdue.edu
[7]
Yadu Babuji, Alison Brizius, Kyle Chard, Ian Foster, Daniel S. Katz, Michael Wilde, and Justin Wozniak. 2017. Introducing Parsl: A Python Parallel Scripting Library. (Aug. 2017).
[8]
B. Blaiszik, K. Chard, J. Pruyne, R. Ananthakrishnan, S. Tuecke, and I. Foster. 2016. The Materials Data Facility: Data Services to Advance Materials Science Research. Journal of the Minerals, Metals & Materials Society (JOM) 68, 8 (2016), 2045--2052.
[9]
J. Brase. 2009. DataCite -- A Global Registration Agency for Research Data. In 4th International Conference on Cooperation and Promotion of Information Resources in Science and Technology. 257--261.
[10]
K. Chard, M. D'Arcy, B. Heavner, I. Foster, C. Kesselman, R. Madduri, A. Rodriguez, S. Soiland-Reyes, C. Goble, K. Clark, E. W. Deutsch, I. Dinov, N. Price, and A. Toga. 2016. I'll take that to go: Big data bags and minimal identifiers for exchange of large, complex datasets. In IEEE International Conference on Big Data (Big Data). 319--328.
[11]
K. Chard, M. Lidman, B. McCollam, J. Bryan, R. Ananthakrishnan, S. Tuecke, and I. Foster. 2016. Globus Nexus: A Platform-as-a-Service provider of research identity, profile, and group management. Future Generation Computer Systems 56 (2016), 571--583.
[12]
K. Chard, J. Pruyne, B. Blaiszik, R. Ananthakrishnan, S. Tuecke, and I. Foster. 2015. Globus Data Publication as a Service: Lowering Barriers to Reproducible Science. In 11th IEEE International Conference on e-Science. 401--410.
[13]
K. Chard, S. Tuecke, and I. Foster. 2014. Efficient and Secure Transfer, Synchronization, and Sharing of Big Data. IEEE Cloud Computing 1, 3 (Sept 2014), 46--55.
[14]
M.J. Costello. 2009. Motivating Online Publication of Data. BioScience 59, 5 (2009), 418--427.
[15]
M. Crosas. 2011. The Dataverse Network: An Open-Source Application for Sharing, Discovering and Preserving Data. D-Lib Magazine 17, 1/2 (2011).
[16]
E. Deelman, K. Vahi, G. Juve, M. Rynge, S. Callaghan, P.J. Maechling, R. Mayani, W. Chen, R.F. da Silva, M. Livny, et al. 2015. Pegasus, a workflow management system for science automation. Future Generation Computer Systems 46 (2015), 17--35.
[17]
J. Goecks, A. Nekrutenko, and J. Taylor. 2010. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome biology 11, 8 (2010), R86.
[18]
D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M.R. Pocock, P. Li, and T. Oinn. 2006. Taverna: A tool for building and running workflows of services. Nucleic Acids Research 34 (2006), W729--W732.
[19]
ISO 26324:2012 2012. Information and documentation -- Digital object identifier system. Standard. International Organization for Standardization, Geneva, CH. https://www.iso.org/standard/43506.html
[20]
M.B. Juric. 2006. Business Process Execution Language for Web Services BPEL and BPEL4WS 2Nd Edition. Packt Publishing.
[21]
J. Starr. 2013. EZID: a digital library data management service. In A handbook of digital library economics. Elsevier, 175--183.
[22]
R. Tansley, M. Bass, D. Stuve, M. Branschofsky, D. Chudnov, G. McClellan, and M. Smith. 2003. The DSpace institutional digital repository system: current functionality. In 3rd ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '03). 87--97.
[23]
S. Tuecke, R. Ananthakrishnan, K. Chard, M. Lidman, B. McCollam, S. Rosen, and I. Foster. 2016. Globus Auth: A research identity and access management platform. In 12th IEEE International Conference on e-Science (e-Science). 203--212.
[24]
M. Wilde, M. Hategan, J.M. Wozniak, B. Clifford, D.S. Katz, and I. Foster. 2011. Swift: A language for distributed parallel scripting. Parallel Comput. 37, 9 (2011), 633--652.
[25]
M.D. Wilkinson, M. Dumontier, I.J. Aalbersberg, G. Appleton, M. Axton, A. Baak, et al. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3 (2016), 160018.
[26]
J. M. Wozniak, K. Chard, B. Blaiszik, R. Osborn, M. Wilde, and I. Foster. 2015. Big Data Remote Access Interfaces for Light Source Science. In 2nd IEEE/ACM International Symposium on Big Data Computing (BDC). 51--60.

Cited By

View all
  • (2024)Foundry-ML - Software and Services to Simplify Access to Machine Learning Datasets in Materials ScienceJournal of Open Source Software10.21105/joss.054679:93(5467)Online publication date: Jan-2024
  • (2023)Globus automation servicesFuture Generation Computer Systems10.1016/j.future.2023.01.010142:C(393-409)Online publication date: 1-May-2023
  • (2022)Ultrafast Focus Detection for Automated MicroscopyComputational Science – ICCS 202210.1007/978-3-031-08751-6_29(403-416)Online publication date: 21-Jun-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
PEARC '18: Proceedings of the Practice and Experience on Advanced Research Computing: Seamless Creativity
July 2018
652 pages
ISBN:9781450364461
DOI:10.1145/3219104
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 July 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Globus
  2. Platform-as-a-Service
  3. data publication

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

PEARC '18

Acceptance Rates

PEARC '18 Paper Acceptance Rate 79 of 123 submissions, 64%;
Overall Acceptance Rate 133 of 202 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)2
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Foundry-ML - Software and Services to Simplify Access to Machine Learning Datasets in Materials ScienceJournal of Open Source Software10.21105/joss.054679:93(5467)Online publication date: Jan-2024
  • (2023)Globus automation servicesFuture Generation Computer Systems10.1016/j.future.2023.01.010142:C(393-409)Online publication date: 1-May-2023
  • (2022)Ultrafast Focus Detection for Automated MicroscopyComputational Science – ICCS 202210.1007/978-3-031-08751-6_29(403-416)Online publication date: 21-Jun-2022
  • (2022)Braid-DB: Toward AI-Driven Science with Machine Learning ProvenanceDriving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation10.1007/978-3-030-96498-6_14(247-261)Online publication date: 10-Mar-2022
  • (2022)High-Performance Ptychographic Reconstruction with Federated FacilitiesDriving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation10.1007/978-3-030-96498-6_10(173-189)Online publication date: 10-Mar-2022
  • (2021)Ultrafast Focus Detection for Automated Microscopy2021 IEEE 17th International Conference on eScience (eScience)10.1109/eScience51609.2021.00039(237-238)Online publication date: Sep-2021
  • (2021)Bridging Data Center AI Systems with Edge Computing for Actionable Information Retrieval2021 3rd Annual Workshop on Extreme-scale Experiment-in-the-Loop Computing (XLOOP)10.1109/XLOOP54565.2021.00008(15-23)Online publication date: Nov-2021
  • (2020)Mitigating Uncertainty in Developing and Applying Scientific Applications in an Integrated Computing EnvironmentProgramming and Computing Software10.1134/S036176882008023X46:8(483-502)Online publication date: 1-Dec-2020
  • (2020)A new Collaborative Platform for Research in Smart FarmingProcedia Computer Science10.1016/j.procs.2020.10.061177(450-455)Online publication date: 2020
  • (2019)Serverless Workflows for Indexing Large Scientific DataProceedings of the 5th International Workshop on Serverless Computing10.1145/3366623.3368140(43-48)Online publication date: 9-Dec-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media