research-article

Parallel I/O prefetching using MPI file caching and I/O signatures

Authors:

William GroppAuthors Info & Claims

SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing

Article No.: 44, Pages 1 - 12

Published: 15 November 2008 Publication History

Abstract

Parallel I/O prefetching is considered to be effective in improving I/O performance. However, the effectiveness depends on determining patterns among future I/O accesses swiftly and fetching data in time, which is difficult to achieve in general. In this study, we propose an I/O signature-based prefetching strategy. The idea is to use a predetermined I/O signature of an application to guide prefetching. To put this idea to work, we first derived a classification of patterns and introduced a simple and effective signature notation to represent patterns. We then developed a toolkit to trace and generate I/O signatures automatically. Finally, we designed and implemented a thread-based client-side collective prefetching cache layer for MPI-IO library to support prefetching. A prefetching thread reads I/O signatures of an application and adjusts them by observing I/O accesses at runtime. Experimental results show that the proposed prefetching method improves I/O performance significantly for applications with complex patterns.

References

[1]

Surendra Byna, Xian-He Sun, William Gropp and Rajeev Thakur, "Predicting the Memory-Access Cost Based on Data Access Patterns", in Proceedings of the IEEE International Conference on Cluster Computing, San Diego, September 2004.

Digital Library

[2]

Y. Chen, S. Byna, X.-H. Sun, R. Thakur, W. Gropp. "Exploring Parallel I/O Concurrency with Speculative Prefetching", in Proc. 37th International Conference on Parallel Processing (ICPP '08), Sept. 2008.

Digital Library

[3]

Phyllis E. Crandall, Ruth A. Aydt, Andrew A. Chien, and Daniel A. Reed, "Input/output characteristics of scalable parallel applications", in Proceedings of the ACM/IEEE conference on Supercomputing, pp. 59-es, December 1995.

Digital Library

[4]

Cluster File Systems Inc., "Lustre: A scalable, high performance file system", Whitepaper, http://www.lustre.org/docs/whitepaper.pdf

[5]

F. Chang, "Using Speculative Execution to Automatically Hide I/O Latency", Carnegie Mellon Ph.D Dissertation CMU-CS-01-172, December 2001.

Digital Library

[6]

F. Chang and G. A. Gibson, "Automatic I/O Hint Generation through Speculative Execution", in Proceedings of the 3rd Symposium on Operating Systems Design and Implementation, February 1999.

Digital Library

[7]

P. H. Carns, W. B. Ligon III, R. B. Ross, and R. Thakur, "PVFS: A Parallel File System For Linux Clusters", in Proceedings of the 4th Annual Linux Showcase and Conference, Atlanta, GA, October 2000, pp. 317--327

Digital Library

[8]

M. Dahlin, R. Wang, T. Anderson, D. Patterson, "Cooperative caching: using remote client memory to improve file system performance", in Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation, November 1994.

Digital Library

[9]

FLASH IO Benchmark, Routine - Parallel HDF5, http://www.astro.sunysb.edu/mzingale/flash_benchmark_io/

[10]

The HDF5 Group, HDF5 - A New Generation of Hierarchical Data Format, http://hdf.ncsa.uiuc.edu/products/hdf5/index.html

[11]

B. Hendrickson and D. Womble, "The torus-wrap mapping for dense matrix calculations on massively parallel computers", SIAM Journal of Scientific Computing, 15(5), September 1994.

Digital Library

[12]

K. Keeton, G. Alvarez, E. Riedel, and M. Uysal. "Characterizing I/O-intensive Workload Sequentiality on Modern Disk Arrays", in Proceedings of the 4th Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW-01), January 2001.

[13]

D. F. Kotz and C. S. Ellis, "Prefetching in File Systems for MIMD Multiprocessors", IEEE Transactions on Parallel and Distributed Systems, 1(2), pp. 218--230, 1990.

Digital Library

[14]

David Kotz and Nils Nieuwejaar, "Dynamic File-Access Characteristics of a Production Parallel Scientific Workload", in Proceedings of Supercomputing '94, pp. 640--649, November, 1994.

Digital Library

[15]

W. K. Liao, K. Coloma, A. Choudhary, L. Ward, E. Russel and S. Tideman, "Collective Caching: Application-Aware Client-Side File Caching", in Proceedings of the 14th International Symposium on High Performance Distributed Computing, 2005.

Digital Library

[16]

X. Ma, J. Lee and M. Winslett, "High-level Buffering for Hiding Periodic Output Cost in Scientific Simulations", IEEE Transactions on Parallel and Distributed Systems, Vol. 17, No. 3, 2006.

Digital Library

[17]

T. M. Madhyastha and Daniel A. Reed, "Learning to classify parallel input/output access patterns", in IEEE Transactions on Parallel and Distributed Systems, Volume 13, Issue 8, pp. 802--813, Aug 2002.

Digital Library

[18]

Ethan L. Miller, Randy H. Katz, "Input/output behavior of supercomputing applications", in Proceedings of the 1991 ACM/IEEE conference on Supercomputing, pp. 567--576, November 1991.

Digital Library

[19]

Jaydeep Marathe, Frank Mueller, Tushar Mohan, Sally McKee, Bronis de Supinski, and Andy Yoo, "METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies", ACM Transactions on Programming Languages and Systems, 29(2), April 2007.

Digital Library

[20]

J. May, "Parallel I/O For High Performance Computing", Morgan Kaufmann Publishing, 2001.

Digital Library

[21]

Message Passing Interface Forum, "MPI: A Message-Passing Interface Standard. Version 1.1", June 1995. http://www.mpi-forum.org/docs/docs.html

[22]

Message Passing Interface Forum, "MPI-2: Extensions to the Message-Passing Interface", July 1997 1996. http://www.mpi-forum.org/docs/docs.html

[23]

NAS Parallel benchmarks, http://www.nas.nasa.gov/Resources/Software/npb.html

[24]

Nils Nieuwejaar, David Kotz, Apratim Purakayastha, Carla Schlatter Ellis, and Michael Best, "File-Access Characteristics of Parallel Scientific Workloads", IEEE Transactions on Parallel and Distributed Systems, 7(10) pp. 1075--1089, October 1996.

Digital Library

[25]

R. H. Patterson and G. Gibson, "Exposing I/O Concurrency with Informed Prefetching", in Proceedings of the 3rd International Conference on Parallel and Distributed Information Systems, 1994.

Digital Library

[26]

Barbara Pasquale and George C. Polyzos, "A static analysis of I/O characteristics of scientific applications in a production workload", in Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, pp. 388--397, December 1993.

Digital Library

[27]

Barbara K. Pasquale and George C. Polyzos, "Dynamic I/O characterization of I/O intensive scientific applications", in Proceedings of the 1994 Conference on Supercomputing, pp. 660--669, 1994.

Digital Library

[28]

A. Papathanasiou and M. Scott, "Aggressive Prefetching: An Idea Whose Time Has Come", in Proceedings of the Tenth Workshop on Hot Topics in Operating Systems, 2005.

Digital Library

[29]

Apratim Purakayastha, Carla Ellis, David Kotz, Nils Nieuwejaar, and Michael Best, "Characterizing Parallel File-Access Patterns on a Large-Scale Multiprocessor", In Proceedings of the Ninth International Parallel Processing Symposium (IPPS), pp. 165--172, April, 1995.

Digital Library

[30]

Parallel I/O Benchmarking Consortium, http://www-unix.mcs.anl.gov/pio-benchmark/

[31]

Daniel A. Reed, Ruth Aydt, Roger Noe, Philip Roth, Keith Shields, Bradley Schwartz, and Luis Tavera, "Scalable Performance Analysis: The Pablo Performance Analysis Environment", in Proceedings of the Scalable Parallel Libraries Conference, October 1993, pp. 104--113.

[32]

F. Schmuck and R. Haskin, "GPFS: A Shared-Disk File System for Large Computing Clusters", in Proceedings of the First USENIX Conference on File and Storage Technologies, pp. 231--244, USENIX, January 2002.

Digital Library

[33]

Frank Shorter, "Design analysis of a performance analysis standard for parallel file systems", Masters Thesis, August 2003, ftp://ftp.parl.clemson.edu/pub/techreports/2003/PARL-2003-001.ps

[34]

Scalable I/O Project Software Downloads, IOR software, http://www.llnl.gov/icc/lc/siop/downloads/download.html

[35]

Evgenia Smirni and Daniel A. Reed, "Workload Characterization of Input/Output Intensive Parallel Applications," Proceedings of the Conference on Modeling Techniques and Tools for Computer Performance Evaluation, Springer-Verlag Lecture Notes in Computer Science, vol. 1245, pp. 169--180, June 1997.

Digital Library

[36]

Evgenia Smirni and Daniel A. Reed, "Lessons from Characterizing the Input/Output Behavior of Parallel Scientific Applications", in Performance Evaluation, volume 33, pp. 27--44, 1998.

Digital Library

[37]

Rajeev Thakur, William Gropp, and Ewing Lusk, "Data Sieving and Collective I/O in ROMIO", in Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999, pp. 182--189.

Digital Library

[38]

Rajeev Thakur, Robert Ross, Ewing Lusk, and William Gropp, "Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation," Technical Memorandum ANL/MCS-TM-234, Mathematics and Computer Science Division, Argonne National Laboratory, Revised May 2004.

[39]

Nancy Tran, Daniel A. Reed. "Automatic ARIMA Time Series Modeling for Adaptive I/O Prefetching," IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 4, pp. 362--377, April, 2004.

Digital Library

[40]

Mustafa Uysal, Anurag Acharya, and Joel Saltz, "Requirements of I/O Systems for Parallel Machines: An Application-driven Study", Technical Report, CS-TR-3802, University of Maryland, College Park, May 1997.

Digital Library

[41]

C. K. Yang, T. Mitra and T. Chiueh, "A Decoupled Architecture for Application-Specific File Prefetching", in Freenix Track of USENIX 2002 Annual Conference, 2002.

Digital Library

Cited By

Knespel MBrunst HButt AMi NChard K(2023)Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache PrefetchingProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592992(295-307)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3588195.3592992
Patel TByna SLockwood GWright NCarns PRoss RTiwari DNoh SWelch B(2020)Uncovering access, reuse, and sharing characteristics of I/O-intensive files on large-scale production HPC systemsProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386701(91-102)Online publication date: 24-Feb-2020
https://dl.acm.org/doi/10.5555/3386691.3386701
Aupy GGainaru AFèvre V(2019)I/O Scheduling Strategy for Periodic ApplicationsACM Transactions on Parallel Computing10.1145/33385106:2(1-26)Online publication date: 23-Jul-2019
https://dl.acm.org/doi/10.1145/3338510
Show More Cited By

Index Terms

Parallel I/O prefetching using MPI file caching and I/O signatures

Recommendations

Stealth prefetching
Proceedings of the 2006 ASPLOS Conference

Prefetching in shared-memory multiprocessor systems is an increasingly difficult problem. As system designs grow to incorporate larger numbers of faster processors, memory latency and interconnect traffic increase. While aggressive prefetching ...
Stealth prefetching
ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems

Prefetching in shared-memory multiprocessor systems is an increasingly difficult problem. As system designs grow to incorporate larger numbers of faster processors, memory latency and interconnect traffic increase. While aggressive prefetching ...
Cashing in on hints for better prefetching and caching in PVFS and MPI-IO
HPDC '10: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing

In this work, we propose, implement and test a novel approach to the management of parallel I/O in high-performance computing. Our proposed approach is built upon three complementary ideas: (i) allowing users to place hints into the application code ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing

November 2008

739 pages

ISBN:9781424428359

Conference Chair:
Patricia Teller

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS: Computer Society

Publisher

IEEE Press

Publication History

Published: 15 November 2008

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SC '08

Sponsor:

SIGARCH
IEEE-CS

SC '08: International Conference for High Performance Computing, Networking, Storage and Analysis

November 15 - 21, 2008

Texas, Austin

Acceptance Rates

SC '08 Paper Acceptance Rate 59 of 277 submissions, 21%;

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
566
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 24 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Knespel MBrunst HButt AMi NChard K(2023)Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache PrefetchingProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592992(295-307)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3588195.3592992
Patel TByna SLockwood GWright NCarns PRoss RTiwari DNoh SWelch B(2020)Uncovering access, reuse, and sharing characteristics of I/O-intensive files on large-scale production HPC systemsProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386701(91-102)Online publication date: 24-Feb-2020
https://dl.acm.org/doi/10.5555/3386691.3386701
Aupy GGainaru AFèvre V(2019)I/O Scheduling Strategy for Periodic ApplicationsACM Transactions on Parallel Computing10.1145/33385106:2(1-26)Online publication date: 23-Jul-2019
https://dl.acm.org/doi/10.1145/3338510
Zhu CWang FHou B(2019)BPPProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337904(1-10)Online publication date: 5-Aug-2019
https://dl.acm.org/doi/10.1145/3337821.3337904
Behzad BByna SPrabhat Snir M(2019)Optimizing I/O Performance of HPC Applications with AutotuningACM Transactions on Parallel Computing10.1145/33092055:4(1-27)Online publication date: 8-Mar-2019
https://dl.acm.org/doi/10.1145/3309205
Al Assaf MJiang XQin XAbid MQiu MZhang J(2018)Informed Prefetching for Distributed Multi-Level Storage SystemsJournal of Signal Processing Systems10.1007/s11265-017-1277-z90:4(619-640)Online publication date: 1-Apr-2018
https://dl.acm.org/doi/10.1007/s11265-017-1277-z
Shi XLi MLiu WJin HYu CChen YGropp WBeckman PLi ZCazorla F(2017)SSDUPProceedings of the International Conference on Supercomputing10.1145/3079079.3079087(1-10)Online publication date: 14-Jun-2017
https://dl.acm.org/doi/10.1145/3079079.3079087
Wang TVasko KLiu ZChen HYu W(2016)Enhance parallel input/output with cross-bundle aggregationInternational Journal of High Performance Computing Applications10.1177/109434201561801730:2(241-256)Online publication date: 1-May-2016
https://dl.acm.org/doi/10.1177/1094342015618017
Lazareva SDemianenko IPuntikov N(2015)Automatic request analyzer for QoS enabled storage systemProceedings of the 11th Central & Eastern European Software Engineering Conference in Russia10.1145/2855667.2855670(1-8)Online publication date: 22-Oct-2015
https://dl.acm.org/doi/10.1145/2855667.2855670
Behzad BByna SPrabhat Snir MButt ALofstead J(2015)Pattern-driven parallel I/O tuningProceedings of the 10th Parallel Data Storage Workshop10.1145/2834976.2834977(43-48)Online publication date: 15-Nov-2015
https://dl.acm.org/doi/10.1145/2834976.2834977
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents