Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1413370.1413415acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Parallel I/O prefetching using MPI file caching and I/O signatures

Published: 15 November 2008 Publication History

Abstract

Parallel I/O prefetching is considered to be effective in improving I/O performance. However, the effectiveness depends on determining patterns among future I/O accesses swiftly and fetching data in time, which is difficult to achieve in general. In this study, we propose an I/O signature-based prefetching strategy. The idea is to use a predetermined I/O signature of an application to guide prefetching. To put this idea to work, we first derived a classification of patterns and introduced a simple and effective signature notation to represent patterns. We then developed a toolkit to trace and generate I/O signatures automatically. Finally, we designed and implemented a thread-based client-side collective prefetching cache layer for MPI-IO library to support prefetching. A prefetching thread reads I/O signatures of an application and adjusts them by observing I/O accesses at runtime. Experimental results show that the proposed prefetching method improves I/O performance significantly for applications with complex patterns.

References

[1]
Surendra Byna, Xian-He Sun, William Gropp and Rajeev Thakur, "Predicting the Memory-Access Cost Based on Data Access Patterns", in Proceedings of the IEEE International Conference on Cluster Computing, San Diego, September 2004.
[2]
Y. Chen, S. Byna, X.-H. Sun, R. Thakur, W. Gropp. "Exploring Parallel I/O Concurrency with Speculative Prefetching", in Proc. 37th International Conference on Parallel Processing (ICPP '08), Sept. 2008.
[3]
Phyllis E. Crandall, Ruth A. Aydt, Andrew A. Chien, and Daniel A. Reed, "Input/output characteristics of scalable parallel applications", in Proceedings of the ACM/IEEE conference on Supercomputing, pp. 59-es, December 1995.
[4]
Cluster File Systems Inc., "Lustre: A scalable, high performance file system", Whitepaper, http://www.lustre.org/docs/whitepaper.pdf
[5]
F. Chang, "Using Speculative Execution to Automatically Hide I/O Latency", Carnegie Mellon Ph.D Dissertation CMU-CS-01-172, December 2001.
[6]
F. Chang and G. A. Gibson, "Automatic I/O Hint Generation through Speculative Execution", in Proceedings of the 3rd Symposium on Operating Systems Design and Implementation, February 1999.
[7]
P. H. Carns, W. B. Ligon III, R. B. Ross, and R. Thakur, "PVFS: A Parallel File System For Linux Clusters", in Proceedings of the 4th Annual Linux Showcase and Conference, Atlanta, GA, October 2000, pp. 317--327
[8]
M. Dahlin, R. Wang, T. Anderson, D. Patterson, "Cooperative caching: using remote client memory to improve file system performance", in Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation, November 1994.
[9]
FLASH IO Benchmark, Routine - Parallel HDF5, http://www.astro.sunysb.edu/mzingale/flash_benchmark_io/
[10]
The HDF5 Group, HDF5 - A New Generation of Hierarchical Data Format, http://hdf.ncsa.uiuc.edu/products/hdf5/index.html
[11]
B. Hendrickson and D. Womble, "The torus-wrap mapping for dense matrix calculations on massively parallel computers", SIAM Journal of Scientific Computing, 15(5), September 1994.
[12]
K. Keeton, G. Alvarez, E. Riedel, and M. Uysal. "Characterizing I/O-intensive Workload Sequentiality on Modern Disk Arrays", in Proceedings of the 4th Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW-01), January 2001.
[13]
D. F. Kotz and C. S. Ellis, "Prefetching in File Systems for MIMD Multiprocessors", IEEE Transactions on Parallel and Distributed Systems, 1(2), pp. 218--230, 1990.
[14]
David Kotz and Nils Nieuwejaar, "Dynamic File-Access Characteristics of a Production Parallel Scientific Workload", in Proceedings of Supercomputing '94, pp. 640--649, November, 1994.
[15]
W. K. Liao, K. Coloma, A. Choudhary, L. Ward, E. Russel and S. Tideman, "Collective Caching: Application-Aware Client-Side File Caching", in Proceedings of the 14th International Symposium on High Performance Distributed Computing, 2005.
[16]
X. Ma, J. Lee and M. Winslett, "High-level Buffering for Hiding Periodic Output Cost in Scientific Simulations", IEEE Transactions on Parallel and Distributed Systems, Vol. 17, No. 3, 2006.
[17]
T. M. Madhyastha and Daniel A. Reed, "Learning to classify parallel input/output access patterns", in IEEE Transactions on Parallel and Distributed Systems, Volume 13, Issue 8, pp. 802--813, Aug 2002.
[18]
Ethan L. Miller, Randy H. Katz, "Input/output behavior of supercomputing applications", in Proceedings of the 1991 ACM/IEEE conference on Supercomputing, pp. 567--576, November 1991.
[19]
Jaydeep Marathe, Frank Mueller, Tushar Mohan, Sally McKee, Bronis de Supinski, and Andy Yoo, "METRIC: Memory tracing via dynamic binary rewriting to identify cache inefficiencies", ACM Transactions on Programming Languages and Systems, 29(2), April 2007.
[20]
J. May, "Parallel I/O For High Performance Computing", Morgan Kaufmann Publishing, 2001.
[21]
Message Passing Interface Forum, "MPI: A Message-Passing Interface Standard. Version 1.1", June 1995. http://www.mpi-forum.org/docs/docs.html
[22]
Message Passing Interface Forum, "MPI-2: Extensions to the Message-Passing Interface", July 1997 1996. http://www.mpi-forum.org/docs/docs.html
[23]
NAS Parallel benchmarks, http://www.nas.nasa.gov/Resources/Software/npb.html
[24]
Nils Nieuwejaar, David Kotz, Apratim Purakayastha, Carla Schlatter Ellis, and Michael Best, "File-Access Characteristics of Parallel Scientific Workloads", IEEE Transactions on Parallel and Distributed Systems, 7(10) pp. 1075--1089, October 1996.
[25]
R. H. Patterson and G. Gibson, "Exposing I/O Concurrency with Informed Prefetching", in Proceedings of the 3rd International Conference on Parallel and Distributed Information Systems, 1994.
[26]
Barbara Pasquale and George C. Polyzos, "A static analysis of I/O characteristics of scientific applications in a production workload", in Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, pp. 388--397, December 1993.
[27]
Barbara K. Pasquale and George C. Polyzos, "Dynamic I/O characterization of I/O intensive scientific applications", in Proceedings of the 1994 Conference on Supercomputing, pp. 660--669, 1994.
[28]
A. Papathanasiou and M. Scott, "Aggressive Prefetching: An Idea Whose Time Has Come", in Proceedings of the Tenth Workshop on Hot Topics in Operating Systems, 2005.
[29]
Apratim Purakayastha, Carla Ellis, David Kotz, Nils Nieuwejaar, and Michael Best, "Characterizing Parallel File-Access Patterns on a Large-Scale Multiprocessor", In Proceedings of the Ninth International Parallel Processing Symposium (IPPS), pp. 165--172, April, 1995.
[30]
Parallel I/O Benchmarking Consortium, http://www-unix.mcs.anl.gov/pio-benchmark/
[31]
Daniel A. Reed, Ruth Aydt, Roger Noe, Philip Roth, Keith Shields, Bradley Schwartz, and Luis Tavera, "Scalable Performance Analysis: The Pablo Performance Analysis Environment", in Proceedings of the Scalable Parallel Libraries Conference, October 1993, pp. 104--113.
[32]
F. Schmuck and R. Haskin, "GPFS: A Shared-Disk File System for Large Computing Clusters", in Proceedings of the First USENIX Conference on File and Storage Technologies, pp. 231--244, USENIX, January 2002.
[33]
Frank Shorter, "Design analysis of a performance analysis standard for parallel file systems", Masters Thesis, August 2003, ftp://ftp.parl.clemson.edu/pub/techreports/2003/PARL-2003-001.ps
[34]
Scalable I/O Project Software Downloads, IOR software, http://www.llnl.gov/icc/lc/siop/downloads/download.html
[35]
Evgenia Smirni and Daniel A. Reed, "Workload Characterization of Input/Output Intensive Parallel Applications," Proceedings of the Conference on Modeling Techniques and Tools for Computer Performance Evaluation, Springer-Verlag Lecture Notes in Computer Science, vol. 1245, pp. 169--180, June 1997.
[36]
Evgenia Smirni and Daniel A. Reed, "Lessons from Characterizing the Input/Output Behavior of Parallel Scientific Applications", in Performance Evaluation, volume 33, pp. 27--44, 1998.
[37]
Rajeev Thakur, William Gropp, and Ewing Lusk, "Data Sieving and Collective I/O in ROMIO", in Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999, pp. 182--189.
[38]
Rajeev Thakur, Robert Ross, Ewing Lusk, and William Gropp, "Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation," Technical Memorandum ANL/MCS-TM-234, Mathematics and Computer Science Division, Argonne National Laboratory, Revised May 2004.
[39]
Nancy Tran, Daniel A. Reed. "Automatic ARIMA Time Series Modeling for Adaptive I/O Prefetching," IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 4, pp. 362--377, April, 2004.
[40]
Mustafa Uysal, Anurag Acharya, and Joel Saltz, "Requirements of I/O Systems for Parallel Machines: An Application-driven Study", Technical Report, CS-TR-3802, University of Maryland, College Park, May 1997.
[41]
C. K. Yang, T. Mitra and T. Chiueh, "A Decoupled Architecture for Application-Specific File Prefetching", in Freenix Track of USENIX 2002 Annual Conference, 2002.

Cited By

View all
  • (2023)Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache PrefetchingProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592992(295-307)Online publication date: 7-Aug-2023
  • (2020)Uncovering access, reuse, and sharing characteristics of I/O-intensive files on large-scale production HPC systemsProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386701(91-102)Online publication date: 24-Feb-2020
  • (2019)I/O Scheduling Strategy for Periodic ApplicationsACM Transactions on Parallel Computing10.1145/33385106:2(1-26)Online publication date: 23-Jul-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing
November 2008
739 pages
ISBN:9781424428359

Sponsors

Publisher

IEEE Press

Publication History

Published: 15 November 2008

Check for updates

Author Tags

  1. I/O signatures
  2. MPI-IO
  3. parallel I/O
  4. prefetching

Qualifiers

  • Research-article

Funding Sources

Conference

SC '08
Sponsor:

Acceptance Rates

SC '08 Paper Acceptance Rate 59 of 277 submissions, 21%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache PrefetchingProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592992(295-307)Online publication date: 7-Aug-2023
  • (2020)Uncovering access, reuse, and sharing characteristics of I/O-intensive files on large-scale production HPC systemsProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386701(91-102)Online publication date: 24-Feb-2020
  • (2019)I/O Scheduling Strategy for Periodic ApplicationsACM Transactions on Parallel Computing10.1145/33385106:2(1-26)Online publication date: 23-Jul-2019
  • (2019)BPPProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337904(1-10)Online publication date: 5-Aug-2019
  • (2019)Optimizing I/O Performance of HPC Applications with AutotuningACM Transactions on Parallel Computing10.1145/33092055:4(1-27)Online publication date: 8-Mar-2019
  • (2018)Informed Prefetching for Distributed Multi-Level Storage SystemsJournal of Signal Processing Systems10.1007/s11265-017-1277-z90:4(619-640)Online publication date: 1-Apr-2018
  • (2017)SSDUPProceedings of the International Conference on Supercomputing10.1145/3079079.3079087(1-10)Online publication date: 14-Jun-2017
  • (2016)Enhance parallel input/output with cross-bundle aggregationInternational Journal of High Performance Computing Applications10.1177/109434201561801730:2(241-256)Online publication date: 1-May-2016
  • (2015)Automatic request analyzer for QoS enabled storage systemProceedings of the 11th Central & Eastern European Software Engineering Conference in Russia10.1145/2855667.2855670(1-8)Online publication date: 22-Oct-2015
  • (2015)Pattern-driven parallel I/O tuningProceedings of the 10th Parallel Data Storage Workshop10.1145/2834976.2834977(43-48)Online publication date: 15-Nov-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media