Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Compiler-directed file layout optimization for hierarchical storage systems

Published: 01 July 2013 Publication History

Abstract

File layout of array data is a critical factor that effects the behavior of storage caches, and has so far taken not much attention in the context of hierarchical storage systems. The main contribution of this paper is a compiler-driven file layout optimization scheme for hierarchical storage caches. This approach, fully automated within an optimizing compiler, analyzes a multi-threaded application code and determines a file layout for each disk-resident array referenced by the code, such that the performance of the target storage cache hierarchy is maximized. We tested our approach using 16 I/O intensive application programs and compared its performance against two previously proposed approaches under different cache space management schemes. Our experimental results show that the proposed approach improves the execution time of these parallel applications by 23.7% on average.

References

[1]
N. Ali, P. Carns, K. Iskra, D. Kimpe, S. Lang, R. Latham, R. Ross, L. Ward and P. Sadayappan, Scalable I/O forwarding framework for high-performance computing systems, in: Proceedings of the IEEE International Conference on Cluster Computing, 2009, pp. 1-10.
[2]
C. Bastoul, A. Cohen, S. Girbal, S. Sharma, O. Temam, A. Group and I. Rocquencourt, Putting polyhedral loop transformations to work, in: Workshop on Languages and Compilers for Parallel Computing, 2003, pp. 209-225.
[3]
D.O. Bigelow, S. Iyer, T. Kaldewey, R.C. Pineiro, A. Povzner, S.A. Brandt, R.A. Golding, T.M. Wong and C. Maltzahn, Endto-end performance management for scalable distributed storage, in: Proceedings of the 2nd International Workshop on Petascale Data Storage, 2007, pp. 30-34.
[4]
R. Bordawekar, A. Choudhary, K. Kennedy, C. Koelbel and M. Paleczny, A model and bordawekar-ooc for out-of-core data parallel programs, in: Proceedings of the fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 1995, pp. 1-10.
[5]
A.R. Butt, C. Gniady and Y.C. Hu, The performance impact of kernel prefetching on buffer cache replacement algorithms, in: Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems, 2005, pp. 157-168.
[6]
P. Cao, E.W. Felten and K. Li, Implementation and performance of application-controlled file caching, in: Proceedings of the USENIX Conference on Operating Systems Design and Implementation, 1994.
[7]
P.H. Carns, W.B. Ligon, III, R.B. Ross and R. Thakur, PVFS: A parallel file system for Linux clusters, in: Proceedings of the 4th Annual Linux Showcase and Conference, 2000, pp. 391-430.
[8]
Z. Chen, Y. Zhou and K. Li, Eviction based cache placement for storage caches, in: Proceedings of the USENIX Annual Technical Conference, 2003.
[9]
J. Choi, S.H. Noh, S.L. Min and Y. Cho, An implementation study of a detection-based adaptive block replacement scheme, in: Proceedings of the USENIX Annual Technical Conference, 1999, pp. 18-18.
[10]
J. Choi, S.H. Noh, S.L. Min and Y. Cho, Towards application/file-level characterization of block references: a case for finegrained buffer management, SIGMETRICS Perform. Eval. Rev. 28 (2000), 286-295.
[11]
X. Ding, S. Jiang, F. Chen, K. Davis and X. Zhang, DiskSeen: exploiting disk layout and access history to enhance I/O prefetch, in: Proceedings of the USENIX Annual Technical Conference, 2007, pp. 20:1-20:14.
[12]
B.S. Gill, On multi-level exclusive caching: offline optimality and why promotions are better than demotions, in: Proceedings of the 6th USENIX Conference on File and Storage Technologies, 2008.
[13]
B.S. Gill, L. Angel and D. Bathen, AMP: Adaptive multistream prefetching in a shared cache, in: Proceedings of the Fifth USENIX Symposium on File and Storage Technologies, 2007, pp. 185-198.
[14]
B.S. Gill, M. Ko, B. Debnath and W. Belluomini, STOW: a spatially and temporally optimized write caching algorithm, in: Proceedings of the USENIX Annual Technical Conference, 2009, p. 26.
[15]
B.S. Gill and D.S. Modha, SARC: sequential prefetching in adaptive replacement cache, in: Proceedings of the Annual Conference on USENIX Annual Technical Conference, 2005, pp. 33-33.
[16]
C. Gniady, A.R. Butt and Y.C. Hu, Program-counter-based pattern classification in buffer caching, in: Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, 2004, pp. 27-27.
[17]
W. Gropp, E. Lusk and R. Thakur, Using MPI-2: Advanced Features of the Message-Passing Interface, MIT Press, Cambridge, MA, 1999.
[18]
IBM Blue Gene/P, available at: http://www.ibm.com/watson.
[19]
Jaguar, available at: http://www.nccs.gov/jaguar/.
[20]
S. Jiang, F. Chen and X. Zhang, CLOCK-Pro: an effective improvement of the clock replacement, in: Proceedings of the USENIX Annual Technical Conference, 2005, p. 35.
[21]
S. Jiang, K. Davis and X. Zhang, Coordinated multilevel buffer cache management with consistent access locality quantification, IEEE Transactions on Computers 56(1) (2007), 95-108.
[22]
S. Jiang, X. Ding, F. Chen, E. Tan and X. Zhang, Dulo: an effective buffer cache management scheme to exploit both temporal and spatial locality, in: Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies, 2005.
[23]
S. Jiang and X. Zhang, Ulc: a file block placement and replacement protocol to effectively exploit hierarchical locality in multi-level buffer caches, in: Proceedings of the 24th International Conference on Distributed Computing Systems, 2004, pp. 168-177.
[24]
S.T. Jones, A.C. Arpaci-Dusseau and R.H. Arpaci-Dusseau, Geiger: monitoring the buffer cache in a virtual machine environment, in: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006, pp. 14-24.
[25]
M. Kandemir, A. Choudhary, J. Ramanujam and R. Bordawekar, Compilation techniques for out-of-core parallel computations, Parallel Comput. 24 (1998), 597-628.
[26]
M. Kandemir, S.P. Muralidhara, M. Karakoy and S.W. Son, Computation mapping for multi-level storage cache hierarchies, in: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010, pp. 179-190.
[27]
M. Kandemir, S.W. Son and M. Karakoy, Improving I/O performance of applications through compiler-directed code restructuring, in: Proceedings of the 6th USENIX Conference on File and Storage Technologies, 2008, pp. 11:1-11:16.
[28]
J.M. Kim, J. Choi, J. Kim, S.H. Noh, S.L. Min, Y. Cho and C.S. Kim, A low-overhead high-performance unified buffer management scheme that exploits sequential and looping references, in: Proceedings of the 4th Conference on Symposium on Operating System Design and Implementation, 2000.
[29]
T. Kimbrel, A. Tomkins, R.H. Patterson, B. Bershad, P. Cao, E.W. Felten, G.A. Gibson, A.R. Karlin and K. Li, A tracedriven comparison of algorithms for parallel prefetching and caching, in: Proceedings of the Second USENIX Symposium on Operating Systems Design and Implementation, 1996, pp. 19-34.
[30]
S. Leung and J. Zahorjan, Optimizing data locality by array restructuring, Technical report, Department of Computer Science and Eng., University of Washington, 1995.
[31]
C. Li and K. Shen, Managing prefetch memory for dataintensive online servers, in: Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies, 2005.
[32]
M. Li, E. Varki, S. Bhatia and A. Merchant, TaP: table-based prefetching for storage caches, in: Proceedings of the 6th USENIX Conference on File and Storage Technologies, 2008.
[33]
X. Li, A. Aboulnaga, K. Salem, A. Sachedina and S. Gao, Second-tier cache management using write hints, in: Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies, 2005.
[34]
NAS Parallel Benchmarks, available at: http://www.nas.nasa. gov/Resources/Software/npb.html.
[35]
Parallel I/O Benchmarks, available at: http://www.mcs.anl. gov/~thakur/pio-benchmarks.html.
[36]
R.H. Patterson, G.A. Gibson, E. Ginting, D. Stodolsky and J. Zelenka, Informed prefetching and caching, in: Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles, 1995, pp. 79-95.
[37]
N. Sarhan and C. Das, An integrated resource sharing policy for multimedia storage servers based on network-attached disks, in: Proceedings of the 23rd International Conference on Distributed Computing Systems, 2003, pp. 136-143.
[38]
A. Schrijver, Theory of Linear and Integer Programming, Wiley, 1998.
[39]
SPEComp, available at: http://www.spec.org/omp/.
[40]
The SUIF 2 Compiler System, available at: http://suif.stanford. edu/suif/suif2/.
[41]
M. Vilayannur, A. Sivasubramaniam, M. Kandemir, R. Thakur and R. Ross, Discretionary caching for I/O on clusters, in: Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003, pp. 96-103.
[42]
M. Wachs, M. Abd-El-Malek, E. Thereska and G.R. Ganger, ARGON: performance insulation for shared storage servers, in: Proceedings of the 5th USENIX Conference on File and Storage Technologies, 2007.
[43]
M.J. Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley/Longman, 1995.
[44]
T.M. Wong and J. Wilkes, My cache or yours? Making storage more exclusive, in: Proceedings of the General Track of the USENIX Annual Technical Conference, 2002, pp. 161-175.
[45]
C.H. Xia, D. Towsley and C. Zhang, Distributed resource management and admission control of stream processing systems with max utility, in: Proceedings of the 27th International Conference on Distributed Computing Systems, 2007.
[46]
G. Yadgar, M. Factor, K. Li and A. Schuster, MC2: Multiple clients on a multilevel cache, in: Proceedings of the 2008 The 28th International Conference on Distributed Computing Systems, 2008, pp. 722-730.
[47]
G. Yadgar, M. Factor and A. Schuster, KARMA: know-it-all replacement for a multilevel cache, in: Proceedings of the 5th USENIX Conference on File and Storage Technologies, 2007.
[48]
J. Yoon, S.L. Min and Y. Cho, Buffer cache management: Predicting the future from the past, in: Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks, 2002.
[49]
Y. Zhou, Z. Chen and K. Li, Second-level buffer cache management, IEEE Trans. Parallel Distrib. Syst. 15 (2004), 505-519.
[50]
Y. Zhou, J. Philbin and K. Li, The multi-queue replacement algorithm for second level buffer caches, in: Proceedings of the USENIX Annual Technical Conference, 2001, pp. 91-104.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Scientific Programming
Scientific Programming  Volume 21, Issue 3-4
Selected Papers from Super Computing 2012
July 2013
147 pages

Publisher

IOS Press

Netherlands

Publication History

Published: 01 July 2013

Author Tags

  1. Compiler Optimization
  2. File Layout

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media