Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3026877.3026931acmotherconferencesArticle/Chapter ViewAbstractPublication PagesosdiConference Proceedingsconference-collections
Article

Coordinated and efficient huge page management with ingens

Published: 02 November 2016 Publication History

Abstract

Modern computing is hungry for RAM, with today's enormous capacities eagerly consumed by diverse workloads. Hardware address translation overheads have grown with memory capacity, motivating hardware manufacturers to provide TLBs with thousands of entries for large page sizes (called huge pages). Operating systems and hypervisors support huge pages with a hodge-podge of best-effort algorithms and spot fixes that made sense for architectures with limited huge page support, but the time has come for a more fundamental redesign.
Ingens is a framework for huge page support that relies on a handful of basic primitives to provide transparent huge page support in a principled, coordinated way. By managing contiguity as a first-class resource and by tracking utilization and access frequency of memory pages, Ingens is able to eliminate a number of fairness and performance pathologies that plague current systems. Experiments with our prototype demonstrate fairness improvements, performance improvements (up to 18%), tail-latency reduction (up to 41%), and reduction of memory bloat from 69% to less than 1% for important applications like Web services (e.g., the Cloudstone benchmark) and the Redis key-value store.

References

[1]
http://www.7-cpu.com/cpu/Skylake.html. [Accessed April, 2016].
[2]
http://www.7-cpu.com/cpu/Haswell.html. [Accessed April, 2016].
[3]
Apache Cloudstack. https://en.wikipedia.org/wiki/Apache_CloudStack. [Accessed April, 2016].
[4]
Apache Hadoop. http://hadoop.apache.org/. [Accessed April, 2016].
[5]
Apache Spark. http://spark.apache.org/docs/latest/index.html. [Accessed April, 2016].
[6]
Application-friendly kernel interfaces. https://lwn.net/Articles/227818/. [March, 2007].
[7]
Cloudera recommends turning off memory compaction due to high CPU utilization. http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_admin_performance.html. [Accessed April, 2016].
[8]
Cloudsuite. http://parsa.epfl.ch/cloudsuite/graph.html. [Accessed April, 2016].
[9]
CouchBase recommends disabling huge pages. http://blog.couchbase.com/often-overlooked-linux-os-tweaks. [March, 2014].
[10]
Data Plane Development Kit. http://www.dpdk.org/. [Accessed April-2016].
[11]
DokuDB recommends disabling huge pages. https://www.percona.com/blog/2014/07/23/why-tokudb-hates-transparent-hugepages/. [July, 2014].
[12]
Exponential moving average. https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average. [Accessed April, 2016].
[13]
High CPU utilization in Hadoop due to transparent huge pages. https://www.ghostar.org/2015/02/transparent-huge-pages-on-hadoop-makes-me-sad/. [February, 2015].
[14]
High CPU utilization in Mysql due to transparent huge pages. http://developer.okta.com/blog/2015/05/22/tcmalloc. [May, 2015].
[15]
Huge page support in Mac OS X. https://developer.apple.com/legacy/ library/documentation/Darwin/Reference/ManPages/man2/mmap.2.html. [Accessed April-2016].
[16]
IBM cloud with KVM hypervisor. http: //www.networkworld.com/article/2230172/opensource-subnet/redhat-s-kvm-virtualization-proves-itself-in-ibm-s-cloud.html. [March, 2010].
[17]
IBM recommends turning off huge pages due to high CPU utilization. http://www-01.ibm.com/support/docview.wss?uid=swg21677458. [July, 2014].
[18]
Intel hardware random number generator. https://software.intel.com/enus/articles/intel-digital-random-number-generator-drng-software-implementation-guide. [May, 2014].
[19]
Intel HiBench. https://github.com/intel-hadoop/HiBench/tree/master/workloads. [Accessed April, 2016].
[20]
Jemalloc. http://www.canonware.com/jemalloc/. [Accessed April-2016].
[21]
Large-page support in Windows. https://msdn.microsoft.com/en-us/library/windows/desktop/aa366720(v=vs.85).aspx. [Accessed April-2016].
[22]
Liblinear. https://www.csie.ntu.edu.tw/~cjlin/liblinear/. [Accessed April, 2016].
[23]
MongoDB. https://www.mongodb.com/. [Accessed April, 2016].
[24]
MongoDB recommends disabling huge pages. https://docs.mongodb.org/manual/tutorial/transparent-huge-pages/. [Accessed April, 2016].
[25]
Movie recommendation with Spark. http: //ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html. [Accessed April, 2016].
[26]
NuoDB recommends disabling huge pages. http: //www.nuodb.com/techblog/linux-transparent-huge-pages-jemalloc-and-nuodb. [May, 2014].
[27]
OpenStack. https://openvirtualizationalliance.org/what-kvm/openstack. [Accessed April-2016].
[28]
PARSEC 3.0 benchmark suite. http://parsec.cs.princeton.edu/. [Accessed April, 2016].
[29]
Redis. http://redis.io/. [Accessed April, 2016].
[30]
Redis recommends disabling huge pages. http://redis.io/topics/latency. [Accessed April, 2016].
[31]
Redis SSD swap discussion. http://antirez.com/news/52. [March, 2013].
[32]
SAP IQ recommends disabling huge pages. http: //scn.sap.com/people/markmumy/blog/2014/05/22/sap-iq-and-linux-hugepagestransparent-hugepages. [May, 2014].
[33]
SPEC CPU 2006. https://www.spec.org/cpu2006/. [Accessed April, 2016].
[34]
Splunk recommends disabling huge pages. http://docs.splunk.com/Documentation/Splunk/6.1.3/ReleaseNotes/SplunkandTHP. [December, 2013].
[35]
Thread-caching malloc. http://googperftools.sourceforge.net/doc/tcmalloc.html. [Accessed April-2016].
[36]
Transparent huge pages in 2.6.38. https://lwn.net/Articles/423584/. [January, 2011].
[37]
VoltDB recommends disabling huge pages. https://docs.voltdb.com/AdminGuide/adminmemmgt.php. [Accessed April, 2016].
[38]
Jeongseob Ahn, Seongwook Jin, and Jaehyuk Huh. Revisiting hardware-assisted page walks for virtualized systems. In International Symposium on Computer Architecture (ISCA), 2012.
[39]
Jeongseob Ahn, Seongwook Jin, and Jaehyuk Huh. Fast two-level address translation for virtualized systems. In IEEE Transactions on Computers, 2015.
[40]
AMD. AMD-V Nested Paging, 2010. http:// developer.amd.com/wordpress/media/2012/10/NPT-WP-1%201-final-TM.pdf.
[41]
Jean Araujo, Rubens Matos, Paulo Maciel, Rivalino Matias, and Ibrahim Beicker. Experimental evaluation of software aging effects on the eucalyptus cloud computing infrastructure. In Middleware Industry Track Workshop, 2011.
[42]
Thomas W. Barr, Alan L. Cox, and Scott Rixner. Translation caching: Skip, don't walk (the page table). In International Symposium on Computer Architecture (ISCA), 2010.
[43]
Thomas W. Barr, Alan L. Cox, and Scott Rixner. Spectlb: A mechanism for speculative address translation. In International Symposium on Computer Architecture (ISCA), 2011.
[44]
Arkapravu Basu, Jayneel Gandhi, Jichuan Chang, Mark D. Hill, and Michael M. Swift. Efficient virtual memory for big memory servers. In International Symposium on Computer Architecture (ISCA), 2013.
[45]
Aaron Beitch, Brandon Liu, Timothy Yung, Rean Griffith, Armando Fox, and David Patterson. Rain: A workload generation toolkit for cloud computing applications. In U.C. Berkeley Technical Publications (UCB/EECS-2010-14), 2010.
[46]
Abhishek Bhattacharjee. Large-reach memory management unit caches. In International Symposium on Microarchitecture, 2013.
[47]
Abhishek Bhattacharjee, Daniel Lustig, and Margaret Martonosi. Shared last-level TLBs for chip multiprocessors. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2011.
[48]
Abhishek Bhattacharjee and Margaret Martonosi. Characterizing the TLB behavior of emerging parallel workloads on chip multiprocessors. In International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009.
[49]
Yu Du, Miao Zhou, B.R. Childers, D. Mosse, and R. Melhem. Supporting superpages in noncontiguous physical memory. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2015.
[50]
Tammy Everts. The average web page is more than 2 MB size. https://www.soasta.com/blog/page-bloat-average-web-page-2-mb/. [June, 2015].
[51]
Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi. Clearing the clouds: A study of emerging scale-out workloads on modern hardware. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 37-48, New York, NY, USA, 2012. ACM.
[52]
Jayneel Gandhi, Mark D. Hill, and Michael M. Swift. Exceeding the best of nested and shadow paging. In International Symposium on Computer Architecture (ISCA), 2016.
[53]
Jayneel Gandhi, Arkaprava Basu, Mark D. Hill, and Michael M. Swift. Efficient memory virtualization. In International Symposium on Microarchitecture, 2014.
[54]
Fabien Gaud, Baptiste Lepers, Jeremie Decouchant, Justin Funston, Alexandra Fedorova, and Vivien Quéma. Large pages may be harmful on numa systems. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference, USENIX ATC'14, pages 231-242, Berkeley, CA, USA, 2014. USENIX Association.
[55]
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), pages 17-30, Hollywood, CA, 2012. USENIX.
[56]
Mel Gorman and Patrick Healy. Supporting superpage allocation without additional hardware support. In Proceedings of the 7th International Symposium on Memory Management, 2008.
[57]
Mel Gorman and Patrick Healy. Performance characteristics of explicit superpage support. In Workshorp on the Interaction between Operating Systems and Computer Architecture (WIOSCA), 2010.
[58]
Mel Gorman and Andy Whitcroft. The what, the why and the where to of anti-fragmentation. In Linux Symposium, 2005.
[59]
Intel Corporation. Intel 64 and IA-32 Architectures Software Developers Manual, 2016. https:// www-ssl.intel.com/content/dam/www/ public/us/en/documents/manuals/64- ia-32-architectures-software-developer-manual-325462.pdf.
[60]
Gokul B. Kandiraju and Anand Sivasubramaniam. Going the distance for TLB prefetching: An application-driven study. In International Symposium on Computer Architecture (ISCA), 2002.
[61]
Vasileios Karakostas, Jayneel Gandhi, Furkan Ayar, Adrin Cristal, Mark D. Hill, Kathryn S. McKinley, Mario Nemirovsky, Michael M. Swift, and Osman nsal. Redundant memory mappings for fast access to large memories. In International Symposium on Computer Architecture (ISCA), 2015.
[62]
Avi Kivity, Yaniv Kamay, Dor Laor, Uri Lublin, and Anthony Liguori. KVM: The linux virtual machine monitor. In Linux Symposium, 2007.
[63]
Kernel Same-page Merging. https://en. wikipedia.org/wiki/Kernel_samepage_merging. [Accessed April, 2016].
[64]
Ching-Pei Lee and Chih-Jen Lin. Large-scale linear RankSVM. Neural Comput., 26(4):781-817, April 2014.
[65]
Huge Pages Part 2 (Interfaces). https://lwn.net/Articles/375096/. [February, 2010].
[66]
Daniel Lustig, Abhishek Bhattacharjee, and Margaret Martonosi. TLB improvements for chip multiprocessors: Inter-core cooperative prefetchers and shared last-level TLBs. ACM Transactions on Architecture and Code Optimization (TACO), 2013.
[67]
Timothy Merrifield and H. Reza Taheri. Performance implications of extended page tables on virtualized x86 processors. In Proceedings of the 12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '16, pages 25-35, New York, NY, USA, 2016. ACM.
[68]
Juan Navarro, Sitaram Iyer, Peter Druschel, and Alan Cox. Practical, transparent operating system support for superpages. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2002.
[69]
M.-M. Papadopoulou, Xin Tong, A. Seznec, and A. Moshovos. Prediction-based superpage-friendly TLB designs. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2015.
[70]
Idle Page Tracking. http://lxr.free-electrons.com/source/ Documentation/vm/idle_page_tracking.txt. [November, 2015].
[71]
Binh Pham, Abhishek Bhattacharjee, Yasuko Eckert, and Gabriel H. Loh. Increasing TLB reach by exploiting clustering in page translations. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2014.
[72]
Binh Pham, Viswanathan Vaidyanathan, Aamer Jaleel, and Abhishek Bhattacharjee. CoLT: Coalesced large-reach TLBs. In International Symposium on Microarchitecture, 2012.
[73]
Binh Pham, Jan Vesely, Gabriel Loh, and Abhishek Bhattacharjee. Large pages and lightweight memory management in virtualized systems: Can you have it both ways? In International Symposium on Microarchitecture, 2015.
[74]
Ashley Saulsbury, Fredrik Dahlgren, and Per Stenström. Recency-based TLB preloading. In International Symposium on Computer Architecture (ISCA), 2000.
[75]
Tom Shanley. Pentium Pro Processor System Architecture. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1st edition, 1996.
[76]
Richard L. Sites and Richard T. Witek. ALPHA architecture reference manual. Digital Press, Boston, Oxford, Melbourne, 1998.
[77]
Will Sobel, Shanti Subramanyam, Akara Sucharitakul, Jimmy Nguyen, Hubert Wong, Arthur Klepchukov, Sheetal Patil, O Fox, and David Patterson. Cloudstone: Multi-platform, multilanguage benchmark and measurement tools for web 2.0, 2008.
[78]
Shekhar Srikantaiah and Mahmut Kandemir. Synergistic tlbs for high performance address translation in chip multiprocessors. In International Symposium on Microarchitecture, 2010.
[79]
M. Talluri and M. D. Hill. Surpassing the TLB performance of superpages with less operating system support. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 1994.
[80]
Transparent Hugepages. https://lwn.net/Articles/359158/. [October, 2009].
[81]
Carl A. Waldspurger. Memory resource management in VMware ESX server. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2002.

Cited By

View all
  • (2025)vAttention: Dynamic Memory Management for Serving LLMs without PagedAttentionProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707256(1133-1150)Online publication date: 3-Feb-2025
  • (2024)Taming hot bloat under virtualization with HUGESCOPEProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692053(999-1012)Online publication date: 10-Jul-2024
  • (2024)Characterizing a Memory Allocator at Warehouse ScaleProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651350(192-206)Online publication date: 27-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
OSDI'16: Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation
November 2016
786 pages
ISBN:9781931971331

Sponsors

  • VMware
  • NetApp
  • Google Inc.
  • Microsoft: Microsoft
  • Facebook: Facebook

In-Cooperation

Publisher

USENIX Association

United States

Publication History

Published: 02 November 2016

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)vAttention: Dynamic Memory Management for Serving LLMs without PagedAttentionProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707256(1133-1150)Online publication date: 3-Feb-2025
  • (2024)Taming hot bloat under virtualization with HUGESCOPEProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692053(999-1012)Online publication date: 10-Jul-2024
  • (2024)Characterizing a Memory Allocator at Warehouse ScaleProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651350(192-206)Online publication date: 27-Apr-2024
  • (2024)GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory StitchingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640423(450-466)Online publication date: 27-Apr-2024
  • (2024)UpDown: Combining Scalable Address Translation with Locality ControlProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00141(1014-1024)Online publication date: 17-Nov-2024
  • (2024)Mosaic Pages: Big TLB Reach With Small PagesIEEE Micro10.1109/MM.2024.340918144:4(52-59)Online publication date: 6-Jun-2024
  • (2024)Contiguitas: The Pursuit of Physical Memory Contiguity in Data CentersIEEE Micro10.1109/MM.2024.340693344:4(44-51)Online publication date: 1-Jul-2024
  • (2023)An Empirical Evaluation of PTE CoalescingProceedings of the International Symposium on Memory Systems10.1145/3631882.3631902(1-16)Online publication date: 2-Oct-2023
  • (2023)Accelerating Extra Dimensional Page Walks for Confidential ComputingProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614293(654-669)Online publication date: 28-Oct-2023
  • (2023)IDYLL: Enhancing Page Translation in Multi-GPUs via Light Weight PTE InvalidationsProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614269(1163-1177)Online publication date: 28-Oct-2023
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media