Article

µtune: auto-tuned threading for OLDI microservices

Authors:

Akshitha Sriraman,

Thomas F. WenischAuthors Info & Claims

OSDI'18: Proceedings of the 13th USENIX conference on Operating Systems Design and Implementation

Pages 177 - 194

Published: 08 October 2018 Publication History

Abstract

Modern On-Line Data Intensive (OLDI) applications have evolved from monolithic systems to instead comprise numerous, distributed microservices interacting via Remote Procedure Calls (RPCs). Microservices face sub-millisecond (sub-ms) RPC latency goals, much tighter than their monolithic counterparts that must meet ≥ 100 ms latency targets. Sub-ms-scale threading and concurrency design effects that were once insignificant for such monolithic services can now come to dominate in the sub-ms-scale microservice regime. We investigate how threading design critically impacts microservice tail latency by developing a taxonomy of threading models--a structured understanding of the implications of how microservices manage concurrency and interact with RPC interfaces under wide-ranging loads. We develop µTune, a system that has two features: (1) a novel framework that abstracts threading model implementation from application code, and (2) an automatic load adaptation system that curtails microservice tail latency by exploiting inherent latency trade-offs revealed in our taxonomy to transition among threading models. We study µTune in the context of four OLDI applications to demonstrate up to 1.9× tail latency improvement over static threading choices and state-of-the-art adaptation techniques.

References

[1]

Adopting microservices at netflix: Lessons for architectural design. https://www.nginx.com/blog/microservices-at-netflix-architectural-best-practices/.

[2]

Aerospike. https://www.aerospike.com/docs/client/java/usage/async/index.html.

[3]

Apache http server project. https://httpd.apache.org/.

[4]

Average number of search terms for online search queries in the United States as of August 2017. https://www.statista.com/statistics/269740/number-of-search-terms-in-internet-research-in-the-us/.

[5]

Azure Synchronous I/O antipattern. https://docs.microsoft.com/en-us/azure/architecture/resiliency/high-availability-azure-applications.

[6]

The biggest thing amazon got right: The platform. https://gigaom.com/2011/10/12/419-the-biggest-thing-amazon-got-right-the-platform/.

[7]

BLPOP key timeout. https://redis.io/commands/blpop.

[8]

Bob Jenkins. SpookyHash: a 128-bit noncryptographic hash. http://burtleburtle.net/bob/hash/spooky.html.

[9]

Building products at soundcloud: Dealing with the monolith. https://developers.soundcloud.com/blog/building-products-at-soundcloud-part-1-dealing-with-the-monolith.

[10]

Building Scalable and Resilient Web Applications on Google Cloud Platform. https://cloud.google.com/solutions/scalable-and-resilient-apps.

[11]

Celery: Distributed Task Queue. http://www.celeryproject.org/.

[12]

Chasing the bottleneck: True story about fighting thread contention in your code. https://blogs.mulesoft.com/biz/news/chasing-the-bottleneck-true-story-about-fighting-thread-contention-in-your-code/.

[13]

Envoy. https://www.envoyproxy.io/.

[14]

Facebook Thrift. https://github.com/facebook/fbthrift.

[15]

Fighting spam with haskell. https://code.facebook.com/posts/745068642270222/fighting-spam-with-haskell/.

[16]

Finagle. https://twitter.github.io/finagle/guide/index.html.

[17]

From a Monolith to Microservices + REST: the Evolution of LinkedIn's Service Architecture. https://www.infoq.com/presentations/linkedin-microservices-urn.

[18]

gRPC. https://github.com/heathermiller/dist-prog-book/blob/master/chapter/1/gRPC.md.

[19]

Handling 1 Million Requests per Minute with Go. http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-golang/.

[20]

Improve Application Performance With SwingWorker in Java SE 6. http://www.oracle.com/technetwork/articles/javase/swingworker-137249.html.

[21]

Latency is everywhere and it costs you sales - how to crush it. http://highscalability.com/blog/2009/7/25/latency-iseverywhere-and-it-costs-you-sales-how-to-crush-it.html.

[22]

Let's look at Dispatch Timeout Handling in WebSphere Application Server for z/OS. https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/dispatch_timeout_handling_in_websphere_application_server_for_zos?lang=en.

[23]

Linux bcc/BPF Run Queue (Scheduler) Latency. http://www.brendangregg.com/blog/2016-10-08/linux-bcc-runqlat.html.

[24]

LPOP key. https://redis.io/commands/lpop.

[25]

Mcrouter. https://github.com/facebook/mcrouter.

[26]

Memcached performance. https://github.com/memcached/memcached/wiki/Performance.

[27]

Microsoft Azure Blob Storage. https://azure.microsoft.com/en-us/services/storage/blobs/.

[28]

mongoDB. https://www.mongodb.com/.

[29]

Myrocks: A space- and write-optimized MySQL database. https://code.facebook.com/posts/190251048047090/myrocks-a-space-and-write-optimized-mysql-database/.

[30]

OpenImages: A public dataset for large-scale multi-label and multi-class image classification. https://github.com/openimages.

[31]

Pokemon go now the biggest mobile game in US history. http://www.cnbc.com/2016/07/13/pokemon-go-now-the-biggest-mobile-game-in-us-history.html.

[32]

Programmer's Guide, Release 2.0.0. https://www.intel.com/content/dam/www/public/us/en/documents/guides/dpdk-programmers-guide.pdf.

[33]

Protocol Buffers. https://developers.google.com/protocol-buffers/.

[34]

Redis Replication. https://redis.io/topics/replication.

[35]

Resque. https://github.com/defunkt/resque.

[36]

RQ. http://python-rq.org/.

[37]

Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Services Architecture. https://www.infoq.com/presentations/scale-gilt.

[38]

Setting Up Internal Load Balancing. https://cloud.google.com/compute/docs/load-balancing/internal/.

[39]

What is microservices architecture? https://smartbear.com/learn/api-design/what-are-microservices/.

[40]

Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Plagiarism&oldid=5139350.

[41]

Workers inside unit tests. http://python-rq.org/docs/testing/.

[42]

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. J. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Józefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. G. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. A. Tucker, V. Vanhoucke, V. Vasudevan, F. B. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Computing Research Repository, 2016.

[43]

T. F. Abdelzaher and N. Bhatti. Web server QoS management by adaptive content delivery. In International Workshop on Quality of Service, 1999.

[44]

A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In IEEE Symposium on Foundations of Computer Science, 2006.

Digital Library

[45]

A. Andoni, P. Indyk, T. Laarhoven, I. Razenshteyn, and L. Schmidt. Practical and Optimal LSH for Angular Distance. In Advances in Neural Information Processing Systems. 2015.

Digital Library

[46]

I. Arapakis, X. Bai, and B. B. Cambazoglu. Impact of Response Latency on User Behavior in Web Search. In International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014.

Digital Library

[47]

N. Bansal, K. Dhamdhere, J. Könemann, and A. Sinha. Non-clairvoyant scheduling for minimizing mean slowdown. Algorithmica, 2004.

Digital Library

[48]

M. Barhamgi, D. Benslimane, and B. Medjahed. A query rewriting approach for web service composition. IEEE Transactions on Services Computing, 2010.

Digital Library

[49]

L. Barroso, M. Marty, D. Patterson, and P. Ranganathan. Attack of the Killer Microseconds. Communications of the ACM, 2017.

Digital Library

[50]

L. A. Barroso, J. Dean, and U. Holzle. Web search for a planet: The google cluster architecture. In IEEE Micro, 2003.

Digital Library

[51]

L. A. Barroso and U. Hölzle. The case for energy-proportional computing. Computer, 2007.

Digital Library

[52]

M. Bawa, T. Condie, and P. Ganesan. LSH forest: self-tuning indexes for similarity search. In International conference on World Wide Web, 2005.

Digital Library

[53]

A. Belay, G. Prekas, A. Klimovic, S. Grossman, C. Kozyrakis, and E. Bugnion. IX: A Protected Dataplane Operating System for High Throughput and Low Latency. In USENIX Conference on Operating Systems Design and Implementation, 2014.

Digital Library

[54]

F. Blagojevic, D. S. Nikolopoulos, A. Stamatakis, C. D. Antonopoulos, and M. Curtis-Maury. Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems. Parallel Computing, 2007.

Digital Library

[55]

A. Bouch, N. Bhatti, and A. Kuchinsky. Quality is in the eye of the beholder: Meeting users' requirements for internet quality of service. In ACM Conference on Human Factors and Computing Systems, 2000.

Digital Library

[56]

N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. C. Li, et al. TAO: Facebook's Distributed Data Store for the Social Graph. In USENIX Annual Technical Conference, 2013.

Digital Library

[57]

J. Cao, M. Andersson, C. Nyberg, and M. Kihl. Web server performance modeling using an m/g/1/k* ps queue. In International Conference on Telecommunications. IEEE.

[58]

J. L. Carlson. Redis in Action. 2013.

Digital Library

[59]

B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking Cloud Serving Systems with YCSB. In ACM Symposium on Cloud Computing, 2010.

Digital Library

[60]

R. R. Curtin, J. R. Cline, N. P. Slagle, W. B. March, P. Ram, N. A. Mehta, and A. G. Gray. MLPACK: A scalable C++ machine learning library. Journal of Machine Learning Research, 2013.

Digital Library

[61]

M. Curtis-Maury, J. Dzierwa, C. D. Antonopoulos, and D. S. Nikolopoulos. Online power-performance adaptation of multithreaded programs using hardware event-based prediction. In Annual International conference on Supercomputing, 2006.

Digital Library

[62]

M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. Locality-sensitive Hashing Scheme Based on P-stable Distributions. In Annual Symposium on Computational Geometry, 2004.

Digital Library

[63]

J. Dean and L. A. Barroso. The Tail at Scale. Communications of the ACM, 2013.

Digital Library

[64]

J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Communications of the ACM, 2008.

Digital Library

[65]

C. C. Del Mundo, V. T. Lee, L. Ceze, and M. Oskin. NCAM: Near-Data Processing for Nearest Neighbor Search. In International Symposium on Memory Systems, 2015.

Digital Library

[66]

N. Dmitry and S.-S. Manfred. On micro-services architecture. International Journal of Open Information Technologies, 2014.

[67]

W. Dong, Z. Wang, W. Josephson, M. Charikar, and K. Li. Modeling LSH for performance tuning. In ACM conference on Information and knowledge management, 2008.

Digital Library

[68]

D. Ersoz, M. S. Yousif, and C. R. Das. Characterizing network traffic in a cluster-based, multi-tier data center. In International Conference on Distributed Computing Systems, 2007.

Digital Library

[69]

Q. Fan and Q. Wang. Performance comparison of web servers with different architectures: a case study using high concurrency workload. In IEEE Workshop on Hot Topics in Web Systems and Technologies, 2015.

Digital Library

[70]

D. G. Feitelson. A survey of scheduling in multiprogrammed parallel systems. IBM Research Division, 1994.

[71]

M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi. Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware. In International Conference on Architectural Support for Programming Languages and Operating Systems, 2012.

Digital Library

[72]

B. Fitzpatrick. Distributed Caching with Memcached. Linux J., 2004.

Digital Library

[73]

E. Frachtenberg. Reducing query latencies in web search using fine-grained parallelism. World Wide Web, 2009.

Digital Library

[74]

B. Furht and A. Escalante. Handbook of cloud computing. Springer, 2010.

Digital Library

[75]

A. Gionis, P. Indyk, and R. Motwani. Similarity Search in High Dimensions via Hashing. In International Conference on Very Large Data Bases, 1999.

Digital Library

[76]

M. E. Haque, Y. h. Eom, Y. He, S. Elnikety, R. Bianchini, and K. S. McKinley. Few-to-Many: Incremental Parallelism for Reducing Tail Latency in Interactive Services. In International Conference on Architectural Support for Programming Languages and Operating Systems, 2015.

Digital Library

[77]

M. E. Haque, Y. He, S. Elnikety, T. D. Nguyen, R. Bianchini, and K. S. McKinley. Exploiting Heterogeneity for Tail Latency and Energy Efficiency. In IEEE/ACM International Symposium on Microarchitecture, 2017.

Digital Library

[78]

F. M. Harper and J. A. Konstan. The Movielens Datasets: History and Context. ACM Tranactions on Interactive Intelligent Systems, 2015.

Digital Library

[79]

K. Hazelwood, S. Bird, D. Brooks, S. Chintala, U. Diril, D. Dzhulgakov, M. Fawzy, B. Jia, Y. Jia, A. Kalro, et al. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In IEEE International Symposium on High Performance Computer Architecture, 2018.

[80]

Y. He, W.-J. Hsu, and C. E. Leiserson. Provably efficient online nonclairvoyant adaptive scheduling. IEEE Transactions on Parallel and Distributed Systems, 2008.

Digital Library

[81]

E. N. Herness, R. J. High, and J. R. McGee. Websphere Application Server: A foundation for on demand computing. IBM Systems Journal, 2004.

Digital Library

[82]

C.-H. Hsu, Y. Zhang, M. A. Laurenzano, D. Meisner, T. Wenisch, L. Tang, J. Mars, and R. Dreslinski. Adrenaline: Pinpointing and Reining in Tail Queries with Quick Voltage Boosting. In International Symposium on High Performance Computer Architecture, 2015.

[83]

J. Hu, I. Pyarali, and D. C. Schmidt. Applying the proactor pattern to high-performance web servers. In International Conference on Parallel and Distributed Computing and Systems, 1998.

[84]

J. C. Hu and D. C. Schmidt. JAWS: A Framework for High-performance Web Servers. In In Domain-Specific Application Frameworks: Frameworks Experience by Industry, 1999.

[85]

P. Indyk and R. Motwani. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In ACM Symposium on Theory of Computing, 1998.

Digital Library

[86]

M. Jeon, S. Kim, S.-w. Hwang, Y. He, S. Elnikety, A. L. Cox, and S. Rixner. Predictive Parallelization: Taming Tail Latencies in Web Search. In International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014.

Digital Library

[87]

E. Y. Jeong, S. Woo, M. Jamshed, H. Jeong, S. Ihm, D. Han, and K. Park. mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems. In USENIX Conference on Networked Systems Design and Implementation, 2014.

Digital Library

[88]

F. R. Johnson, R. Stoica, A. Ailamaki, and T. C. Mowry. Decoupling Contention Management from Scheduling. In Architectural Support for Programming Languages and Operating Systems, 2010.

Digital Library

[89]

C. Jung, D. Lim, J. Lee, and S. Han. Adaptive execution techniques for SMT multiprocessor architectures. In ACM SIGPLAN symposium on Principles and practice of parallel programming, 2005.

Digital Library

[90]

S. Kanev, K. Hazelwood, G.-Y. Wei, and D. Brooks. Tradeoffs between power management and tail latency in warehouse-scale applications. In IEEE International Symposium on Workload Characterization, 2014.

[91]

R. Kapoor, G. Porter, M. Tewari, G. M. Voelker, and A. Vahdat. Chronos: Predictable low latency for data center applications. In ACM Symposium on Cloud Computing, 2012.

Digital Library

[92]

H. Kasture, D. B. Bartolini, N. Beckmann, and D. Sanchez. Rubik: Fast analytical power management for latency-critical systems. In International Symposium on Microarchitecture, 2015.

Digital Library

[93]

S. Kim, Y. He, S.-w. Hwang, S. Elnikety, and S. Choi. Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search. In ACM International Conference on Web Search and Data Mining, 2015.

Digital Library

[94]

W. Ko, M. Yankelevsky, D. S. Nikolopoulos, and C. D. Polychronopoulos. Effective cross-platform, multilevel parallelism via dynamic adaptive execution. In Parallel and Distributed Processing Symposium, 2001.

Digital Library

[95]

R. Kohavi, R. M. Henne, and D. Sommerfield. Practical Guide to Controlled Experiments on the Web: Listen to Your Customers Not to the Hippo. In International Conference on Knowledge Discovery and Data Mining, 2007.

Digital Library

[96]

E. Kushilevitz, R. Ostrovsky, and Y. Rabani. Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM Journal on Computing, 2000.

Digital Library

[97]

K. Langendoen, J. Romein, R. Bhoedjang, and H. Bal. Integrating polling, interrupts, and thread management. In Symposium on the Frontiers of Massively Parallel Computing, 1996.

Digital Library

[98]

P.-A. Larson, J. Goldstein, and J. Zhou. MTCache: Transparent mid-tier database caching in SQL server. In International Conference on Data Engineering, 2004.

Digital Library

[99]

J. Lee, H. Wu, M. Ravichandran, and N. Clark. Thread Tailor: Dynamically Weaving Threads Together for Efficient, Adaptive Parallel Applications. In International Symposium on Computer Architecture, 2010.

Digital Library

[100]

A. Lesyuk. Mastering Redmine. 2013.

[101]

C. Li, C. Ding, and K. Shen. Quantifying the cost of context switch. In Workshop on Experimental computer science, 2007.

Digital Library

[102]

J. Li, N. K. Sharma, D. R. K. Ports, and S. D. Gribble. Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency. In ACM Symposium on Cloud Computing, 2014.

Digital Library

[103]

H. Lim, D. Han, D. G. Andersen, and M. Kaminsky. MICA: A Holistic Approach to Fast In-memory Key-value Storage. In USENIX Conference on Networked Systems Design and Implementation, 2014.

Digital Library

[104]

Y. Ling, T. Mullen, and X. Lin. Analysis of Optimal Thread Pool Size. SIGOPS Operating Systems Review, 2000.

Digital Library

[105]

D. Liu and R. Deters. The Reverse C10K Problem for Server-Side Mashups. In International Conference on Service-Oriented Computing Workshops, 2008.

[106]

P. M. LiVecchi. Performance enhancements for threaded servers, 2004. US Patent 6,823,515.

[107]

D. Lo, L. Cheng, R. Govindaraju, L. A. Barroso, and C. Kozyrakis. Towards energy proportionality for large-scale latency-critical workloads. In International Symposium on Computer Architecture, 2014.

Digital Library

[108]

D. Lo, L. Cheng, R. Govindaraju, P. Ranganathan, and C. Kozyrakis. Heracles: Improving Resource Efficiency at Scale. In International Symposium on Computer Architecture, 2015.

Digital Library

[109]

L. Luo, A. Sriraman, B. Fugate, S. Hu, G. Pokam, C. J. Newburn, and J. Devietti. LASER: Light, Accurate Sharing dEtection and Repair. In International Symposium on High Performance Computer Architecture, 2016.

[110]

Q. Lv, W. Josephson, Z. Wang, M. Charikar, and K. Li. Multi-probe LSH: Efficient Indexing for High-dimensional Similarity Search. In International Conference on Very Large Data Bases, 2007.

Digital Library

[111]

M. McCandless, E. Hatcher, and O. Gospodnetic. Lucene in Action, Second Edition: Covers Apache Lucene 3.0. 2010.

Digital Library

[112]

C. McCann, R. Vaswani, and J. Zahorjan. A Dynamic Processor Allocation Policy for Multiprogrammed Shared-memory Multiprocessors. ACM Transactions on Computer Systems, 1993.

Digital Library

[113]

D. Meisner, C. M. Sadler, L. A. Barroso, W.-D. Weber, and T. F. Wenisch. Power Management of Online Data-intensive Services. In International Symposium on Computer Architecture, 2011.

Digital Library

[114]

G. Mühl, L. Fiege, and P. Pietzuch. Distributed event-based systems. 2006.

Digital Library

[115]

M. Muja and D. G. Lowe. Scalable Nearest Neighbor Algorithms for High Dimensional Data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014.

[116]

I. Nadareishvili, R. Mitra, M. McLarty, and M. Amundsen. Microservice Architecture: Aligning Principles, Practices, and Culture. 2016.

Digital Library

[117]

R. M. Needham. Denial of Service. In ACM Conference on Computer and Communications Security, 1993.

Digital Library

[118]

R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, and P. Saab. Scaling Memcache at Facebook. In USENIX Symposium on Networked Systems Design and Implementation, 2013.

Digital Library

[119]

V. S. Pai, P. Druschel, and W. Zwaenepoel. Flash: An efficient and portable Web server. In USENIX Annual Technical Conference, 1999.

Digital Library

[120]

D. Pariag, T. Brecht, A. S. Harji, P. A. Buhr, A. Shukla, and D. R. Cheriton. Comparing the performance of web server architectures. In European Conference on Computer Systems, 2007.

Digital Library

[121]

S. Peter, J. Li, I. Zhang, D. R. Ports, D. Woos, A. Krishnamurthy, T. Anderson, and T. Roscoe. Arrakis: The operating system is the control plane. ACM Transactions on Computer Systems, 2016.

Digital Library

[122]

G. Prekas, M. Kogias, and E. Bugnion. ZygOS: Achieving Low Tail Latency for Microsecond-scale Networked Tasks. In Symposium on Operating Systems Principles, 2017.

Digital Library

[123]

G. Prekas, M. Primorac, A. Belay, C. Kozyrakis, and E. Bugnion. Energy Proportionality and Workload Consolidation for Latency-critical Applications. In ACM Symposium on Cloud Computing, 2015.

Digital Library

[124]

K. K. Pusukuri, R. Gupta, and L. N. Bhuyan. Thread reinforcer: Dynamically determining number of threads via OS level monitoring. In IEEE International Symposium on Workload Characterization, 2011.

Digital Library

[125]

D. R. Raymond and S. F. Midkiff. Denial-of-service in wireless sensor networks: Attacks and defenses. IEEE Pervasive Computing, 2008.

Digital Library

[126]

R. Rojas-Cessa, Y. Kaymak, and Z. Dong. Schemes for fast transmission of flows in data center networks. IEEE Communications Surveys & Tutorials, 2015.

[127]

D. Schmidt and P. Stephenson. Experience using design patterns to evolve communication software across diverse OS platforms. In European Conference on Object-Oriented Programming, 1995.

Digital Library

[128]

D. C. Schmidt and C. Cleeland. Applying patterns to develop extensible ORB middleware. IEEE Communications Magazine, 1999.

Digital Library

[129]

G. Shakhnarovich, P. Viola, and T. Darrell. Fast pose estimation with parameter-sensitive hashing. In IEEE International Conference on Computer Vision, 2003.

Digital Library

[130]

R. K. Sharma, C. E. Bash, C. D. Patel, R. J. Friedrich, and J. S. Chase. Balance of power: Dynamic thermal management for internet data centers. IEEE Internet Computing, 2005.

Digital Library

[131]

M. Slaney and M. Casey. Locality-sensitive hashing for finding nearest neighbors. IEEE Signal Processing Magazine, 2008.

[132]

S. M. Specht and R. B. Lee. Distributed Denial of Service: Taxonomies of Attacks, Tools, and Countermeasures. In ISCA International Conference on Parallel and Distributed Computing (and Communications) Systems, 2004.

[133]

A. Sriraman, S. Liu, S. Gunbay, S. Su, and T. F. Wenisch. Deconstructing the Tail at Scale Effect Across Network Protocols. The Annual Workshop on Duplicating, Deconstructing, and Debunking, 2016.

[134]

A. Sriraman and T. F. Wenisch. µSuite: A Benchmark Suite for Microservices. In IEEE International Symposium on Workload Characterization, 2018.

[135]

M. A. Suleman, M. K. Qureshi, and Y. N. Patt. Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on CMPs. In International Conference on Architectural Support for Programming Languages and Operating Systems, 2008.

Digital Library

[136]

C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the Inception Architecture for Computer Vision. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.

[137]

Y. Tao, K. Yi, C. Sheng, and P. Kalnis. Quality and efficiency in high dimensional nearest neighbor search. In ACM SIGMOD International Conference on Management of data, 2009.

Digital Library

[138]

Y. Tao, K. Yi, C. Sheng, and P. Kalnis. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space. ACM Transactions on Database Systems, 2010.

Digital Library

[139]

K. Terasawa and Y. Tanaka. Spherical LSH for approximate nearest neighbor search on unit hypersphere. In Workshop on Algorithms and Data Structures, 2007.

Digital Library

[140]

S. Tilkov and S. Vinoski. Node.js: Using JavaScript to build high-performance network programs. IEEE Internet Computing, 2010.

Digital Library

[141]

D. Tsafrir. The context-switch overhead inflicted by hardware interrupts (and the enigma of do-nothing loops). In Workshop on Experimental computer science, 2007.

Digital Library

[142]

B. Vamanan, J. Hasan, and T. Vijaykumar. Deadline-aware Datacenter TCP (D2TCP). In ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, 2012.

Digital Library

[143]

B. Vamanan, H. B. Sohail, J. Hasan, and T. N. Vijaykumar. Timetrader: Exploiting Latency Tail to Save Datacenter Energy for Online Search. In International Symposium on Microarchitecture, 2015.

Digital Library

[144]

M. Villamizar, O. Garcés, H. Castro, M. Verano, L. Salamanca, R. Casallas, and S. Gil. Evaluating the monolithic and the microservice architecture pattern to deploy web applications in the cloud. In Computing Colombian Conference, 2015.

[145]

J. R. Von Behren, J. Condit, and E. A. Brewer. Why Events Are a Bad Idea (for High-Concurrency Servers). In Hot Topics in Operating Systems, 2003.

Digital Library

[146]

Q. Wang, C.-A. Lai, Y. Kanemasa, S. Zhang, and C. Pu. A Study of Long-Tail Latency in n-Tier Systems: RPC vs. Asynchronous Invocations. In International Conference on Distributed Computing Systems, 2017.

[147]

Z. Wang and M. F. O'Boyle. Mapping Parallelism to Multi-cores: A Machine Learning Based Approach. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009.

Digital Library

[148]

M. Welsh, D. Culler, and E. Brewer. SEDA: An Architecture for Well-conditioned, Scalable Internet Services. In ACM Symposium on Operating Systems Principles, 2001.

Digital Library

[149]

W. J. Wilbur and K. Sirotkin. The automatic identification of stop words. Journal of information science, 1992.

Digital Library

[150]

C. Wilson, H. Ballani, T. Karagiannis, and A. Rowtron. Better Never Than Late: Meeting Deadlines in Datacenter Networks. In ACM SIGCOMM Conference, 2011.

Digital Library

[151]

Y. Zhang, D. Meisner, J. Mars, and L. Tang. Treadmill: Attributing the Source of Tail Latency Through Precise Load Testing and Statistical Inference. In International Symposium on Computer Architecture, 2016.

Digital Library

Cited By

Seemakhupt KLiu SSenevirathne YShahbaz MKhan SMartínez JDuato JJohn L(2021)PMNetProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00068(804-817)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00068
Asyabi EBestavros ASharafzadeh EZhu TFonseca RDelimitrou COoi B(2020)PeafowlProceedings of the 11th ACM Symposium on Cloud Computing10.1145/3419111.3421298(150-164)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3419111.3421298
Liu JArpaci-Dusseau AArpaci-Dusseau RKannan SYadgar GNoh S(2019)File systems as processesProceedings of the 11th USENIX Conference on Hot Topics in Storage and File Systems10.5555/3357062.3357081(14-14)Online publication date: 8-Jul-2019
https://dl.acm.org/doi/10.5555/3357062.3357081
Show More Cited By

µtune: auto-tuned threading for OLDI microservices
1. General and reference
  1. Cross-computing tools and techniques

Recommendations

WFR-TM

Transactional Memory (TM) is a promising concurrent programming paradigm which employs transactions to achieve synchronization in accessing common data known as transactional variables. A transaction may either commit, making its updates to ...
Speculation-based techniques for transactional lock-free execution of lock-based programs
Wait-n-GoTM: improving HTM performance by serializing cyclic dependencies
ASPLOS '13

Transactional memory (TM) has been proposed to alleviate some key programmability problems in chip multiprocessors. Most TMs optimistically allow concurrent transactions, detecting read-write or write-write conflicts. Upon conflicts, existing hardware ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

OSDI'18: Proceedings of the 13th USENIX conference on Operating Systems Design and Implementation

October 2018

815 pages

ISBN:9781931971478

Program Chairs:
Andrea Arpaci-Dusseau
University of Wisconsin-Madison
,
Geoff Voelker
University of California, San Diego

Sponsors

NetApp
Google Inc.
NSF
Microsoft: Microsoft
Facebook: Facebook

In-Cooperation

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

USENIX Association

United States

Publication History

Published: 08 October 2018

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Seemakhupt KLiu SSenevirathne YShahbaz MKhan SMartínez JDuato JJohn L(2021)PMNetProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00068(804-817)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00068
Asyabi EBestavros ASharafzadeh EZhu TFonseca RDelimitrou COoi B(2020)PeafowlProceedings of the 11th ACM Symposium on Cloud Computing10.1145/3419111.3421298(150-164)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3419111.3421298
Liu JArpaci-Dusseau AArpaci-Dusseau RKannan SYadgar GNoh S(2019)File systems as processesProceedings of the 11th USENIX Conference on Hot Topics in Storage and File Systems10.5555/3357062.3357081(14-14)Online publication date: 8-Jul-2019
https://dl.acm.org/doi/10.5555/3357062.3357081
Sharafzadeh EKohroudi SAsyabi ESharifi M(2019)YawnProceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3343737.3343740(91-98)Online publication date: 19-Aug-2019
https://dl.acm.org/doi/10.1145/3343737.3343740

View Options

View options

Figures

Tables

Media

View Table of Conten