Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3291168.3291182acmotherconferencesArticle/Chapter ViewAbstractPublication PagesosdiConference Proceedingsconference-collections
Article

µtune: auto-tuned threading for OLDI microservices

Published: 08 October 2018 Publication History

Abstract

Modern On-Line Data Intensive (OLDI) applications have evolved from monolithic systems to instead comprise numerous, distributed microservices interacting via Remote Procedure Calls (RPCs). Microservices face sub-millisecond (sub-ms) RPC latency goals, much tighter than their monolithic counterparts that must meet ≥ 100 ms latency targets. Sub-ms-scale threading and concurrency design effects that were once insignificant for such monolithic services can now come to dominate in the sub-ms-scale microservice regime. We investigate how threading design critically impacts microservice tail latency by developing a taxonomy of threading models--a structured understanding of the implications of how microservices manage concurrency and interact with RPC interfaces under wide-ranging loads. We develop µTune, a system that has two features: (1) a novel framework that abstracts threading model implementation from application code, and (2) an automatic load adaptation system that curtails microservice tail latency by exploiting inherent latency trade-offs revealed in our taxonomy to transition among threading models. We study µTune in the context of four OLDI applications to demonstrate up to 1.9× tail latency improvement over static threading choices and state-of-the-art adaptation techniques.

References

[1]
Adopting microservices at netflix: Lessons for architectural design. https://www.nginx.com/blog/microservices-at-netflix-architectural-best-practices/.
[2]
Aerospike. https://www.aerospike.com/docs/client/java/usage/async/index.html.
[3]
Apache http server project. https://httpd.apache.org/.
[4]
Average number of search terms for online search queries in the United States as of August 2017. https://www.statista.com/statistics/269740/number-of-search-terms-in-internet-research-in-the-us/.
[5]
Azure Synchronous I/O antipattern. https://docs.microsoft.com/en-us/azure/architecture/resiliency/high-availability-azure-applications.
[6]
The biggest thing amazon got right: The platform. https://gigaom.com/2011/10/12/419-the-biggest-thing-amazon-got-right-the-platform/.
[7]
BLPOP key timeout. https://redis.io/commands/blpop.
[8]
Bob Jenkins. SpookyHash: a 128-bit noncryptographic hash. http://burtleburtle.net/bob/hash/spooky.html.
[9]
Building products at soundcloud: Dealing with the monolith. https://developers.soundcloud.com/blog/building-products-at-soundcloud-part-1-dealing-with-the-monolith.
[10]
Building Scalable and Resilient Web Applications on Google Cloud Platform. https://cloud.google.com/solutions/scalable-and-resilient-apps.
[11]
Celery: Distributed Task Queue. http://www.celeryproject.org/.
[12]
Chasing the bottleneck: True story about fighting thread contention in your code. https://blogs.mulesoft.com/biz/news/chasing-the-bottleneck-true-story-about-fighting-thread-contention-in-your-code/.
[13]
Envoy. https://www.envoyproxy.io/.
[14]
Facebook Thrift. https://github.com/facebook/fbthrift.
[15]
Fighting spam with haskell. https://code.facebook.com/posts/745068642270222/fighting-spam-with-haskell/.
[16]
Finagle. https://twitter.github.io/finagle/guide/index.html.
[17]
From a Monolith to Microservices + REST: the Evolution of LinkedIn's Service Architecture. https://www.infoq.com/presentations/linkedin-microservices-urn.
[18]
gRPC. https://github.com/heathermiller/dist-prog-book/blob/master/chapter/1/gRPC.md.
[19]
Handling 1 Million Requests per Minute with Go. http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-golang/.
[20]
Improve Application Performance With SwingWorker in Java SE 6. http://www.oracle.com/technetwork/articles/javase/swingworker-137249.html.
[21]
Latency is everywhere and it costs you sales - how to crush it. http://highscalability.com/blog/2009/7/25/latency-iseverywhere-and-it-costs-you-sales-how-to-crush-it.html.
[22]
Let's look at Dispatch Timeout Handling in WebSphere Application Server for z/OS. https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/dispatch_timeout_handling_in_websphere_application_server_for_zos?lang=en.
[23]
Linux bcc/BPF Run Queue (Scheduler) Latency. http://www.brendangregg.com/blog/2016-10-08/linux-bcc-runqlat.html.
[24]
LPOP key. https://redis.io/commands/lpop.
[25]
Mcrouter. https://github.com/facebook/mcrouter.
[26]
Memcached performance. https://github.com/memcached/memcached/wiki/Performance.
[27]
Microsoft Azure Blob Storage. https://azure.microsoft.com/en-us/services/storage/blobs/.
[28]
mongoDB. https://www.mongodb.com/.
[29]
Myrocks: A space- and write-optimized MySQL database. https://code.facebook.com/posts/190251048047090/myrocks-a-space-and-write-optimized-mysql-database/.
[30]
OpenImages: A public dataset for large-scale multi-label and multi-class image classification. https://github.com/openimages.
[31]
Pokemon go now the biggest mobile game in US history. http://www.cnbc.com/2016/07/13/pokemon-go-now-the-biggest-mobile-game-in-us-history.html.
[32]
Programmer's Guide, Release 2.0.0. https://www.intel.com/content/dam/www/public/us/en/documents/guides/dpdk-programmers-guide.pdf.
[33]
Protocol Buffers. https://developers.google.com/protocol-buffers/.
[34]
Redis Replication. https://redis.io/topics/replication.
[35]
Resque. https://github.com/defunkt/resque.
[36]
RQ. http://python-rq.org/.
[37]
Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Services Architecture. https://www.infoq.com/presentations/scale-gilt.
[38]
Setting Up Internal Load Balancing. https://cloud.google.com/compute/docs/load-balancing/internal/.
[39]
What is microservices architecture? https://smartbear.com/learn/api-design/what-are-microservices/.
[40]
Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Plagiarism&oldid=5139350.
[41]
Workers inside unit tests. http://python-rq.org/docs/testing/.
[42]
M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. J. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Józefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. G. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. A. Tucker, V. Vanhoucke, V. Vasudevan, F. B. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Computing Research Repository, 2016.
[43]
T. F. Abdelzaher and N. Bhatti. Web server QoS management by adaptive content delivery. In International Workshop on Quality of Service, 1999.
[44]
A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In IEEE Symposium on Foundations of Computer Science, 2006.
[45]
A. Andoni, P. Indyk, T. Laarhoven, I. Razenshteyn, and L. Schmidt. Practical and Optimal LSH for Angular Distance. In Advances in Neural Information Processing Systems. 2015.
[46]
I. Arapakis, X. Bai, and B. B. Cambazoglu. Impact of Response Latency on User Behavior in Web Search. In International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014.
[47]
N. Bansal, K. Dhamdhere, J. Könemann, and A. Sinha. Non-clairvoyant scheduling for minimizing mean slowdown. Algorithmica, 2004.
[48]
M. Barhamgi, D. Benslimane, and B. Medjahed. A query rewriting approach for web service composition. IEEE Transactions on Services Computing, 2010.
[49]
L. Barroso, M. Marty, D. Patterson, and P. Ranganathan. Attack of the Killer Microseconds. Communications of the ACM, 2017.
[50]
L. A. Barroso, J. Dean, and U. Holzle. Web search for a planet: The google cluster architecture. In IEEE Micro, 2003.
[51]
L. A. Barroso and U. Hölzle. The case for energy-proportional computing. Computer, 2007.
[52]
M. Bawa, T. Condie, and P. Ganesan. LSH forest: self-tuning indexes for similarity search. In International conference on World Wide Web, 2005.
[53]
A. Belay, G. Prekas, A. Klimovic, S. Grossman, C. Kozyrakis, and E. Bugnion. IX: A Protected Dataplane Operating System for High Throughput and Low Latency. In USENIX Conference on Operating Systems Design and Implementation, 2014.
[54]
F. Blagojevic, D. S. Nikolopoulos, A. Stamatakis, C. D. Antonopoulos, and M. Curtis-Maury. Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems. Parallel Computing, 2007.
[55]
A. Bouch, N. Bhatti, and A. Kuchinsky. Quality is in the eye of the beholder: Meeting users' requirements for internet quality of service. In ACM Conference on Human Factors and Computing Systems, 2000.
[56]
N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. C. Li, et al. TAO: Facebook's Distributed Data Store for the Social Graph. In USENIX Annual Technical Conference, 2013.
[57]
J. Cao, M. Andersson, C. Nyberg, and M. Kihl. Web server performance modeling using an m/g/1/k* ps queue. In International Conference on Telecommunications. IEEE.
[58]
J. L. Carlson. Redis in Action. 2013.
[59]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking Cloud Serving Systems with YCSB. In ACM Symposium on Cloud Computing, 2010.
[60]
R. R. Curtin, J. R. Cline, N. P. Slagle, W. B. March, P. Ram, N. A. Mehta, and A. G. Gray. MLPACK: A scalable C++ machine learning library. Journal of Machine Learning Research, 2013.
[61]
M. Curtis-Maury, J. Dzierwa, C. D. Antonopoulos, and D. S. Nikolopoulos. Online power-performance adaptation of multithreaded programs using hardware event-based prediction. In Annual International conference on Supercomputing, 2006.
[62]
M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. Locality-sensitive Hashing Scheme Based on P-stable Distributions. In Annual Symposium on Computational Geometry, 2004.
[63]
J. Dean and L. A. Barroso. The Tail at Scale. Communications of the ACM, 2013.
[64]
J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Communications of the ACM, 2008.
[65]
C. C. Del Mundo, V. T. Lee, L. Ceze, and M. Oskin. NCAM: Near-Data Processing for Nearest Neighbor Search. In International Symposium on Memory Systems, 2015.
[66]
N. Dmitry and S.-S. Manfred. On micro-services architecture. International Journal of Open Information Technologies, 2014.
[67]
W. Dong, Z. Wang, W. Josephson, M. Charikar, and K. Li. Modeling LSH for performance tuning. In ACM conference on Information and knowledge management, 2008.
[68]
D. Ersoz, M. S. Yousif, and C. R. Das. Characterizing network traffic in a cluster-based, multi-tier data center. In International Conference on Distributed Computing Systems, 2007.
[69]
Q. Fan and Q. Wang. Performance comparison of web servers with different architectures: a case study using high concurrency workload. In IEEE Workshop on Hot Topics in Web Systems and Technologies, 2015.
[70]
D. G. Feitelson. A survey of scheduling in multiprogrammed parallel systems. IBM Research Division, 1994.
[71]
M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi. Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware. In International Conference on Architectural Support for Programming Languages and Operating Systems, 2012.
[72]
B. Fitzpatrick. Distributed Caching with Memcached. Linux J., 2004.
[73]
E. Frachtenberg. Reducing query latencies in web search using fine-grained parallelism. World Wide Web, 2009.
[74]
B. Furht and A. Escalante. Handbook of cloud computing. Springer, 2010.
[75]
A. Gionis, P. Indyk, and R. Motwani. Similarity Search in High Dimensions via Hashing. In International Conference on Very Large Data Bases, 1999.
[76]
M. E. Haque, Y. h. Eom, Y. He, S. Elnikety, R. Bianchini, and K. S. McKinley. Few-to-Many: Incremental Parallelism for Reducing Tail Latency in Interactive Services. In International Conference on Architectural Support for Programming Languages and Operating Systems, 2015.
[77]
M. E. Haque, Y. He, S. Elnikety, T. D. Nguyen, R. Bianchini, and K. S. McKinley. Exploiting Heterogeneity for Tail Latency and Energy Efficiency. In IEEE/ACM International Symposium on Microarchitecture, 2017.
[78]
F. M. Harper and J. A. Konstan. The Movielens Datasets: History and Context. ACM Tranactions on Interactive Intelligent Systems, 2015.
[79]
K. Hazelwood, S. Bird, D. Brooks, S. Chintala, U. Diril, D. Dzhulgakov, M. Fawzy, B. Jia, Y. Jia, A. Kalro, et al. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In IEEE International Symposium on High Performance Computer Architecture, 2018.
[80]
Y. He, W.-J. Hsu, and C. E. Leiserson. Provably efficient online nonclairvoyant adaptive scheduling. IEEE Transactions on Parallel and Distributed Systems, 2008.
[81]
E. N. Herness, R. J. High, and J. R. McGee. Websphere Application Server: A foundation for on demand computing. IBM Systems Journal, 2004.
[82]
C.-H. Hsu, Y. Zhang, M. A. Laurenzano, D. Meisner, T. Wenisch, L. Tang, J. Mars, and R. Dreslinski. Adrenaline: Pinpointing and Reining in Tail Queries with Quick Voltage Boosting. In International Symposium on High Performance Computer Architecture, 2015.
[83]
J. Hu, I. Pyarali, and D. C. Schmidt. Applying the proactor pattern to high-performance web servers. In International Conference on Parallel and Distributed Computing and Systems, 1998.
[84]
J. C. Hu and D. C. Schmidt. JAWS: A Framework for High-performance Web Servers. In In Domain-Specific Application Frameworks: Frameworks Experience by Industry, 1999.
[85]
P. Indyk and R. Motwani. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In ACM Symposium on Theory of Computing, 1998.
[86]
M. Jeon, S. Kim, S.-w. Hwang, Y. He, S. Elnikety, A. L. Cox, and S. Rixner. Predictive Parallelization: Taming Tail Latencies in Web Search. In International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014.
[87]
E. Y. Jeong, S. Woo, M. Jamshed, H. Jeong, S. Ihm, D. Han, and K. Park. mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems. In USENIX Conference on Networked Systems Design and Implementation, 2014.
[88]
F. R. Johnson, R. Stoica, A. Ailamaki, and T. C. Mowry. Decoupling Contention Management from Scheduling. In Architectural Support for Programming Languages and Operating Systems, 2010.
[89]
C. Jung, D. Lim, J. Lee, and S. Han. Adaptive execution techniques for SMT multiprocessor architectures. In ACM SIGPLAN symposium on Principles and practice of parallel programming, 2005.
[90]
S. Kanev, K. Hazelwood, G.-Y. Wei, and D. Brooks. Tradeoffs between power management and tail latency in warehouse-scale applications. In IEEE International Symposium on Workload Characterization, 2014.
[91]
R. Kapoor, G. Porter, M. Tewari, G. M. Voelker, and A. Vahdat. Chronos: Predictable low latency for data center applications. In ACM Symposium on Cloud Computing, 2012.
[92]
H. Kasture, D. B. Bartolini, N. Beckmann, and D. Sanchez. Rubik: Fast analytical power management for latency-critical systems. In International Symposium on Microarchitecture, 2015.
[93]
S. Kim, Y. He, S.-w. Hwang, S. Elnikety, and S. Choi. Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search. In ACM International Conference on Web Search and Data Mining, 2015.
[94]
W. Ko, M. Yankelevsky, D. S. Nikolopoulos, and C. D. Polychronopoulos. Effective cross-platform, multilevel parallelism via dynamic adaptive execution. In Parallel and Distributed Processing Symposium, 2001.
[95]
R. Kohavi, R. M. Henne, and D. Sommerfield. Practical Guide to Controlled Experiments on the Web: Listen to Your Customers Not to the Hippo. In International Conference on Knowledge Discovery and Data Mining, 2007.
[96]
E. Kushilevitz, R. Ostrovsky, and Y. Rabani. Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM Journal on Computing, 2000.
[97]
K. Langendoen, J. Romein, R. Bhoedjang, and H. Bal. Integrating polling, interrupts, and thread management. In Symposium on the Frontiers of Massively Parallel Computing, 1996.
[98]
P.-A. Larson, J. Goldstein, and J. Zhou. MTCache: Transparent mid-tier database caching in SQL server. In International Conference on Data Engineering, 2004.
[99]
J. Lee, H. Wu, M. Ravichandran, and N. Clark. Thread Tailor: Dynamically Weaving Threads Together for Efficient, Adaptive Parallel Applications. In International Symposium on Computer Architecture, 2010.
[100]
A. Lesyuk. Mastering Redmine. 2013.
[101]
C. Li, C. Ding, and K. Shen. Quantifying the cost of context switch. In Workshop on Experimental computer science, 2007.
[102]
J. Li, N. K. Sharma, D. R. K. Ports, and S. D. Gribble. Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency. In ACM Symposium on Cloud Computing, 2014.
[103]
H. Lim, D. Han, D. G. Andersen, and M. Kaminsky. MICA: A Holistic Approach to Fast In-memory Key-value Storage. In USENIX Conference on Networked Systems Design and Implementation, 2014.
[104]
Y. Ling, T. Mullen, and X. Lin. Analysis of Optimal Thread Pool Size. SIGOPS Operating Systems Review, 2000.
[105]
D. Liu and R. Deters. The Reverse C10K Problem for Server-Side Mashups. In International Conference on Service-Oriented Computing Workshops, 2008.
[106]
P. M. LiVecchi. Performance enhancements for threaded servers, 2004. US Patent 6,823,515.
[107]
D. Lo, L. Cheng, R. Govindaraju, L. A. Barroso, and C. Kozyrakis. Towards energy proportionality for large-scale latency-critical workloads. In International Symposium on Computer Architecture, 2014.
[108]
D. Lo, L. Cheng, R. Govindaraju, P. Ranganathan, and C. Kozyrakis. Heracles: Improving Resource Efficiency at Scale. In International Symposium on Computer Architecture, 2015.
[109]
L. Luo, A. Sriraman, B. Fugate, S. Hu, G. Pokam, C. J. Newburn, and J. Devietti. LASER: Light, Accurate Sharing dEtection and Repair. In International Symposium on High Performance Computer Architecture, 2016.
[110]
Q. Lv, W. Josephson, Z. Wang, M. Charikar, and K. Li. Multi-probe LSH: Efficient Indexing for High-dimensional Similarity Search. In International Conference on Very Large Data Bases, 2007.
[111]
M. McCandless, E. Hatcher, and O. Gospodnetic. Lucene in Action, Second Edition: Covers Apache Lucene 3.0. 2010.
[112]
C. McCann, R. Vaswani, and J. Zahorjan. A Dynamic Processor Allocation Policy for Multiprogrammed Shared-memory Multiprocessors. ACM Transactions on Computer Systems, 1993.
[113]
D. Meisner, C. M. Sadler, L. A. Barroso, W.-D. Weber, and T. F. Wenisch. Power Management of Online Data-intensive Services. In International Symposium on Computer Architecture, 2011.
[114]
G. Mühl, L. Fiege, and P. Pietzuch. Distributed event-based systems. 2006.
[115]
M. Muja and D. G. Lowe. Scalable Nearest Neighbor Algorithms for High Dimensional Data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014.
[116]
I. Nadareishvili, R. Mitra, M. McLarty, and M. Amundsen. Microservice Architecture: Aligning Principles, Practices, and Culture. 2016.
[117]
R. M. Needham. Denial of Service. In ACM Conference on Computer and Communications Security, 1993.
[118]
R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, and P. Saab. Scaling Memcache at Facebook. In USENIX Symposium on Networked Systems Design and Implementation, 2013.
[119]
V. S. Pai, P. Druschel, and W. Zwaenepoel. Flash: An efficient and portable Web server. In USENIX Annual Technical Conference, 1999.
[120]
D. Pariag, T. Brecht, A. S. Harji, P. A. Buhr, A. Shukla, and D. R. Cheriton. Comparing the performance of web server architectures. In European Conference on Computer Systems, 2007.
[121]
S. Peter, J. Li, I. Zhang, D. R. Ports, D. Woos, A. Krishnamurthy, T. Anderson, and T. Roscoe. Arrakis: The operating system is the control plane. ACM Transactions on Computer Systems, 2016.
[122]
G. Prekas, M. Kogias, and E. Bugnion. ZygOS: Achieving Low Tail Latency for Microsecond-scale Networked Tasks. In Symposium on Operating Systems Principles, 2017.
[123]
G. Prekas, M. Primorac, A. Belay, C. Kozyrakis, and E. Bugnion. Energy Proportionality and Workload Consolidation for Latency-critical Applications. In ACM Symposium on Cloud Computing, 2015.
[124]
K. K. Pusukuri, R. Gupta, and L. N. Bhuyan. Thread reinforcer: Dynamically determining number of threads via OS level monitoring. In IEEE International Symposium on Workload Characterization, 2011.
[125]
D. R. Raymond and S. F. Midkiff. Denial-of-service in wireless sensor networks: Attacks and defenses. IEEE Pervasive Computing, 2008.
[126]
R. Rojas-Cessa, Y. Kaymak, and Z. Dong. Schemes for fast transmission of flows in data center networks. IEEE Communications Surveys & Tutorials, 2015.
[127]
D. Schmidt and P. Stephenson. Experience using design patterns to evolve communication software across diverse OS platforms. In European Conference on Object-Oriented Programming, 1995.
[128]
D. C. Schmidt and C. Cleeland. Applying patterns to develop extensible ORB middleware. IEEE Communications Magazine, 1999.
[129]
G. Shakhnarovich, P. Viola, and T. Darrell. Fast pose estimation with parameter-sensitive hashing. In IEEE International Conference on Computer Vision, 2003.
[130]
R. K. Sharma, C. E. Bash, C. D. Patel, R. J. Friedrich, and J. S. Chase. Balance of power: Dynamic thermal management for internet data centers. IEEE Internet Computing, 2005.
[131]
M. Slaney and M. Casey. Locality-sensitive hashing for finding nearest neighbors. IEEE Signal Processing Magazine, 2008.
[132]
S. M. Specht and R. B. Lee. Distributed Denial of Service: Taxonomies of Attacks, Tools, and Countermeasures. In ISCA International Conference on Parallel and Distributed Computing (and Communications) Systems, 2004.
[133]
A. Sriraman, S. Liu, S. Gunbay, S. Su, and T. F. Wenisch. Deconstructing the Tail at Scale Effect Across Network Protocols. The Annual Workshop on Duplicating, Deconstructing, and Debunking, 2016.
[134]
A. Sriraman and T. F. Wenisch. µSuite: A Benchmark Suite for Microservices. In IEEE International Symposium on Workload Characterization, 2018.
[135]
M. A. Suleman, M. K. Qureshi, and Y. N. Patt. Feedback-driven threading: power-efficient and high-performance execution of multi-threaded workloads on CMPs. In International Conference on Architectural Support for Programming Languages and Operating Systems, 2008.
[136]
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the Inception Architecture for Computer Vision. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[137]
Y. Tao, K. Yi, C. Sheng, and P. Kalnis. Quality and efficiency in high dimensional nearest neighbor search. In ACM SIGMOD International Conference on Management of data, 2009.
[138]
Y. Tao, K. Yi, C. Sheng, and P. Kalnis. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space. ACM Transactions on Database Systems, 2010.
[139]
K. Terasawa and Y. Tanaka. Spherical LSH for approximate nearest neighbor search on unit hypersphere. In Workshop on Algorithms and Data Structures, 2007.
[140]
S. Tilkov and S. Vinoski. Node.js: Using JavaScript to build high-performance network programs. IEEE Internet Computing, 2010.
[141]
D. Tsafrir. The context-switch overhead inflicted by hardware interrupts (and the enigma of do-nothing loops). In Workshop on Experimental computer science, 2007.
[142]
B. Vamanan, J. Hasan, and T. Vijaykumar. Deadline-aware Datacenter TCP (D2TCP). In ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, 2012.
[143]
B. Vamanan, H. B. Sohail, J. Hasan, and T. N. Vijaykumar. Timetrader: Exploiting Latency Tail to Save Datacenter Energy for Online Search. In International Symposium on Microarchitecture, 2015.
[144]
M. Villamizar, O. Garcés, H. Castro, M. Verano, L. Salamanca, R. Casallas, and S. Gil. Evaluating the monolithic and the microservice architecture pattern to deploy web applications in the cloud. In Computing Colombian Conference, 2015.
[145]
J. R. Von Behren, J. Condit, and E. A. Brewer. Why Events Are a Bad Idea (for High-Concurrency Servers). In Hot Topics in Operating Systems, 2003.
[146]
Q. Wang, C.-A. Lai, Y. Kanemasa, S. Zhang, and C. Pu. A Study of Long-Tail Latency in n-Tier Systems: RPC vs. Asynchronous Invocations. In International Conference on Distributed Computing Systems, 2017.
[147]
Z. Wang and M. F. O'Boyle. Mapping Parallelism to Multi-cores: A Machine Learning Based Approach. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009.
[148]
M. Welsh, D. Culler, and E. Brewer. SEDA: An Architecture for Well-conditioned, Scalable Internet Services. In ACM Symposium on Operating Systems Principles, 2001.
[149]
W. J. Wilbur and K. Sirotkin. The automatic identification of stop words. Journal of information science, 1992.
[150]
C. Wilson, H. Ballani, T. Karagiannis, and A. Rowtron. Better Never Than Late: Meeting Deadlines in Datacenter Networks. In ACM SIGCOMM Conference, 2011.
[151]
Y. Zhang, D. Meisner, J. Mars, and L. Tang. Treadmill: Attributing the Source of Tail Latency Through Precise Load Testing and Statistical Inference. In International Symposium on Computer Architecture, 2016.

Cited By

View all
  • (2021)PMNetProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00068(804-817)Online publication date: 14-Jun-2021
  • (2020)PeafowlProceedings of the 11th ACM Symposium on Cloud Computing10.1145/3419111.3421298(150-164)Online publication date: 12-Oct-2020
  • (2019)File systems as processesProceedings of the 11th USENIX Conference on Hot Topics in Storage and File Systems10.5555/3357062.3357081(14-14)Online publication date: 8-Jul-2019
  • Show More Cited By
  1. µtune: auto-tuned threading for OLDI microservices

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    OSDI'18: Proceedings of the 13th USENIX conference on Operating Systems Design and Implementation
    October 2018
    815 pages
    ISBN:9781931971478

    Sponsors

    • NetApp
    • Google Inc.
    • NSF
    • Microsoft: Microsoft
    • Facebook: Facebook

    In-Cooperation

    Publisher

    USENIX Association

    United States

    Publication History

    Published: 08 October 2018

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 24 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)PMNetProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00068(804-817)Online publication date: 14-Jun-2021
    • (2020)PeafowlProceedings of the 11th ACM Symposium on Cloud Computing10.1145/3419111.3421298(150-164)Online publication date: 12-Oct-2020
    • (2019)File systems as processesProceedings of the 11th USENIX Conference on Hot Topics in Storage and File Systems10.5555/3357062.3357081(14-14)Online publication date: 8-Jul-2019
    • (2019)YawnProceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3343737.3343740(91-98)Online publication date: 19-Aug-2019

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media