Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2535461.2535468guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

TAO: Facebook's distributed data store for the social graph

Published: 26 June 2013 Publication History

Abstract

We introduce a simple data model and API tailored for serving the social graph, and TAO, an implementation of this model. TAO is a geographically distributed data store that provides efficient and timely access to the social graph for Facebook's demanding workload using a fixed set of queries. It is deployed at Facebook, replacing memcache for many data types that fit its model. The system runs on thousands of machines, is widely distributed, and provides access to many petabytes of data. TAO can process a billion reads and millions of writes each second.

References

[1]
Amazon SimpleDB. http://aws.amazon.com/simpledb/.
[2]
Facebook - Company Info. http://newsroom.fb.com.
[3]
LevelDB. https://code.google.com/p/leveldb.
[4]
Project Voldemort. http://project-voldemort.com/.
[5]
J. Baker, C. Bond, J. C. Corbett, J. Furman, A. Khorlin, J. Larson, J.-M. Leon, Y. Li, A. Lloyd, and V. Yushprakh. Megastore: Providing scalable, highly available storage for interactive services. In Proceedings of the Conference on Innovative Data system Research, CIDR, 2011.
[6]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. Gruber. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating System Design and Implementation, OSDI. USENIX Assoc., 2006.
[7]
B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. PNUTS: Yahoo!'s hosted data serving platform. PVLDB, 1(2), 2008.
[8]
J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google's globally-distributed database. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI, Berkeley, CA, USA, 2012.
[9]
F. Dabek, J. Li, E. Sit, J. Robertson, M. F. Kaashoek, and R. Morris. Designing a DHT for low latency and high throughput. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation, NSDI, Berkeley, CA, USA, 2004.
[10]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In Proceedings of 21st ACM Symposium on Operating Systems Principles, SOSP, New York, NY, USA, 2007.
[11]
FlockDB. http://engineering.twitter.com/2010/05/introducingflockdb.html.
[12]
L. Glendenning, I. Beschastnikh, A. Krishnamurthy, and T. Anderson. Scalable consistency in scatter. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles, SOSP, New York, NY, USA, 2011.
[13]
S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and D. Culler. Scalable, distributed data structures for internet service construction. In Proceedings of the 4th Symposium on Operating System Design and Implementation, OSDI, Berkeley, CA, USA, 2000.
[14]
U. Kang, C. E. Tsourakakis, and C. Faloutsos. PEGASUS: mining peta-scale graphs. Knowledge Information Systems, 27(2), 2011.
[15]
D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin. Consistent Hashing and Random trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In Proceedings of the 29th annual ACM Symposium on Theory of Computing, STOC, 1997.
[16]
J. Li, J. Stribling, T. M. Gil, R. Morris, and M. F. Kaashoek. Comparing the performance of distributed hash tables under churn. In Proceedings of the Third International Conference on Peer-to-Peer Systems, IPTPS, Berlin, Heidelberg, 2004.
[17]
W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Don't settle for eventual: scalable causal consistency for wide-area storage with COPS. In T. Wobber and P. Druschel, editors, Proceedings of the 23rd ACM Symposium on Operating System Design and Implementation, SOSP. ACM, 2011.
[18]
W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Stronger semantics for low-latency geo-replicated storage. In Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation, NSDI, 2013.
[19]
G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In A. K. Elmagarmid and D. Agrawal, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 2010.
[20]
Neo4j. http://neo4j.org/.
[21]
R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation, NSDI, 2013.
[22]
E. Nygren, R. K. Sitaraman, and J. Sun. The Akamai network: a platform for high-performance internet applications. SIGOPS Operating Systems Review, 44(3), Aug. 2010.
[23]
C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig latin: a not-so-foreign language for data processing. In J. T.- L.Wang, editor, Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 2008.
[24]
V. Ramasubramanian and E. G. Sirer. Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation, NSDI, Berkeley, CA, USA, 2004.
[25]
S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, SIGCOMM, New York, NY, USA, 2001.
[26]
Redis. http://redis.io/.
[27]
E. G. S. Robert Escriva, Bernard Wong. Hyperdex: A distributed, searchable key-value store for cloud computing. Technical report, Department of Computer Science, Cornell University, Ithaca, New York, December 2011.
[28]
A. I. T. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg, Middleware, London, UK, UK, 2001.
[29]
M. Satyanarayanan. The evolution of coda. ACM Transactions on Computer Systems, 20(2), May 2002.
[30]
B. Shao and H. Wang. Trinity. http://research.microsoft.com/enus/projects/trinity/.
[31]
Y. Sovran, R. Power, M. K. Aguilera, and J. Li. Transactional storage for geo-replicated systems. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles, SOSP, New York, NY, USA, 2011.
[32]
I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, SIGCOMM, New York, NY, USA, 2001.
[33]
D. B. Terry, M. M. Theimer, K. Petersen, A. J. Demers, M. J. Spreitzer, and C. H. Hauser. Managing update conflicts in Bayou, a weakly connected replicated storage system. In Proceedings of the 15th ACM Symposium on Operating Systems Principles, SOSP, New York, NY, USA, 1995.
[34]
The Apache Software Foundation. http://hbase.apache.org, 2010.
[35]
W. Vogels. Eventually consistent. Queue, 6(6), Oct. 2008.

Cited By

View all
  • (2023)FrozenHot Cache: Rethinking Cache Management for Modern HardwareProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587446(557-573)Online publication date: 8-May-2023
  • (2021)RAMP-TAOProceedings of the VLDB Endowment10.14778/3476311.347637914:12(3014-3027)Online publication date: 28-Oct-2021
  • (2021)AUTOGRProceedings of the VLDB Endowment10.14778/3461535.346154114:9(1517-1530)Online publication date: 22-Oct-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
USENIX ATC'13: Proceedings of the 2013 USENIX conference on Annual Technical Conference
June 2013
364 pages

Sponsors

  • VMware
  • Akamai: Akamai
  • Google Inc.
  • EMC2: EMC2
  • Facebook: Facebook

Publisher

USENIX Association

United States

Publication History

Published: 26 June 2013

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)FrozenHot Cache: Rethinking Cache Management for Modern HardwareProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587446(557-573)Online publication date: 8-May-2023
  • (2021)RAMP-TAOProceedings of the VLDB Endowment10.14778/3476311.347637914:12(3014-3027)Online publication date: 28-Oct-2021
  • (2021)AUTOGRProceedings of the VLDB Endowment10.14778/3461535.346154114:9(1517-1530)Online publication date: 22-Oct-2021
  • (2021)MonkeyDB: effectively testing correctness under weak isolation levelsProceedings of the ACM on Programming Languages10.1145/34855465:OOPSLA(1-27)Online publication date: 15-Oct-2021
  • (2021)Leveraging NVMe SSDs for Building a Fast, Cost-effective, LSM-tree-based KV StoreACM Transactions on Storage10.1145/348096317:4(1-29)Online publication date: 15-Oct-2021
  • (2021)KangarooProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483568(243-262)Online publication date: 26-Oct-2021
  • (2021)WukongACM SIGOPS Operating Systems Review10.1145/3469379.346938855:1(77-83)Online publication date: 6-Jun-2021
  • (2021)A Large-scale Analysis of Hundreds of In-memory Key-value Cache Clusters at TwitterACM Transactions on Storage10.1145/346852117:3(1-35)Online publication date: 16-Aug-2021
  • (2021) XStore: Fast RDMA-Based Ordered Key-Value Store Using Remote Learned CacheACM Transactions on Storage10.1145/346852017:3(1-32)Online publication date: 16-Aug-2021
  • (2021)Don't be a blockheadProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3458336.3465300(144-151)Online publication date: 1-Jun-2021
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media