Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1807167.1807257acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Feeding frenzy: selectively materializing users' event feeds

Published: 06 June 2010 Publication History

Abstract

Near real-time event streams are becoming a key feature of many popular web applications. Many web sites allow users to create a personalized feed by selecting one or more event streams they wish to follow. Examples include Twitter and Facebook, which a user to follow other users' activity, and iGoogle and My Yahoo, which allow users to follow selected RSS streams. How can we efficiently construct a web page showing the latest events from a user's feed? Constructing such a feed must be fast so the page loads quickly, yet reflects recent updates to the underlying event streams. The wide fanout of popular streams (those with many followers) and high skew (fanout and update rates vary widely) make it difficult to scale such applications.
We associate feeds with consumers and event streams with producers. We demonstrate that the best performance results from selectively materializing each consumer's feed: events from high-rate producers are retrieved at query time, while events from lower-rate producers are materialized in advance. A formal analysis of the problem shows the surprising result that we can minimize global cost by making local decisions about each producer/consumer pair, based on the ratio between a given producer's update rate (how often an event is added to the stream) and a given consumer's view rate (how often the feed is viewed). Our experimental results, using Yahoo!'s web-scale database PNUTS, shows that this hybrid strategy results in the lowest system load (and hence improves scalability) under a variety of workloads.

References

[1]
Twitter API. http://apiwiki.twitter.com/.
[2]
D. Agrawal, A. E. Abbadi, A. K. Singh, and T. Yurek. Efficient view maintenance at data warehouses. In SIGMOD, 1997.
[3]
P. Agrawal et al. Asynchronous view maintenance for VLSD databases. In SIGMOD, 2009.
[4]
S. Agrawal, S. Chaudhuri, and V. Narasayya. Automated selection of materialized views and indexes for SQL databases. In VLDB, 2000.
[5]
S. Babu, K. Munagala, and J. Widom. Adaptive caching for continuous queries. In ICDE, 2005.
[6]
M. Cafarella et al. Data management projects at Google. SIGMOD Record, 34--38(1), March 2008.
[7]
S. Ceri and J. Widom. Deriving production rules for incremental view maintenance. In VLDB, 1991.
[8]
B. Chandramouli and J. Yang. End-to-end support for joins in large-scale publish/subscribe systems. In VLDB, 2008.
[9]
B. F. Cooper et al. PNUTS: Yahoo!'s hosted data serving platform. In VLDB, 2008.
[10]
I. Eure. Looking to the future with Cassandra. http://blog.digg.com/?p=966.
[11]
C. Garrod et al. Scalable query result caching for web applications. In VLDB, 2008.
[12]
G. Graefe. B-tree indexes for high update rates. SIGMOD Record, 35(1):39--44, March 2006.
[13]
H. Gupta. Selection of views to materialize in a data warehouse. In ICDT, 1997.
[14]
P. Haas and J. Hellerstein. Ripple joins for online aggregation. In SIGMOD, 1999.
[15]
A. Halevy. Answering queries using views: A survey. VLDB Journal, 10(4):270--294, 2001.
[16]
G. Luo. Partial Materialized Views. In ICDE, 2007.
[17]
G. Luo, J. F. Naughton, C. J. Ellmann, and M. Watzke. A comparison of three methods for join view maintenance in parallel RDBMS. In ICDE, 2003.
[18]
S. Madden, M. Franklin, J. Hellerstein, and W. Hong. TinyDB: An acquisitional query processing system for sensor networks. TODS, 30(1):122--173, March 2005.
[19]
S. Martello and P. Toth. Knapsack Problems: Algorithms and Computer Implementations. Wiley-Interscience, 1990.
[20]
C. Mohan and I. Narang. Algorithms for creating indexes for very large tables without quiescing updates. In SIGMOD, 1992.
[21]
C. Olston, J. Jiang, and J. Widom. Adaptive filters for continuous queries over distributed data streams. In SIGMOD, 2003.
[22]
C. Re and D. Suciu. Materialized views in probabilistic databases: for information exchange and query optimization. In VLDB, 2007.
[23]
K. Salem, K. Beyer, B. Lindsay, and R. Cochrane. How to roll a join: Asynchronous incremental view maintenance. In SIGMOD, 2000.
[24]
P. Seshadri and A. Swami. Generalized partial indexes. In ICDE, 1995.
[25]
M. Stonebraker. The case for partial indexes. SIGMOD Record, 18(4):4--11, 1989.
[26]
E. Weaver. Improving running components at Twitter. http://blog.evanweaver.com/articles/2009/03/13/-qcon-presentation/.
[27]
S. Wu, J. Li, B. Ooi, and K.-L. Tan. Just-in-time query retrieval over partially indexed data on structured P2P overlays. In SIGMOD, 2008.
[28]
J. Yang and J. Widom. Temporal view self-maintenance. In EDBT, 2000.
[29]
K. Yi et al. Efficient maintenance of materialized top-k views. In ICDE, 2003.
[30]
J. Zhou, P.-A. Larson, and H. G. Elmongui. Lazy maintenance of materialized views. In VLDB, 2007.
[31]
Y. Zhou, A. Salehi, and K. Aberer. Scalable delivery of stream query results. In VLDB, 2009.
[32]
Y. Zhuge, H. Garcia-Molina, J. Hammer, and J. Widom. View maintenance in a warehousing environment. In SIGMOD, 1995.

Cited By

View all
  • (2022)Network-wide complex event processing over geographically distributed data sourcesInformation Systems10.1016/j.is.2019.10144288:COnline publication date: 21-Apr-2022
  • (2021)QuiCKProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457567(2517-2529)Online publication date: 9-Jun-2021
  • (2020)QuickPoint: Efficiently Identifying Densest Sub-Graphs in Online Social Networks for Event Stream DisseminationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2018.288143532:2(332-346)Online publication date: 9-Jan-2020
  • Show More Cited By

Index Terms

  1. Feeding frenzy: selectively materializing users' event feeds

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
    June 2010
    1286 pages
    ISBN:9781450300322
    DOI:10.1145/1807167
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 June 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. social networks
    2. view maintenance

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '10
    Sponsor:
    SIGMOD/PODS '10: International Conference on Management of Data
    June 6 - 10, 2010
    Indiana, Indianapolis, USA

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Network-wide complex event processing over geographically distributed data sourcesInformation Systems10.1016/j.is.2019.10144288:COnline publication date: 21-Apr-2022
    • (2021)QuiCKProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457567(2517-2529)Online publication date: 9-Jun-2021
    • (2020)QuickPoint: Efficiently Identifying Densest Sub-Graphs in Online Social Networks for Event Stream DisseminationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2018.288143532:2(332-346)Online publication date: 9-Jan-2020
    • (2019)Location-Centric View Selection in a Location-Based Feed-Following SystemProceedings of the 13th ACM International Conference on Distributed and Event-based Systems10.1145/3328905.3329512(67-78)Online publication date: 24-Jun-2019
    • (2019)Piggyback Game: Efficient Event Stream Dissemination in Online Social Network SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.286624230:3(692-709)Online publication date: 1-Mar-2019
    • (2019)A parallel data generator for efficiently generating "realistic" social streamsFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-018-8022-z13:5(1072-1101)Online publication date: 1-Oct-2019
    • (2018)Efficient Event Stream Dissemination in Online Social Networks Based on Community Detection2018 IEEE International Conference on Communications (ICC)10.1109/ICC.2018.8422653(1-6)Online publication date: May-2018
    • (2017)A social-aware caching algorithm for improving performance of online social network services in a multi-cloud environmentProceedings of the 3rd International Conference on Communication and Information Processing10.1145/3162957.3163024(384-388)Online publication date: 24-Nov-2017
    • (2017)Systems Applications of Social NetworksACM Computing Surveys10.1145/309274250:5(1-42)Online publication date: 26-Sep-2017
    • (2016)A Location- and Diversity-Aware News Feed System for Mobile UsersIEEE Transactions on Services Computing10.1109/TSC.2015.24363969:6(846-861)Online publication date: 1-Nov-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media