Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1376916.1376929acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

A generic flow algorithm for shared filter ordering problems

Published: 09 June 2008 Publication History

Abstract

We consider a fundamental flow maximization problem that arises during the evaluation of multiple overlapping queries defined on a data stream, in a heterogenous parallel environment. Each query is a conjunction of boolean filters, and each filter could be shared across multiple queries. We are required to design an evaluation plan that evaluates filters against stream items in order to determine the set of queries satisfied by each item. The evaluation plan specifies for each item: (i) the subset of filters evaluated for this item and the order of their evaluations, and (ii) the processor on which each filter evaluation occurs. Our goal is to design an evaluation plan which maximizes the total throughput (flow) of the stream handled by the plan, without violating the processor capacities.
Filter ordering has received extensive attention in single-processor settings, with the objective of minimizing the total cost of filter evaluations: in particular, efficient (approximation) algorithms are known for various important versions of min-cost filter ordering. Min-cost filter ordering problem for a single processor is a special case of our flow maximization for parallel processors. Our main contribution in this work is a generic flow-maximization algorithm, which assumes the availability of a min-cost filter ordering algorithm for a single processor, and uses this to iteratively construct a solution to the flow-maximization problem for heterogenous parallel processors. We show that the approximation ratio of our flow-maximization strategy is essentially the same as that of the underlying min-cost filter ordering algorithm. Our result, along with existing results on min-cost filter ordering, enables the optimization of several important versions of filter ordering in parallel environments.

References

[1]
Ron Avnur and Joseph M. Hellerstein. Eddies: continuously adaptive query processing. SIGMOD Rec., 29(2):261--272, 2000.]]
[2]
Shivnath Babu, Rajeev Motwani, Kamesh Munagala, Itaru Nishizawa, and Jennifer Widom. Adaptive ordering of pipelined stream filters. In SIGMOD, pages 407--418, New York, NY, USA, 2004. ACM Press.]]
[3]
Amotz Bar-Noy, Mihir Bellare, Magn´us M. Halld´orsson, Hadas Shachnai, and Tami Tamir. On chromatic sums and distributed resource allocation. Inf. Comput, 140(2):183--202, 1998.]]
[4]
Surajit Chaudhuri, Umeshwar Dayal, and Tak W. Yan. Join queries with external text sources: execution and optimization techniques. In SIGMOD '95: Proceedings of the 1995 ACM SIGMOD international conference on Management of data, pages 410--422, New York, NY, USA, 1995. ACM.]]
[5]
H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Annals of Mathematical Statistics, 23:493--509, 1952.]]
[6]
Edith Cohen, Amos Fiat, and Haim Kaplan. Efficient sequences of trials. In SODA, pages 737--746, 2003.]]
[7]
Anne Condon, Amol Deshpande, Lisa Hellerstein, and Ning Wu. Flow algorithms for two pipelined filter ordering problems. In PODS, pages 193--202, New York, NY, USA, 2006. ACM Press.]]
[8]
Amol Deshpande, Carlos Guestrin, Wei Hong, and Samuel Madden. Exploiting correlated attributes in acquisitional query processing. In ICDE '05: Proceedings of the 21st International Conference on Data Engineering, pages 143--154, Washington, DC, USA, 2005. IEEE Computer Society.]]
[9]
Oren Etzioni, Steve Hanks, Tao Jiang, Richard M. Karp, Omid Madani, and Orli Waarts. Efficient information gathering on the internet (extended abstract). In FOCS, pages 234--243, 1996.]]
[10]
Uriel Feige and Prasad Tetali. Approximating min sum set cover. Algorithmica, 40(4):219--234, 2004.]]
[11]
N. Garg and J. Koenemann. Faster and simpler algorithms for multicommodity flow and other fractional packing problems. In Proceedings of the 39th Annual Symposium on Foundations of Computer Science, page 300. IEEE Computer Science Society, 1998.]]
[12]
Roy Goldman and Jennifer Widom. Wsq/dsq: a practical approach for combined querying of databases and the web. In SIGMOD '00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 285--296, New York, NY, USA, 2000. ACM.]]
[13]
J. Hellerstein and M. Stonebraker. Predicate migration: Optimizing queries with expensive predicates. In Proc. SIGMOD, 1993.]]
[14]
W. Hoeffding. Probability inequalities for sums of bounded random variables. American Statistical Association Journal, 58:13--30, 1963.]]
[15]
T. Ibaraki and T. Kameda. On the optimal nesting order for computing n-relational joins. ACM Trans. on Database Systems, 9(3):482--502, 1984.]]
[16]
Haim Kaplan, Eyal Kushilevitz, and Yishay Mansour. Learning with attribute costs. In STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 356--365, New York, NY, USA, 2005. ACM.]]
[17]
Murali S. Kodialam. The throughput of sequential testing. In Proceedings of the 8th International IPCO Conference on Integer Programming and Combinatorial Optimization, pages 280--292, London, UK, 2001. Springer-Verlag.]]
[18]
Ravi Krishnamurthy, Haran Boral, and Carlo Zaniolo. Optimization of nonrecursive queries. In Proc. VLDB, pages 128--137, 1986.]]
[19]
Zhen Liu, Srinivasan Parthasarathy, Anand Ranganathan, and Hao Yang. Near-optimal algorithms for shared filter evaluation in data stream systems. In Proc. of ACM SIGMOD (to appear), 2008.]]
[20]
Kamesh Munagala, Shivnath Babu, Rajeev Motwani, and Jennifer Widom. The pipelined set cover problem. In 10th International Conference on Database Theory - ICDT, pages 83--98, 2005.]]
[21]
Kamesh Munagala, Utkarsh Srivastava, and Jennifer Widom. Optimization of continuous queries with shared expensive filters. In PODS '07: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 215--224, New York, NY, USA, 2007. ACM.]]
[22]
Serge A. Plotkin, David B. Shmoys, and Éva Tardos. Fast approximation algorithms for fractional packing and covering problems. In Proceedings of the 32nd annual symposium on Foundations of computer science, pages 495--504, Los Alamitos, CA, USA, 1991. IEEE Computer Society Press.]]
[23]
H. Simon and J. Kadane. Optimal problem-solving search: All-or-none solutions. Artificial Intelligence, 6:235--247, 1975.]]
[24]
Vijay V. Vazirani. Approximation algorithms. Springer-Verlag New York, Inc., New York, NY, USA, 2001.]]
[25]
Neal E. Young. Sequential and parallel algorithms for mixed packing and covering. In Proceedings of IEEE Symposium on Foundations of Computer Science, 2001.]]

Cited By

View all

Index Terms

  1. A generic flow algorithm for shared filter ordering problems

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PODS '08: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
    June 2008
    330 pages
    ISBN:9781605581521
    DOI:10.1145/1376916
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 June 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. flow maximization
    2. parallel
    3. query optimization
    4. shared filter ordering

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '08
    Sponsor:

    Acceptance Rates

    PODS '08 Paper Acceptance Rate 28 of 159 submissions, 18%;
    Overall Acceptance Rate 642 of 2,707 submissions, 24%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Solving Zero-Sum Games Using Best-Response Oracles with Applications to Search GamesOperations Research10.1287/opre.2019.185367:3(731-743)Online publication date: 1-May-2019
    • (2017)Max-Throughput for (Conservative) k-of-n TestingAlgorithmica10.1007/s00453-015-0089-477:2(595-618)Online publication date: 1-Feb-2017
    • (2014)Sharing across Multiple MapReduce JobsACM Transactions on Database Systems10.1145/256079639:2(1-46)Online publication date: 26-May-2014
    • (2013)Optimal Service Ordering in Decentralized Queries Over Web ServicesIntelligence Methods and Systems Advancements for Knowledge-Based Business10.4018/978-1-4666-1873-2.ch003(43-58)Online publication date: 2013
    • (2012)Optimal Service Ordering in Decentralized Queries Over Web ServicesGrid and Cloud Computing10.4018/978-1-4666-0879-5.ch613(1513-1528)Online publication date: 2012
    • (2012)Parallel pipelined filter ordering with precedence constraintsACM Transactions on Algorithms10.1145/2344422.23444318:4(1-38)Online publication date: 4-Oct-2012
    • (2012)Optimization of decentralized multi-way join queries over pipelined filtering servicesComputing10.1007/s00607-012-0209-994:12(939-972)Online publication date: 24-Aug-2012
    • (2011)Optimal Service Ordering in Decentralized Queries Over Web ServicesInternational Journal of Knowledge-Based Organizations10.4018/ijkbo.20110401011:2(1-16)Online publication date: 1-Apr-2011
    • (2011)Predictable performance and high query concurrency for data analyticsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-011-0221-220:2(227-248)Online publication date: 1-Apr-2011
    • (2011)Max-throughput for (conservative) k-of-n testingProceedings of the 22nd international conference on Algorithms and Computation10.1007/978-3-642-25591-5_72(703-713)Online publication date: 5-Dec-2011
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media