Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3318464.3384690acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper

MC3: A System for Minimization of Classifier Construction Cost

Published: 31 May 2020 Publication History

Abstract

Search mechanisms over massive sets of items are the cornerstone of many modern applications, particularly in e-commerce websites. Consumers express in search queries a set of properties, and expect the system to retrieve qualifying items. A common difficulty, however, is that the information on whether or not an item satisfies the search criteria is sometimes not explicitly recorded in the repository. Instead, it may be considered as general knowledge or "hidden" in a picture/description, thereby leading to incomplete search results. To overcome these problems companies invest in building dedicated classifiers that determine whether an item satisfies the given search criteria. However, building classifiers typically incurs non-trivial costs due to the required volumes of high-quality labeled training data. In this demo, we introduce MC3, a real-time system that helps data analysts decide which classifiers to construct to minimize the costs of answering a set of search queries. MC3 is interactive and facilitates real-time analysis, by providing detailed classifiers impact information. We demonstrate the effectiveness of MC3 on real-world data and scenarios taken from a large e-commerce system, by interacting with the SIGMOD'20 audience members who act as analysts.

References

[1]
J. R. Bernardino, P. S. Furtado, and H. C. Madeira. Approximate query answering using data warehouse striping. JIIS, 19(2):145--167, 2002.
[2]
J. Cheng and M. S. Bernstein. Flock: Hybrid crowd-machine learning classifiers. In CSCW, pages 600--611. ACM, 2015.
[3]
E. Dushkin, S. Gershtein, T. Milo, and S. Novgorodov. Query driven data labeling with experts: Why pay twice? In EDBT, 2019.
[4]
S. Gershtein, T. Milo, G. Morami, and S. Novgorodov. Minimization of classifier construction cost for search queries. In SIGMOD, 2020.
[5]
H. Gupta and I. S. Mumick. Selection of views to materialize in a data warehouse. TKDE, 17(1):24--43, 2005.
[6]
I. Guy. Searching by talking: Analysis of voice queries on mobile web search. In SIGIR 2016, pages 35--44, 2016.
[7]
A. Jarovsky, T. Milo, S. Novgorodov, and W.-C. Tan. Rule sharing for fraud detection via adaptation. ICDE, 2018.
[8]
Y. Kotidis and N. Roussopoulos. A case for dynamic view management. TODS, 26(4):388--423, 2001.
[9]
M. S. Sorower. A literature survey on algorithms for multi-label learning. Oregon State University, Corvallis, 18, 2010.
[10]
C. Sun, N. Rampalli, F. Yang, and A. Doan. Chimera: Large-scale classification using machine learning, rules, and crowdsourcing. PVLDB, 7(13):1529--1540, 2014.
[11]
M.-C. Yuen, I. King, and K.-S. Leung. A survey of crowdsourcing systems. In SocialCom/PASSAT, pages 766--773, 2011.

Cited By

View all
  • (2022)Classifier Construction Under Budget ConstraintsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517863(1160-1174)Online publication date: 10-Jun-2022

Index Terms

  1. MC3: A System for Minimization of Classifier Construction Cost

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
    June 2020
    2925 pages
    ISBN:9781450367356
    DOI:10.1145/3318464
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 May 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. classifiers
    2. e-commerce

    Qualifiers

    • Short-paper

    Conference

    SIGMOD/PODS '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 22 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Classifier Construction Under Budget ConstraintsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517863(1160-1174)Online publication date: 10-Jun-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media