Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1989323.1989331acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

CrowdDB: answering queries with crowdsourcing

Published: 12 June 2011 Publication History

Abstract

Some queries cannot be answered by machines only. Processing such queries requires human input for providing information that is missing from the database, for performing computationally difficult functions, and for matching, ranking, or aggregating results based on fuzzy criteria. CrowdDB uses human input via crowdsourcing to process queries that neither database systems nor search engines can adequately answer. It uses SQL both as a language for posing complex queries and as a way to model data. While CrowdDB leverages many aspects of traditional database systems, there are also important differences. Conceptually, a major change is that the traditional closed-world assumption for query processing does not hold for human input. From an implementation perspective, human-oriented query operators are needed to solicit, integrate and cleanse crowdsourced data. Furthermore, performance and cost depend on a number of new factors including worker affinity, training, fatigue, motivation and location. We describe the design of CrowdDB, report on an initial set of experiments using Amazon Mechanical Turk, and outline important avenues for future work in the development of crowdsourced query processing systems.

References

[1]
Pictures of the Golden Gate Bridge retrieved from Flickr by akaporn, Dawn Endico, devinleedrew, di_the_huntress, Geoff Livingston, kevincole, Marc\_Smith, and superstrikertwo under the Creative Commons Attribution 2.0 Generic license.
[2]
Amazon. AWS Case Study: Smartsheet, 2006.
[3]
Amazon Mechanical Turk. http://www.mturk.com, 2010.
[4]
S. Amer-Yahia et al. Crowds, Clouds, and Algorithms: Exploring the Human Side of "Big Data" Applications. In SIGMOD, 2010.
[5]
M. Armbrust et al. PIQL: A Performance Insightful Query Language. In SIGMOD, 2010.
[6]
M. S. Bernstein et al. Soylent: A Word Processor with a Crowd Inside. In ACM SUIST, 2010.
[7]
M. J. Carey and D. Kossmann. On saying "Enough already!" in SQL. SIGMOD Rec., 26(2):219--230, 1997.
[8]
S. S. Chawathe et al. The TSIMMIS Project: Integration of Heterogeneous Information Sources. In IPSJ, 1994.
[9]
K. Chen et al. USHER: Improving Data Quality with Dynamic Forms. In ICDE, pages 321--332, 2010.
[10]
A. Doan, R. Ramakrishnan, and A. Halevy. Crowdsourcing Systems on the World-Wide Web. CACM, 54:86--96, Apr. 2011.
[11]
L. M. Haas et al. Optimizing Queries Across Diverse Data Sources. In VLDB, 1997.
[12]
J. M. Hellerstein et al. Adaptive Query Processing: Technology in Evolution. IEEE Data Eng. Bull., 2000.
[13]
J. M. Hellerstein and J. F. Naughton. Query Execution Techniques for Caching Expensive Methods. In SIGMOD, pages 423--434, 1996.
[14]
E. Huang et al. Toward Automatic Task Design: A Progress Report. In HCOMP, 2010.
[15]
P. G. Ipeirotis. Analyzing the Amazon Mechanical Turk Marketplace. http://hdl.handle.net/2451/29801, 2010.
[16]
P. G. Ipeirotis. Mechanical Turk, Low Wages, and the Market for Lemons. http://behind-the-enemy-lines.blogspot.com/2010/07/ mechanical-turk-low-wages-and-market.html, 2010.
[17]
A. G. Kleppe, J. Warmer, and W. Bast. MDA Explained: The Model Driven Architecture: Practice and Promise. Addison-Wesley, 2003.
[18]
G. Little. How many turkers are there? http://groups.csail.mit.edu/uid/deneme/?p=502, 2009.
[19]
G. Little et al. TurKit: Tools for Iterative Tasks on Mechanical Turk. In HCOMP, 2009.
[20]
A. Marcus et al. Crowdsourced Databases: Query Processing with People. In CIDR, 2011.
[21]
Microsoft. Table Column Properties (SQL Server), 2008.
[22]
A. Parameswaran et al. Human-Assisted Graph Search: It's Okay to Ask Questions. In VLDB, 2011.
[23]
A. Parameswaran and N. Polyzotis. Answering Queries using Humans, Algorithms and Databases. In CIDR, 2011.
[24]
J. Ross et al. Who are the Crowdworkers? Shifting Demographics in Mechanical Turk. In CHI EA, 2010.
[25]
D. Schall, S. Dustdar, and M. B. Blake. Programming Human and Software-Based Web Services. Computer, 43(7):82--85, 2010.
[26]
Turker Nation. http://www.turkernation.com/, 2010.
[27]
Turkopticon. http://turkopticon.differenceengines.com/, 2010.
[28]
T. Yan, V. Kumar, and D. Ganesan. CrowdSearch: Exploiting Crowds for Accurate Real-time. Image Search on Mobile Phones. In MobiSys, 2010.

Cited By

View all
  • (2024)A Deep Learning Based Platform for Remote Sensing Images Change Detection Integrating Crowdsourcing and Active LearningSensors10.3390/s2405150924:5(1509)Online publication date: 26-Feb-2024
  • (2024)RCTD: Reputation-Constrained Truth Discovery in Sybil Attack Crowdsourcing EnvironmentProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671803(1313-1324)Online publication date: 25-Aug-2024
  • (2024)UITDE: A UAV-Assisted Intelligent True Data Evaluation Method for Ubiquitous IoT Systems in Intelligent Transportation of Smart CityIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.337341125:8(9597-9607)Online publication date: Aug-2024
  • Show More Cited By

Index Terms

  1. CrowdDB: answering queries with crowdsourcing

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
    June 2011
    1364 pages
    ISBN:9781450306614
    DOI:10.1145/1989323
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. architecture
    2. crowd
    3. crowdsourcing
    4. database
    5. hybrid system
    6. query processing

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)91
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Deep Learning Based Platform for Remote Sensing Images Change Detection Integrating Crowdsourcing and Active LearningSensors10.3390/s2405150924:5(1509)Online publication date: 26-Feb-2024
    • (2024)RCTD: Reputation-Constrained Truth Discovery in Sybil Attack Crowdsourcing EnvironmentProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671803(1313-1324)Online publication date: 25-Aug-2024
    • (2024)UITDE: A UAV-Assisted Intelligent True Data Evaluation Method for Ubiquitous IoT Systems in Intelligent Transportation of Smart CityIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.337341125:8(9597-9607)Online publication date: Aug-2024
    • (2024)An accurate roughness prediction in milling processes through analytical evaluation and KNN regression approachThe International Journal of Advanced Manufacturing Technology10.1007/s00170-024-14526-3Online publication date: 29-Oct-2024
    • (2024)Efficient Privacy-Preserving Truth Discovery and Copy Detection in CrowdsourcingMachine Learning and Knowledge Discovery in Databases. Research Track10.1007/978-3-031-70352-2_22(368-385)Online publication date: 22-Aug-2024
    • (2023)Leveraging Human-AI Collaboration in Crowd-Powered Source Search: A Preliminary StudyJournal of Social Computing10.23919/JSC.2023.00024:2(95-111)Online publication date: Jun-2023
    • (2023)Algorithmic Complexity Attacks on Dynamic Learned IndexesProceedings of the VLDB Endowment10.14778/3636218.363623217:4(780-793)Online publication date: 1-Dec-2023
    • (2023)A Lightweight, Effective, and Efficient Model for Label Aggregation in CrowdsourcingACM Transactions on Knowledge Discovery from Data10.1145/363010218:4(1-27)Online publication date: 26-Oct-2023
    • (2023)Multi-view Ensemble Clustering via Low-rank and Sparse Decomposition: From Matrix to TensorACM Transactions on Knowledge Discovery from Data10.1145/358976817:7(1-19)Online publication date: 4-May-2023
    • (2023)DuCape: Dual Quaternion and Capsule Network–Based Temporal Knowledge Graph EmbeddingACM Transactions on Knowledge Discovery from Data10.1145/358964417:7(1-19)Online publication date: 4-May-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media