Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing

Published: 01 August 2021 Publication History

Abstract

Crowdsourcing has been a helpful mechanism to leverage human intelligence to acquire useful knowledge. However, when we aggregate the crowd knowledge based on the currently developed voting algorithms, it often results in common knowledge that may not be expected. In this paper, we consider the problem of collecting specific knowledge via crowdsourcing. With the help of using external knowledge base such as WordNet, we incorporate the semantic relations between the alternative answers into a probabilistic model to determine which answer is more specific. We formulate the probabilistic model considering both worker’s ability and task’s difficulty from the basic assumption, and solve it by the expectation-maximization (EM) algorithm. To increase algorithm compatibility, we also refine our method into semi-supervised one. Experimental results show that our approach is robust with hyper-parameters and achieves better improvement than majority voting and other algorithms when more specific answers are expected, especially for sparse data.

References

[1]
Howe J The rise of crowdsourcing Wired Magazine 2006 14 6 1-4
[2]
Wang J, Li G, Kraska T, Franklin M J, Feng J. Leveraging transitive relations for crowdsourced joins. In: Proceedings of ACM Conference on Management of Data. 2013, 229–240
[3]
Russell B C, Torralba A, Murphy K P, and Freeman W T Labelme: a database and Web-based tool for image annotation International Journal of Computer Vision 2008 77 1–3 157-173
[4]
Hwang K and Lee S Y Environmental audio scene and activity recognition through mobile-based crowdsourcing IEEE Transactions on Consumer Electronics 2012 58 2 700-705
[5]
Vondrick C, Patterson D, and Ramanan D Efficiently scaling up crowdsourced video annotation International Journal of Computer Vision 2013 101 1 184-204
[6]
Waggoner B, Chen Y. Output agreement mechanisms and common knowledge. In: Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing. 2014
[7]
Ordonez V, Deng J, Choi Y, Berg A C, Berg T. From large scale image categorization to entry-level categories. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 2768–2775
[8]
Feng S, Ravi S, Kumar R, Kuznetsova P, Liu W, Berg A C, Berg T L, Choi Y. Refer-to-as relations as semantic knowledge. In: Proceedings of International Conference on Automated Planning and Scheduling. 2015
[9]
Dawid A P and Skene A M Maximum likelihood estimation of observer error-rates using the em algorithm Applied Statistics 1979 28 1 20-28
[10]
Whitehill J, Wu T F, Bergsma J, Movellan J R, Ruvolo P L. Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2009, 2035–2043
[11]
Salek M, Bachrach Y, Key P. Hotspotting-a probabilistic graphical model for image object localization through crowdsourcing. In: Proceedings of International Conference on Automated Planning and Scheduling. 2013
[12]
Bachrach Y, Minka T, Guiver J, Graepel T. How to grade a test without knowing the answers—a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on Machine Learning. 2012, 819–826
[13]
Raykar V C, Yu S, Zhao L H, Valadez G H, Florin C, Bogoni L, and Moy L Learning from crowds Journal of Machine Learning Research 2010 11 43 1297-1322
[14]
Demartini G, Difallah D E, Cudré-Mauroux P. Zencrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 469–478
[15]
Zhou D, Basu S, Mao Y, Platt J C. Learning from the wisdom of crowds by minimax entropy. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2012, 2195–2203
[16]
Han T, Sun H, Song Y, Fang Y, Liu X. Incorporating external knowledge into crowd intelligence for more specific knowledge acquisition. In: Proceedings of International Joint Conference on Artificial Intelligence. 2016, 1541–1547
[17]
Chilton L B, Little G, Edge D, Weld D S, Landay J A. Cascade: crowdsourcing taxonomy creation. In: Proceedings of SIGCHI Conference on Human Factors in Computing Systems. 2013, 1999–2008
[18]
Bragg J, Weld D S. Crowdsourcing multi-label classification for taxonomy creation. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing. 2013
[19]
Sun Y, Singla A, Fox D, Krause A. Building hierarchies of concepts via crowdsourcing. In: Proceedings of International Joint Conference on Artificial Intelligence. 2015, 844–851
[20]
Fellbaum C. WordNet: An Electronic Lexical Database. MIT Press, 1998
[21]
Lenat D B, Guha R V. Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, 1989
[22]
Speer R, Havasi C. Representing general relational knowledge in conceptnet 5. In: Proceedings of Language Resources and Evaluation Conference. 2012, 3679–3686
[23]
Wu W, Li H, Wang H, Zhu K Q. Probase: a probabilistic taxonomy for text understanding. In: Proceedings of ACM Conference on Management of Data. 2012, 481–492
[24]
Prelec D, Seung H S, and McCoy J A solution to the single-question crowd wisdom problem Nature 2017 541 7638 532-535
[25]
Divvala S K, Farhadi A, Guestrin C. Learning everything about anything: webly-supervised visual concept learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3270–3277
[26]
Sheng V S, Provost F, Ipeirotis P G. Get another label? improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 614–622
[27]
Ipeirotis P G, Provost F, Wang J. Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation. 2010, 64–67
[28]
Han T, Sun H, Song Y, Wang Z, Liu X. Budgeted task scheduling for crowdsourced knowledge acquisition. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017, 1059–1068
[29]
Callison-Burch C. Fast, cheap, and creative: evaluating translation quality using amazon’s mechanical turk. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009, 286–295
[30]
Hu C, Bederson B B, Resnik P. Translation by iterative collaboration between monolingual users. In: Proceedings of Graphics Interface 2010. 2010, 39–46
[31]
Ambati V, Vogel S, Carbonell J. Active learning and crowd-sourcing for machine translation. In: Proceedings of the 7th International Conference on Language Resources and Evaluation. 2010
[32]
Dong X L, Gabrilovich E, Heitz G, Horn W, Murphy K, Sun S, and Zhang W From data fusion to knowledge fusion Proceedings of the VLDB Endowment 2014 7 10 881-892
[33]
Ma F, Li Y, Li Q, Qiu M, Gao J, Zhi S, Su L, Zhao B, Ji H, Han J. Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 745–754
[34]
Fang Y, Sun H, Chen P, Huai J. On the cost complexity of crowdsourcing. In: Proceedings of International Joint Conference on Artificial Intelligence. 2018, 1531–1537
[35]
Luengo-Oroz M A, Arranz A, and Frean J Crowdsourcing malaria parasite quantification: an online game for analyzing images of infected thick blood smears Journal of Medical Internet Research 2012 14 6 e167
[36]
Kalman R E A new approach to linear filtering and prediction problems Journal of Basic Engineering 1960 82 1 35-45
[37]
Sun H, Hu K, Fang Y, and Song Y Adaptive result inference for collecting quantitative data with crowdsourcing IEEE Internet of Things Journal 2017 4 5 1389-1398
[38]
Dai P, Lin C H, and Weld D S Pomdp-based control of workflows for crowdsourcing Artificial Intelligence 2013 202 52-85
[39]
Dai P, Weld D S. Artificial intelligence for artificial artificial intelligence. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence. 2011
[40]
Fang Y, Sun H, Li G, Zhang R, and Huai J Context-aware result inference in crowdsourcing Information Sciences 2018 460 346-363
[41]
Otani N, Baba Y, and Kashima H Quality control of crowdsourced classification using hierarchical class structures Expert Systems with Applications 2016 58 155-163
[42]
Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255

Cited By

View all
  • (2023)Attribute augmentation-based label integration for crowdsourcingFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-022-2225-z17:5Online publication date: 1-Oct-2023
  • (2023)Hint: harnessing the wisdom of crowds for handling multi-phase tasksNeural Computing and Applications10.1007/s00521-021-06825-735:31(22911-22933)Online publication date: 1-Nov-2023

Index Terms

  1. Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Frontiers of Computer Science: Selected Publications from Chinese Universities
      Frontiers of Computer Science: Selected Publications from Chinese Universities  Volume 15, Issue 4
      Aug 2021
      200 pages
      ISSN:2095-2228
      EISSN:2095-2236
      Issue’s Table of Contents

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 01 August 2021
      Accepted: 03 April 2020
      Received: 24 September 2019

      Author Tags

      1. crowdsourcing
      2. knowledge acquisition
      3. EM algorithm
      4. label aggregation

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 10 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Attribute augmentation-based label integration for crowdsourcingFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-022-2225-z17:5Online publication date: 1-Oct-2023
      • (2023)Hint: harnessing the wisdom of crowds for handling multi-phase tasksNeural Computing and Applications10.1007/s00521-021-06825-735:31(22911-22933)Online publication date: 1-Nov-2023

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media