research-article

Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing

Authors:

Xudong LiuAuthors Info & Claims

Frontiers of Computer Science, Volume 15, Issue 4

https://doi.org/10.1007/s11704-020-9364-x

Published: 01 August 2021 Publication History

Abstract

Crowdsourcing has been a helpful mechanism to leverage human intelligence to acquire useful knowledge. However, when we aggregate the crowd knowledge based on the currently developed voting algorithms, it often results in common knowledge that may not be expected. In this paper, we consider the problem of collecting specific knowledge via crowdsourcing. With the help of using external knowledge base such as WordNet, we incorporate the semantic relations between the alternative answers into a probabilistic model to determine which answer is more specific. We formulate the probabilistic model considering both worker’s ability and task’s difficulty from the basic assumption, and solve it by the expectation-maximization (EM) algorithm. To increase algorithm compatibility, we also refine our method into semi-supervised one. Experimental results show that our approach is robust with hyper-parameters and achieves better improvement than majority voting and other algorithms when more specific answers are expected, especially for sparse data.

References

[1]

Howe J The rise of crowdsourcing Wired Magazine 2006 14 6 1-4

[2]

Wang J, Li G, Kraska T, Franklin M J, Feng J. Leveraging transitive relations for crowdsourced joins. In: Proceedings of ACM Conference on Management of Data. 2013, 229–240

[3]

Russell B C, Torralba A, Murphy K P, and Freeman W T Labelme: a database and Web-based tool for image annotation International Journal of Computer Vision 2008 77 1–3 157-173

[4]

Hwang K and Lee S Y Environmental audio scene and activity recognition through mobile-based crowdsourcing IEEE Transactions on Consumer Electronics 2012 58 2 700-705

[5]

Vondrick C, Patterson D, and Ramanan D Efficiently scaling up crowdsourced video annotation International Journal of Computer Vision 2013 101 1 184-204

[6]

Waggoner B, Chen Y. Output agreement mechanisms and common knowledge. In: Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing. 2014

[7]

Ordonez V, Deng J, Choi Y, Berg A C, Berg T. From large scale image categorization to entry-level categories. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 2768–2775

[8]

Feng S, Ravi S, Kumar R, Kuznetsova P, Liu W, Berg A C, Berg T L, Choi Y. Refer-to-as relations as semantic knowledge. In: Proceedings of International Conference on Automated Planning and Scheduling. 2015

[9]

Dawid A P and Skene A M Maximum likelihood estimation of observer error-rates using the em algorithm Applied Statistics 1979 28 1 20-28

[10]

Whitehill J, Wu T F, Bergsma J, Movellan J R, Ruvolo P L. Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2009, 2035–2043

[11]

Salek M, Bachrach Y, Key P. Hotspotting-a probabilistic graphical model for image object localization through crowdsourcing. In: Proceedings of International Conference on Automated Planning and Scheduling. 2013

[12]

Bachrach Y, Minka T, Guiver J, Graepel T. How to grade a test without knowing the answers—a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on Machine Learning. 2012, 819–826

[13]

Raykar V C, Yu S, Zhao L H, Valadez G H, Florin C, Bogoni L, and Moy L Learning from crowds Journal of Machine Learning Research 2010 11 43 1297-1322

[14]

Demartini G, Difallah D E, Cudré-Mauroux P. Zencrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 469–478

[15]

Zhou D, Basu S, Mao Y, Platt J C. Learning from the wisdom of crowds by minimax entropy. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2012, 2195–2203

[16]

Han T, Sun H, Song Y, Fang Y, Liu X. Incorporating external knowledge into crowd intelligence for more specific knowledge acquisition. In: Proceedings of International Joint Conference on Artificial Intelligence. 2016, 1541–1547

[17]

Chilton L B, Little G, Edge D, Weld D S, Landay J A. Cascade: crowdsourcing taxonomy creation. In: Proceedings of SIGCHI Conference on Human Factors in Computing Systems. 2013, 1999–2008

[18]

Bragg J, Weld D S. Crowdsourcing multi-label classification for taxonomy creation. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing. 2013

[19]

Sun Y, Singla A, Fox D, Krause A. Building hierarchies of concepts via crowdsourcing. In: Proceedings of International Joint Conference on Artificial Intelligence. 2015, 844–851

[20]

Fellbaum C. WordNet: An Electronic Lexical Database. MIT Press, 1998

[21]

Lenat D B, Guha R V. Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, 1989

[22]

Speer R, Havasi C. Representing general relational knowledge in conceptnet 5. In: Proceedings of Language Resources and Evaluation Conference. 2012, 3679–3686

[23]

Wu W, Li H, Wang H, Zhu K Q. Probase: a probabilistic taxonomy for text understanding. In: Proceedings of ACM Conference on Management of Data. 2012, 481–492

[24]

Prelec D, Seung H S, and McCoy J A solution to the single-question crowd wisdom problem Nature 2017 541 7638 532-535

[25]

Divvala S K, Farhadi A, Guestrin C. Learning everything about anything: webly-supervised visual concept learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3270–3277

[26]

Sheng V S, Provost F, Ipeirotis P G. Get another label? improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 614–622

[27]

Ipeirotis P G, Provost F, Wang J. Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation. 2010, 64–67

[28]

Han T, Sun H, Song Y, Wang Z, Liu X. Budgeted task scheduling for crowdsourced knowledge acquisition. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017, 1059–1068

[29]

Callison-Burch C. Fast, cheap, and creative: evaluating translation quality using amazon’s mechanical turk. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009, 286–295

[30]

Hu C, Bederson B B, Resnik P. Translation by iterative collaboration between monolingual users. In: Proceedings of Graphics Interface 2010. 2010, 39–46

[31]

Ambati V, Vogel S, Carbonell J. Active learning and crowd-sourcing for machine translation. In: Proceedings of the 7th International Conference on Language Resources and Evaluation. 2010

[32]

Dong X L, Gabrilovich E, Heitz G, Horn W, Murphy K, Sun S, and Zhang W From data fusion to knowledge fusion Proceedings of the VLDB Endowment 2014 7 10 881-892

[33]

Ma F, Li Y, Li Q, Qiu M, Gao J, Zhi S, Su L, Zhao B, Ji H, Han J. Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 745–754

[34]

Fang Y, Sun H, Chen P, Huai J. On the cost complexity of crowdsourcing. In: Proceedings of International Joint Conference on Artificial Intelligence. 2018, 1531–1537

[35]

Luengo-Oroz M A, Arranz A, and Frean J Crowdsourcing malaria parasite quantification: an online game for analyzing images of infected thick blood smears Journal of Medical Internet Research 2012 14 6 e167

[36]

Kalman R E A new approach to linear filtering and prediction problems Journal of Basic Engineering 1960 82 1 35-45

[37]

Sun H, Hu K, Fang Y, and Song Y Adaptive result inference for collecting quantitative data with crowdsourcing IEEE Internet of Things Journal 2017 4 5 1389-1398

[38]

Dai P, Lin C H, and Weld D S Pomdp-based control of workflows for crowdsourcing Artificial Intelligence 2013 202 52-85

[39]

Dai P, Weld D S. Artificial intelligence for artificial artificial intelligence. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence. 2011

[40]

Fang Y, Sun H, Li G, Zhang R, and Huai J Context-aware result inference in crowdsourcing Information Sciences 2018 460 346-363

[41]

Otani N, Baba Y, and Kashima H Quality control of crowdsourced classification using hierarchical class structures Expert Systems with Applications 2016 58 155-163

[42]

Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255

Cited By

Zhang YJiang LLi C(2023)Attribute augmentation-based label integration for crowdsourcingFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-022-2225-z17:5Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1007/s11704-022-2225-z
Fang YChen Phan T(2023)Hint: harnessing the wisdom of crowds for handling multi-phase tasksNeural Computing and Applications10.1007/s00521-021-06825-735:31(22911-22933)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1007/s00521-021-06825-7

Index Terms

Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
2. Information systems
  1. Information systems applications

Index terms have been assigned to the content through auto-classification.

Recommendations

Information is not knowledge, knowledge is not wisdom, wisdom is not truth

In this contribution to the special issue of the International Journal of Human-Computer Studies on Knowledge Acquisition I will give a view on the evolution of concepts related to knowledge during the last 25 years and will briefly look into the ...
Multi-Label Inference for Crowdsourcing
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

When acquiring labels from crowdsourcing platforms, a task may be designed to include multiple labels and the values of each label may belong to a set of various distinct options, which is the so-called multi-class multi-label annotation. To improve the ...
Budgeted Task Scheduling for Crowdsourced Knowledge Acquisition
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

Knowledge acquisition (e.g. through labeling) is one of the most successful applications in crowdsourcing. In practice, collecting as specific as possible knowledge via crowdsourcing is very useful since specific knowledge can be generalized easily if we ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Frontiers of Computer Science: Selected Publications from Chinese Universities

Frontiers of Computer Science: Selected Publications from Chinese Universities Volume 15, Issue 4

Aug 2021

200 pages

ISSN:2095-2228

EISSN:2095-2236

Issue’s Table of Contents

© Higher Education Press 2020.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 August 2021

Accepted: 03 April 2020

Received: 24 September 2019

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang YJiang LLi C(2023)Attribute augmentation-based label integration for crowdsourcingFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-022-2225-z17:5Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1007/s11704-022-2225-z
Fang YChen Phan T(2023)Hint: harnessing the wisdom of crowds for handling multi-phase tasksNeural Computing and Applications10.1007/s00521-021-06825-735:31(22911-22933)Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1007/s00521-021-06825-7

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents