Nothing Special   »   [go: up one dir, main page]

Skip to main content

Anomaly Detection in Crowdsourced Work with Interval-Valued Labels

  • Conference paper
  • First Online:
Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2022)

Abstract

Crowdsourcing is an emerging paradigm in AI and machine learning. It involves gathering input from human crowds, usually through the Internet, to solve a given task. Due to its open nature, the selected crowd-workers usually come from a variety of social-economic backgrounds and bring with them differing levels of reliability. There is also a threat of people with adversarial intentions launching attacks to derail crowdsourced projects. In this paper, we apply interval-valued labels (IVLs) and worker reliability to detect anomalous behavior in crowd-workers. Three of the four worker reliability measures-confidence, stability, and predictability-do not rely on the correctness of a worker’s label [28]. Therefore, by comparing a worker’s IVLs on gold questions with regular ones, we may detect anomalies within our workers. Doing so in our computational experiments, we have successfully been able to identify adversarial attackers for quality assurance of crowdsourcing.

This work is partially supported by the US National Science Foundation through the grant award NSF/OIA-1946391.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.mturk.com/.

  2. 2.

    http://crowdflowersites.com/.

References

  1. Checco, A., Bates, J., Demartini, G.: Adversarial attacks on crowdsourcing quality control. J. Artif. Intell. Res. 67, 375–408 (2020)

    Article  MathSciNet  Google Scholar 

  2. Corliss, G.F., Hu, C., Kearfott, R.B., Walster, G.W.: Rigorous Global Search - Executive Summary. Technical report (1997)

    Google Scholar 

  3. Dai, J., Wang, W., Mi, J.: Uncertainty measurement for interval-valued information systems. Inf. Sci. 251, 63–78 (2013)

    Article  MathSciNet  Google Scholar 

  4. Duan, Q., Hu, C., Wei, H.: Enhancing network intrusion detection systems with interval methods. In: SAC 2005: Proceedings of the 2005 ACM Symposium on Applied Computing, pp. 1444–1448 (2005)

    Google Scholar 

  5. Gan, Q., Yang, Q., Hu, C.: Parallel all-row preconditioned interval linear solver for nonlinear equations on multiprocessors. Parallel Comput. 20(9), 1249–1268 (1994)

    Article  MathSciNet  Google Scholar 

  6. Goldstein, M., Uchida, S.: A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4), e0152173 (2016)

    Article  Google Scholar 

  7. He, L., Hu, C.: Impacts of interval computing on stock market forecasting. J. Comput. Econ. 33(3), 263–276 (2009). https://doi.org/10.1007/s10614-008-9159-x

    Article  Google Scholar 

  8. Hu, C., Frolov, A., Kearfott, R., Yang, Q.: A general iterative sparse linear solver and its parallelization for interval Newton methods. Reliable Comput. 1, 251–263 (1995)

    Article  MathSciNet  Google Scholar 

  9. Hu, C., Cardenas, A., Hoogendoorn, S., et al.: An interval polynomial interpolation problem and its Lagrange solution. Reliable Comput. 4, 27–38 (1998)

    Article  MathSciNet  Google Scholar 

  10. Hu, C.: Using interval function approximation to estimate uncertainty. In: Huynh, V.N., Nakamori, Y., Ono, H., Lawry, J., Kreinovich, V., Nguyen, H.T. (eds.) Interval/Probabilistic Uncertainty and Non-Classical Logics. Advances in Soft Computing, vol. 46. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77664-2_26

  11. Hu, C., et al.: Knowledge Processing with Interval and Soft Computing. Springer, London (2008). https://doi.org/10.1007/978-1-84800-326-2

  12. Hu, C., He, L.: An application of interval methods to stock market forecasting. J. Reliab. Comput. 13, 423–434 (2007). https://doi.org/10.1007/s11155-007-9039-4

  13. Hu, C.: Interval function and its linear least-squares approximation. In: ACM SNC 2011: Proceedings of the 2011 International Workshop on Symbolic-Numeric Computation, pp. 16–23. ACM (2012)

    Google Scholar 

  14. Hu, C., Hu, Z.H.: On statistics, probability, and entropy of interval-valued datasets. In: Lesot, M.J., et al. (eds.) IPMU 2020. CCIS, vol. 1239, pp. 407–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50153-2_31

    Chapter  Google Scholar 

  15. Hu, C., Hu, Z.H.: A computational study on the entropy of interval-valued datasets from the stock market. In: Lesot, M.J., et al. (eds.) IPMU 2020. CCIS, vol. 1239, pp. 422–435. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50153-2_32

    Chapter  Google Scholar 

  16. Hu, C., Sheng, V.S., Wu, N., Wu, X.: Managing uncertainty in crowdsourcing with interval-valued labels. In: Rayz, J., Raskin, V., Dick, S., Kreinovich, V. (eds.) NAFIPS 2021. LNNS, vol. 258, pp. 166–178. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-82099-2_15

    Chapter  Google Scholar 

  17. Hu, P., Dellar, M., Hu, C.: Task scheduling on flow networks with temporal uncertainty. In: 2007 IEEE Symposium on Foundations of Computational Intelligence, pp. 128–135 (2007)

    Google Scholar 

  18. de Korvin, A., Hu, C., Chen, P.: Generating and applying rules for interval valued fuzzy observations. In: Yang, Z.R., Yin, H., Everson, R.M. (eds.) IDEAL 2004. LNCS, vol. 3177, pp. 279–284. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28651-6_41

    Chapter  Google Scholar 

  19. Kong, X., Song, X., Xia, F., Guo, H., Wang, J., Tolba, A.: LoTAD: long-term traffic anomaly detection based on crowdsourced bus trajectory data. World Wide Web 21(3), 825–847 (2017). https://doi.org/10.1007/s11280-017-0487-4

    Article  Google Scholar 

  20. Li, Y., Sun, J., Huang, W., Tian X.: Detecting anomaly in large-scale network using mobile crowdsourcing. In: IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, pp. 2179–2187 (2019)

    Google Scholar 

  21. Li, H., Liu, Q.: Cheaper and Better: Selecting Good Workers for Crowdsourcing. arXiv:1502.00725 (2015)

  22. Marupally, P., Paruchuri, V.S., Hu, C.: Bandwidth variability prediction with rolling interval least squares (RILS). In: Proceedings of the 50th ACM SE Conference, Tuscaloosa, AL, USA, 29–31 March 2012, pp. 209–213. ACM (2012)

    Google Scholar 

  23. NIST: Do two processes have the same mean? https://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm

  24. NIST: F-test. www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm

  25. Nordin, B., Hu, C., Chen, B., Sheng, V.S.: Interval-valued centroids in K-means algorithms. In: Proceedings of the 11th IEEE International Conference on Machine Learning and Applications (ICMLA), Boca Raton, FL, USA, pp. 478–481. IEEE (2012)

    Google Scholar 

  26. Rhodes, C., Lemon, J., Hu, C.: An interval-radial algorithm for hierarchical clustering analysis. In: 14th IEEE International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, pp. 849–856. IEEE (2015)

    Google Scholar 

  27. Shannon, C.-E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)

    Article  MathSciNet  Google Scholar 

  28. Spurling, M., Hu, C., Sheng, V.S., Zhang, H.: Estimating crowd-worker’s reliability with interval-valued labels to improve the quality of crowdsourced work (2021, to appear)

    Google Scholar 

  29. Qiu, L., et al.: CrowdSelect: increasing accuracy of crowdsourcing tasks through behavior prediction and user selection. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management, pp. 539–548 (2016)

    Google Scholar 

  30. Wang, G., Wang, T., Zheng, H., Zhao, B.: Man vs. machine: practical adversarial detection of malicious crowdsourcing workers. In: Proceedings of the 23rd USENIX Security Symposium, San Diego, CA, USA, 20–22 August 2014, pp. 239–254. USENIX Association (2014)

    Google Scholar 

  31. Yu, H., Liu, Y., et al.: Fair and explainable dynamic engagement of crowd workers. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, pp. 6575–6577 (2019)

    Google Scholar 

  32. Zhang, J., Sheng, V.S., Nicholson, B., Wu, X.: CEKA: a tool for mining the wisdom of crowds. J. Mach. Learn. Res. 16, 2853–2858 (2015)

    MathSciNet  Google Scholar 

  33. Zhao, Y., et al.: Outlier detection for streaming task assignment in crowdsourcing. In: WWW (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Makenzie Spurling or Chenyi Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Spurling, M., Hu, C., Zhan, H., Sheng, V.S. (2022). Anomaly Detection in Crowdsourced Work with Interval-Valued Labels. In: Ciucci, D., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2022. Communications in Computer and Information Science, vol 1601. Springer, Cham. https://doi.org/10.1007/978-3-031-08971-8_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08971-8_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08970-1

  • Online ISBN: 978-3-031-08971-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics