Abstract
Crowdsourcing is an emerging paradigm in AI and machine learning. It involves gathering input from human crowds, usually through the Internet, to solve a given task. Due to its open nature, the selected crowd-workers usually come from a variety of social-economic backgrounds and bring with them differing levels of reliability. There is also a threat of people with adversarial intentions launching attacks to derail crowdsourced projects. In this paper, we apply interval-valued labels (IVLs) and worker reliability to detect anomalous behavior in crowd-workers. Three of the four worker reliability measures-confidence, stability, and predictability-do not rely on the correctness of a worker’s label [28]. Therefore, by comparing a worker’s IVLs on gold questions with regular ones, we may detect anomalies within our workers. Doing so in our computational experiments, we have successfully been able to identify adversarial attackers for quality assurance of crowdsourcing.
This work is partially supported by the US National Science Foundation through the grant award NSF/OIA-1946391.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Checco, A., Bates, J., Demartini, G.: Adversarial attacks on crowdsourcing quality control. J. Artif. Intell. Res. 67, 375–408 (2020)
Corliss, G.F., Hu, C., Kearfott, R.B., Walster, G.W.: Rigorous Global Search - Executive Summary. Technical report (1997)
Dai, J., Wang, W., Mi, J.: Uncertainty measurement for interval-valued information systems. Inf. Sci. 251, 63–78 (2013)
Duan, Q., Hu, C., Wei, H.: Enhancing network intrusion detection systems with interval methods. In: SAC 2005: Proceedings of the 2005 ACM Symposium on Applied Computing, pp. 1444–1448 (2005)
Gan, Q., Yang, Q., Hu, C.: Parallel all-row preconditioned interval linear solver for nonlinear equations on multiprocessors. Parallel Comput. 20(9), 1249–1268 (1994)
Goldstein, M., Uchida, S.: A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4), e0152173 (2016)
He, L., Hu, C.: Impacts of interval computing on stock market forecasting. J. Comput. Econ. 33(3), 263–276 (2009). https://doi.org/10.1007/s10614-008-9159-x
Hu, C., Frolov, A., Kearfott, R., Yang, Q.: A general iterative sparse linear solver and its parallelization for interval Newton methods. Reliable Comput. 1, 251–263 (1995)
Hu, C., Cardenas, A., Hoogendoorn, S., et al.: An interval polynomial interpolation problem and its Lagrange solution. Reliable Comput. 4, 27–38 (1998)
Hu, C.: Using interval function approximation to estimate uncertainty. In: Huynh, V.N., Nakamori, Y., Ono, H., Lawry, J., Kreinovich, V., Nguyen, H.T. (eds.) Interval/Probabilistic Uncertainty and Non-Classical Logics. Advances in Soft Computing, vol. 46. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77664-2_26
Hu, C., et al.: Knowledge Processing with Interval and Soft Computing. Springer, London (2008). https://doi.org/10.1007/978-1-84800-326-2
Hu, C., He, L.: An application of interval methods to stock market forecasting. J. Reliab. Comput. 13, 423–434 (2007). https://doi.org/10.1007/s11155-007-9039-4
Hu, C.: Interval function and its linear least-squares approximation. In: ACM SNC 2011: Proceedings of the 2011 International Workshop on Symbolic-Numeric Computation, pp. 16–23. ACM (2012)
Hu, C., Hu, Z.H.: On statistics, probability, and entropy of interval-valued datasets. In: Lesot, M.J., et al. (eds.) IPMU 2020. CCIS, vol. 1239, pp. 407–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50153-2_31
Hu, C., Hu, Z.H.: A computational study on the entropy of interval-valued datasets from the stock market. In: Lesot, M.J., et al. (eds.) IPMU 2020. CCIS, vol. 1239, pp. 422–435. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50153-2_32
Hu, C., Sheng, V.S., Wu, N., Wu, X.: Managing uncertainty in crowdsourcing with interval-valued labels. In: Rayz, J., Raskin, V., Dick, S., Kreinovich, V. (eds.) NAFIPS 2021. LNNS, vol. 258, pp. 166–178. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-82099-2_15
Hu, P., Dellar, M., Hu, C.: Task scheduling on flow networks with temporal uncertainty. In: 2007 IEEE Symposium on Foundations of Computational Intelligence, pp. 128–135 (2007)
de Korvin, A., Hu, C., Chen, P.: Generating and applying rules for interval valued fuzzy observations. In: Yang, Z.R., Yin, H., Everson, R.M. (eds.) IDEAL 2004. LNCS, vol. 3177, pp. 279–284. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28651-6_41
Kong, X., Song, X., Xia, F., Guo, H., Wang, J., Tolba, A.: LoTAD: long-term traffic anomaly detection based on crowdsourced bus trajectory data. World Wide Web 21(3), 825–847 (2017). https://doi.org/10.1007/s11280-017-0487-4
Li, Y., Sun, J., Huang, W., Tian X.: Detecting anomaly in large-scale network using mobile crowdsourcing. In: IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, pp. 2179–2187 (2019)
Li, H., Liu, Q.: Cheaper and Better: Selecting Good Workers for Crowdsourcing. arXiv:1502.00725 (2015)
Marupally, P., Paruchuri, V.S., Hu, C.: Bandwidth variability prediction with rolling interval least squares (RILS). In: Proceedings of the 50th ACM SE Conference, Tuscaloosa, AL, USA, 29–31 March 2012, pp. 209–213. ACM (2012)
NIST: Do two processes have the same mean? https://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm
NIST: F-test. www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm
Nordin, B., Hu, C., Chen, B., Sheng, V.S.: Interval-valued centroids in K-means algorithms. In: Proceedings of the 11th IEEE International Conference on Machine Learning and Applications (ICMLA), Boca Raton, FL, USA, pp. 478–481. IEEE (2012)
Rhodes, C., Lemon, J., Hu, C.: An interval-radial algorithm for hierarchical clustering analysis. In: 14th IEEE International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, pp. 849–856. IEEE (2015)
Shannon, C.-E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)
Spurling, M., Hu, C., Sheng, V.S., Zhang, H.: Estimating crowd-worker’s reliability with interval-valued labels to improve the quality of crowdsourced work (2021, to appear)
Qiu, L., et al.: CrowdSelect: increasing accuracy of crowdsourcing tasks through behavior prediction and user selection. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management, pp. 539–548 (2016)
Wang, G., Wang, T., Zheng, H., Zhao, B.: Man vs. machine: practical adversarial detection of malicious crowdsourcing workers. In: Proceedings of the 23rd USENIX Security Symposium, San Diego, CA, USA, 20–22 August 2014, pp. 239–254. USENIX Association (2014)
Yu, H., Liu, Y., et al.: Fair and explainable dynamic engagement of crowd workers. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, pp. 6575–6577 (2019)
Zhang, J., Sheng, V.S., Nicholson, B., Wu, X.: CEKA: a tool for mining the wisdom of crowds. J. Mach. Learn. Res. 16, 2853–2858 (2015)
Zhao, Y., et al.: Outlier detection for streaming task assignment in crowdsourcing. In: WWW (2022)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Spurling, M., Hu, C., Zhan, H., Sheng, V.S. (2022). Anomaly Detection in Crowdsourced Work with Interval-Valued Labels. In: Ciucci, D., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2022. Communications in Computer and Information Science, vol 1601. Springer, Cham. https://doi.org/10.1007/978-3-031-08971-8_42
Download citation
DOI: https://doi.org/10.1007/978-3-031-08971-8_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08970-1
Online ISBN: 978-3-031-08971-8
eBook Packages: Computer ScienceComputer Science (R0)