A human-centered systematic literature review of the computational approaches for online sexual risk detection

A Razi, S Kim, A Alsoubai, G Stringhini… - Proceedings of the …, 2021 - dl.acm.org
Proceedings of the ACM on Human-Computer Interaction, 2021dl.acm.org
In the era of big data and artificial intelligence, online risk detection has become a popular
research topic. From detecting online harassment to the sexual predation of youth, the state-
of-the-art in computational risk detection has the potential to protect particularly vulnerable
populations from online victimization. Yet, this is a high-risk, high-reward endeavor that
requires a systematic and human-centered approach to synthesize disparate bodies of
research across different application domains, so that we can identify best practices …
In the era of big data and artificial intelligence, online risk detection has become a popular research topic. From detecting online harassment to the sexual predation of youth, the state-of-the-art in computational risk detection has the potential to protect particularly vulnerable populations from online victimization. Yet, this is a high-risk, high-reward endeavor that requires a systematic and human-centered approach to synthesize disparate bodies of research across different application domains, so that we can identify best practices, potential gaps, and set a strategic research agenda for leveraging these approaches in a way that betters society. Therefore, we conducted a comprehensive literature review to analyze 73 peer-reviewed articles on computational approaches utilizing text or meta-data/multimedia for online sexual risk detection. We identified sexual grooming (75%), sex trafficking (12%), and sexual harassment and/or abuse (12%) as the three types of sexual risk detection present in the extant literature. Furthermore, we found that the majority (93%) of this work has focused on identifying sexual predators after-the-fact, rather than taking more nuanced approaches to identify potential victims and problematic patterns that could be used to prevent victimization before it occurs. Many studies rely on public datasets (82%) and third-party annotators (33%) to establish ground truth and train their algorithms. Finally, the majority of this work (78%) mostly focused on algorithmic performance evaluation of their model and rarely (4%) evaluate these systems with real users. Thus, we urge computational risk detection researchers to integrate more human-centered approaches to both developing and evaluating sexual risk detection algorithms to ensure the broader societal impacts of this important work.
ACM Digital Library