Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3428502.3428507acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicegovConference Proceedingsconference-collections
research-article

Mapping crime descriptions to law articles using deep learning

Published: 29 October 2020 Publication History

Abstract

In the operational systems of the Dutch Public Prosecution Service, data about criminal cases are registered. This information is used to generate crime statistics or other management information as input for policymaking. A key element for these statistics is the crime type of a case, which is normally deduced from the registered law articles of each case. However the quality of these registered law articles has shortcomings. Additional data describing the crime could be useful to enhance the equality of these law articles. In this paper we investigate the possibility to map additional descriptions of the crime to the formal notations of law articles using a deep learning neural network approach called sequence-to-sequence learning. We describe the characteristics of the data and carry out a number of experiments on these data. Subsequently, we compare two approaches: a) one-hot encoding for the words in a sentence and b) pre-trained word embeddings. The results show that the mapping of the crime description to law articles works reasonably well: a measured accuracy of 91% is reached, and when some issues in the dataset would be resolved the performance could be even higher.

References

[1]
Bargh, M.S., Mbgong, F., Dijk, J.J. van & Choenni, S. (2015b) A framework for dynamic data quality management, In Proc. of the 4th International Conference on Information Systems Post-implementation and Change Management (ISPCM), Las Palmas, Grand Canary, Spain.
[2]
van Dijk J., Bargh M.S., Choenni S., Spruit M. (2017) Maturing Pay-as-you-go Data Quality Management: Towards Decision Support for Paying the Larger Bills. In: Francalanci C., Helfert M. (eds) Data Management Technologies and Applications. DATA 2016. Communications in Computer and Information Science, vol 737. Springer, Cham
[3]
M. Bargh, J. van Dijk, R. Choenni, "Management of Data Quality Related Problems: Exploiting Operational Knowledge, DATA 2016, 5th Int. Conf. on Data Management Technologies and Applications, July 24-26, Lisbon, Portugal, SCITEPRESS - Science and Technology Publications, pp. 31--42.
[4]
Hirschberg, Julia, and Christopher D. Manning. "Advances in natural language processing." Science 349.6245 (2015): 261--266.
[5]
T. Young, D. Hazarika, S. Poria and E. Cambria, "Recent Trends in Deep Learning Based Natural Language Processing [Review Article]," in IEEE Computational Intelligence Magazine, vol. 13, no. 3, pp. 55--75, Aug. 2018.
[6]
Nadeau, David, and Satoshi Sekine. "A survey of named entity recognition and classification." Lingvisticae Investigationes 30.1 (2007): 3--26.
[7]
S. Ramos, S. Gehrig, P. Pinggera, U. Franke and C. Rother, "Detecting unexpected obstacles for self-driving cars: Fusing deep learning and geometric modeling," 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, 2017, pp. 1025--1032.
[8]
Esteva, A., Robicquet, A., Ramsundar, B., Kuleshov, V., DePristo, M., Chou, K., ... & Dean, J. (2019). A guide to deep learning in healthcare. Nature medicine, 25(1), 24.
[9]
Department of Justice (2004) Codifying the Criminal Law - Expert Group on the codification of the criminal law.
[10]
Tak, P.J. (2008). The Dutch criminal justice system. Wolf Legal Publishers.
[11]
Dijk, J.J. van, Kalidien, S.N. & Choenni, S. (2016). Smart Monitoring of the Criminal Justice System. Government Information Quaterly (GIQ) special issue. Beyond 2015 Smart Governance, Smart Development.
[12]
Netten, N., Braak, S.W. van den, Choenni, S. & Leertouwer E.C. (2014). Elapsed Times in Criminal Justice Systems. In Proceedings of the 8th International Conference on Theory and Practice of Electronic Governance (ICEGOV'14), Guimares, Portugal.
[13]
Smit, P. and Choenni, S. (2014). On the interaction between forecasts and policy decisions. In Proceedings of the 15th Annual International Conference on Digital Government Research (dg.o '14). ACM, New York, NY, USA, 110--117.
[14]
Criminaliteit en rechtshandhaving (C&R) https://data.overheid.nl/dataset/criminaliteit-en-rechtshandhaving
[15]
Kalidien, S. N., Choenni R. & Meijer, R. F. (2010). Crime statistics online: potentials and challenges. In Proc. of the 11th Annual International Digital Government Research Conference on Public Administration Online: Challenges and Opportunities (DG.O'10), Puebla, Mexico.
[16]
Zuiderwijk, A.M.G., Cramer, B., Leertouwer, E.C., Temürhan, M. & Busker, A.L.J. (2012). Case processing time in the Dutch criminal justice system. (In Dutch: "Doorlooptijden in de strafrechtketen"), The Hague, WODC. Cahiers 2012--01.
[17]
Tollenaar, N., & Van der Heijden, P. (2013). Which method predicts recidivism best?: A comparison of statistical, machine learning and data mining predictive models. Journal of the Royal Statistical Society. Series A (Statistics in Society), 176(2), 565--584.
[18]
PredPol, 2016, http://www.predpol.com
[19]
Katz, D.M, Bommarito, M.J. & Blackman, J. (2014). Predicting the Behavior of the Supreme Court of the United States: A General Approach. SSRN.
[20]
IBM. (2019). ROSS - Super Intelligent Attorney. URL: http://www.rossintelligence.com/.
[21]
Tollenaar, N., Rokven, J., Macro, D., Beerthuizen, M., Laan, A.M. van der 2019. Predictive textmining for cyber and digitized crime in police registrations. Cahier 2019-2 WODC, The Hague, The Netherlands.
[22]
Amrit, Chintan, et al. "Identifying child abuse through text mining and machine learning." Expert systems with applications 88 (2017): 402--418.
[23]
Iriberri, Alicia. "Natural Language Processing and Psychology in e-Government Services: Evaluation of a Crime Reporting and Interviewing System." International Journal of Electronic Government Research (IJEGR) 11.2 (2015): 1--17.
[24]
Alom, M. Z., Taha, T. M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M. S., ... & Asari, V. K. (2019). A state-of-the-art survey on deep learning theory and architectures. Electronics, 8(3), 292.
[25]
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... & Dieleman, S. (2016). Mastering the game of Go with deep neural networks and tree search. nature, 529(7587), 484.
[26]
Goldberg, Yoav, and Omer Levy. "word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method." arXiv preprint arXiv:1402.3722 (2014).
[27]
Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. Accepted at ICLR 2015 as oral presentation. https://arxiv.org/abs/1409.0473
[28]
N. Toomarian and J. Barhen, "Fast temporal neural learning using teacher forcing," IJCNN-91-Seattle International Joint Conference on Neural Networks, Seattle, WA, USA, 1991, pp. 817--822 vol.1.
[29]
Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex, Sutskever, Ilya, and Salakhutdinov, Ruslan. Dropout:A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15(1):1929-1958, January 2014
[30]
Ketkar N. (2017) Introduction to PyTorch. In: Deep Learning with Python. Apress, Berkeley, CA
[31]
Learning Word Vectors for 157 Languages, Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, Tomas Mikolov, Facebook AI Research, EPFL
[32]
Caruana, Rich, Steve Lawrence, and C. Lee Giles. "Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping." Advances in neural information processing systems. 2001.
[33]
Pooled Contextualized Embeddings for Named Entity Recognition. Alan Akbik, Tanja Bergmann and Roland Vollgraf. 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2019.

Cited By

View all
  • (2024)DiscoLQA: zero-shot discourse-based legal question answering on European LegislationArtificial Intelligence and Law10.1007/s10506-023-09387-2Online publication date: 10-Jan-2024

Index Terms

  1. Mapping crime descriptions to law articles using deep learning

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICEGOV '20: Proceedings of the 13th International Conference on Theory and Practice of Electronic Governance
    September 2020
    880 pages
    ISBN:9781450376747
    DOI:10.1145/3428502
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • University of the Aegean: University of the Aegean

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Natural language processing
    2. crime type
    3. data quality
    4. deep learning
    5. sequence-to-sequence learning
    6. word embeddings

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICEGOV 2020

    Acceptance Rates

    ICEGOV '20 Paper Acceptance Rate 79 of 209 submissions, 38%;
    Overall Acceptance Rate 350 of 865 submissions, 40%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 20 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)DiscoLQA: zero-shot discourse-based legal question answering on European LegislationArtificial Intelligence and Law10.1007/s10506-023-09387-2Online publication date: 10-Jan-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media