Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3415958.3433087acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmedesConference Proceedingsconference-collections
short-paper

A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice

Published: 27 November 2020 Publication History

Abstract

Effective retrieval of jurisprudence (case-law) is imperative to achieve consistency and predictability for any legal system. In this work, we propose and proceed to an empirical evaluation of a framework for jurisprudence retrieval of the Brazilian Superior Court of Justice in order to ease the task of retrieval of other decisions with the same legal opinion. The experimental results shown that our approach based on text similarity performs better than the legacy system of the Court based on Boolean queries. The building of complex Boolean queries is very specialized and we aim to offer a tool able to use free text as queries without any operator. With the legacy system as baseline, we compare the TF-IDF traditional retrieval model, the BM25 probabilistic model and the Word2Vec model. Our results indicate that the Word2Vec Skip-Gram model, trained on a specialized legal corpus and BM25 yield similar performance and surpasses the legacy system. Combining BM25 model with embedding models improved the performance up to 19%.

References

[1]
Bruna Armonas Colombo, Pedro Buck, and Vinicius Miana Bezerra. 2017. Challenges When Using Jurimetrics in Brazil---A Survey of Courts. Future Internet 9, 4 (2017), 68.
[2]
Rhuan Barros, André Peres, Fabiana Lorenzi, Leandro Krug Wives, and Etiene Hubert da Silva Jaccottet. 2018. Case Law Analysis with Machine Learning in Brazilian Court. In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, 857--868.
[3]
Yen-Liang Chen, Yi-Hung Liu, and Wu-Liang Ho. 2013. A text mining approach to assist the general public in the retrieval of legal documents. Journal of the American Society for Information Science and Technology 64, 2 (2013), 280--290.
[4]
Pedro Henrique Luz de Araujo, Teófilo E de Campos, Renato RR de Oliveira, Matheus Stauffer, Samuel Couto, and Paulo Bermejo. 2018. LeNER-Br: A Dataset for Named Entity Recognition in Brazilian Legal Text. In International Conference on Computational Processing of the Portuguese Language. Springer, 313--323.
[5]
Diego de Vargas Feijó and Viviane Pereira Moreira. 2018. RulingBR: A Summarization Dataset for Legal Texts. In International Conference on Computational Processing of the Portuguese Language. Springer, 255--264.
[6]
Wael H Gomaa, Aly A Fahmy, et al. 2013. A survey of text similarity approaches. International Journal of Computer Applications 68, 13 (2013), 13--18.
[7]
Zellig S Harris. 1954. Distributional structure. Word 10, 2-3 (1954), 146--162.
[8]
Nathan Hartmann, Erick Fonseca, Christopher Shulby, Marcos Treviso, Jessica Rodrigues, and Sandra Aluisio. 2017. Portuguese word embeddings: Evaluating on word analogies and natural language tasks. arXiv preprint arXiv:1708.06025 (2017).
[9]
Kalervo Järvelin and Jaana Kekäläinen. 2017. IR evaluation methods for retrieving highly relevant documents. In ACM SIGIR Forum, Vol. 51. ACM New York, NY, USA, 243--250.
[10]
Karen Sparck Jones. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation (1972).
[11]
Sushanta Kumar. 2014. Similarity Analysis of Legal Judgments and applying 'Paragraph-link'to Find Similar Legal Judgments. Ph.D. Dissertation. PhD thesis, International Institute of Information Technology Hyderabad.
[12]
Sushanta Kumar, P Krishna Reddy, V Balakista Reddy, and Aditya Singh. 2011. Similarity analysis of legal judgments. In Proceedings of the Fourth Annual ACM Bangalore Conference. ACM, 17.
[13]
Daniel Locke and Guido Zuccon. 2018. A test collection for evaluating legal case law search. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 1261--1264.
[14]
Daniel Locke, Guido Zuccon, and Harrisen Scells. 2017. Automatic Query Generation from Legal Texts for Case Law Retrieval. In Asia Information Retrieval Symposium. Springer, 181--193.
[15]
Arpan Mandal, Raktim Chaki, Sarbajit Saha, Kripabandhu Ghosh, Arindam Pal, and Saptarshi Ghosh. 2017. Measuring similarity among legal court case documents. In Proceedings of the 10th Annual ACM India Compute Conference. ACM, 1--9.
[16]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.
[17]
Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45--50. http://is.muni.cz/publication/884893/en.
[18]
Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: BM25 and beyond. Now Publishers Inc.
[19]
Tony Russell-Rose, Jon Chamberlain, and Leif Azzopardi. 2018. Information retrieval in the workplace: A comparison of professional search practices. Information Processing& Management 54, 6 (2018), 1042--1057.
[20]
Luis Sanchez, Jiyin He, Jarana Manotumruksa, Dyaa Albakour, Miguel Martinez, and Aldo Lipani. 2020. Easing Legal News Monitoring with Learning to Rank and BERT. In Advances in Information Retrieval, Joemon M.Jose, Emine Yilmaz, João Magalhaes, Pablo Castells, Nicola Ferro, Mário J. Silva, and Flávio Martins (Eds.). Springer International Publishing, Cham, 336--343.
[21]
Milagro Teruel, Cristian Cardellino, Fernando Cardellino, Laura Alonso Alemany, and Serena Villata. 2018. Legal text processing within the MIREL project. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (Miyazaki, Japan, 7-12), Georg Rehm, Víctor Rodríguez-Doncel, and Julián Moreno-Schneider (Eds.). European Language Resources Association (ELRA), Paris, France.
[22]
Edwin Thuma and Nkwebi Peace Motlogelwa. 2017. On the importance of Legal Catchphrases in Precedence Retrieval. In FIRE (Working Notes). 92--94.
[23]
Alberto Tonon, Gianluca Demartini, and Philippe Cudré-Mauroux. 2015. Pooling-based continuous evaluation of information retrieval systems. Information Retrieval Journal 18, 5 (2015), 445--472.
[24]
Marc Van Opijnen and Cristiana Santos. 2017. On the concept of relevance in legal information retrieval. Artificial Intelligence and Law 25, 1 (2017), 65--87.

Cited By

View all
  • (2024)Building a relevance feedback corpus for legal information retrieval in the real-case scenario of the Brazilian Chamber of DeputiesLanguage Resources and Evaluation10.1007/s10579-024-09767-3Online publication date: 18-Aug-2024
  • (2024)HIRS: A Hybrid Information Retrieval System for Legislative DocumentsProgress in Artificial Intelligence10.1007/978-3-031-73497-7_26(320-331)Online publication date: 16-Nov-2024
  • (2023)Automated Keyphrase Generation for Brazilian Legal Information Retrieval2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191598(1-8)Online publication date: 18-Jun-2023
  • Show More Cited By

Index Terms

  1. A new conceptual framework for enhancing legal information retrieval at the Brazilian Superior Court of Justice

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      MEDES '20: Proceedings of the 12th International Conference on Management of Digital EcoSystems
      November 2020
      170 pages
      ISBN:9781450381154
      DOI:10.1145/3415958
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 November 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. E-Discovery
      2. Information Retrieval
      3. Natural Language Processing
      4. Word Embedding

      Qualifiers

      • Short-paper
      • Research
      • Refereed limited

      Conference

      MEDES '20
      MEDES '20: 12th International Conference on Management of Digital EcoSystems
      November 2 - 4, 2020
      Virtual Event, United Arab Emirates

      Acceptance Rates

      MEDES '20 Paper Acceptance Rate 19 of 27 submissions, 70%;
      Overall Acceptance Rate 267 of 682 submissions, 39%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)28
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 16 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Building a relevance feedback corpus for legal information retrieval in the real-case scenario of the Brazilian Chamber of DeputiesLanguage Resources and Evaluation10.1007/s10579-024-09767-3Online publication date: 18-Aug-2024
      • (2024)HIRS: A Hybrid Information Retrieval System for Legislative DocumentsProgress in Artificial Intelligence10.1007/978-3-031-73497-7_26(320-331)Online publication date: 16-Nov-2024
      • (2023)Automated Keyphrase Generation for Brazilian Legal Information Retrieval2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191598(1-8)Online publication date: 18-Jun-2023
      • (2022)Ulysses-RFSQ: A Novel Method to Improve Legal Information Retrieval Based on Relevance FeedbackIntelligent Systems10.1007/978-3-031-21686-2_6(77-91)Online publication date: 28-Nov-2022
      • (2021)Development and Evaluation of an Intelligence and Learning System in Jurisprudence Text Mining in the Field of Competition DefenseApplied Sciences10.3390/app11231136511:23(11365)Online publication date: 1-Dec-2021

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media