research-article

Open access

Error-Tolerant E-Discovery Protocols

Authors:

Jason D. Hartline,

Aravindan VijayaraghavanAuthors Info & Claims

CSLAW '24: Proceedings of the Symposium on Computer Science and Law

Pages 24 - 35

https://doi.org/10.1145/3614407.3643703

Published: 12 March 2024 Publication History

Abstract

We consider the multi-party classification problem introduced by Dong, Hartline, and Vijayaraghavan (2022) in the context of electronic discovery (e-discovery). Based on a request for production from the requesting party, the responding party is required to provide documents that are responsive to the request except for those that are legally privileged1. Our goal is to find a protocol that verifies that the responding party sends almost all responsive documents while minimizing the disclosure of non-responsive documents. We provide protocols in the challenging non-realizable setting, where the instance may not be perfectly separated by a linear classifier. We demonstrate empirically that our protocol successfully manages to find almost all relevant documents, while incurring only a small disclosure of non-responsive documents. We complement this with a theoretical analysis of our protocol in the single-dimensional setting, and other experiments on simulated data which suggest that the non-responsive disclosure incurred by our protocol may be unavoidable.

Supplementary Material

dong (dong.zip)

Supplemental movie, appendix, image and software files for, Error-Tolerant E-Discovery Protocols

Download
494.41 KB

PDF File (Appendix.pdf)

Appendix

Download
516.24 KB

References

[1]

Pranjal Awasthi, Maria Florina Balcan, and Philip M. Long. 2017. The Power of Localization for Efficiently Learning Linear Separators with Noise. J. ACM 63, 6, Article 50 (jan 2017), 27 pages. https://doi.org/10.1145/3006384

Digital Library

[2]

Wei-Cheng Chang, Felix X Yu, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. 2020. Pre-training tasks for embedding-based large-scale retrieval. arXiv preprint arXiv:2002.03932 (2020).

[3]

Kenneth L Clarkson. 1994. More output-sensitive geometric algorithms. In Proceedings 35th Annual Symposium on Foundations of Computer Science. IEEE, 695--702.

Digital Library

[4]

Gordon V Cormack and Maura R Grossman. 2014. Evaluation of machine-learning protocols for technology-assisted review in electronic discovery. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. 153--162.

Digital Library

[5]

Gordon V Cormack and Maura R Grossman. 2017. Technology-Assisted Review in Empirical Medicine: Waterloo Participation in CLEF eHealth 2017. CLEF (working notes) 11 (2017).

[6]

Gordon V Cormack and Mona Mojdeh. 2009. Machine Learning for Information Retrieval: TREC 2009 Web, Relevance Feedback and Legal Tracks. In TREC.

[7]

Jinshuo Dong, Jason Hartline, and Aravindan Vijayaraghavan. 2022. Classification Protocols with Minimal Disclosure. In Proceedings of the 2022 Symposium on Computer Science and Law. 67--76.

Digital Library

[8]

Brown v. Tellermate Holdings Ltd. 2014. Case No. 2:11-cv-1122, 2014 U.S. Dist. LEXIS 90123 (2014). https://casetext.com/case/brown-v-tellermate-holdings-ltd

[9]

Hyles v. New York City. 2016. 10 Civ. 3119 (AT)(AJP) (2016). https://casetext.com/case/hyles-v-nyc

[10]

Moore v. Groupe. 2012. 868 F. Supp. 2d 137 (2012). https://casetext.com/case/moore-v-groupe

[11]

O Goldreich, S Micali, and A Wigderson. 1987. How to play ANY mental game. In Proceedings of the nineteenth annual ACM symposium on Theory of computing. 218--229.

Digital Library

[12]

Shafi Goldwasser, Silvio Micali, and Charles Rackoff. 1989. The Knowledge Complexity of Interactive Proof Systems. SIAM J. Comput. 18, 1 (1989), 186--208.

Digital Library

[13]

Shafi Goldwasser, Guy N. Rothblum, Jonathan Shafer, and Amir Yehudayoff. 2021. Interactive Proofs for Verifying Machine Learning. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 185), James R. Lee (Ed.). Schloss Dagstuhl--Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 41:1--41:19. https://doi.org/10.4230/LIPIcs.ITCS.2021.41

[14]

Maura R Grossman and Gordon V Cormack. 2010. Technology-assisted review in e-discovery can be more effective and more efficient than exhaustive manual review. Rich. JL & Tech. 17 (2010), 1.

[15]

Maura R Grossman and Gordon V Cormak. 2012. Inconsistent responsiveness determination in document review: Difference of opinion or human error. Pace L. Rev. 32 (2012), 267.

[16]

Venkatesan Guruswami and Prasad Raghavendra. 2009. Hardness of Learning Halfspaces with Noise. SIAM J. Comput. 39, 2 (2009), 742--765. https://doi.org/10.1137/070685798 arXiv:https://doi.org/10.1137/070685798

Digital Library

[17]

Bruce Hedin, Stephen Tomlinson, Jason R Baron, and Douglas W Oard. 2009. Overview of the TREC 2009 Legal Track. In TREC.

[18]

Adam Tauman Kalai, Adam R. Klivans, Yishay Mansour, and Rocco A. Servedio. 2008. Agnostically Learning Halfspaces. SIAM J. Comput. 37, 6 (2008), 1777--1805. https://doi.org/10.1137/060649057 arXiv:https://doi.org/10.1137/060649057

Digital Library

[19]

Michael J. Kearns, Robert E. Schapire, and Linda M. Sellie. 1992. Toward Efficient Agnostic Learning. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory (Pittsburgh, Pennsylvania, USA) (COLT '92). Association for Computing Machinery, New York, NY, USA, 341--352. https://doi.org/10.1145/130385.130424

Digital Library

[20]

Michael J. Kearns and Umesh V. Vazirani. 1994. An Introduction to Computational Learning Theory. MIT Press, Cambridge, MA, USA.

Digital Library

[21]

Daniel N Kluttz and Deirdre K Mulligan. 2019. Automated decision support technologies and the legal profession. Berkeley Technology Law Journal 34, 3 (2019), 853--890.

[22]

Antoine Louis and Gerasimos Spanakis. 2021. A statutory article retrieval dataset in French. arXiv preprint arXiv:2108.11792 (2021).

[23]

Alison O'Mara-Eves, James Thomas, John McNaught, Makoto Miwa, and Sophia Ananiadou. 2015. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic reviews 4, 1 (2015), 1--22.

[24]

Allyson Haynes Stuart. 2021. A Right to Privacy for Modern Discovery. Geo. Mason L. Rev. 29 (2021), 675.

[25]

Jan van den Brand, Yin Tat Lee, Aaron Sidford, and Zhao Song. 2020. Solving tall dense linear programs in nearly linear time. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing. 775--788.

Digital Library

[26]

Jie Zou and Evangelos Kanoulas. 2020. Towards question-based high-recall information retrieval: Locating the last few relevant documents for technology-assisted reviews. ACM Transactions on Information Systems (TOIS) 38, 3 (2020), 1--35.

Digital Library

Index Terms

Error-Tolerant E-Discovery Protocols
1. Applied computing
  1. Law, social and behavioral sciences
    1. Law
2. Theory of computation
  1. Computational complexity and cryptography
    1. Interactive proof systems

Recommendations

Formal Analysis Of Multi-party Non-repudiation Protocols Without TTP
ICCIIS '10: Proceedings of the 2010 International Conference on Communications and Intelligence Information Security

Non-repudiation service is crucial to electronic commerce. Now multi-party non-repudiation is a new focus of research. This paper presents a multi-party non-repudiation protocol, based on a group encryption scheme. A multi-party non-repudiation problem ...
Multi-Party Fair Exchange with an Off-Line Trusted Neutral Party
DEXA '99: Proceedings of the 10th International Workshop on Database & Expert Systems Applications

Recently developed cryptographic techniques [2, 4, 6, 14] make it possible to construct fair exchange protocols with an off-line trusted third party(TTP). The technique is referred to as a verifiable encryption scheme(VES) proves that a ciphertext is ...
Efficient and provably secure generic construction of three-party password-based authenticated key exchange protocols
INDOCRYPT'06: Proceedings of the 7th international conference on Cryptology in India

Three-party password-based authenticated key exchange (3-party PAKE) protocols make two communication parties establish a shared session key with the help of a trusted server, with which each of the two parties shares a predetermined password. Recently, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CSLAW '24: Proceedings of the Symposium on Computer Science and Law

March 2024

161 pages

ISBN:9798400703331

DOI:10.1145/3614407

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 March 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

CSLAW '24

Sponsor:

ACM

CSLAW '24: Symposium on Computer Science and Law

March 12 - 13, 2024

MA, Boston, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
152
Total Downloads

Downloads (Last 12 months)152
Downloads (Last 6 weeks)35

Reflects downloads up to 25 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents