Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3490486.3538235acmconferencesArticle/Chapter ViewAbstractPublication PagesecConference Proceedingsconference-collections
research-article
Public Access

A System-Level Analysis of Conference Peer Review

Published: 13 July 2022 Publication History

Abstract

We undertake a system-level analysis of the conference peer review process. The process involves three constituencies with different objectives: authors want their papers accepted at prestigious venues (and quickly), conferences want to present a program with many high-quality and few low-quality papers, and reviewers want to avoid being overburdened by reviews. These objectives are far from aligned; the key obstacle is that the evaluation of the merits of a submission (both by the authors and the reviewers) is inherently noisy. Over the years, conferences have experimented with numerous policies and innovations to navigate the tradeoffs. These experiments include setting various bars for acceptance, varying the number of reviews per submission, requiring prior reviews to be included with resubmissions, and others. The purpose of the present work is to investigate, both analytically and using agent-based simulations, how well various policies work, and more importantly, why they do or do not work.
We model the conference-author interactions as a Stackelberg game in which a prestigious conference commits to a threshold acceptance policy which will be applied to the (noisy) reviews of each submitted paper; the authors best-respond by submitting or not submitting to the conference, the alternative being a "sure accept" (such as arXiv or a lightly refereed venue). Our findings include: observing that the conference should typically set a higher acceptance threshold than the actual desired quality, which we call the resubmission gap and quantify in terms of various parameters; observing that the reviewing load is heavily driven by resubmissions of borderline papers --- therefore, a judicious choice of acceptance threshold may lead to fewer reviews while incurring an acceptable loss in quality; observing that depending on the paper quality distribution, stricter reviewing may lead to higher or lower acceptance rates --- the former is the result of self selection by the authors. As a rule of thumb, a relatively small number of reviews per paper, coupled with a strict acceptance policy, tends to do well in trading off these two objectives; finding that a relatively small increase in review quality or in self assessment by the authors is much more effective for conference quality control (without a large increase in review burden) than increases in the quantity of reviews per paper.; showing that keeping track of past reviews of papers can help reduce the review burden without a decrease in conference quality.
For robustness, we consider different models of paper quality and learn some of the parameters from real data.

References

[1]
2020. ICLR 2020 review data. https://github.com/shaohua0116/ICLR2020-OpenReviewData.
[2]
2021. ICLR 2021 review data. https://github.com/evanzd/ICLR2021-OpenReviewData.
[3]
Stefano Allesina. 2012. Modeling peer review: An agent-based approach. Ideas in Ecology and Evolution 5, 2 (2012).
[4]
Ammar Ammar and Devavrat Shah. 2012. Efficient rank aggregation using partial data. ACM SIGMETRICS Performance Evaluation Review 40, 1 (2012), 355--366.
[5]
Federico Bianchi, Francisco Grimaldo, Giangiacomo Bravo, and Flaminio Squazzoni. 2018. The peer review game: an agent-based model of scientists facing resource constraints and institutional pressures. Scientometrics 116, 3 (2018), 1401--1420.
[6]
Domenic V. Cicchetti. 1991. The reliability of peer review for manuscript and grant submissions: A cross-disciplinary investigation. Behavioral and brain sciences 14, 1 (1991), 119--135.
[7]
Stephen Cole, Jonathan R. Cole, and Gary A. Simon. 1981. Chance and consensus in peer review. Science 214, 4523 (1981), 881--886.
[8]
Wade D. Cook, Boaz Golany, Michal Penn, and Tal Raviv. 2007. Creating a consensus ranking of proposals from reviewers' partial ordinal rankings. Computers & Operations Research 34, 4 (2007), 954--965.
[9]
Alexander Philip Dawid and Allan M. Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics) 28, 1 (1979), 20--28.
[10]
Rafael D'Andrea and James P. O'Dwyer. 2017. Can editors save peer review from peer reviewers? PloS One 12, 10 (2017), e0186111.
[11]
Robert L. Ebel. 1951. Estimation of the reliability of ratings. Psychometrika 16, 4 (1951), 407--424.
[12]
Thomas Feliciani, Junwen Luo, Lai Ma, Pablo Lucas, Flaminio Squazzoni, Kalpana Shankar. 2019. A scoping review of simulation models of peer review. Scientometrics 121, 1 (2019), 555--594.
[13]
David Gill and Daniel Sgroi. 2012. The optimal choice of pre-launch reviewer. Journal of Economic Theory 147, 3 (2012), 1247--1260.
[14]
Daniel M. Herron. 2012. Is expert peer review obsolete? A model suggests that post-publication reader review may exceed the accuracy of traditional peer review. Surgical Endoscopy 26, 8 (2012), 2275--2280.
[15]
Steven Jecmen, Hanrui Zhang, Ryan Liu, Nihar B. Shah, Vincent Conitzer, and Fei Fang. 2020. Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6--12, 2020, virtual, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/93fb39474c51b8a82a68413e2a5ae17a-Abstract.html
[16]
Sampath Kannan, Mingzi Niu, Aaron Roth, and Rakesh Vohra. 2021. Best vs. All: Equity and Accuracy of Standardized Test Score Reporting. (2021). arXiv preprint arXiv:2102.07809.
[17]
Michail Kovanis, Raphaël Porcher, Philippe Ravaud, and Ludovic Trinquart. 2016. Complex systems approach to scientific publication and peer-review system: development of an agent-based model calibrated with empirical journal data. Scientometrics 106, 2 (2016), 695--715.
[18]
Michail Kovanis, Ludovic Trinquart, Philippe Ravaud, and Raphaël Porcher. 2017. Evaluating alternative systems of peer review: a large-scale agent-based modelling approach to scientific publication. Scientometrics 113, 1 (2017), 651--671.
[19]
Josh Lerner and Jean Tirole. 2006. A model of forum shopping. American economic review 96, 4 (2006), 1091--1113.
[20]
Bryan D. Neff and Julian D. Olden. 2006. Is peer review a game of chance? BioScience 56, 4 (2006), 333--340.
[21]
Ritesh Noothigattu, Nihar B. Shah, and Ariel Procaccia. 2018. Choosing how to choose papers. (2018). arXiv preprint arxiv:1808.09057.
[22]
Paul J. Roebber and David M. Schultz. 2011. Peer Review, Program Officers and Science Funding. PLoS ONE 6, 4 (2011), e18680.
[23]
Anna Rogers and Isabelle Augenstein. 2020. What Can We Do to Improve Peer Review in NLP?. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16--20 November 2020 (Findings of ACL), Trevor Cohn, Yulan He, and Yang Liu (Eds.), Vol. EMNLP 2020. Association for Computational Linguistics, 1256--1262. https://doi.org/10.18653/v1/2020.findings-emnlp.112
[24]
Nihar B. Shah, Behzad Tabibian, Krikamol Muandet, Isabelle Guyon, and Ulrike Von Luxburg. 2018. Design and analysis of the NIPS 2016 review process. Journal of Machine Learning Research 19 (2018), 1--49.
[25]
Lones Smith and Andrea Wilson. 2021. Accept this Paper. (2021). Manuscript in preparation. Slide deck available at https://econ.la.psu.edu/events/seminar-documents/accept2021.pdf.
[26]
Flaminio Squazzoni and Claudio Gandelli. 2012. Saint Matthew strikes again: An agent-based model of peer review and the scientific community structure. Journal of Informetrics 6, 2 (2012), 265--275.
[27]
Siddarth Srinivasan and Jamie Morgenstern. 2021. Auctions and Prediction Markets for Scientific Peer Review. (2021). arXiv preprint arXiv:2109.00923.
[28]
Weijie J. Su. 2021. You Are the Best Reviewer of Your Own Papers: An Owner-Assisted Scoring Mechanism. arXiv:cs.LG/2110.14802
[29]
Stefan Thurner and Rudolf Hanel. 2011. Peer-review in a world with rational scientists: Toward selection of the average. The European Physical Journal B 84, 4 (2011), 707--711.
[30]
David Tran, Alex Valtchanov, Keshav Ganapathy, Raymond Feng, Eric Slud, Micah Goldblum, and Tom Goldstein. 2020. An Open Review of OpenReview: A Critical Analysis of the Machine Learning Conference Review Process. (2020). arXiv preprint arXiv:2010.05137.
[31]
Jingyan Wang and Nihar B. Shah. 2019. Your 2 is My 1, Your 3 is My 9: Handling Arbitrary Miscalibrations in Ratings. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS '19, Montreal, QC, Canada, May 13--17, 2019, Edith Elkind, Manuela Veloso, Noa Agmon, and Matthew E. Taylor (Eds.). International Foundation for Autonomous Agents and Multiagent Systems, 864--872. http://dl.acm.org/citation.cfm?id=3331778
[32]
Hanrui Zhang, Yu Cheng, and Vincent Conitzer. 2021. Classification with Few Tests through Self-Selection. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2--9, 2021. AAAI Press, 5805--5812. https://ojs.aaai.org/index.php/AAAI/article/view/16727

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
EC '22: Proceedings of the 23rd ACM Conference on Economics and Computation
July 2022
1269 pages
ISBN:9781450391504
DOI:10.1145/3490486
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. peer review
  2. stackelberg game

Qualifiers

  • Research-article

Funding Sources

Conference

EC '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 664 of 2,389 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 281
    Total Downloads
  • Downloads (Last 12 months)171
  • Downloads (Last 6 weeks)31
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media