research-article

ChatGPT and Bard Performance on the POSCOMP Exam

Authors:

Mateus Santos Saldanha,

Luciano Antonio DigiampietriAuthors Info & Claims

SBSI '24: Proceedings of the 20th Brazilian Symposium on Information Systems

Article No.: 49, Pages 1 - 10

https://doi.org/10.1145/3658271.3658320

Published: 23 May 2024 Publication History

Get Access

Abstract

Context: Modern chatbots, built upon advanced language models, have achieved remarkable proficiency in answering questions across diverse fields. Problem: Understanding the capabilities and limitations of these chatbots is a significant challenge, particularly as they are integrated into different information systems, including those in education. Solution: In this study, we conducted a quantitative assessment of the ability of two prominent chatbots, ChatGPT and Bard, to solve POSCOMP questions. IS Theory: The IS theory used in this work is Information processing theory. Method: We used a total of 271 questions from the last five POSCOMP exams that did not rely on graphic content as our materials. We presented these questions to the two chatbots in two formats: directly as they appeared in the exam and with additional context. In the latter case, the chatbots were informed that they were answering a multiple-choice question from a computing exam. Summary of Results: On average, chatbots outperformed human exam-takers by more than 20%. Interestingly, both chatbots performed better, in average, without additional context added to the prompt. They exhibited similar performance levels, with a slight advantage observed for ChatGPT. Contributions and Impact in the IS area: The primary contribution to the field involves the exploration of the capabilities and limitations of chatbots in addressing computing-related questions. This information is valuable for individuals developing Information Systems with the assistance of such chatbots or those relying on technologies built upon these capabilities.

References

[1]

Sebastian Bordt and Ulrike von Luxburg. 2023. ChatGPT Participates in a Computer Science Exam. arxiv:2303.09461 [cs.CL]

Google Scholar

[2]

Felipe de Fonseca, Ivandré Paraboni, and Luciano Digiampietri. 2023. Contextual stance classification using prompt engineering. In Anais do XIV Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana (Belo Horizonte/MG). SBC, Porto Alegre, RS, Brasil, 33–42. https://doi.org/10.5753/stil.2023.233708

Crossref

Google Scholar

[3]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423

Crossref

Google Scholar

[4]

Burton A. Leland, Bradley D. Christie, James G. Nourse, David L. Grier, Raymond E. Carhart, Tim Maffett, Steve M. Welford, and Dennis H. Smith. 1997. Managing the Combinatorial Explosion. Journal of Chemical Information and Computer Sciences 37, 1 (1997), 62–70. https://doi.org/10.1021/ci960088t

Crossref

Google Scholar

[5]

Kamil Malinka, Martin Peresíni, Anton Firc, Ondrej Hujnák, and Filip Janus. 2023. On the Educational Impact of ChatGPT: Is Artificial Intelligence Ready to Obtain a University Degree?. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1 (Turku, Finland) (ITiCSE 2023). Association for Computing Machinery, New York, NY, USA, 47–53. https://doi.org/10.1145/3587102.3588827

Digital Library

Google Scholar

[6]

A. Newell, J.C. Shaw, and H.A. Simon. 1959. Report on a general problem-solving program. In Proceedings of the International Conference on Information Processing. I K International Publishing House, Paris, France, 256–264.

Google Scholar

[7]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.). Vol. 35. Curran Associates, Inc., New Orleans, LA, USA, 27730–27744. https://proceedings.neurips.cc/paper_files/paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf

Google Scholar

[8]

Vinay Pursnani, Yusuf Sermet, and Ibrahim Demir. 2023. Performance of ChatGPT on the US Fundamentals of Engineering Exam: Comprehensive Assessment of Proficiency and Potential Implications for Professional Environmental Engineering Practice. arxiv:2304.12198 [cs.CY]

Google Scholar

[9]

Basit Qureshi. 2023. Exploring the Use of ChatGPT as a Tool for Learning and Assessment in Undergraduate Computer Science Curriculum: Opportunities and Challenges. arxiv:2304.11214 [cs.CY]

Google Scholar

[10]

Alec Radford and Karthik Narasimhan. 2018. Improving Language Understanding by Generative Pre-Training. https://api.semanticscholar.org/CorpusID:49313245

Google Scholar

[11]

Stuart J. Russell and Peter Norvig. 2009. Artificial Intelligence: a modern approach (3 ed.). Pearson, London, England.

Google Scholar

[12]

Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, YaGuang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen, Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Vincent Zhao, Yanqi Zhou, Chung-Ching Chang, Igor Krivokon, Will Rusch, Marc Pickett, Pranesh Srinivasan, Laichee Man, Kathleen Meier-Hellstern, Meredith Ringel Morris, Tulsee Doshi, Renelito Delos Santos, Toju Duke, Johnny Soraker, Ben Zevenbergen, Vinodkumar Prabhakaran, Mark Diaz, Ben Hutchinson, Kristen Olson, Alejandra Molina, Erin Hoffman-John, Josh Lee, Lora Aroyo, Ravi Rajakumar, Alena Butryna, Matthew Lamm, Viktoriya Kuzmina, Joe Fenton, Aaron Cohen, Rachel Bernstein, Ray Kurzweil, Blaise Aguera-Arcas, Claire Cui, Marian Croak, Ed Chi, and Quoc Le. 2022. LaMDA: Language Models for Dialog Applications. arxiv:2201.08239 [cs.CL]

Google Scholar

[13]

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. 2023. LLaMA: Open and Efficient Foundation Language Models. arxiv:2302.13971 [cs.CL]

Google Scholar

[14]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc., Long Beach, CA, USA. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

Google Scholar

Index Terms

ChatGPT and Bard Performance on the POSCOMP Exam
1. Computing methodologies
  1. Artificial intelligence
2. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Beyond ChatGPT: A conceptual framework and systematic review of speech-recognition chatbots for language learning
Abstract
The diversification of chatbot technology, such as the emergence of large language models and their incorporation into various technologies, necessitates a conceptual framework for a comprehensive understanding of different chatbot ...
Highlights
- Thirty-seven empirical studies on speech-recognition chatbots for language learning were reviewed.
Ccna security official exam certification guide (exam 640-553)
A+ Exam Cram 2 (Exam Cram 220-221, Exam Cram 220-222)

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

SBSI '24: Proceedings of the 20th Brazilian Symposium on Information Systems

May 2024

708 pages

ISBN:9798400709968

DOI:10.1145/3658271

Editors:
Ronney Moreira de Castro
UFJF
,
José Maria N. David
UFJF
,
Johnny C. Marques
ITA
,
Tadeu Moreira de Classe
UNIRIO
,
Victor Ströele
UFJF
,
Williamson Alison Freitas Silva
UNIPAMPA

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 May 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SBSI '24

SBSI '24: XX Brazilian Symposium on Information Systems

May 20 - 23, 2024

Juiz de Fora, Brazil

Acceptance Rates

Overall Acceptance Rate 181 of 557 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
18
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)3

Reflects downloads up to 27 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

References

Index Terms

Recommendations

Beyond ChatGPT: A conceptual framework and systematic review of speech-recognition chatbots for language learning

Ccna security official exam certification guide (exam 640-553)

A+ Exam Cram 2 (Exam Cram 220-221, Exam Cram 220-222)

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations