Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3632620.3671108acmconferencesArticle/Chapter ViewAbstractPublication PagesicerConference Proceedingsconference-collections
research-article
Open access

Probeable Problems for Beginner-level Programming-with-AI Contests

Published: 12 August 2024 Publication History

Abstract

To broaden participation, competitive programming contests may include beginner-level problems that do not require knowledge of advanced Computer Science concepts (e.g., algorithms and data structures). However, since most participants have easy access to AI code-generation tools, these problems often become trivial to solve. For beginner-friendly programming contests that do not prohibit the use of AI tools, we propose Probeable Problems: code writing tasks that provide (1) a problem specification that deliberately omits certain details, and (2) a mechanism to probe for these details by asking clarifying questions and receiving immediate feedback. To evaluate our proposal, we conducted a 2-hour programming contest for undergraduate Computer Science students from multiple institutions, where each student was an active member of their institution’s ACM student chapter. The contest comprised of six Probeable Problems for which a popular code-generation tools (e.g., GitHub Copilot) were unable to generate accurate solutions due to the absence of details. Students were permitted to work individually or in groups, and were free to use AI tools. We obtained consent from 26 groups (67 students) to use their submissions for research. To determine whether Probeable Problems are suitable for such contests, we analyze the extent to which the code submitted by these groups identifies missing details.

Supplemental Material

MP4 File
How can we design beginner-friendly programming contests which allow participants to use AI code-generation tools? Our answer: omit key details and provide contestants with a way to probe further (by ask clarifying questions) to recover those details! We believe that developing the ability to ask clarifying questions is important, and related to the ability to write thorough tests.

References

[1]
Maurício Aniche, Felienne Hermans, and Arie van Deursen. 2019. Pragmatic Software Testing Education. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (Minneapolis, MN, USA) (SIGCSE ’19). Association for Computing Machinery, New York, NY, USA, 414–420. https://doi.org/10.1145/3287324.3287461
[2]
Maurício Aniche, Christoph Treude, and Andy Zaidman. 2022. How Developers Engineer Test Cases: An Observational Study. IEEE Transactions on Software Engineering 48, 12 (2022), 4925–4946. https://doi.org/10.1109/TSE.2021.3129889
[3]
Gina R. Bai, Justin Smith, and Kathryn T. Stolee. 2021. How Students Unit Test: Perceptions, Practices, and Pitfalls. In Proceedings of the 26th ACM Conference on Innovation and Technology in Computer Science Education V. 1 (Virtual Event, Germany) (ITiCSE ’21). Association for Computing Machinery, New York, NY, USA, 248–254. https://doi.org/10.1145/3430665.3456368
[4]
Brett A. Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather, and Eddie Antonio Santos. 2023. Programming Is Hard - Or at Least It Used to Be: Educational Opportunities and Challenges of AI Code Generation. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (, Toronto ON, Canada, ) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 500–506. https://doi.org/10.1145/3545945.3569759
[5]
Daniel M. Berry and Erik Kamsties. 2004. Ambiguity in Requirements Specification. Springer US, Boston, MA, 7–44. https://doi.org/10.1007/978-1-4615-0465-8_2
[6]
Lex Bijlsma, Niels Doorn, Harrie Passier, Harold Pootjes, and Sylvia Stuurman. 2021. How do students test software units?. In Proceedings of the 43rd International Conference on Software Engineering: Joint Track on Software Engineering Education and Training (Virtual Event, Spain) (ICSE-JSEET ’21). IEEE Press, 189–198. https://doi.org/10.1109/ICSE-SEET52601.2021.00029
[7]
Francisco Enrique Vicente Castro and Kathi Fisler. 2020. Qualitative Analyses of Movements Between Task-level and Code-level Thinking of Novice Programmers. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (Portland, OR, USA) (SIGCSE ’20). Association for Computing Machinery, New York, NY, USA, 487–493. https://doi.org/10.1145/3328778.3366847
[8]
Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. SIGPLAN Not. 35, 9 (sep 2000), 268–279. https://doi.org/10.1145/357766.351266
[9]
Peter J. Clarke, Debra Davis, Tariq M. King, Jairo Pava, and Edward L. Jones. 2014. Integrating Testing into Software Engineering Courses Supported by a Collaborative Learning Environment. ACM Trans. Comput. Educ. 14, 3, Article 18 (oct 2014), 33 pages. https://doi.org/10.1145/2648787
[10]
Lucas Cordova, Jeffrey Carver, Noah Gershmel, and Gursimran Walia. 2021. A Comparison of Inquiry-Based Conceptual Feedback vs. Traditional Detailed Feedback Mechanisms in Software Testing Education: An Empirical Investigation. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education (Virtual Event, USA) (SIGCSE ’21). Association for Computing Machinery, New York, NY, USA, 87–93. https://doi.org/10.1145/3408877.3432417
[11]
Paul Denny, Viraj Kumar, and Nasser Giacaman. 2023. Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 1136–1142. https://doi.org/10.1145/3545945.3569823
[12]
Paul Denny, Juho Leinonen, James Prather, Andrew Luxton-Reilly, Thezyrie Amarouche, Brett A. Becker, and Brent N. Reeves. 2024. Prompt Problems: A New Programming Exercise for the Generative AI Era. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1 (, Portland, OR, USA, ) (SIGCSE 2024). Association for Computing Machinery, New York, NY, USA, 296–302. https://doi.org/10.1145/3626252.3630909
[13]
Paul Denny, James Prather, Brett A. Becker, James Finnie-Ansley, Arto Hellas, Juho Leinonen, Andrew Luxton-Reilly, Brent N. Reeves, Eddie Antonio Santos, and Sami Sarsa. 2023. Computing Education in the Era of Generative AI. arxiv:2306.02608 [cs.CY]
[14]
Niels Doorn, Tanja E. J. Vos, Beatriz Marín, Harrie Passier, Lex Bijlsma, and Silvio Cacace. 2021. Exploring students’ sensemaking of test case design. An initial study. In 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C). 1069–1078. https://doi.org/10.1109/QRS-C55045.2021.00161
[15]
Lyn Dupré. 1995. Bugs in writing, a guide to debugging your prose. ACM Press/Addison-Wesley Publishing Co.
[16]
Stephen H. Edwards and Zalia Shams. 2014. Do student programmers all tend to write the same software tests?. In Proceedings of the 2014 Conference on Innovation & Technology in Computer Science Education (Uppsala, Sweden) (ITiCSE ’14). Association for Computing Machinery, New York, NY, USA, 171–176. https://doi.org/10.1145/2591708.2591757
[17]
Saad Ezzini, Sallam Abualhaija, Chetan Arora, Mehrdad Sabetzadeh, and Lionel C Briand. 2021. Using domain-specific corpora for improved handling of ambiguity in requirements. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1485–1497.
[18]
Alessio Ferrari and Andrea Esuli. 2019. An NLP approach for cross-domain ambiguity detection in requirements engineering. Automated Software Engineering 26, 3 (2019), 559–598.
[19]
James Finnie-Ansley, Paul Denny, Brett A. Becker, Andrew Luxton-Reilly, and James Prather. 2022. The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming. In Proceedings of the 24th Australasian Computing Education Conference (, Virtual Event, Australia, ) (ACE ’22). Association for Computing Machinery, New York, NY, USA, 10–19. https://doi.org/10.1145/3511861.3511863
[20]
James Finnie-Ansley, Paul Denny, Andrew Luxton-Reilly, Eddie Antonio Santos, James Prather, and Brett A. Becker. 2023. My AI Wants to Know if This Will Be on the Exam: Testing OpenAI’s Codex on CS2 Programming Exercises. In Proceedings of the 25th Australasian Computing Education Conference (, Melbourne, VIC, Australia, ) (ACE ’23). Association for Computing Machinery, New York, NY, USA, 97–104. https://doi.org/10.1145/3576123.3576134
[21]
Donald C Gause and Gerald M Weinberg. 1989. Exploring requirements. Dorset House (1989), 249.
[22]
Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Ceyao Zhang, Jinlin Wang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin Wu, and Jürgen Schmidhuber. 2023. MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework. arxiv:2308.00352 [cs.AI]
[23]
Yiming Huang, Zhenghao Lin, Xiao Liu, Yeyun Gong, Shuai Lu, Fangyu Lei, Yaobo Liang, Yelong Shen, Chen Lin, Nan Duan, and Weizhu Chen. 2023. Competition-Level Problems are Effective LLM Evaluators. arxiv:2312.02143 [cs.CL]
[24]
Cruz Izu, Carsten Schulte, Ashish Aggarwal, Quintin Cutts, Rodrigo Duran, Mirela Gutica, Birte Heinemann, Eileen Kraemer, Violetta Lonati, Claudio Mirolo, and Renske Weeda. 2019. Fostering Program Comprehension in Novice Programmers - Learning Activities and Learning Trajectories. In Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education (Aberdeen, Scotland Uk) (ITiCSE-WGR ’19). Association for Computing Machinery, New York, NY, USA, 27–52. https://doi.org/10.1145/3344429.3372501
[25]
Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, and Ion Stoica. 2024. LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code. arxiv:2403.07974 [cs.SE]
[26]
Erik Kamsties. 2005. Understanding Ambiguity in Requirements Engineering. Springer Berlin Heidelberg, Berlin, Heidelberg, 245–266. https://doi.org/10.1007/3-540-28244-0_11
[27]
Erik Kamsties and Barbara Peach. 2000. Taming ambiguity in natural language requirements. In Proceedings of the Thirteenth international conference on Software and Systems Engineering and Applications, Vol. 1315.
[28]
Ayaan M. Kazerouni, Clifford A. Shaffer, Stephen H. Edwards, and Francisco Servant. 2019. Assessing Incremental Testing Practices and Their Impact on Project Outcomes. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (Minneapolis, MN, USA) (SIGCSE ’19). Association for Computing Machinery, New York, NY, USA, 407–413. https://doi.org/10.1145/3287324.3287366
[29]
Benjamin L Kovitz. 1998. Practical software requirements: a manual of content and style. Manning Publications Co.
[30]
Dino Mandrioli, Stephen Fickas, Carlo A. Furia, Mehdi Jazayeri, Matteo Rossi, and Michal Young. 2010. SCORE: the first student contest on software engineering. SIGSOFT Softw. Eng. Notes 35, 4 (jul 2010), 24–30. https://doi.org/10.1145/1811226.1811240
[31]
Ray Offen. 2002. Domain understanding is the key to successful system development. Requirements engineering 7, 3 (2002), 172.
[32]
Mrigank Pawagi and Viraj Kumar. 2023. GuardRails: Automated Suggestions for Clarifying Ambiguous Purpose Statements. In Proceedings of the 16th Annual ACM India Compute Conference (, Hyderabad, India, ) (COMPUTE ’23). Association for Computing Machinery, New York, NY, USA, 55–60. https://doi.org/10.1145/3627217.3627234
[33]
Siddhartha Prasad, Ben Greenman, Tim Nelson, John Wrenn, and Shriram Krishnamurthi. 2022. Making Hay from Wheats: A Classsourcing Method to Identify Misconceptions. In Proceedings of the 22nd Koli Calling International Conference on Computing Education Research (Koli, Finland) (Koli Calling ’22). Association for Computing Machinery, New York, NY, USA, Article 2, 7 pages. https://doi.org/10.1145/3564721.3564726
[34]
James Prather, Paul Denny, Juho Leinonen, David H. Smith IV au2, Brent N. Reeves, Stephen MacNeil, Brett A. Becker, Andrew Luxton-Reilly, Thezyrie Amarouche, and Bailey Kimmel. 2024. Interactions with Prompt Problems: A New Way to Teach Programming with Large Language Models. arxiv:2401.10759 [cs.HC]
[35]
Chen Qian, Xin Cong, Wei Liu, Cheng Yang, Weize Chen, Yusheng Su, Yufan Dang, Jiahao Li, Juyuan Xu, Dahai Li, Zhiyuan Liu, and Maosong Sun. 2023. Communicative Agents for Software Development. arxiv:2307.07924 [cs.SE]
[36]
Patrick Rein, Tom Beckmann, Leonard Geier, Toni Mattis, and Robert Hirschfeld. 2022. Competitive Debugging: Toward Contests Promoting Debugging as a Skill. In Proceedings of the 2022 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Auckland, New Zealand) (Onward! 2022). Association for Computing Machinery, New York, NY, USA, 172–184. https://doi.org/10.1145/3563835.3567665
[37]
Cristina Ribeiro and Daniel Berry. 2020. The prevalence and severity of persistent ambiguity in software requirements specifications: Is a special effort needed to find them?Science of Computer Programming 195 (2020), 102472. https://doi.org/10.1016/j.scico.2020.102472
[38]
Tal Ridnik, Dedy Kredo, and Itamar Friedman. 2024. Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering. arxiv:2401.08500 [cs.LG]
[39]
Vidhu Rojit, Sindhu R. Pai, Shruthi Kaivalya, and Viraj Kumar. 2016. Visual Specifications for Web-Application Programming Assignments. In 2016 IEEE Eighth International Conference on Technology for Education (T4E). 50–53. https://doi.org/10.1109/T4E.2016.019
[40]
Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, and Majd Sakr. 2023. Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle to Pass Assessments in Higher Education Programming Courses. In Proceedings of the 2023 ACM Conference on International Computing Education Research - Volume 1 (, Chicago, IL, USA, ) (ICER ’23). Association for Computing Machinery, New York, NY, USA, 78–92. https://doi.org/10.1145/3568813.3600142
[41]
G. Michael Schneider. 1978. The Introductory Programming Course in Computer Science: Ten Principles. In Papers of the SIGCSE/CSA Technical Symposium on Computer Science Education (Detroit, Michigan) (SIGCSE ’78). Association for Computing Machinery, New York, NY, USA, 107–114. https://doi.org/10.1145/990555.990598
[42]
Melanie Sclar, Yejin Choi, Yulia Tsvetkov, and Alane Suhr. 2023. Quantifying Language Models’ Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting. arxiv:2310.11324 [cs.CL]
[43]
Austin M. Shin and Ayaan M. Kazerouni. 2024. A Model of How Students Engineer Test Cases With Feedback. ACM Trans. Comput. Educ. 24, 1, Article 1 (jan 2024), 31 pages. https://doi.org/10.1145/3628604
[44]
Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Reflexion: language agents with verbal reinforcement learning. In Advances in Neural Information Processing Systems, A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.). Vol. 36. Curran Associates, Inc., 8634–8652. https://proceedings.neurips.cc/paper_files/paper/2023/file/1b44b878bb782e6954cd888628510e90-Paper-Conference.pdf
[45]
Varshini Venkatesh, Vaishnavi Venkatesh, and Viraj Kumar. 2023. Evaluating Copilot on CS1 Code Writing Problems with Suppressed Specifications. In Proceedings of the 16th Annual ACM India Compute Conference (, Hyderabad, India, ) (COMPUTE ’23). Association for Computing Machinery, New York, NY, USA, 104–107. https://doi.org/10.1145/3627217.3627235
[46]
Jan Vykopal, Valdemar Švábenský, and Ee-Chien Chang. 2020. Benefits and Pitfalls of Using Capture the Flag Games in University Courses. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (Portland, OR, USA) (SIGCSE ’20). Association for Computing Machinery, New York, NY, USA, 752–758. https://doi.org/10.1145/3328778.3366893
[47]
Michel Wermelinger. 2023. Using GitHub Copilot to Solve Simple Programming Problems. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (Toronto ON, Canada) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 172–178. https://doi.org/10.1145/3545945.3569830
[48]
John Wrenn and Shriram Krishnamurthi. 2019. Executable Examples for Programming Problem Comprehension. In Proceedings of the 2019 ACM Conference on International Computing Education Research (Toronto ON, Canada) (ICER ’19). Association for Computing Machinery, New York, NY, USA, 131–139. https://doi.org/10.1145/3291279.3339416
[49]
John Wrenn and Shriram Krishnamurthi. 2021. Reading Between the Lines: Student Help-Seeking for (Un) Specified Behaviors. In Proceedings of the 21st Koli Calling International Conference on Computing Education Research. 1–6.
[50]
Mengwei Xu, Wangsong Yin, Dongqi Cai, Rongjie Yi, Daliang Xu, Qipeng Wang, Bingyang Wu, Yihao Zhao, Chen Yang, Shihe Wang, Qiyang Zhang, Zhenyan Lu, Li Zhang, Shangguang Wang, Yuanchun Li, Yunxin Liu, Xin Jin, and Xuanzhe Liu. 2024. A Survey of Resource-efficient LLM and Multimodal Foundation Models. arxiv:2401.08092 [cs.LG]

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICER '24: Proceedings of the 2024 ACM Conference on International Computing Education Research - Volume 1
August 2024
539 pages
ISBN:9798400704758
DOI:10.1145/3632620
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Ambiguity
  2. CS1
  3. Code specifications
  4. Code writing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICER 2024
Sponsor:

Acceptance Rates

Overall Acceptance Rate 189 of 803 submissions, 24%

Upcoming Conference

ICER 2025
ACM Conference on International Computing Education Research
August 3 - 6, 2025
Charlottesville , VA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 134
    Total Downloads
  • Downloads (Last 12 months)134
  • Downloads (Last 6 weeks)53
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media