Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2930238.2930247acmconferencesArticle/Chapter ViewAbstractPublication PagesumapConference Proceedingsconference-collections
research-article

Reinforcement Learning: the Sooner the Better, or the Later the Better?

Published: 13 July 2016 Publication History

Abstract

Reinforcement Learning (RL) is one of the best machine learning approaches for decision making in interactive environments. RL focuses on inducing effective decision making policies with the goal of maximizing the agent's cumulative reward. In this study, we investigated the impact of both immediate and delayed reward functions on RL-induced policies and empirically evaluated the effectiveness of induced policies within an Intelligent Tutoring System called Deep Thought. Moreover, we divided students into Fast and Slow learners based on their incoming competence as measured by their average response time on the initial tutorial level. Our results show that there was a significant interaction effect between the induced policies and the students' incoming competence. More specifically, Fast learners are less sensitive to learning environments in that they can learn equally well regardless of the pedagogical strategies employed by the tutor, but Slow learners benefit significantly more from effective pedagogical strategies than from ineffective ones. In fact, with effective pedagogical strategies the slow learners learned as much as their faster peers, but with ineffective pedagogical strategies the former learned significantly less than the latter.

References

[1]
T. Barnes and J. C. Stamper. Toward automatic hint generation for logic proof tutoring using historical student data. In Intelligent Tutoring Systems, pages 373--382, 2008.
[2]
J. Beck, B. P. Woolf, and C. R. Beal. Advisor: A machine learning architecture for intelligent tutor construction. In AAAI/IAAI, pages 552--557, 2000.
[3]
M. Chi, K. VanLehn, D. Litman, and P. Jordan. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Modeling and User-Adapted Interaction, 21(1-2):137--180, 2011.
[4]
L. J. Cronbach and R. E. Snow. Aptitudes and instructional methods: A handbook for research on interactions. New York: Irvington, 1977.
[5]
W. J. Gonzalez-Espada and D. W. Bullock. Innovative applications of classroom response systems: Investigating students' item response times in relation to nal course grade, gender, general point average, and high school act scores. Electronic Journal for the Integration of Technology in Education, 6:97--108.
[6]
A. Iglesias, P. Martinez, R. Aler, and F. Fernandez. Learning teaching strategies in an adaptive and intelligent educational system through reinforcement learning. Applied Intelligence, 31:89--106, 2009. 10.1007/s10489-008-0115-1.
[7]
A. Iglesias, P. Martinez, R. Aler, and F. Fernandez. Reinforcement learning of pedagogical policies in adaptive and intelligent educational systems. Knowledge-Based Systems, 22(4):266--270, 2009. Artificial Intelligence (AI) in Blended Learning.
[8]
A. Iglesias, P. Martinez, and F. Fernandez. An experience applying reinforcement learning in a web-based adaptive and intelligent educational system. Informatics in Education, 2(2):223--240, 2003.
[9]
D. J. Litman and S. Silliman. Itspoke: an intelligent tutoring spoken dialogue system. In Demonstration Papers at HLT-NAACL 2004, pages 5--8. Association for Computational Linguistics, 2004.
[10]
K. N. Martin and I. Arroyo. Agentx: Using reinforcement learning to improve the effectiveness of intelligent tutoring systems. pages 564--572.
[11]
B. M. McLaren and S. Isotani. When is it best to learn with all worked examples? In Artificial Intelligence in Education, pages 222--229. Springer, 2011.
[12]
B. M. McLaren, S.-J. Lim, and K. R. Koedinger. When and how often should worked examples be given to students' new results and a summary of the current state of research. In Proceedings of the 30th annual conference of the cognitive science society, pages 2176--2181, 2008.
[13]
B. M. McLaren, T. van Gog, C. Ganoe, D. Yaron, and M. Karabinos. Exploring the assistance dilemma: Comparing instructional support in examples and problems. In Intelligent Tutoring Systems, pages 354--361. Springer, 2014.
[14]
Z. L. Mostafavi Behrooz and T. Barnes. Data-driven proficiency profiling. In Proc. of the 8th International Conference on Educational Data Mining, 2015.
[15]
A. S. Najar, A. Mitrovic, and B. M. McLaren. Adaptive support versus alternating worked examples and tutored problems: Which leads to better learning? In User Modeling, Adaptation, and Personalization, pages 171--182. Springer, 2014.
[16]
M. Chi, K. VanLehn, D. J. Litman, and P. W. Jordan. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Model. User-Adapt. Interact., 21(1-2):137--180, 2011.
[17]
A. Renkl, R. K. Atkinson, U. H. Maier, and R. Staley. From example study to problem solving: Smooth transitions help learning. The Journal of Experimental Education, 70(4):293--315, 2002.
[18]
R. J. Salden, V. Aleven, R. Schwonke, and A. Renkl. The expertise reversal e ect and worked examples in tutored problem solving. Instructional Science, 38(3):289--307, 2010.
[19]
D. L. Schnipke and D. J. Scrams. Exploring issues of examinee behavior: Insights gained from response-time analyses. Computer-based testing: Building the foundation for future assessments, pages 237--266, 2002.
[20]
S. Shen and M. Chi. Aim low: Correlation-based feature selection for model-based reinforcement learning. 2016.
[21]
J. C. Stamper, T. Barnes, and M. J. Croy. Extracting student models for intelligent tutoring systems. pages 1900--1901.
[22]
R. S. Sutton and A. G. Barto. Reinforcement Learning. MIT Press Bradford Books, 1998.
[23]
J. R. Tetreault, D. Bohus, and D. J. Litman. Estimating the reliability of mdp policies: a con dence interval approach. In HLT-NAACL, pages 276--283, 2007.
[24]
J. R. Tetreault and D. J. Litman. A reinforcement learning approach to evaluating state representations in spoken dialogue systems. Speech Communication, 50(8):683--696, 2008.
[25]
R. D. L. V. S. Thomas et al. Response Times: Their Role in Inferring Elementary Mental Organization: Their Role in Inferring Elementary Mental Organization. Oxford University Press, USA, 1986.
[26]
T. Van Gog, L. Kester, and F. Paas. E ects of worked examples, example-problem, and problem-example pairs on novices' learning. Contemporary Educational Psychology, 36(3):212--218, 2011.
[27]
J. D. Williams. The best of both worlds: unifying conventional dialog systems and pomdps. In INTERSPEECH, pages 1173--1176, 2008.

Cited By

View all
  • (2024)Reinforcement Learning in AI-Driven Assessments : Enhancing Continuous Learning and AccessibilityInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT24105101410:5(297-305)Online publication date: 6-Oct-2024
  • (2024)Reinforcement learning tutor better supported lower performers in a math taskMachine Learning10.1007/s10994-023-06423-9113:5(3023-3048)Online publication date: 9-Feb-2024
  • (2023)HOPE: Human-Centric Off-Policy Evaluation for E-Learning and HealthcareProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598804(1504-1513)Online publication date: 30-May-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
UMAP '16: Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization
July 2016
366 pages
ISBN:9781450343688
DOI:10.1145/2930238
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. delayed reward
  2. immediate reward
  3. pedagogical strategy
  4. problem solving
  5. reinforcement learning
  6. worked example

Qualifiers

  • Research-article

Funding Sources

  • NSF Grant

Conference

UMAP '16
Sponsor:
UMAP '16: User Modeling, Adaptation and Personalization Conference
July 13 - 17, 2016
Nova Scotia, Halifax, Canada

Acceptance Rates

UMAP '16 Paper Acceptance Rate 21 of 123 submissions, 17%;
Overall Acceptance Rate 162 of 633 submissions, 26%

Upcoming Conference

UMAP '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)5
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Reinforcement Learning in AI-Driven Assessments : Enhancing Continuous Learning and AccessibilityInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT24105101410:5(297-305)Online publication date: 6-Oct-2024
  • (2024)Reinforcement learning tutor better supported lower performers in a math taskMachine Learning10.1007/s10994-023-06423-9113:5(3023-3048)Online publication date: 9-Feb-2024
  • (2023)HOPE: Human-Centric Off-Policy Evaluation for E-Learning and HealthcareProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598804(1504-1513)Online publication date: 30-May-2023
  • (2023)Leveraging response times in learning environments: opportunities and challengesUser Modeling and User-Adapted Interaction10.1007/s11257-023-09386-734:3(729-752)Online publication date: 2-Nov-2023
  • (2023)Sim-GAIL: A generative adversarial imitation learning approach of student modelling for intelligent tutoring systemsNeural Computing and Applications10.1007/s00521-023-08989-w35:34(24369-24388)Online publication date: 3-Oct-2023
  • (2023)Competitive Collaboration for Complex Task Learning in Agent SystemsAI 2023: Advances in Artificial Intelligence10.1007/978-981-99-8391-9_26(325-337)Online publication date: 27-Nov-2023
  • (2022)Evaluation and Promotion of a Multidimensional Information Intelligent Speech System in Dialect TeachingJournal of Sensors10.1155/2022/16920802022(1-10)Online publication date: 9-Mar-2022
  • (2022)Adaptive Cognitive Training with Reinforcement LearningACM Transactions on Interactive Intelligent Systems10.1145/347677712:1(1-29)Online publication date: 4-Mar-2022
  • (2022)Get A Sense of Accomplishment in Doing Exercises: A Reinforcement Learning Perspective2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD54268.2022.9776133(299-304)Online publication date: 4-May-2022
  • (2022)The Impact of Batch Deep Reinforcement Learning on Student Performance: A Simple Act of Explanation Can Go A Long WayInternational Journal of Artificial Intelligence in Education10.1007/s40593-022-00312-333:4(1031-1056)Online publication date: 28-Nov-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media