research-article

Reinforcement Learning: the Sooner the Better, or the Later the Better?

Authors:

Min ChiAuthors Info & Claims

UMAP '16: Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization

Pages 37 - 44

https://doi.org/10.1145/2930238.2930247

Published: 13 July 2016 Publication History

Abstract

Reinforcement Learning (RL) is one of the best machine learning approaches for decision making in interactive environments. RL focuses on inducing effective decision making policies with the goal of maximizing the agent's cumulative reward. In this study, we investigated the impact of both immediate and delayed reward functions on RL-induced policies and empirically evaluated the effectiveness of induced policies within an Intelligent Tutoring System called Deep Thought. Moreover, we divided students into Fast and Slow learners based on their incoming competence as measured by their average response time on the initial tutorial level. Our results show that there was a significant interaction effect between the induced policies and the students' incoming competence. More specifically, Fast learners are less sensitive to learning environments in that they can learn equally well regardless of the pedagogical strategies employed by the tutor, but Slow learners benefit significantly more from effective pedagogical strategies than from ineffective ones. In fact, with effective pedagogical strategies the slow learners learned as much as their faster peers, but with ineffective pedagogical strategies the former learned significantly less than the latter.

References

[1]

T. Barnes and J. C. Stamper. Toward automatic hint generation for logic proof tutoring using historical student data. In Intelligent Tutoring Systems, pages 373--382, 2008.

Digital Library

[2]

J. Beck, B. P. Woolf, and C. R. Beal. Advisor: A machine learning architecture for intelligent tutor construction. In AAAI/IAAI, pages 552--557, 2000.

Digital Library

[3]

M. Chi, K. VanLehn, D. Litman, and P. Jordan. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Modeling and User-Adapted Interaction, 21(1-2):137--180, 2011.

Digital Library

[4]

L. J. Cronbach and R. E. Snow. Aptitudes and instructional methods: A handbook for research on interactions. New York: Irvington, 1977.

[5]

W. J. Gonzalez-Espada and D. W. Bullock. Innovative applications of classroom response systems: Investigating students' item response times in relation to nal course grade, gender, general point average, and high school act scores. Electronic Journal for the Integration of Technology in Education, 6:97--108.

[6]

A. Iglesias, P. Martinez, R. Aler, and F. Fernandez. Learning teaching strategies in an adaptive and intelligent educational system through reinforcement learning. Applied Intelligence, 31:89--106, 2009. 10.1007/s10489-008-0115-1.

Digital Library

[7]

A. Iglesias, P. Martinez, R. Aler, and F. Fernandez. Reinforcement learning of pedagogical policies in adaptive and intelligent educational systems. Knowledge-Based Systems, 22(4):266--270, 2009. Artificial Intelligence (AI) in Blended Learning.

Digital Library

[8]

A. Iglesias, P. Martinez, and F. Fernandez. An experience applying reinforcement learning in a web-based adaptive and intelligent educational system. Informatics in Education, 2(2):223--240, 2003.

[9]

D. J. Litman and S. Silliman. Itspoke: an intelligent tutoring spoken dialogue system. In Demonstration Papers at HLT-NAACL 2004, pages 5--8. Association for Computational Linguistics, 2004.

Digital Library

[10]

K. N. Martin and I. Arroyo. Agentx: Using reinforcement learning to improve the effectiveness of intelligent tutoring systems. pages 564--572.

[11]

B. M. McLaren and S. Isotani. When is it best to learn with all worked examples? In Artificial Intelligence in Education, pages 222--229. Springer, 2011.

Digital Library

[12]

B. M. McLaren, S.-J. Lim, and K. R. Koedinger. When and how often should worked examples be given to students' new results and a summary of the current state of research. In Proceedings of the 30th annual conference of the cognitive science society, pages 2176--2181, 2008.

[13]

B. M. McLaren, T. van Gog, C. Ganoe, D. Yaron, and M. Karabinos. Exploring the assistance dilemma: Comparing instructional support in examples and problems. In Intelligent Tutoring Systems, pages 354--361. Springer, 2014.

Digital Library

[14]

Z. L. Mostafavi Behrooz and T. Barnes. Data-driven proficiency profiling. In Proc. of the 8th International Conference on Educational Data Mining, 2015.

[15]

A. S. Najar, A. Mitrovic, and B. M. McLaren. Adaptive support versus alternating worked examples and tutored problems: Which leads to better learning? In User Modeling, Adaptation, and Personalization, pages 171--182. Springer, 2014.

[16]

M. Chi, K. VanLehn, D. J. Litman, and P. W. Jordan. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Model. User-Adapt. Interact., 21(1-2):137--180, 2011.

Digital Library

[17]

A. Renkl, R. K. Atkinson, U. H. Maier, and R. Staley. From example study to problem solving: Smooth transitions help learning. The Journal of Experimental Education, 70(4):293--315, 2002.

[18]

R. J. Salden, V. Aleven, R. Schwonke, and A. Renkl. The expertise reversal e ect and worked examples in tutored problem solving. Instructional Science, 38(3):289--307, 2010.

[19]

D. L. Schnipke and D. J. Scrams. Exploring issues of examinee behavior: Insights gained from response-time analyses. Computer-based testing: Building the foundation for future assessments, pages 237--266, 2002.

[20]

S. Shen and M. Chi. Aim low: Correlation-based feature selection for model-based reinforcement learning. 2016.

[21]

J. C. Stamper, T. Barnes, and M. J. Croy. Extracting student models for intelligent tutoring systems. pages 1900--1901.

Digital Library

[22]

R. S. Sutton and A. G. Barto. Reinforcement Learning. MIT Press Bradford Books, 1998.

Digital Library

[23]

J. R. Tetreault, D. Bohus, and D. J. Litman. Estimating the reliability of mdp policies: a con dence interval approach. In HLT-NAACL, pages 276--283, 2007.

[24]

J. R. Tetreault and D. J. Litman. A reinforcement learning approach to evaluating state representations in spoken dialogue systems. Speech Communication, 50(8):683--696, 2008.

Digital Library

[25]

R. D. L. V. S. Thomas et al. Response Times: Their Role in Inferring Elementary Mental Organization: Their Role in Inferring Elementary Mental Organization. Oxford University Press, USA, 1986.

[26]

T. Van Gog, L. Kester, and F. Paas. E ects of worked examples, example-problem, and problem-example pairs on novices' learning. Contemporary Educational Psychology, 36(3):212--218, 2011.

[27]

J. D. Williams. The best of both worlds: unifying conventional dialog systems and pomdps. In INTERSPEECH, pages 1173--1176, 2008.

Cited By

Vijay Kumar Valaboju (2024)Reinforcement Learning in AI-Driven Assessments : Enhancing Continuous Learning and AccessibilityInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT24105101410:5(297-305)Online publication date: 6-Oct-2024
https://doi.org/10.32628/CSEIT241051014
Ruan SNie ASteenbergen WHe JZhang JGuo MLiu YDang Nguyen KWang CYing RLanday JBrunskill E(2024)Reinforcement learning tutor better supported lower performers in a math taskMachine Learning10.1007/s10994-023-06423-9113:5(3023-3048)Online publication date: 9-Feb-2024
https://doi.org/10.1007/s10994-023-06423-9
Gao GJu SAusin MChi MAgmon NAn BRicci AYeoh W(2023)HOPE: Human-Centric Off-Policy Evaluation for E-Learning and HealthcareProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598804(1504-1513)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.5555/3545946.3598804
Show More Cited By

Index Terms

Recommendations

Evaluation of reinforcement learning techniques
IITM '10: Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia

Reinforcement learning is became one of the most important approaches to machine intelligence. Now RL is widely use by different research field as intelligent control, robotics and neuroscience. It provides us possible solution within unknown ...
Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies

For many forms of e-learning environments, the system's behavior can be viewed as a sequential decision process wherein, at each discrete step, the system is responsible for selecting the next action to take. Pedagogical strategies are policies to ...
An evaluation of pedagogical tutorial tactics for a natural language tutoring system: a reinforcement learning approach
Special issue on Best of ITS 2010

Pedagogical strategies are policies for a tutor to decide the next action when there are multiple actions available. When the content is controlled to be the same across experimental conditions, there has been little evidence that tutorial decisions ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

UMAP '16: Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization

July 2016

366 pages

ISBN:9781450343688

DOI:10.1145/2930238

General Chairs:
Julita Vassileva
University of Saskatchewan, Canada
,
James Blustein
Dalhousie University, Canada
,
Program Chairs:
Lora Aroyo
VU Amsterdam, the Netherlands
,
Sidney D'Mello
University of Notre Dame, USA

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF Grant

Conference

UMAP '16

Sponsor:

UMAP '16: User Modeling, Adaptation and Personalization Conference

July 13 - 17, 2016

Nova Scotia, Halifax, Canada

Acceptance Rates

UMAP '16 Paper Acceptance Rate 21 of 123 submissions, 17%;

Overall Acceptance Rate 162 of 633 submissions, 26%

Upcoming Conference

UMAP '25

Sponsor:
sigchi
sigchi

33rd ACM Conference on User Modeling, Adaptation and Personalization

June 16 - 19, 2025

New York City , NY , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
565
Total Downloads

Downloads (Last 12 months)62
Downloads (Last 6 weeks)5

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Vijay Kumar Valaboju (2024)Reinforcement Learning in AI-Driven Assessments : Enhancing Continuous Learning and AccessibilityInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT24105101410:5(297-305)Online publication date: 6-Oct-2024
https://doi.org/10.32628/CSEIT241051014
Ruan SNie ASteenbergen WHe JZhang JGuo MLiu YDang Nguyen KWang CYing RLanday JBrunskill E(2024)Reinforcement learning tutor better supported lower performers in a math taskMachine Learning10.1007/s10994-023-06423-9113:5(3023-3048)Online publication date: 9-Feb-2024
https://doi.org/10.1007/s10994-023-06423-9
Gao GJu SAusin MChi MAgmon NAn BRicci AYeoh W(2023)HOPE: Human-Centric Off-Policy Evaluation for E-Learning and HealthcareProceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems10.5555/3545946.3598804(1504-1513)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.5555/3545946.3598804
Pelánek R(2023)Leveraging response times in learning environments: opportunities and challengesUser Modeling and User-Adapted Interaction10.1007/s11257-023-09386-734:3(729-752)Online publication date: 2-Nov-2023
https://doi.org/10.1007/s11257-023-09386-7
Li ZShi LWang JCristea AZhou Y(2023)Sim-GAIL: A generative adversarial imitation learning approach of student modelling for intelligent tutoring systemsNeural Computing and Applications10.1007/s00521-023-08989-w35:34(24369-24388)Online publication date: 3-Oct-2023
https://doi.org/10.1007/s00521-023-08989-w
Samarasinghe DBarlow MLakshika E(2023)Competitive Collaboration for Complex Task Learning in Agent SystemsAI 2023: Advances in Artificial Intelligence10.1007/978-981-99-8391-9_26(325-337)Online publication date: 27-Nov-2023
https://doi.org/10.1007/978-981-99-8391-9_26
Pang Y(2022)Evaluation and Promotion of a Multidimensional Information Intelligent Speech System in Dialect TeachingJournal of Sensors10.1155/2022/16920802022(1-10)Online publication date: 9-Mar-2022
https://doi.org/10.1155/2022/1692080
Zini FLe Piane FGaspari M(2022)Adaptive Cognitive Training with Reinforcement LearningACM Transactions on Interactive Intelligent Systems10.1145/347677712:1(1-29)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3476777
Niu SCao S(2022)Get A Sense of Accomplishment in Doing Exercises: A Reinforcement Learning Perspective2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD54268.2022.9776133(299-304)Online publication date: 4-May-2022
https://doi.org/10.1109/CSCWD54268.2022.9776133
Sanz Ausin MManiktala MBarnes TChi M(2022)The Impact of Batch Deep Reinforcement Learning on Student Performance: A Simple Act of Explanation Can Go A Long WayInternational Journal of Artificial Intelligence in Education10.1007/s40593-022-00312-333:4(1031-1056)Online publication date: 28-Nov-2022
https://doi.org/10.1007/s40593-022-00312-3
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents