Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1597538.1597696guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Reinforcement learning with human teachers: evidence of feedback and guidance with implications for learning performance

Published: 16 July 2006 Publication History

Abstract

As robots become a mass consumer product, they will need to learn new skills by interacting with typical human users. Past approaches have adapted reinforcement learning (RL) to accept a human reward signal; however, we question the implicit assumption that people shall only want to give the learner feedback on its past actions. We present findings from a human user study showing that people use the reward signal not only to provide feedback about past actions, but also to provide future directed rewards to guide subsequent actions. Given this, we made specific modifications to the simulated RL robot to incorporate guidance. We then analyze and evaluate its learning performance in a second user study, and we report significant improvements on several measures. This work demonstrates the importance of understanding the human-teacher/robot-learner system as a whole in order to design algorithms that support how people want to teach while simultaneously improving the robot's learning performance.

References

[1]
Blumberg, B.; Downie, M.; Ivanov, Y.; Berlin, M.; Johnson, M.; and Tomlinson, B. 2002. Integrated learning for interactive synthetic characters. In Proceedings of the ACM SIGGRAPH.
[2]
Clouse, J., and Utgoff, P. 1992. A teaching method for reinforcement learning. In Proc. of the Nineth International Conf. on Machine Learning (ICML), 92-101.
[3]
Cohn, D.; Ghahramani, Z.; and Jordan., M. 1995. Active learning with statistical models. In Tesauro, G.; Touretzky, D.; and Alspector, J., eds., Advances in Neural Information Processing, volume 7. Morgan Kaufmann.
[4]
Evans, R. 2002. Varieties of learning. In Rabin, S., ed., AI Game Programming Wisdom. Hingham, MA: Charles River Media. 567-578.
[5]
Horvitz, E.; Breese, J.; Heckerman, D.; Hovel, D.; and Rommelse. K. 1998. The lumiere project: Bayesian user modeling for inferring the goals and needs of software users. In In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, 256-265.
[6]
Isbell, C.; Shelton, C.; Kearns, M.; Singh, S.; and Stone, P. 2001. Cobot: A social reinforcement learning agent. 5th Intern. Conf. on Autonomous Agents.
[7]
Kaplan, F.; Oudeyer, P.-Y.; Kubinyi, E.; and Miklosi, A. 2002. Robotic clicker training. Robotics and Autonomous Systems 38(3-4):197-206.
[8]
Kuhlmann, G.; Stone, P.; Mooney, R. J.; and Shavlik, J. W. 2004. Guiding a reinforcement learner with natural language advice: Initial results in robocup soccer. In Proceedings of the AAAI-2004 Workshop on Supervisory Control of Learning and Adaptive Systems.
[9]
Lashkari, Y.; Metral, M.; and Maes, P. 1994. Collaborative Interface Agents. In Proceedings of the Twelfth National Conference on Artificial Intelligence, volume 1. Seattle, WA: AAAI Press.
[10]
Lauria, S.; Bugmann, G.; Kyriacou, T.; and Klein, E. 2002. Mobile robot programming using natural language. Robotics and Autonomous Systems 38(3-4): 171-181.
[11]
Lieberman, H., ed. 2001. Your Wish is My Command: Programming by Example. San Francisco: Morgan Kaufmann.
[12]
Lockerd, A., and Breazeal, C. 2004. Tutelage and socially guided robot learning. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[13]
Mataric, M. 1997. Reinforcement learning in the multi-robot domain. Autonomous Robots 4(1):73-83.
[14]
Nicolescu, M. N., and Mataric, M. J. 2003. Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In Proceedings of the 2nd Intl. Conf. AAMAS.
[15]
Schaal, S. 1999. Is imitation learning the route to humanoid robots? Trends in Cognitive Sciences 3:233242.
[16]
Schohn, G., and Cohn, D. 2000. Less is more: Active learning with support vector machines. In Proc. 17th ICML, 839-846. Morgan Kaufmann, San Francisco, CA.
[17]
Smart, W., and Kaelbling, L. 2002. Effective reinforcement learning for mobile robots.
[18]
Stern, A.; Frank, A.; and Resner, B. 1998. Virtual petz (video session): a hybrid approach to creating autonomous, lifelike dogz and catz. In AGENTS '98: Proceedings of the second international conference on Autonomous agents, 334-335. New York, NY, USA: ACM Press.
[19]
Thrun, S. B., and Mitchell, T. M. 1993. Lifelong robot learning. Technical Report IAI-TR-93-7.
[20]
Thrun, S. 2002. Robotics. In Russell, S., and Norvig, P., eds., Artificial Intelligence: A Modern Approach (2nd edition). Prentice Hall.
[21]
Voyles, R., and Khosla, P. 1998. A multi-agent system for programming robotic agents by human demonstration. In Proceedings of AI and Manufacturing Research Planning Workshop.
[22]
Watkins, C., and Dayan, P. 1992. Q-Iearning. Machine Learning 8(3):279-292.

Cited By

View all
  • (2022)Correct Me If I'm Wrong: Using Non-Experts to Repair Reinforcement Learning PoliciesProceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction10.5555/3523760.3523825(493-501)Online publication date: 7-Mar-2022
  • (2022)Shaping Haru’s Affective Behavior with Valence and Arousal Based Implicit Facial Feedback2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN53752.2022.9900540(769-776)Online publication date: 29-Aug-2022
  • (2021)Learning to executeProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3540408(1912-1924)Online publication date: 6-Dec-2021
  • Show More Cited By
  1. Reinforcement learning with human teachers: evidence of feedback and guidance with implications for learning performance

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        AAAI'06: Proceedings of the 21st national conference on Artificial intelligence - Volume 1
        July 2006
        1005 pages
        ISBN:9781577352815

        Sponsors

        • AAAI: American Association for Artificial Intelligence

        Publisher

        AAAI Press

        Publication History

        Published: 16 July 2006

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 16 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2022)Correct Me If I'm Wrong: Using Non-Experts to Repair Reinforcement Learning PoliciesProceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction10.5555/3523760.3523825(493-501)Online publication date: 7-Mar-2022
        • (2022)Shaping Haru’s Affective Behavior with Valence and Arousal Based Implicit Facial Feedback2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)10.1109/RO-MAN53752.2022.9900540(769-776)Online publication date: 29-Aug-2022
        • (2021)Learning to executeProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3540408(1912-1924)Online publication date: 6-Dec-2021
        • (2020)Adaptive Routing with Guaranteed Delay Bounds using Safe Reinforcement LearningProceedings of the 28th International Conference on Real-Time Networks and Systems10.1145/3394810.3394815(149-160)Online publication date: 9-Jun-2020
        • (2019)Human-machine collaborative optimization via apprenticeship schedulingJournal of Artificial Intelligence Research10.1613/jair.1.1123363:1(1-49)Online publication date: 17-Apr-2019
        • (2019)Curiosity Did Not Kill the RobotACM Transactions on Human-Robot Interaction10.1145/33264628:3(1-24)Online publication date: 23-Jul-2019
        • (2019)Reinforcement learning using continuous states and interactive feedbackProceedings of the 2nd International Conference on Applications of Intelligent Systems10.1145/3309772.3309801(1-5)Online publication date: 7-Jan-2019
        • (2019)Adapting robot task planning to user preferencesAutonomous Robots10.1007/s10514-018-9737-243:6(1343-1356)Online publication date: 1-Aug-2019
        • (2018)Improving deep reinforcement learning in minecraft with action adviceProceedings of the Fifteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment10.5555/3505425.3505446(146-152)Online publication date: 8-Oct-2018
        • (2018)Improving reinforcement learning with human inputProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304652.3304833(5724-5728)Online publication date: 13-Jul-2018
        • Show More Cited By

        View Options

        View options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media