Learning from Demonstrations in Human–Robot Collaborative Scenarios: A Survey
<p>The five stages of the Learning from Demonstration programming paradigm.</p> "> Figure 2
<p>The four Data Acquisition methods available during the LfD process: Physical guidance through Kinaesthetic teaching, Observable guidance using a motion capture system to record the trajectories, Reactive guidance where the hints necessary to solve the task are provided to the robot through Natural Language Processing (NLP), and Multimodal guidance where Kinaesthetic teaching and NLP are used simultaneously for Data Acquisition.</p> ">
Abstract
:1. Introduction
- RQ1:
- Which are the main programming algorithms applied in the recent scientific literature regarding Learning from Demonstrations for collaborative robots applicable to possible manufacturing scenarios?
- RQ2:
- Which is the level of collaboration/interaction between human and robot in the tasks that researchers seek to solve with the application of Learning from Demonstration programming algorithms?
- RQ3:
- How do these solutions align with smart manufacturing/Industry 4.0 paradigm in terms of intuitiveness, safety, and ergonomics during demonstration and/or execution?
2. Materials and Methods
- Step 1:
- Establish the research objectives of the SLR.
- Step 2:
- Define the conceptual boundaries of the research.
- Step 3:
- Organize the data collection by defining the inclusion/exclusion criteria.
- Step 4:
- Report the validation procedure and efforts.
2.1. Research Objectives of the SLR
2.2. Conceptual Boundaries
2.3. Inclusion and Exclusion Criteria
2.4. Validation of the Search Results
3. Theoretical Background
3.1. Learning from Demonstration
3.1.1. Demonstrator Selection Stage
3.1.2. Data Acquisition Stage
- Acquisition by physical guidance: In this kind of acquisition method, the demonstration is performed on the configuration space of the robot by manually positioning along the desired trajectories to learn. Here, interfaces such as Kinaesthetic teaching, teleoperation, and haptics are commonly used. In Figure 2, an example of Kinaesthetic teaching is provided, one disadvantage of this method is the lack of precision in the recording due to the difficulties experienced by the user to reach some positions manually.
- Acquisition by observable guidance: In this context, the demonstrator performs the demonstration in its own configuration space and then uses Indirect-mapping-based techniques to relate the trajectories to the robot configuration space. Interfaces such as inertial sensors, vision sensors, and motion capture systems are commonly used [29]. Figure 2 also shows an example of this method using a motion capture system—one disadvantage is the impossibility of recording contact forces during the trajectories.
- Acquisition by reactive guidance: Different from previous methods, the robot is allowed to perform the task during the demonstration stage by exploring the effects of a predefined set of actions it is allowed to perform in a particular state. Whereas, the demonstrator, from time to time, suggests which actions suit better for a particular state or even directly selects the desired action that the robot should take. Thus, the data from the demonstration are shaped according to the feedback that the demonstrator provides during the acquisition process, avoiding the correspondence problem, by recording the demonstration data directly to the robot configuration space [25]. Methods such as reinforcement learning (RL) and active learning (AL) are widely used [30]. Figure 2 shows how the demonstrator provides suggestion to the robot using a Natural Language interface.
- Acquisition by multimodal guidance: This method uses any combination of the previous methods to better adjust the data recorded through the demonstration, and allows a more natural interaction between robot and demonstrator; this is illustrated in Figure 2. For example, Wang et al. [31] used a combination of observable guidance (a 3D vision system) and physical guidance (force/torque sensory system) to teach an assembly task. Detrimental to this method is the increased complexity for data fusion from multiple input sources.
3.1.3. Data Modeling Stage
Low-Level Motions
- One-shot or deterministic learning approaches are generally modeled as non-linear deterministic systems where the goal is to reach a particular target by the end of the skill. A commonly used approach in the literature is Dynamic Movement Primitives (DMP) introduced by Ijspeert [32] in order to represent an observer behavior in an attractor landscape. Learning then consists of instantiating the parameters modulating these motion patterns. Learning a DMP typically requires just a single skill demonstration, this has advantages in the LfD problem in terms of simplicity, but it makes the system more vulnerable to noisy demonstrations as the resulting controller will closely resemble the seed demonstration.
- Multi-shot or probabilistic learning is based on statistical learning methods. Whereas, the demonstration data is modeled with probability density functions, and the learning process is accomplished by exploiting various non-linear regression techniques from machine learning. In this method, multiple demonstrations are needed during the learning process, this property makes them more robust to noisy demonstrations, but more complex to implement due to the number of demonstrations needed. Most of the probabilistic algorithms in LfD are based on the general approach of modeling an action with Hidden Markov Models (HMMs) [33,34]. More recently, works have used HMMs method for skill learning, combined with Gaussian Mixture Models (GMM) [35] and Gaussian Mixture Regression (GMR) [36,37].
High-Level Tasks
- Policy learning consists of identifying and reproducing the demonstrator policy by generalizing over a set of demonstrations. The demonstration data usually are a sequence of state–action pairs within the task constraints. Here, the learning system can learn a direct mapping that outputs actions given states or learn a task plan mapping the sequence of actions that lead from an initial state to a goal state [26];
- Reward learning is similar to policy learning but, instead of using a set of actions to represent a task, the task is modeled in terms of its goals or objective, often referred to as a reward function. Techniques such as Inverse Reinforcement Learning (IRL) [38] enable the learning system to derive explicit reward functions from demonstration data in this method;
3.1.4. Skill or Task Execution
3.1.5. Refinement Learning
4. Analysis
4.1. Human Participation in the LfD Process
Data Acquisition Methods
4.2. Main LfD Algorithms for Skill/Task Learning
4.2.1. Skill Learning
4.2.2. Task Learning
5. Discussion and Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AL | Active Learning |
ADHSMM | Adaptive Duration Hidden Semi-Markov Model |
AR | Augmented Reality |
CNNs | Convolutional Neural Networks |
DT | Digital Twin |
DMPs | Dynamic Movement Primitives |
EMG | Electromyography |
EIProMPs | Environment adaptive Interactive Probabilistic Movement Primitives |
GMM | Gaussian Mixture Models |
GMR | Gaussian Mixture Regression |
GP | Gaussian Process |
GPLVM | Gaussian Process Latent Variable Model |
HMMs | Hidden Markov Models |
HSMM | Hidden Semi-Markov Model |
HRC | Human-Robot Collaboration |
IL | Incremental Learning |
IMU | Inertial Measurement Unit |
IP | Interaction Primitives |
IRL | Inverse Reinforcement Learning |
IRL-PBRS | Interactive Reinforcement Learning and Potential Based Reward Shaping |
KIF | Knowledge Integration Framework |
LfD | Learning from Demonstration |
MaxEnt-IRL | Maximum Entropy Inverse Reinforcement Learning |
MPC | Model predictive Control |
MTiPRoMP | Multi-Task interaction Probabilistic Movement Primitive |
MTPRoMP | Multi-Task Probabilistic Movement Primitive |
NLP | Natural Language Processing |
DSL | Domain Specific Language |
PbD | Programming by Demonstration |
PCA | Principal Component Analysis |
pHRIP | physical Human Robot Interaction Primitives |
PDDL | Planning Domain Definition Language |
ProMPs | Probabilistic Movement Primitives |
RBFs | Radial Basis Functions |
RF | Random Forest |
RL | Reinforcement Learning |
RMSE | Root-Mean Square Error |
SBS | Skill Based System |
SLR | Systematic Literature Review |
TPGMM | Task Parametrized Gaussian Mixture Model |
VR | Virtual Reality |
WoS | Web of Science |
WRFs | Weighted Random Forests |
WRC 2018 | World Robot Challenge |
References
- Alcácer, V.; Cruz-Machado, V. Scanning the industry 4.0: A literature review on technologies for manufacturing systems. Eng. Sci. Technol. Int. J. 2019, 22, 899–919. [Google Scholar] [CrossRef]
- Dotoli, M.; Fay, A.; Miśkowicz, M.; Seatzu, C. An overview of current technologies and emerging trends in factory automation. Int. J. Prod. Res. 2019, 57, 5047–5067. [Google Scholar] [CrossRef]
- Mittal, S.; Khan, M.A.; Romero, D.; Wuest, T. Smart manufacturing: Characteristics, technologies and enabling factors. Proc. Inst. Mech. Eng. Part B J. Eng. Manuf. 2019, 233, 1342–1361. [Google Scholar] [CrossRef]
- Bauer, A.; Wollherr, D.; Buss, M. Human–robot collaboration: A survey. Int. J. Humanoid Robot. 2008, 5, 47–66. [Google Scholar] [CrossRef]
- Evjemo, L.D.; Gjerstad, T.; Grøtli, E.I.; Sziebig, G. Trends in smart manufacturing: Role of humans and industrial robots in smart factories. Curr. Robot. Rep. 2020, 1, 35–41. [Google Scholar] [CrossRef] [Green Version]
- Villani, V.; Pini, F.; Leali, F.; Secchi, C. Survey on human–robot collaboration in industrial settings: Safety, intuitive interfaces and applications. Mechatronics 2018, 55, 248–266. [Google Scholar] [CrossRef]
- Bi, Z.M.; Luo, M.; Miao, Z.; Zhang, B.; Zhang, W.J.; Wang, L. Safety assurance mechanisms of collaborative robotic systems in manufacturing. Robot. Comput.-Integr. Manuf. 2021, 67, 102022. [Google Scholar] [CrossRef]
- Maurice, P.; Padois, V.; Measson, Y.; Bidaud, P. Human-oriented design of collaborative robots. Int. J. Ind. Ergon. 2016, 57, 88–102. [Google Scholar] [CrossRef] [Green Version]
- Gualtieri, L.; Rauch, E.; Vidoni, R. Emerging research fields in safety and ergonomics in industrial collaborative robotics: A systematic literature review. Robot. Comput.-Integr. Manuf. 2021, 67, 101998. [Google Scholar] [CrossRef]
- Hentout, A.; Aouache, M.; Maoudj, A.; Akli, I. Human–robot interaction in industrial collaborative robotics: A literature review of the decade 2008–2017. Adv. Robot. 2019, 33, 764–799. [Google Scholar] [CrossRef]
- Zaatari, S.E.; Marei, M.; Li, W.; Usman, Z. Cobot programming for collaborative industrial tasks: An overview. Robot. Auton. Syst. 2019, 116, 162–180. [Google Scholar] [CrossRef]
- Michaelis, J.E.; Siebert-Evenstone, A.; Shaffer, D.W.; Mutlu, B. Collaborative or Simply Uncaged? Understanding Human-Cobot Interactions in Automation. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–12. [Google Scholar] [CrossRef]
- Argall, B.D.; Chernova, S.; Veloso, M.; Browning, B. A survey of robot learning from demonstration. Robot. Auton. Syst. 2009, 57, 469–483. [Google Scholar] [CrossRef]
- Hussein, A.; Gaber, M.M.; Elyan, E.; Jayne, C. Imitation learning: A survey of learning methods. ACM Comput. Surv. 2017, 50, 1–35. [Google Scholar] [CrossRef]
- Zhu, Z.; Hu, H. Robot Learning from Demonstration in Robotic Assembly: A Survey. Robot 2018, 7, 17. [Google Scholar] [CrossRef] [Green Version]
- Ravichandar, H.; Polydoros, A.S.; Chernova, S.; Billard, A. Recent Advances in Robot Learning from Demonstration. Annu. Rev. Control. Robot. Auton. Syst. 2020, 3, 297–330. [Google Scholar] [CrossRef] [Green Version]
- Xie, Z.W.; Zhang, Q.; Jiang, Z.N.; Liu, H. Robot learning from demonstration for path planning: A review. Sci. China Technol. Sci. 2020, 63, 1325–1334. [Google Scholar] [CrossRef]
- Kitchenham, B. Procedures for Performing Systematic Reviews; Keele University: Keele, UK, 2004. [Google Scholar]
- Kitchenham, B.; Brereton, O.P.; Budgen, D.; Turner, M.; Bailey, J.; Linkman, S. Systematic literature reviews in software engineering—A systematic literature review. Inf. Softw. Technol. 2008, 51, 7–15. [Google Scholar] [CrossRef]
- Xiao, Y.; Watson, M. Guidance on conducting a systematic literature review. J. Plan. Educ. Res. 2019, 39, 93–112. [Google Scholar] [CrossRef]
- Scells, H.; Zuccon, G. Generating better queries for systematic reviews. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA, 8–12 July 2018; pp. 475–484. [Google Scholar] [CrossRef]
- Ananiadou, S.; Rea, B.; Okazaki, N.; Procter, R.; Thomas, J. Supporting systematic reviews using text mining. Soc. Sci. Comput. Rev. 2009, 27, 509–523. [Google Scholar] [CrossRef] [Green Version]
- Tsafnat, G.; Glasziou, P.; Choong, M.K.; Dunn, A.; Galgani, F.; Coiera, E. Systematic review automation technologies. Syst. Rev. 2014, 3, 74. [Google Scholar] [CrossRef] [Green Version]
- Billard, A.G.; Calinon, S.; Dillmann, R. Learning from Humans. In Springer Handbook of Robotics; Springer: Berlin/Heidelberg, Germany, 2016; pp. 1995–2014. [Google Scholar] [CrossRef]
- Chernova, S.; Thomaz, A.L. Robot learning from human teachers. Synth. Lect. Artif. Intell. Mach. Learn. 2014, 28, 1–121. [Google Scholar] [CrossRef]
- Zhou, Z.; Xiong, R.; Wang, Y.; Zhang, J. Advanced Robot Programming: A Review. Curr. Robot. Rep. 2020, 1, 251–528. [Google Scholar] [CrossRef]
- Koskinopoulou, M.; Piperakis, S.; Trahanias, P. Learning from demonstration facilitates human-robot collaborative task execution. In Proceedings of the 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Christchurch, New Zealand, 7–10 March 2016; pp. 59–66. [Google Scholar] [CrossRef]
- Qu, J.; Zhang, F.; Wang, Y.; Fu, Y. Human-like coordination motion learning for a redundant dual-arm robot. Robot. Comput.-Integr. Manuf. 2019, 57, 379–390. [Google Scholar] [CrossRef]
- Wang, W.; Chen, Y.; Li, R.; Jia, Y. Learning and comfort in human-robot interaction: A review. Appl. Sci. 2019, 9, 5152. [Google Scholar] [CrossRef] [Green Version]
- Lopes, M.; Melo, F.; Montesano, L. Active Learning for Reward Estimation in Inverse Reinforcement Learning; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5782, pp. 31–46. [Google Scholar] [CrossRef] [Green Version]
- Wang, W.; Li, R.; Chen, Y.; Diekel, Z.M.; Jia, Y. Facilitating Human-Robot Collaborative Tasks by Teaching-Learning-Collaboration from Human Demonstrations. IEEE Trans. Autom. Sci. Eng. 2019, 16, 640–653. [Google Scholar] [CrossRef]
- Ijspeert, A.J.; Nakanishi, J.; Schaal, S. Learning rhythmic movements by demonstration using nonlinear oscillators. IEEE Int. Conf. Intell. Robot. Syst. 2002, 1, 958–963. [Google Scholar] [CrossRef] [Green Version]
- Rabiner, L.R. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 1989, 77, 257–286. [Google Scholar] [CrossRef] [Green Version]
- Fink, G.A. Markov Models for Pattern Recognition; Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar] [CrossRef]
- Parsons, O.E. A Gaussian Mixture Model Approach to Classifying Response Types; Springer: Cham, Switzerland, 2020; pp. 3–22. [Google Scholar] [CrossRef]
- Ghahramani, Z.; Jordan, M. Supervised learning from incomplete data via an EM approach. In Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA, 30 November–3 December 1993; Cowan, J., Tesauro, G., Alspector, J., Eds.; Morgan-Kaufmann: Burlington, MA, USA, 1993; Volume 6. [Google Scholar]
- Fabisch, A. gmr: Gaussian Mixture Regression. J. Open Source Softw. 2021, 6, 3054. [Google Scholar] [CrossRef]
- Odom, P.; Natarajan, S. Active Advice Seeking for Inverse Reinforcement Learning. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, Singapore, 9–13 May 2016; pp. 503–511. [Google Scholar]
- Nicolescu, M.N.; Matarić, M.J. Task learning through imitation and human–robot interaction. In Imitation and Social Learning in Robots, Humans and Animals: Behavioural, Social and Communicative Dimensions; Nehaniv, C.L., Dautenhahn, K., Eds.; Cambridge University Press: Cambridge, UK, 2007; pp. 407–424. [Google Scholar] [CrossRef] [Green Version]
- Luo, Y.; Yin, L.; Bai, W.; Mao, K. An Appraisal of Incremental Learning Methods. Entropy 2020, 22, 1190. [Google Scholar] [CrossRef]
- Ewerton, M.; Maeda, G.; Kollegger, G.; Wiemeyer, J.; Peters, J. Incremental imitation learning of context-dependent motor skills. In Proceedings of the 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), Cancun, Mexico, 15–17 November 2016; pp. 351–358. [Google Scholar] [CrossRef]
- Nozari, S.; Krayani, A.; Marcenaro, L.; Martin, D.; Regazzoni, C. Incremental Learning through Probabilistic Behavior Prediction. In Proceedings of the 2022 30th European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 29 August–2 September 2022; pp. 1502–1506. [Google Scholar]
- Mészáros, A.; Franzese, G.; Kober, J. Learning to Pick at Non-Zero-Velocity From Interactive Demonstrations. IEEE Robot. Autom. Lett. 2022, 7, 6052–6059. [Google Scholar] [CrossRef]
- Kober, J.; Bagnell, J.A.; Peters, J. Reinforcement learning in robotics: A survey. Int. J. Robot. Res. 2013, 32, 1238–1274. [Google Scholar] [CrossRef]
- Akkaladevi, S.C.; Plasch, M.; Maddukuri, S.; Eitzinger, C.; Pichler, A.; Rinner, B. Toward an interactive reinforcement based learning framework for human robot collaborative assembly processes. Front. Robot. AI 2018, 5, 126. [Google Scholar] [CrossRef] [Green Version]
- Winter, J.D.; Beir, A.D.; Makrini, I.E.; de Perre, G.V.; Nowé, A.; Vanderborght, B. Accelerating interactive reinforcement learning by human advice for an assembly task by a cobot. Robotics 2019, 8, 104. [Google Scholar] [CrossRef] [Green Version]
- Lai, Y.; Paul, G.; Cui, Y.; Matsubara, T. User intent estimation during robot learning using physical human robot interaction primitives. Auton. Robot. 2022, 46, 421–436. [Google Scholar] [CrossRef]
- Hu, H.; Yang, X.; Lou, Y. A robot learning from demonstration framework for skillful small parts assembly. Int. J. Adv. Manuf. Technol. 2022, 119, 6775–6787. [Google Scholar] [CrossRef]
- Zhang, S.; Huang, H.; Huang, D.; Yao, L.; Wei, J.; Fan, Q. Subtask-learning based for robot self-assembly in flexible collaborative assembly in manufacturing. Int. J. Adv. Manuf. Technol. 2022, 120, 6807–6819. [Google Scholar] [CrossRef]
- Hu, Y.; Wang, Y.; Hu, K.; Li, W. Adaptive obstacle avoidance in path planning of collaborative robots for dynamic manufacturing. J. Intell. Manuf. 2021, 1–19. [Google Scholar] [CrossRef]
- Wang, L.; Jia, S.; Wang, G.; Turner, A.; Ratchev, S. Enhancing learning capabilities of movement primitives under distributed probabilistic framework for flexible assembly tasks. Neural Comput. Appl. 2021, 1–12. [Google Scholar] [CrossRef]
- Coninck, E.D.; Verbelen, T.; Molle, P.V.; Simoens, P.; Dhoedt, B. Learning robots to grasp by demonstration. Robot. Auton. Syst. 2020, 127, 103474. [Google Scholar] [CrossRef]
- Steinmetz, F.; Nitsch, V.; Stulp, F. Intuitive Task-Level Programming by Demonstration Through Semantic Skill Recognition. IEEE Robot. Autom. Lett. 2019, 4, 3742–3749. [Google Scholar] [CrossRef] [Green Version]
- Schlette, C.; Buch, A.G.; Hagelskjaer, F.; Iturrate, I.; Kraft, D.; Kramberger, A.; Lindvig, A.P.; Mathiesen, S.; Petersen, H.G.; Rasmussen, M.H.; et al. Advanced Robotics Towards robot cell matrices for agile production-SDU Robotics’ assembly cell at the WRC 2018 Towards robot cell matrices for agile production-SDU Robotics’ assembly cell at the WRC 2018. Adv. Robot. 2019, 2020, 422–438. [Google Scholar] [CrossRef]
- Kyrarini, M.; Haseeb, M.A.; Ristić-Durrant, D.; Gräser, A. Robot learning of industrial assembly task via human demonstrations. Auton. Robot. 2019, 43, 239–257. [Google Scholar] [CrossRef] [Green Version]
- Raiola, G.; Restrepo, S.S.; Chevalier, P.; Rodriguez-Ayerbe, P.; Lamy, X.; Tliba, S.; Stulp, F. Co-manipulation with a library of virtual guiding fixtures. Auton. Robot. 2018, 42, 1037–1051. [Google Scholar] [CrossRef] [Green Version]
- Esfahani, A.M.G.; Ragaglia, M. Robot learning from demonstrations: Emulation learning in environments with moving obstacles. Robot. Auton. Syst. 2018, 101, 45–56. [Google Scholar] [CrossRef]
- Rozo, L.; Silvério, J.; Calinon, S.; Caldwell, D.G. Learning Controllers for Reactive and Proactive Behaviors in Human–Robot Collaboration. Front. Robot. AI 2016, 3, 30. [Google Scholar] [CrossRef] [Green Version]
- Rozo, L.; Calinon, S.; Caldwell, D.G.; Jiménez, P.; Torras, C. Learning Physical Collaborative Robot Behaviors From Human Demonstrations. IEEE Trans. Robot. 2016, 32, 513–527. [Google Scholar] [CrossRef] [Green Version]
- Iturrate, I.; Kramberger, A.; Sloth, C. Quick Setup of Force-Controlled Industrial Gluing Tasks Using Learning From Demonstration. Front. Robot. AI 2021, 8, 354. [Google Scholar] [CrossRef]
- Wang, Y.Q.; Hu, Y.D.; Zaatari, S.E.; Li, W.D.; Zhou, Y. Optimised Learning from Demonstrations for Collaborative Robots. Robot.-Comput.-Integr. Manuf. 2021, 71, 102169. [Google Scholar] [CrossRef]
- Liang, Y.S.; Pellier, D.; Fiorino, H.; Pesty, S. Evaluation of a Robot Programming Framework for Non-Experts Using Symbolic Planning Representations. In Proceedings of the 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Lisbon, Portugal, 28 August–1 September 2017; pp. 1121–1126. [Google Scholar] [CrossRef] [Green Version]
- Ramirez-Amaro, K.; Dean-Leon, E.; Bergner, F.; Cheng, G. A Semantic-Based Method for Teaching Industrial Robots New Tasks. KI-Kunstl. Intell. 2019, 33, 117–122. [Google Scholar] [CrossRef]
- Fu, J.; Du, J.; Teng, X.; Fu, Y.; Wu, L. Adaptive multi-task human-robot interaction based on human behavioral intention. IEEE Access 2021, 9, 133762–133773. [Google Scholar] [CrossRef]
- Liang, Y.S.; Pellier, D.; Fiorino, H.; Pesty, S. iRoPro: An interactive Robot Programming Framework. Int. J. Soc. Robot. 2022, 14, 177–191. [Google Scholar] [CrossRef]
- Schou, C.; Andersen, R.S.; Chrysostomou, D.; Bøgh, S.; Madsen, O. Skill-based instruction of collaborative robots in industrial settings. Robot. Comput.-Integr. Manuf. 2018, 53, 72–80. [Google Scholar] [CrossRef]
- Wang, N.; Chen, C.; Nuovo, A.D. A Framework of Hybrid Force/Motion Skills Learning for Robots. IEEE Trans. Cogn. Dev. Syst. 2021, 13, 162–170. [Google Scholar] [CrossRef]
- Haage, M.; Piperagkas, G.; Papadopoulos, C.; Mariolis, I.; Malec, J.; Bekiroglu, Y.; Hedelind, M.; Tzovaras, D. Teaching Assembly by Demonstration Using Advanced Human Robot Interaction and a Knowledge Integration Framework. Procedia Manuf. 2017, 11, 164–173. [Google Scholar] [CrossRef]
- Wu, M.; Taetz, B.; He, Y.; Bleser, G.; Liu, S. An adaptive learning and control framework based on dynamic movement primitives with application to human-robot handovers. Robot. Auton. Syst. 2022, 148, 103935. [Google Scholar] [CrossRef]
- Sun, Y.; Wang, W.; Chen, Y.; Jia, Y. Learn How to Assist Humans Through Human Teaching and Robot Learning in Human-Robot Collaborative Assembly. IEEE Trans. Syst. Man, Cybern. Syst. 2022, 52, 728–738. [Google Scholar] [CrossRef]
- Huang, B.; Ye, M.; Hu, Y.; Vandini, A.; Lee, S.L.; Yang, G.Z. A Multirobot Cooperation Framework for Sewing Personalized Stent Grafts. IEEE Trans. Ind. Inf. 2018, 14, 1776–1785. [Google Scholar] [CrossRef] [Green Version]
- Castelli, F.; Michieletto, S.; Ghidoni, S.; Pagello, E. A machine learning-based visual servoing approach for fast robot control in industrial setting. Int. J. Adv. Robot. Syst. 2017, 14, 1729881417738884. [Google Scholar] [CrossRef]
- Zhang, H.D.; Liu, S.B.; Lei, Q.J.; He, Y.; Yang, Y.; Bai, Y. Robot programming by demonstration: A novel system for robot trajectory programming based on robot operating system. Adv. Manuf. 2020, 8, 216–229. [Google Scholar] [CrossRef]
- Zaatari, S.E.; Wang, Y.; Hu, Y.; Li, W. An improved approach of task-parameterized learning from demonstrations for cobots in dynamic manufacturing. J. Intell. Manuf. 2021, 33, 1503–1519. [Google Scholar] [CrossRef]
- Zaatari, S.E.; Wang, Y.; Li, W.; Peng, Y. iTP-LfD: Improved task parametrised learning from demonstration for adaptive path generation of cobot. Robot. Comput.-Integr. Manuf. 2021, 69, 102109. [Google Scholar] [CrossRef]
- Rodriguez-Linan, A.; Lopez-Juarez, I.; Maldonado-Ramirez, A.; Zalapa-Elias, A.; Torres-Trevino, L.; Navarro-Gonzalez, J.L.; Chinas-Sanchez, P. An Approach to Acquire Path-Following Skills by Industrial Robots from Human Demonstration. IEEE Access 2021, 9, 82351–82363. [Google Scholar] [CrossRef]
- Racca, M.; Kyrki, V.; Cakmak, M. Interactive Tuning of Robot Program Parameters via Expected Divergence Maximization. HRI ACM/IEEE Int. Conf. Hum.-Robot Interact. 2020, 10, 629–638. [Google Scholar] [CrossRef] [Green Version]
- Peternel, L.; Tsagarakis, N.; Caldwell, D.; Ajoudani, A. Robot adaptation to human physical fatigue in human–robot co-manipulation. Auton. Robot. 2018, 42, 1011–1021. [Google Scholar] [CrossRef]
- Al-Yacoub, A.; Zhao, Y.C.; Eaton, W.; Goh, Y.M.; Lohse, N. Improving human robot collaboration through Force/Torque based learning for object manipulation. Robot. Comput.-Integr. Manuf. 2021, 69. [Google Scholar] [CrossRef]
- Zeng, C.; Yang, C.; Zhong, J.; Zhang, J. Encoding Multiple Sensor Data for Robotic Learning Skills from Multimodal Demonstration. IEEE Access 2019, 7, 145604–145613. [Google Scholar] [CrossRef]
- Soares, I.; Petry, M.; Moreira, A.P. Programming Robots by Demonstration Using Augmented Reality. Sensors 2021, 21, 5976. [Google Scholar] [CrossRef]
- Lee, H.; Kim, D.; Aman, M.; Amin, U.A. Control framework for collaborative robot using imitation learning-based teleoperation from human digital twin to robot digital twin. Mechatronics 2022, 85, 102833. [Google Scholar] [CrossRef]
- Koert, D.; Trick, S.; Ewerton, M.; Lutter, M.; Peters, J. Incremental Learning of an Open-Ended Collaborative Skill Library. Int. J. Humanoid Robot. 2020, 17, 2050001. [Google Scholar] [CrossRef]
- Qian, K.; Xu, X.; Liu, H.; Bai, J.; Luo, S. Environment-adaptive learning from demonstration for proactive assistance in human–robot collaborative tasks. Robot. Auton. Syst. 2022, 151, 104046. [Google Scholar] [CrossRef]
- Tang, T.; Lin, H.C.; Zhao, Y.; Fan, Y.; Chen, W.; Tomizuka, M. Teach industrial robots peg-hole-insertion by human demonstration. In Proceedings of the 2016 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), Banff, AB, Canada, 12–15 July 2016; pp. 488–494. [Google Scholar] [CrossRef]
- Schaal, S.; Peters, J.; Nakanishi, J.; Ijspeert, A. Learning movement primitives. Springer Tracts Adv. Robot. 2005, 15, 561–572. [Google Scholar] [CrossRef]
- Hoffmann, H.; Pastor, P.; Park, D.H.; Schaal, S. Biologically-inspired dynamical systems for movement generation: Automatic real-time goal adaptation and obstacle avoidance. In Proceedings of the 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan, 12–17 May 2009; pp. 2587–2592. [Google Scholar] [CrossRef]
- Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G. Probabilistic movement primitives. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS 2013), Lake Tahoe, NV, USA, 5–10 December 2013. [Google Scholar]
- Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G. Using probabilistic movement primitives in robotics. Auton. Robot. 2017, 42, 529–551. [Google Scholar] [CrossRef] [Green Version]
- Maeda, G.J.; Neumann, G.; Ewerton, M.; Lioutikov, R.; Kroemer, O.; Peters, J. Probabilistic movement primitives for coordination of multiple human–robot collaborative tasks. Auton. Robot. 2017, 41, 593–612. [Google Scholar] [CrossRef] [Green Version]
- Koert, D.; Trick, S.; Ewerton, M.; Lutter, M.; Peters, J. Online Learning of an Open-Ended Skill Library for Collaborative Tasks. In Proceedings of the 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids), Beijing, China, 6–9 November 2018; pp. 1–9. [Google Scholar] [CrossRef]
- Amor, H.B.; Neumann, G.; Kamthe, S.; Kroemer, O.; Peters, J. Interaction primitives for human-robot cooperation tasks. In Proceedings of the IEEE International Conference on Robotics and Automation, Hong Kong, China, 31 May–7 June 2014; pp. 2831–2837. [Google Scholar] [CrossRef] [Green Version]
- Calinon, S.; Guenter, F.; Billard, A. On learning, representing, and generalizing a task in a humanoid robot. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2007, 37, 286–298. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Calinon, S.; D’Halluin, F.; Sauser, E.L.; Caldwell, D.G.; Billard, A.G. Learning and reproduction of gestures by imitation. IEEE Robot. Autom. Mag. 2010, 17, 44–54. [Google Scholar] [CrossRef] [Green Version]
- Carfì, A.; Villalobos, J.; Coronado, E.; Bruno, B.; Mastrogiovanni, F. Can Human-Inspired Learning Behaviour Facilitate Human—Robot Interaction? Int. J. Soc. Robot. 2020, 12, 173–186. [Google Scholar] [CrossRef]
- Carmigniani, J.; Furht, B. Augmented reality: An overview. In Handbook of Augmented Reality; Springer: New York, NY, USA, 2011; pp. 3–46. [Google Scholar]
- Sherman, W.R.; Craig, A.B. Understanding Virtual Reality; Morgan Kauffman: San Francisco, CA, USA, 2003. [Google Scholar]
- Grieves, M.; Vickers, J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisciplinary Perspectives on Complex Systems; Springer: Berlin/Heidelberg, Germany, 2017; pp. 85–113. [Google Scholar]
- Palmarini, R.; Amo, I.F.D.; Bertolino, G.; Dini, G.; Erkoyuncu, J.A.; Roy, R.; Farnsworth, M. Designing an AR interface to improve trust in Human-Robots collaboration. In Proceedings of the 28th CIRP Design Conference, Nantes, France, 23–25 May 2018; Volume 70, pp. 350–355. [Google Scholar] [CrossRef]
- Shu, B.; Sziebig, G.; Pieters, R. Architecture for Safe Human-Robot Collaboration: Multi-Modal Communication in Virtual Reality for Efficient Task Execution. In Proceedings of the 2019 IEEE 28th International Symposium on Industrial Electronics (ISIE), Vancouver, BC, Canada, 12–14 June 2019; pp. 2297–2302. [Google Scholar] [CrossRef] [Green Version]
- Materna, Z.; Kapinus, M.; Beran, V.; Smrž, P.; Zemčík, P. Interactive Spatial Augmented Reality in Collaborative Robot Programming: User Experience Evaluation. In Proceedings of the 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Nanjing, China, 27–31 August 2018; pp. 80–87. [Google Scholar] [CrossRef]
- Bambussek, D.; Materna, Z.Z.; Kapinus, M.; Beran, V.V.; Smrz, P.; Bambušek, D.; Materna, Z.Z.; Kapinus, M.; Beran, V.V.; Smrž, P. Combining Interactive Spatial Augmented Reality with Head-Mounted Display for End-User Collaborative Robot Programming. In Proceedings of the 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), New Delhi, India, 14–19 October 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Manou, E.; Vosniakos, G.C.; Matsas, E. Off-line programming of an industrial robot in a virtual reality environment. Int. J. Interact. Des. Manuf. 2019, 13, 507–519. [Google Scholar] [CrossRef]
- Burghardt, A.; Szybicki, D.; Gierlak, P.; Kurc, K.; Pietruś, P.; Cygan, R. Programming of Industrial Robots Using Virtual Reality and Digital Twins. Appl. Sci. 2020, 10, 486. [Google Scholar] [CrossRef] [Green Version]
- Malik, A.A.; Brem, A. Digital twins for collaborative robots: A case study in human-robot interaction. Robot. Comput.-Integr. Manuf. 2021, 68, 102092. [Google Scholar] [CrossRef]
- Pérez, L.; Rodríguez-Jiménez, S.; Rodríguez, N.; Usamentiaga, R.; García, D.F. Digital twin and virtual reality based methodology for multi-robot manufacturing cell commissioning. Appl. Sci. 2020, 10, 3633. [Google Scholar] [CrossRef]
- Xia, K.; Sacco, C.; Kirkpatrick, M.; Saidy, C.; Nguyen, L.; Kircaliali, A.; Harik, R. A digital twin to train deep reinforcement learning agent for smart manufacturing plants: Environment, interfaces and intelligence. J. Manuf. Syst. 2020, 58, 210–230. [Google Scholar] [CrossRef]
- Pedersen, M.R.; Nalpantidis, L.; Andersen, R.S.; Schou, C.; Bøgh, S.; Krüger, V.; Madsen, O. Robot skills for manufacturing: From concept to industrial deployment. Robot. Comput.-Integr. Manuf. 2016, 37, 282–291. [Google Scholar] [CrossRef]
Year | Author | Title | Focus |
---|---|---|---|
2017 | Hussein et al. [14] | Imitation Learning: A survey of learning methods | Algorithms |
2018 | Zhu et al. [15] | Robot Learning from Demonstration in Robotic Assembly: A survey | Assembly operations |
2020 | Ravichandar et al. [16] | Recent Advances in Robot Learning from Demonstration | General |
2020 | Xie et al. [17] | Robot learning from demonstration for path planning: A review | Path planning |
Classification | Method | Used in |
---|---|---|
Physical guidance | Kinaesthetic Teaching | [27,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65] |
Teleoperation & Haptics | [66] | |
Shadowing | [67] | |
Observable guidance | Vision system | [28,68,69,70,71,72,73,74,75] |
IMU | [76] | |
Reactive guidance | NLP | [46] |
GUI | [77] | |
Multimodal guidance | Mixture of methods | [31,78,79,80,81,82,83,84,85] |
Year | Author | Classification | Method | Robot |
---|---|---|---|---|
2020 | Zhang et al. [73] | Observable | Marker tracking | UR5 |
2021 | Soares et al. [81] | Multimodal | Augmented Reality | UR5/ABB IRB2600 |
2021 | Rodriguez et al. [76] | Observable | IMU | Kuka KR 16 |
2022 | Lee et al. [82] | Multimodal | Vision System & Digital Twin | ABB YuMi |
Year | Author | Modeling Method | Encoding of | Skill Learned |
---|---|---|---|---|
2017 | 🟉 Liang et al. [62] | DMP | Motion | Pick & Place |
2018 | Peternel et al. [78] | DMP | Motion and Force | Visual servoing |
2018 | Ghalamzan et al. [57] | DMP and GMM-GMR | Motion | Obstacle avoidance |
2019 | 🟉 Schlette et al. [54] | DMP | Motion and Force | Obstacle avoidance |
2021 | Iturrate et al. [60] | DMP | Motion and Force | Gluing |
2021 | Wang et al. [67] | DMP | Motion and Force | Sweeping |
2022 | Wu et al. [69] | DMP-GP | Motion and Force | Hand over |
2022 | 🟉 Liang et al. [65] | Keyframe-based | Motion | Various* |
Year | Author | Modelling Method | Encoding of | Skill Learned |
---|---|---|---|---|
2016 | Koskinopolou et al. [27] | GLPVM and PCA | Motion | Picking and Pushing |
2016 | Rozo et al. [58] | ADHSMM | Motion | Hand over |
2016 | Rozo et al. [59] | TP-GMM | Motion and Force | Lifting |
2016 | Tang et al. [85] | GMM-GMR | Motion and Force | Peg-in-hole |
2017 | Castelli et al. [72] | GMM-GMR | Motion | Visual servoing |
2018 | Raiola et al. [56] | GMM-GMR | Motion and Force | Virtual guidance pick and place |
2018 | 🟉 Huang et al. [71] | GMM-GMR | Motion | Sewing |
2019 | Qu et al. [28] | PCA and GMM-GMR | Motion | Dual arm coordination |
2019 | 🟉 Kyrarini et al. [55] | GMM-GMR | Motion | Object manipulation |
2019 | Zeng et al. [80] | HSMM-GMR | Motion and Force | Pushing |
2020 | DeConinck et al. [52] | CNN | Motion | Grasping |
2020 | 🟉 Koert et al. [83] | ProMP | Motion | Hand over |
2021 | Al-Yacoub et al. [79] | WRF | Motion and Force | Grasping |
2021 | Wang et al. [61] | GMM-GMR | Motion | Pick & Place |
2021 | 🟉 Hu et al. [50] | GMM-GMR | Motion | Pick and place |
2021 | Fu et al. [64] | MTProMP/MTiProMP | Motion | Multi-tasking |
2021 | Zaatari et al. [74] | TPGMM-TPGMR | Motion | Various* |
2021 | Zaatari et al. [75] | TPGMM-TPGMR | Motion | Various* |
2021 | Wang et al. [51] | GMM-GMR | Motion | Pick |
2022 | 🟉 Qian et al. [84] | EIProMP | Motion | Hand over |
2022 | Zhang et al. [49] | GMM-GMR | Motion | Grasping |
2022 | Hu et al. [48] | TPGMM-TPGMR | Motion and Force | Grasping |
2022 | Lai et al. [47] | pHRIP | Motion and Force | Target Reaching |
Year | Author | Modeling | Task |
---|---|---|---|
2017 | Haage et al. [68] | Semantic Learning | Assembly |
2017 | 🟉 Liang et al. [62] | Policy Learning | Assembly |
2018 | 🟉 Huang et al. [71] | Semantic Learning | Assembly |
2018 | Schou et al. [66] | Semantic Learning | Various |
2019 | Winter et al. [46] | Policy Learning | Assembly |
2019 | Steinmetz et al. [53] | Semantic Learning | Various |
2019 | Wang et al. [31] | Reward Learning | Assembly |
2019 | Ramirez-Amaro et al. [63] | Semantic Learning | Sorting |
2019 | 🟉 Schlette et al. [54] | Policy Learning | Assembly |
2020 | 🟉 Koert et al. [83] | Semantic Learning | Assembly |
2020 | Racca et al. [77] | Semantic Learning | Hand-over |
2022 | Sun et al. [70] | Semantic Learning | Assembly |
2019 | 🟉 Kyrarini et al. [55] | Policy Learning | Assembly |
2022 | 🟉 Hu et al. [48] | Policy Learning | Assembly |
2022 | 🟉 Liang et al. [65] | Policy Learning | Assembly |
2022 | 🟉 Qian et al. [84] | Semantic Learning | Assembly |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sosa-Ceron, A.D.; Gonzalez-Hernandez, H.G.; Reyes-Avendaño, J.A. Learning from Demonstrations in Human–Robot Collaborative Scenarios: A Survey. Robotics 2022, 11, 126. https://doi.org/10.3390/robotics11060126
Sosa-Ceron AD, Gonzalez-Hernandez HG, Reyes-Avendaño JA. Learning from Demonstrations in Human–Robot Collaborative Scenarios: A Survey. Robotics. 2022; 11(6):126. https://doi.org/10.3390/robotics11060126
Chicago/Turabian StyleSosa-Ceron, Arturo Daniel, Hugo Gustavo Gonzalez-Hernandez, and Jorge Antonio Reyes-Avendaño. 2022. "Learning from Demonstrations in Human–Robot Collaborative Scenarios: A Survey" Robotics 11, no. 6: 126. https://doi.org/10.3390/robotics11060126