Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3319502.3374820acmconferencesArticle/Chapter ViewAbstractPublication PageshriConference Proceedingsconference-collections
research-article

See What I See: Enabling User-Centric Robotic Assistance Using First-Person Demonstrations

Published: 09 March 2020 Publication History

Abstract

We explore first-person demonstration as an intuitive way of producing task demonstrations to facilitate user-centric robotic assistance. First-person demonstration directly captures the human experience of task performance via head-mounted cameras and naturally includes productive viewpoints for task actions. We implemented a perception system that parses natural first-person demonstrations into task models consisting of sequential task procedures, spatial configurations, and unique task viewpoints. We also developed a robotic system capable of interacting autonomously with users as it follows previously acquired task demonstrations. To evaluate the effectiveness of our robotic assistance, we conducted a user study contextualized in an assembly scenario; we sought to determine how assistance based on a first-person demonstration (user-centric assistance) versus that informed only by the cover image of the official assembly instruction (standard assistance) may shape users' behaviors and overall experience when working alongside a collaborative robot. Our results show that participants felt that their robot partner was more collaborative and considerate when it provided user-centric assistance than when it offered only standard assistance. Additionally, participants were more likely to exhibit unproductive behaviors, such as using their non-dominant hand, when performing the assembly task without user-centric assistance.

Supplementary Material

MP4 File (p639-wang.mp4)

References

[1]
Baris Akgun, Maya Cakmak, Jae Wook Yoo, and Andrea Lockerd Thomaz. 2012. Trajectories and keyframes for kinesthetic teaching: A human-robot interaction perspective. In Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction. ACM, 391--398.
[2]
Jacopo Aleotti and Stefano Caselli. 2006. Robust trajectory learning and approximation for robot programming by demonstration. Robotics and Autonomous Systems, Vol. 54, 5 (2006), 409--413.
[3]
Sonya Alexandrova, Zachary Tatlock, and Maya Cakmak. 2015. Roboflow: A flow-based visual programming language for mobile manipulation tasks. In 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 5537--5544.
[4]
Brenna D Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. 2009. A survey of robot learning from demonstration. Robotics and autonomous systems, Vol. 57, 5 (2009), 469--483.
[5]
Reuben M Aronson and Henny Admoni. 2018. Gaze for Error Detection During Human-Robot Shared Manipulation. In Fundamentals of Joint Action workshop, Robotics: Science and Systems .
[6]
Gedas Bertasius, Aaron Chan, and Jianbo Shi. 2018. Egocentric basketball motion planning from a single first-person image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 5889--5898.
[7]
Gedas Bertasius, Hyun Soo Park, Stella X Yu, and Jianbo Shi. 2016. First person action-object detection with egonet. In Proceedings of Robotics: Science and Systems (RSS) .
[8]
Aude Billard, Sylvain Calinon, Ruediger Dillmann, and Stefan Schaal. 2008. Robot programming by demonstration. In Springer handbook of robotics . Springer, 1371--1394.
[9]
Sonia Chernova and Andrea L Thomaz. 2014. Robot learning from human teachers. Synthesis Lectures on Artificial Intelligence and Machine Learning, Vol. 8, 3 (2014), 1--121.
[10]
LS Homem De Mello and Arthur C Sanderson. 1990. AND/OR graph representation of assembly plans. IEEE Transactions on robotics and automation, Vol. 6, 2 (1990), 188--199.
[11]
Anca Dragan and Siddhartha Srinivasa. 2013. Generating legible motion. (2013).
[12]
Anca D Dragan, Shira Bauman, Jodi Forlizzi, and Siddhartha S Srinivasa. 2015. Effects of robot motion on human-robot collaboration. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction. ACM, 51--58.
[13]
Staffan Ekvall and Danica Kragic. 2006. Learning task models from multiple human demonstrations. In ROMAN 2006-The 15th IEEE International Symposium on Robot and Human Interactive Communication. IEEE, 358--363.
[14]
Alireza Fathi, Xiaofeng Ren, and James M Rehg. 2011. Learning to recognize objects in egocentric activities. In CVPR 2011. IEEE, 3281--3288.
[15]
Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, and Sergey Levine. 2017. One-Shot Visual Imitation Learning via Meta-Learning. In Conference on Robot Learning . 357--368.
[16]
Yuxiang Gao and Chien-Ming Huang. 2019. PATI: a projection-based augmented table-top interface for robot programming. In Proceedings of the 24th International Conference on Intelligent User Interfaces. ACM, 345--355.
[17]
Dylan F Glas, Takayuki Kanda, and Hiroshi Ishiguro. 2016. Human-robot interaction design using Interaction Composer eight years of lessons learned. In 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI) . IEEE, 303--310.
[18]
Michael Görner, Robert Haschke, Helge Ritter, and Jianwei Zhang. 2019. MoveIt! Task Constructor for task-level motion planning. (2019).
[19]
Bradley Hayes and Brian Scassellati. 2016. Autonomously constructing hierarchical task networks for planning and human-robot collaboration. In 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 5469--5476.
[20]
Micha Hersch, Florent Guenter, Sylvain Calinon, and Aude Billard. 2008. Dynamical system modulation for robot learning via kinesthetic demonstrations. IEEE Transactions on Robotics, Vol. 24, 6 (2008), 1463--1467.
[21]
Chien-Ming Huang and Bilge Mutlu. 2016. Anticipatory robot control for efficient human-robot collaboration. In The Eleventh ACM/IEEE International Conference on Human Robot Interaction. IEEE Press, 83--90.
[22]
Justin Huang and Maya Cakmak. 2017. Code3: A system for end-to-end programming of mobile manipulator robots for novices and experts. In Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction. ACM, 453--462.
[23]
James Kennedy, Paul Baxter, and Tony Belpaeme. 2015. The robot who tried too hard: Social behaviour of a robot tutor can negatively affect child learning. In 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 67--74.
[24]
Thomas Kollar, Stefanie Tellex, Deb Roy, and Nicholas Roy. 2010. Toward understanding natural language directions. In Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction. IEEE Press, 259--266.
[25]
Yasuo Kuniyoshi, Masayuki Inaba, and Hirochika Inoue. 1994. Learning by watching: Extracting reusable task knowledge from visual observation of human performance. IEEE transactions on robotics and automation, Vol. 10, 6 (1994), 799--822.
[26]
Stanislao Lauria, Guido Bugmann, Theocharis Kyriacou, and Ewan Klein. 2002. Mobile robot programming using natural language. Robotics and Autonomous Systems, Vol. 38, 3--4 (2002), 171--181.
[27]
Alex X Lee, Henry Lu, Abhishek Gupta, Sergey Levine, and Pieter Abbeel. 2015. Learning force-based manipulation of deformable objects from multiple demonstrations. In 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 177--184.
[28]
Jangwon Lee and Michael S Ryoo. 2017. Learning robot activities from first-person human videos using convolutional future regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1--2.
[29]
Kyungjun Lee and Hernisa Kacorri. 2019. Hands Holding Clues for Object Recognition in Teachable Machines. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 336.
[30]
Haoxiang Li, Mohammed Kutbi, Xin Li, Changjiang Cai, Philippos Mordohai, and Gang Hua. 2016. An egocentric computer vision based co-robot wheelchair. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 1829--1836.
[31]
Yin Li, Zhefan Ye, and James M Rehg. 2015. Delving into egocentric actions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 287--295.
[32]
Minghuang Ma, Haoqi Fan, and Kris M Kitani. 2016. Going deeper into first-person activity recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1894--1903.
[33]
Cynthia Matuszek, Liefeng Bo, Luke Zettlemoyer, and Dieter Fox. 2014. Learning from Unscripted Deictic Gesture and Language for Human-Robot Interactions. In AAAI . 2556--2563.
[34]
Grégoire Milliez, Raphaël Lallement, Michelangelo Fiore, and Rachid Alami. 2016. Using human knowledge awareness to adapt collaborative plan generation, explanation and monitoring. In The Eleventh ACM/IEEE International Conference on Human Robot Interaction. IEEE Press, 43--50.
[35]
Anahita Mohseni-Kabir, Charles Rich, Sonia Chernova, Candace L Sidner, and Daniel Miller. 2015. Interactive hierarchical task learning from a single demonstration. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction. ACM, 205--212.
[36]
Yoan Mollard, Thibaut Munzer, Andrea Baisero, Marc Toussaint, and Manuel Lopes. 2015. Robot programming from demonstration, feedback and transfer. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 1825--1831.
[37]
Vladimir Nekrasov, Chunhua Shen, and Ian Reid. 2018. Light-weight refinenet for real-time semantic segmentation. In Proceedings of the British Machine Vision Conference (BMVC) .
[38]
Scott Niekum, Sachin Chitta, Andrew G Barto, Bhaskara Marthi, and Sarah Osentoski. 2013. Incremental Semantically Grounded Learning from Demonstration. In Robotics: Science and Systems, Vol. 9. Berlin, Germany.
[39]
Scott Niekum, Sarah Osentoski, George Konidaris, and Andrew G Barto. 2012. Learning and generalization of complex tasks from unstructured demonstrations. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5239--5246.
[40]
Stefanos Nikolaidis, Anca Dragan, and Siddharta Srinivasa. 2016. Viewpoint-Based Legibility Optimization. In The Eleventh ACM/IEEE International Conference on Human Robot Interaction. IEEE Press, 271--278.
[41]
Stefanos Nikolaidis, Ramya Ramakrishnan, Keren Gu, and Julie Shah. 2015. Efficient model learning from joint-action demonstrations for human-robot collaborative tasks. In Proceedings of the tenth annual ACM/IEEE international conference on human-robot interaction. ACM, 189--196.
[42]
Chris Paxton, Andrew Hundt, Felix Jonathan, Kelleher Guerin, and Gregory D Hager. 2017. CoSTAR: Instructing collaborative robots with behavior trees and vision. In 2017 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 564--571.
[43]
Alessandro Roncone, Olivier Mangin, and Brian Scassellati. 2017. Transparent role assignment and task allocation in human robot collaboration. In 2017 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1014--1021.
[44]
MS Ryoo, Thomas J Fuchs, Lu Xia, Jake K Aggarwal, and Larry Matthies. 2015. Robot-centric activity prediction from first-person videos: What will they do to me?. In 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 295--302.
[45]
Yasaman S Sefidgar, Prerna Agarwal, and Maya Cakmak. 2017. Situated tangible robot programming. In Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction. ACM, 473--482.
[46]
Lanbo She, Yu Cheng, Joyce Y Chai, Yunyi Jia, Shaohua Yang, and Ning Xi. 2014a. Teaching robots new actions through natural language instructions. In Robot and Human Interactive Communication, 2014 RO-MAN: The 23rd IEEE International Symposium on. IEEE, 868--873.
[47]
Lanbo She, Shaohua Yang, Yu Cheng, Yunyi Jia, Joyce Chai, and Ning Xi. 2014b. Back to the blocks world: Learning new actions through situated human-robot dialogue. In Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL) . 89--97.
[48]
Hyun Soo Park, Jyh-Jing Hwang, Yedong Niu, and Jianbo Shi. 2016. Egocentric future localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4697--4705.
[49]
Hyun Soo Park and Jianbo Shi. 2015. Social saliency prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4777--4785.
[50]
Maj Stenmark and Pierre Nugues. 2013. Natural language programming of industrial robots. In ISR. Citeseer, 1--5.
[51]
Swathikiran Sudhakaran, Sergio Escalera, and Oswald Lanz. 2019. Lsta: Long short-term attention for egocentric action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 9954--9963.
[52]
C Sylvain. 2009. Robot programming by demonstration: A probabilistic approach .EPFL Press.
[53]
Faraz Torabi, Garrett Warnell, and Peter Stone. 2018. Behavioral cloning from observation. arXiv preprint arXiv:1805.01954 (2018).
[54]
Lu Xia, Ilaria Gori, Jake K Aggarwal, and Michael S Ryoo. 2015. Robot-centric activity recognition from first-person rgb-d videos. In 2015 IEEE Winter Conference on Applications of Computer Vision. IEEE, 357--364.
[55]
Tianhe Yu, Chelsea Finn, Annie Xie, Sudeep Dasari, Tianhao Zhang, Pieter Abbeel, and Sergey Levine. 2018. One-shot imitation from observing humans via domain-adaptive meta-learning. In Proceedings of Robotics: Science and Systems (RSS) .
[56]
Tianhao Zhang, Zoe McCarthy, Owen Jow, Dennis Lee, Ken Goldberg, and Pieter Abbeel. 2018. Deep imitation learning for complex manipulation tasks from virtual reality teleoperation. 2018 IEEE International Conference on on Robotics and Automation (ICRA) (2018).

Cited By

View all
  • (2023)Learning Interaction Regions and Motion Trajectories Simultaneously From Egocentric Demonstration VideosIEEE Robotics and Automation Letters10.1109/LRA.2023.33013078:10(6635-6642)Online publication date: Oct-2023
  • (2022)Detection of Physical Strain and Fatigue in Industrial Environments Using Visual and Non-Visual Low-Cost SensorsTechnologies10.3390/technologies1002004210:2(42)Online publication date: 16-Mar-2022
  • (2022)ARnnotate: An Augmented Reality Interface for Collecting Custom Dataset of 3D Hand-Object Interaction Pose EstimationProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545663(1-14)Online publication date: 29-Oct-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
HRI '20: Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction
March 2020
690 pages
ISBN:9781450367462
DOI:10.1145/3319502
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 March 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. collaborative robotics
  2. first-person demonstration
  3. human-robot interaction
  4. programming by demonstration

Qualifiers

  • Research-article

Conference

HRI '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 268 of 1,124 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)3
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Learning Interaction Regions and Motion Trajectories Simultaneously From Egocentric Demonstration VideosIEEE Robotics and Automation Letters10.1109/LRA.2023.33013078:10(6635-6642)Online publication date: Oct-2023
  • (2022)Detection of Physical Strain and Fatigue in Industrial Environments Using Visual and Non-Visual Low-Cost SensorsTechnologies10.3390/technologies1002004210:2(42)Online publication date: 16-Mar-2022
  • (2022)ARnnotate: An Augmented Reality Interface for Collecting Custom Dataset of 3D Hand-Object Interaction Pose EstimationProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545663(1-14)Online publication date: 29-Oct-2022
  • (2021)A Survey on End-User Robot ProgrammingACM Computing Surveys10.1145/346681954:8(1-36)Online publication date: 4-Oct-2021
  • (2021)Assisted End-User Robot ProgrammingProceedings of the 2021 International Conference on Multimodal Interaction10.1145/3462244.3481276(797-801)Online publication date: 18-Oct-2021
  • (2021)Fine-Grained Activity Recognition for Assembly VideosIEEE Robotics and Automation Letters10.1109/LRA.2021.30641496:2(3728-3735)Online publication date: Apr-2021
  • (2021)Foreground-Aware Stylization and Consensus Pseudo-Labeling for Domain Adaptation of First-Person Hand SegmentationIEEE Access10.1109/ACCESS.2021.30940529(94644-94655)Online publication date: 2021
  • (2020)From Do You See What I See? to Do You Control What I See? Mediated Vision, From a Distance, for Eyewear UsersProceedings of the 19th International Conference on Mobile and Ubiquitous Multimedia10.1145/3428361.3432089(326-328)Online publication date: 22-Nov-2020
  • (2020)User Needs and Design Opportunities in End-User Robot ProgrammingCompanion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3371382.3378300(93-95)Online publication date: 23-Mar-2020
  • (2020)Contextual Programming of Collaborative RobotsArtificial Intelligence in HCI10.1007/978-3-030-50334-5_22(321-338)Online publication date: 10-Jul-2020

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media