Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3306127.3331817acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Fully Convolutional One-Shot Object Segmentation for Industrial Robotics

Published: 08 May 2019 Publication History

Abstract

The ability to identify and localize new objects robustly and effectively is vital for robotic grasping and manipulation in warehouses or smart factories. Deep convolutional neural networks (DCNNs) have achieved the state-of-the-art performance on established image datasets for object detection and segmentation. However, applying DCNNs in dynamic industrial scenarios, e.g., warehouses and autonomous production, remains a challenging problem. DCNNs quickly become ineffective when tasked with detecting objects that they have not been trained on. Given that re-training using the latest data is time consuming, DCNNs cannot meet the requirement of the Factory of the Future (FoF) regarding rapid development and production cycles. To address this problem, we propose a novel one-shot object segmentation framework, using a fully convolutional Siamese network architecture, to detect previously unknown objects based on a single prototype image. We turn to multi-task learning to reduce training time and improve classification accuracy. Furthermore, we introduce a novel approach to automatically cluster the learnt feature space representation in a weakly supervised manner. We test the proposed framework on the RoboCup@Work dataset, simulating requirements for the FoF. Results show that the trained network on average identifies 73% of previously unseen objects correctly from a single example image. Correctly identified objects are estimated to have a 87.53% successful pick-up rate. Finally, multi-task learning lowers the convergence time by up to 33%, and increases accuracy by 2.99%.

References

[1]
Jeannette Bohg, Karol Hausman, Bharath Sankaran, Oliver Brock, Danica Kragic, Stefan Schaal, and Gaurav S Sukhatme. 2017. Interactive perception: Leveraging action in perception and perception in action. IEEE Transactions on Robotics, Vol. 33, 6 (2017), 1273--1291.
[2]
Bastian Broecker, Daniel Claes, Joscha Fossel, and Karl Tuyls. 2014. Winning the RoboCup@Work 2014 Competition: The smARTLab Approach. RoboCup 2014: Robot World Cup XVIII. Springer, 142--154.
[3]
Rich Caruana. 1998. Multitask Learning. Learning to Learn. Springer, 95--133.
[4]
Ondvrej Chum, Michal Perd'och, and Jivrí Matas. 2009. Geometric min-Hashing: Finding a (Thick) Needle in a Haystack. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 17--24.
[5]
Mark Everingham, Luc Van Gool, Christopher K.I. Williams, John Winn, and Andrew Zisserman. 2010. The Pascal Visual Object Classes (VOC) Challenge. International Journal of Computer Vision, Vol. 88, 2 (2010), 303--338.
[6]
Alberto Garcia-Garcia, Sergio Orts, Sergiu Oprea, Victor Villena-Martinez, and José García Rodríguez. 2017. A Review on Deep Learning Techniques Applied to Semantic Segmentation . CoRR, Vol. abs/1704.06857 (2017).
[7]
Yanming Guo, Yu Liu, Ard Oerlemans, Songyang Lao, Song Wu, and Michael S. Lew. 2016. Deep Learning for Visual Understanding: A Review . Neurocomputing, Vol. 187 (2016), 27--48.
[8]
M. Harant, M. Millard, M. Sreenivasa, N. vS arabon, and K. Mombaur. 2018. Comparison of Objective Functions for the Design of an Active Spinal Exoskeleton using Motion Capture Data and Optimal Control . Submitted to RA-L only Intelligent Human-Robot Interaction for Rehabilitation and Physical Assistance for the IEEE Robotics and Automation Letters (RA-L) (2018).
[9]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the IEEE International Conference on Computer Vision. 1026--1034.
[10]
Tomávs Hodavn, Pavel Haluza, vS tve pán Obdrvz álek, Jivr 'i Matas, Manolis Lourakis, and Xenophon Zabulis. 2017. T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects . IEEE Winter Conference on Applications of Computer Vision (WACV) (2017).
[11]
Seunghoon Hong, Tackgeun You, Suha Kwak, and Bohyung Han. 2015. Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network. In International Conference on Machine Learning. 597--606.
[12]
Joel Janai, Fatma Güney, Aseem Behl, and Andreas Geiger. 2017. Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art . arXiv preprint arXiv:1704.05519 (2017).
[13]
Gil Keren, Maximilian Schmitt, Thomas Kehrenberg, and Björn Schuller. 2018. Weakly Supervised One-Shot Detection with Attention Siamese Networks . arXiv preprint arXiv:1801.03329 (2018).
[14]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, 2015 .
[15]
Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. 2015. Siamese Neural Networks for One-Shot Image Recognition. In ICML Deep Learning Workshop, Vol. 2.
[16]
Gerhard K. Kraetzschmar, Nico Hochgeschwender, Walter Nowak, Frederik Hegger, Sven Schneider, Rhama Dwiputra, Jakob Berghofer, and Rainer Bischoff. 2015. RoboCup@Work: Competing for the Factory of the Future . RoboCup 2014: Robot World Cup XVIII, Reinaldo A. C. Bianchi, H. Levent Akin, Subramanian Ramamoorthy, and Komei Sugiura (Eds.). Lecture Notes in Computer Science, Vol. 8992. Springer International Publishing, 171--182.
[17]
Christoph H Lampert, Hannes Nickisch, and Stefan Harmeling. 2009. Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer. In IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE, 951--958.
[18]
Heiner Lasi, Peter Fettke, Hans-Georg Kemper, Thomas Feld, and Michael Hoffmann. 2014. Industry 4.0 . Business & Information Systems Engineering, Vol. 6, 4 (2014), 239.
[19]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 3431--3440.
[20]
Dominik Lucke, Carmen Constantinescu, and Engelbert Westk"amper. 2008. Smart Factory - A Step towards the Next Generation of Manufacturing . Manufacturing systems and technologies for the new frontier. Springer, 115--118.
[21]
Shan Luo, Joao Bimbo, Ravinder Dahiya, and Hongbin Liu. 2017. Robotic tactile perception of object properties: A review. Mechatronics, Vol. 48 (2017), 54--67.
[22]
Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. 2013. Rectifier Nonlinearities Improve Neural Network Acoustic Models. In ICML Workshop on Deep Learning for Audio, Speech and Language Processing, Vol. 30. 3--8.
[23]
Claudio Michaelis, Matthias Bethge, and Alexander S Ecker. 2018. One-Shot Segmentation in Clutter . arXiv preprint arXiv:1803.09597 (2018).
[24]
George Papandreou, Liang-Chieh Chen, Kevin P. Murphy, and Alan L. Yuille. 2015. Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation . 2015 IEEE International Conference on Computer Vision (ICCV) (2015), 1742--1750.
[25]
Pedro O. Pinheiro and Ronan Collobert. 2015. Weakly Supervised Semantic Segmentation with Convolutional Networks. In CVPR, Vol. 2. 6.
[26]
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, Faster, Stronger. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 6517--6525.
[27]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks . IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, 6 (2017), 1137--1149.
[28]
Ruslan Salakhutdinov, Joshua Tenenbaum, and Antonio Torralba. 2012. One-Shot Learning with a Hierarchical Nonparametric Bayesian Model. In Proceedings of ICML Workshop on Unsupervised and Transfer Learning . 195--206.
[29]
Jürgen Schmidhuber. 2015. Deep Learning in Neural Networks: An Overview . Neural Networks, Vol. 61 (2015), 85--117.
[30]
Benjamin Schnieders and Karl Tuvls. 2018. Fast Convergence for Object Detection by Learning how to Combine Error Functions. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 7329--7335.
[31]
Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Sharathchandra Pankanti, Rogerio Feris, Abhishek Kumar, Raja Giries, and Alex M. Bronstein. 2018. RepMet: Representative-based metric learning for classification and one-shot object detection . arXiv preprint arXiv:1806.04728 (2018).
[32]
Amirreza Shaban, Shray Bansal, Zhen Liu, Irfan Essa, and Byron Boots. 2017. One-Shot Learning for Semantic Segmentation. In Proceedings of the British Machine Vision Conference (BMVC) 2017 .
[33]
Tong Shen, Guosheng Lin, Lingqiao Liu, Chunhua Shen, and Ian Reid. 2017. Weakly Supervised Semantic Segmentation Based on Web Image Co-segmentation . arXiv preprint arXiv:1705.09052 (2017).
[34]
Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition . arXiv preprint arXiv:1409.1556 (2014).
[35]
Oriol Vinyals, Charles Blundell, Tim Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. 2016. Matching Networks for One Shot Learning. In Advances in Neural Information Processing Systems . 3630--3638.
[36]
Jian Wang, Feng Zhou, Shilei Wen, Xiao Liu, and Yuanqing Lin. 2017. Deep Metric Learning with Angular Loss. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2612--2620.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems
May 2019
2518 pages
ISBN:9781450363099

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 08 May 2019

Check for updates

Author Tags

  1. fully convolutional network
  2. industrial robotics
  3. one-shot segmentation

Qualifiers

  • Research-article

Funding Sources

  • This work was supported by the EPSRC project Robotics and Artificial Intelligence for Nuclear (RAIN)

Conference

AAMAS '19
Sponsor:

Acceptance Rates

AAMAS '19 Paper Acceptance Rate 193 of 793 submissions, 24%;
Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 94
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media