Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3675094.3678447acmconferencesArticle/Chapter ViewAbstractPublication PagesubicompConference Proceedingsconference-collections
research-article
Open access

Composite Image Generation Using Labeled Segments for Pattern-Rich Dataset without Unannotated Target

Published: 05 October 2024 Publication History

Abstract

Although object detection technology using cameras offers potential for various applications, it incurs dataset creation costs to train new models where general-purpose models are ineffective, such as in industrial settings. We have previously developed a semi-automated annotation framework that employs optical flow and representation learning techniques to reduce human effort significantly. However, it was likely to cause unintended annotation omissions and mistakes compared to manual annotation. In this study, we propose a composite image generation approach to create omission-free and pattern-rich datasets. The proposed method synthesizes natural-looking images without unannotated targets by placing labeled foreground segments at their original positions on targetless background frames collected with the same fixed-point cameras. Evaluation with video footage in a logistics warehouse confirmed that improved dataset reliability led to higher model performance.

References

[1]
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-End Object Detection with Transformers. In Computer Vision - ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part I. 213--229. https://doi.org/10.1007/978--3-030--58452--8_13
[2]
Xinlei Chen and Kaiming He. 2021. Exploring Simple Siamese Representation Learning. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 15745--15753. https://doi.org/10.1109/CVPR46437.2021.01549
[3]
Jiwoong Choi, Ismail Elezi, Hyuk-Jae Lee, Clement Farabet, and Jose M. Alvarez. 2021. Active Learning for Deep Object Detection via Probabilistic Modeling. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 10244--10253. https://doi.org/10.1109/ICCV48922.2021.01010
[4]
Nikita Dvornik, Julien Mairal, and Cordelia Schmid. 2018. Modeling Visual Context Is Key to Augmenting Object Detection Datasets. In Computer Vision - ECCV 2018: 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part XII. 375--391. https://doi.org/10.1007/978--3-030-01258--8_23
[5]
Debidatta Dwibedi, Ishan Misra, and Martial Hebert. 2017. Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection. In 2017 IEEE International Conference on Computer Vision (ICCV). 1310--1319. https://doi.org/10.1109/ICCV.2017.146
[6]
Nathan Elangovan, Ricardo V. Godoy, Felipe Sanches, Ke Wang, Tom White, Patrick Jarvis, and Minas Liarokapis. 2023. On Human Grasping and Manipulation in Kitchens: Automated Annotation, Insights, and Metrics for Effective Data Collection. In 2023 IEEE International Conference on Robotics and Automation (ICRA). 11329--11335. https://doi.org/10.1109/ICRA48891.2023.10161171
[7]
Georgios Georgakis, Arsalan Mousavian, Alexander Berg, and Jana Kosecka. 2017. Synthesizing Training Data for Object Detection in Indoor Scenes. In Proceedings of Robotics: Science and Systems. https://doi.org/10.15607/RSS.2017.XIII.043
[8]
Keisuke Higashiura, Kodai Yokoyama, Yusuke Asai, Hironori Shimosato, Kazuma Kano, Shin Katayama, Kenta Urano, Takuro Yonezawa, and Nobuo Kawaguchi. 2024. Semi-Automated Framework for Digitalizing Multi-Product Warehouses with Large Scale Camera Arrays. In 2024 IEEE International Conference on Pervasive Computing and Communications (PerCom). 98--105. https://doi.org/10.1109/PerCom59722.2024.10494498
[9]
Glenn Jocher, Ayush Chaurasia, and Jing Qiu. 2023. Ultralytics YOLOv8. https://github.com/ultralytics/ultralytics
[10]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single Shot MultiBox Detector. In Computer Vision -- ECCV 2016. 21--37.
[11]
Jiahao Lu, Chong Yin, Oswin Krause, Kenny Erleben, Michael Bachmann Nielsen, and Sune Darkner. 2022. Reducing Annotation Need in Self-explanatory Models for Lung Nodule Diagnosis. In Interpretability of Machine Intelligence in Medical Image Computing: 5th International Workshop, IMIMIC 2022, Held in Conjunction with MICCAI 2022, Singapore, Singapore, September 22, 2022, Proceedings. 33--43. https://doi.org/10.1007/978--3-031--17976--1_4
[12]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, 6 (2017), 1137--1149. https://doi.org/10.1109/TPAMI.2016.2577031
[13]
Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Semantic Foggy Scene Understanding with Synthetic Data. Int. J. Comput. Vision, Vol. 126, 9 (sep 2018), 973--992. https://doi.org/10.1007/s11263-018--1072--8
[14]
Zachary Teed and Jia Deng. 2020. RAFT: Recurrent All-Pairs Field Transforms for Optical Flow. In Computer Vision - ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part II. 402--419. https://doi.org/10.1007/978--3-030--58536--5_24
[15]
Vladyslav Usenko, Nikolaus Demmel, and Daniel Cremers. 2018. The Double Sphere Camera Model. In 2018 International Conference on 3D Vision (3DV). 552--560. https://doi.org/10.1109/3DV.2018.00069
[16]
Lin Yang, Yizhe Zhang, Jianxu Chen, Siyuan Zhang, and Danny Z. Chen. 2017. Suggestive Annotation: A Deep Active Learning Framework for Biomedical Image Segmentation. In Medical Image Computing and Computer Assisted Intervention - MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11--13, 2017, Proceedings, Part III. 399--407. https://doi.org/10.1007/978--3--319--66179--7_46
[17]
Kodai Yokoyama, Shin Katayama, Kenta Urano, Takuro Yonezawa, and Nobuo Kawaguchi. 2023. Digitization and Analysis Framework for Warehouse Truck Berth. In 2023 Fourteenth International Conference on Mobile Computing and Ubiquitous Network (ICarnegie Mellon University). 1--4. https://doi.org/10.23919/ICarnegie Mellon University58504.2023.10412228
[18]
Donggeun Yoo and In So Kweon. 2019. Learning Loss for Active Learning. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 93--102. https://doi.org/10.1109/CVPR.2019.00018

Index Terms

  1. Composite Image Generation Using Labeled Segments for Pattern-Rich Dataset without Unannotated Target

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      UbiComp '24: Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing
      October 2024
      1032 pages
      ISBN:9798400710582
      DOI:10.1145/3675094
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 05 October 2024

      Check for updates

      Author Tags

      1. data augmentation
      2. image synthesis
      3. logistics warehouse
      4. optical flow
      5. representation learning

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      UbiComp '24

      Acceptance Rates

      Overall Acceptance Rate 764 of 2,912 submissions, 26%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 73
        Total Downloads
      • Downloads (Last 12 months)73
      • Downloads (Last 6 weeks)29
      Reflects downloads up to 12 Feb 2025

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media