research-article

Open access

Composite Image Generation Using Labeled Segments for Pattern-Rich Dataset without Unannotated Target

Authors:

Keisuke Higashiura,

Tahera Hossain,

Takuro Yonezawa,

Nobuo KawaguchiAuthors Info & Claims

UbiComp '24: Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing

Pages 507 - 512

https://doi.org/10.1145/3675094.3678447

Published: 05 October 2024 Publication History

Abstract

Although object detection technology using cameras offers potential for various applications, it incurs dataset creation costs to train new models where general-purpose models are ineffective, such as in industrial settings. We have previously developed a semi-automated annotation framework that employs optical flow and representation learning techniques to reduce human effort significantly. However, it was likely to cause unintended annotation omissions and mistakes compared to manual annotation. In this study, we propose a composite image generation approach to create omission-free and pattern-rich datasets. The proposed method synthesizes natural-looking images without unannotated targets by placing labeled foreground segments at their original positions on targetless background frames collected with the same fixed-point cameras. Evaluation with video footage in a logistics warehouse confirmed that improved dataset reliability led to higher model performance.

References

[1]

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-End Object Detection with Transformers. In Computer Vision - ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part I. 213--229. https://doi.org/10.1007/978--3-030--58452--8_13

[2]

Xinlei Chen and Kaiming He. 2021. Exploring Simple Siamese Representation Learning. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 15745--15753. https://doi.org/10.1109/CVPR46437.2021.01549

[3]

Jiwoong Choi, Ismail Elezi, Hyuk-Jae Lee, Clement Farabet, and Jose M. Alvarez. 2021. Active Learning for Deep Object Detection via Probabilistic Modeling. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 10244--10253. https://doi.org/10.1109/ICCV48922.2021.01010

[4]

Nikita Dvornik, Julien Mairal, and Cordelia Schmid. 2018. Modeling Visual Context Is Key to Augmenting Object Detection Datasets. In Computer Vision - ECCV 2018: 15th European Conference, Munich, Germany, September 8--14, 2018, Proceedings, Part XII. 375--391. https://doi.org/10.1007/978--3-030-01258--8_23

[5]

Debidatta Dwibedi, Ishan Misra, and Martial Hebert. 2017. Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection. In 2017 IEEE International Conference on Computer Vision (ICCV). 1310--1319. https://doi.org/10.1109/ICCV.2017.146

[6]

Nathan Elangovan, Ricardo V. Godoy, Felipe Sanches, Ke Wang, Tom White, Patrick Jarvis, and Minas Liarokapis. 2023. On Human Grasping and Manipulation in Kitchens: Automated Annotation, Insights, and Metrics for Effective Data Collection. In 2023 IEEE International Conference on Robotics and Automation (ICRA). 11329--11335. https://doi.org/10.1109/ICRA48891.2023.10161171

[7]

Georgios Georgakis, Arsalan Mousavian, Alexander Berg, and Jana Kosecka. 2017. Synthesizing Training Data for Object Detection in Indoor Scenes. In Proceedings of Robotics: Science and Systems. https://doi.org/10.15607/RSS.2017.XIII.043

[8]

Keisuke Higashiura, Kodai Yokoyama, Yusuke Asai, Hironori Shimosato, Kazuma Kano, Shin Katayama, Kenta Urano, Takuro Yonezawa, and Nobuo Kawaguchi. 2024. Semi-Automated Framework for Digitalizing Multi-Product Warehouses with Large Scale Camera Arrays. In 2024 IEEE International Conference on Pervasive Computing and Communications (PerCom). 98--105. https://doi.org/10.1109/PerCom59722.2024.10494498

[9]

Glenn Jocher, Ayush Chaurasia, and Jing Qiu. 2023. Ultralytics YOLOv8. https://github.com/ultralytics/ultralytics

[10]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single Shot MultiBox Detector. In Computer Vision -- ECCV 2016. 21--37.

[11]

Jiahao Lu, Chong Yin, Oswin Krause, Kenny Erleben, Michael Bachmann Nielsen, and Sune Darkner. 2022. Reducing Annotation Need in Self-explanatory Models for Lung Nodule Diagnosis. In Interpretability of Machine Intelligence in Medical Image Computing: 5th International Workshop, IMIMIC 2022, Held in Conjunction with MICCAI 2022, Singapore, Singapore, September 22, 2022, Proceedings. 33--43. https://doi.org/10.1007/978--3-031--17976--1_4

[12]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, 6 (2017), 1137--1149. https://doi.org/10.1109/TPAMI.2016.2577031

Digital Library

[13]

Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Semantic Foggy Scene Understanding with Synthetic Data. Int. J. Comput. Vision, Vol. 126, 9 (sep 2018), 973--992. https://doi.org/10.1007/s11263-018--1072--8

[14]

Zachary Teed and Jia Deng. 2020. RAFT: Recurrent All-Pairs Field Transforms for Optical Flow. In Computer Vision - ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part II. 402--419. https://doi.org/10.1007/978--3-030--58536--5_24

[15]

Vladyslav Usenko, Nikolaus Demmel, and Daniel Cremers. 2018. The Double Sphere Camera Model. In 2018 International Conference on 3D Vision (3DV). 552--560. https://doi.org/10.1109/3DV.2018.00069

[16]

Lin Yang, Yizhe Zhang, Jianxu Chen, Siyuan Zhang, and Danny Z. Chen. 2017. Suggestive Annotation: A Deep Active Learning Framework for Biomedical Image Segmentation. In Medical Image Computing and Computer Assisted Intervention - MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11--13, 2017, Proceedings, Part III. 399--407. https://doi.org/10.1007/978--3--319--66179--7_46

[17]

Kodai Yokoyama, Shin Katayama, Kenta Urano, Takuro Yonezawa, and Nobuo Kawaguchi. 2023. Digitization and Analysis Framework for Warehouse Truck Berth. In 2023 Fourteenth International Conference on Mobile Computing and Ubiquitous Network (ICarnegie Mellon University). 1--4. https://doi.org/10.23919/ICarnegie Mellon University58504.2023.10412228

[18]

Donggeun Yoo and In So Kweon. 2019. Learning Loss for Active Learning. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 93--102. https://doi.org/10.1109/CVPR.2019.00018

Index Terms

Composite Image Generation Using Labeled Segments for Pattern-Rich Dataset without Unannotated Target
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection
  2. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

Anime-to-real clothing: Cosplay costume generation via image-to-image translation
Abstract
Cosplay has grown from its origins at fan conventions into a billion-dollar global dress phenomenon. To facilitate the imagination and reinterpretation of animated images as real garments, this paper presents an automatic costume-image generation ...
Transfer Dataset in Image Segmentation Use Case
Neural Information Processing
Abstract
The most labour-intensive stage of machine learning (ML) modelling is the appropriate preparation of correct dataset. This paper aims to show transfer dataset approach in image segmentation use case to lower labour intensity. Moreover, we test the ...
Cookpad Image Dataset: An Image Collection as Infrastructure for Food Research
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

In food-related services, image information is as important as text information for users. For example, in recipe search services, users find recipes based not only on text but also images. To promote studies on food images, many datasets have recently ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

UbiComp '24: Companion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing

October 2024

1032 pages

ISBN:9798400710582

DOI:10.1145/3675094

General Chairs:
Vassilis Kostakos
University of Melbourne, Australia
,
Judy Kay
University of Sydney, Australia
,
Thuong Hoang
Deakin University, Australia

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

SIGSPATIAL: ACM Special Interest Group on Spatial Information

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 October 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

UbiComp '24

Sponsor:

UbiComp '24: The 2024 ACM International Joint Conference on Pervasive and Ubiquitous Computing

October 5 - 9, 2024

Melbourne VIC, Australia

Acceptance Rates

Overall Acceptance Rate 764 of 2,912 submissions, 26%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
73
Total Downloads

Downloads (Last 12 months)73
Downloads (Last 6 weeks)29

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten