research-article

TmoTA: Simple, Highly Responsive Tool for Multiple Object Tracking Annotation

Authors:

Marzan Tasnim Oyshi,

Sebastian Vogt,

Stefan GumholdAuthors Info & Claims

CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems

Article No.: 413, Pages 1 - 11

https://doi.org/10.1145/3544548.3581185

Published: 19 April 2023 Publication History

Abstract

Machine learning is applied in a multitude of sectors with very impressive results. This success is due to the availability of an ever-growing amount of data acquired by omnipresent sensor devices and platforms on the internet. But there is a scarcity of labeled data which is required for most ML methods. However, generation of labeled data requires much time and resources. In this paper, we propose a portable, Open Source, simple and responsive manual Tool for 2D multiple object Tracking Annotation (TmoTA). Besides responsiveness, our tool design provides several features like view centering and looped playback that speed up the annotation process. We evaluate our proposed tool by comparing TmoTA with the widely used manual labeling tools CVAT, Label Studio, and two semi-automated tools Supervisely and VATIC with respect to object labeling time and accuracy. The evaluation includes a user study and pre-case studies showing that the annotation time per object frame can be reduced by 20% to 40% over the first 20 annotated objects compared to the manual labeling tools.

Supplementary Material

Supplemental Materials (3544548.3581185-supplemental-materials.zip)

Download
.19 KB

MP4 File (3544548.3581185-talk-video.mp4)

Pre-recorded Video Presentation

Download
121.24 MB

MP4 File (3544548.3581185-video-preview.mp4)

Video Preview

Download
7.15 MB

References

[1]

K. Bernardin, A. Elbs, and R. Stiefelhagen. 2006. Multiple object tracking performance metrics and evaluation in a smart room environment. In Proceedings of the Sixth IEEE International Workshop on Visual Surveillance, VS 2006, Graz, Austria, 1. May 2006.

[2]

Keni Bernardin and Rainer Stiefelhagen. 2008. Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. J. Image Video Process. 2008, Article 1 (jan 2008), 10 pages. https://doi.org/10.1155/2008/246309

[3]

P.K. Bhagat and P. Choudhary. 2018. Image annotation: Then and now. Image and Vision Computing 80 (2018), 1–23. https://doi.org/10.1016/j.imavis.2018.09.017

[4]

Dinesh Bolkensteyn. 2016. Vatic.js. https://github.com/dbolkensteyn/vatic.js. (Accessed on 12/08/2022).

[5]

cognilytica. 2021. Data Labeling Market: Research Snapshot Dec. 2021 - Cognilytica. https://www.cognilytica.com/document/data-labeling-market-research-snapshot-dec-2021/. (Accessed on 06/20/2022).

[6]

Carlos Cuevas, Eva María Yáñez, and Narciso García. 2015. Tool for Semiautomatic Labeling of Moving Objects in Video Sequences: TSLAB. Sensors 15, 7 (2015), 15159–15178. https://doi.org/10.3390/s150715159

[7]

Patrick Dendorfer, Hamid Rezatofighi, Anton Milan, Javen Shi, Daniel Cremers, Ian Reid, Stefan Roth, Konrad Schindler, and Laura Leal-Taixé. 2020. Mot20: A benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003(2020).

[8]

Abhishek Dutta and Andrew Zisserman. 2019. The VIA Annotation Software for Images, Audio and Video. In Proceedings of the 27th ACM International Conference on Multimedia (Nice, France) (MM ’19). Association for Computing Machinery, New York, NY, USA, 2276–2279. https://doi.org/10.1145/3343031.3350535

Digital Library

[9]

Heartex Inc.2022. Label Studio. https://labelstud.io/. (Accessed on 12/08/2022).

[10]

Arne Hoffhues Jonathon Luiten. 2020. TrackEval. https://github.com/JonathonLuiten/TrackEval.

[11]

Patrick W. Jordan, Bruce Thomas, Bernard A. Weerdmeester, and Ian L. McClelland. 1996. Usability Evaluation In Industry - Google Books. https://rb.gy/juemzj. (Accessed on 12/09/2022).

[12]

Adrian Krenzer, Kevin Makowski, Amar Hekalo, Daniel Fitting, Joel Troya, Wolfram G Zoller, Alexander Hann, and Frank Puppe. 2022. Fast machine learning annotation in the medical domain: a semi-automated video annotation tool for gastroenterologists. BioMedical Engineering OnLine 21, 1 (2022), 1–23.

[13]

Alina Kuznetsova, Aakrati Talati, Yiwen Luo, Keith Simmons, and Vittorio Ferrari. 2020. Efficient video annotation with visual interpolation and frame selection guidance. CoRR abs/2012.12554(2020). arXiv:2012.12554https://arxiv.org/abs/2012.12554

[14]

Jonathon Luiten, Aljosa Osep, Patrick Dendorfer, Philip Torr, Andreas Geiger, Laura Leal-Taixé, and Bastian Leibe. 2020. HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking. International Journal of Computer Vision(2020), 1–31.

[15]

microsoft/VoTT. 2021. GitHub - microsoft/VoTT: Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos.https://github.com/microsoft/VoTT#download-and-install-a-release-package-for-your-platform-recommended. (Accessed on 06/24/2022).

[16]

Supervisely OÜ. 2022. Supervisely. https://supervise.ly/. (Accessed on 12/08/2022).

[17]

Boris Sekachev, Andrey Zhavoronkov, and Nikita Manovich. 2019. Computer Vision Annotation Tool: A Universal Approach to Data Annotation. https://www.intel.com/content/www/us/en/developer/articles/technical/computer-vision-annotation-tool-a-universal-approach-to-data-annotation.html. (Accessed on 06/24/2022).

[18]

sgumhold/cgv. 2022. GitHub - sgumhold/cgv. https://github.com/sgumhold/cgv. (Accessed on 06/28/2022).

[19]

Dennis Stumpf, Stephan Krauß, Gerd Reis, Oliver Wasenmüller, and Didier Stricker. 2021. SALT: A Semi-automatic Labeling Tool for RGB-D Video Sequences. CoRR abs/2102.10820(2021). arXiv:2102.10820https://arxiv.org/abs/2102.10820

[20]

tzutalin. 2021. GitHub - tzutalin/labelImg: LabelImg is a graphical image annotation tool and label object bounding boxes in images. https://github.com/tzutalin/labelImg. (Accessed on 06/24/2022).

[21]

Luis von Ahn and Laura Dabbish. 2004. Labeling Images with a Computer Game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vienna, Austria) (CHI ’04). Association for Computing Machinery, New York, NY, USA, 319–326. https://doi.org/10.1145/985692.985733

Digital Library

[22]

Luis von Ahn, Ruoran Liu, and Manuel Blum. 2006. Peekaboom: A Game for Locating Objects in Images. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Montréal, Québec, Canada) (CHI ’06). Association for Computing Machinery, New York, NY, USA, 55–64. https://doi.org/10.1145/1124772.1124782

Digital Library

[23]

Carl Vondrick, Donald J. Patterson, and Deva Ramanan. 2012. Efficiently Scaling up Crowdsourced Video Annotation. International Journal of Computer Vision 101 (2012), 184–204. http://dx.doi.org/10.1007/s11263-012-0564-1

Digital Library

[24]

Jenny Yuen, Bryan Russell, Ce Liu, and Antonio Torralba. 2009. LabelMe video: Building a video database with human annotations. In 2009 IEEE 12th International Conference on Computer Vision. 1451–1458. https://doi.org/10.1109/ICCV.2009.5459289

Cited By

Index Terms

TmoTA: Simple, Highly Responsive Tool for Multiple Object Tracking Annotation
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Automatic labeling to overcome the limitations of deep learning in applications with insufficient training data: A case study on fruit detection in pear orchards
Graphical abstract

Display Omitted
Highlights
- Developed a multi-object localization algorithm by adapting an unsupervised single-object discovery algorithm.
- Minimized human intervention in data annotation by proposing an AI-based automatic labeling method.
- Achieved superior ...
Abstract
Accurate counting of pears in orchard environments is essential for crop management. However, due to the time and labor cost of manual counting, farmers rely on sampling a few trees and extending the count to the entire orchard, which overlooks ...
A Cluster-then-label Approach for Few-shot Learning with Application to Automatic Image Data Labeling
Few-shot learning (FSL) aims at learning to generalize from only a small number of labeled examples for a given target task. Most current state-of-the-art FSL methods typically have two limitations. First, they usually require access to a source dataset (...
OneLabeler: A Flexible System for Building Data Labeling Tools
CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

Labeled datasets are essential for supervised machine learning. Various data labeling tools have been built to collect labels in different usage scenarios. However, developing labeling tools is time-consuming, costly, and expertise-demanding on software ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems

April 2023

14911 pages

ISBN:9781450394215

DOI:10.1145/3544548

Editors:
Albrecht Schmidt
LMU Munich, Germany60028717
,
Kaisa Väänänen
Tampere University, Finland60011170
,
Tesh Goyal
Google Research, USA60006191
,
Per Ola Kristensson
University of Cambridge, UK60031101
,
Anicia Peters
University of Namibia, Namibia60072704
,
Stefanie Mueller
Massachusetts Institute of Technology, USA60022195
,
Julie R. Williamson
University of Glasgow, UK60001490
,
Max L. Wilson
University of Nottingham, UK60015138

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 April 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

German Federal Ministry of Education and Research (BMBF, 01/S18026A-F) by funding the Center for Scalable Data Analytics and Artificial Intelligance ScaDS.AI Dresden/Leipzig
Deutsche Forschungsgemeinschaft through DFG grant 389792660 as part of TRR 248 and the Clusters of Excellence CeTI (EXC 2050/1, grant 390696704)

Conference

CHI '23

Sponsor:

SIGCHI

CHI '23: CHI Conference on Human Factors in Computing Systems

April 23 - 28, 2023

Hamburg, Germany

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
376
Total Downloads

Downloads (Last 12 months)166
Downloads (Last 6 weeks)12

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Table of Contents