Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3544548.3581185acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

TmoTA: Simple, Highly Responsive Tool for Multiple Object Tracking Annotation

Published: 19 April 2023 Publication History

Abstract

Machine learning is applied in a multitude of sectors with very impressive results. This success is due to the availability of an ever-growing amount of data acquired by omnipresent sensor devices and platforms on the internet. But there is a scarcity of labeled data which is required for most ML methods. However, generation of labeled data requires much time and resources. In this paper, we propose a portable, Open Source, simple and responsive manual Tool for 2D multiple object Tracking Annotation (TmoTA). Besides responsiveness, our tool design provides several features like view centering and looped playback that speed up the annotation process. We evaluate our proposed tool by comparing TmoTA with the widely used manual labeling tools CVAT, Label Studio, and two semi-automated tools Supervisely and VATIC with respect to object labeling time and accuracy. The evaluation includes a user study and pre-case studies showing that the annotation time per object frame can be reduced by 20% to 40% over the first 20 annotated objects compared to the manual labeling tools.

Supplementary Material

Supplemental Materials (3544548.3581185-supplemental-materials.zip)
MP4 File (3544548.3581185-talk-video.mp4)
Pre-recorded Video Presentation
MP4 File (3544548.3581185-video-preview.mp4)
Video Preview

References

[1]
K. Bernardin, A. Elbs, and R. Stiefelhagen. 2006. Multiple object tracking performance metrics and evaluation in a smart room environment. In Proceedings of the Sixth IEEE International Workshop on Visual Surveillance, VS 2006, Graz, Austria, 1. May 2006.
[2]
Keni Bernardin and Rainer Stiefelhagen. 2008. Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. J. Image Video Process. 2008, Article 1 (jan 2008), 10 pages. https://doi.org/10.1155/2008/246309
[3]
P.K. Bhagat and P. Choudhary. 2018. Image annotation: Then and now. Image and Vision Computing 80 (2018), 1–23. https://doi.org/10.1016/j.imavis.2018.09.017
[4]
Dinesh Bolkensteyn. 2016. Vatic.js. https://github.com/dbolkensteyn/vatic.js. (Accessed on 12/08/2022).
[5]
cognilytica. 2021. Data Labeling Market: Research Snapshot Dec. 2021 - Cognilytica. https://www.cognilytica.com/document/data-labeling-market-research-snapshot-dec-2021/. (Accessed on 06/20/2022).
[6]
Carlos Cuevas, Eva María Yáñez, and Narciso García. 2015. Tool for Semiautomatic Labeling of Moving Objects in Video Sequences: TSLAB. Sensors 15, 7 (2015), 15159–15178. https://doi.org/10.3390/s150715159
[7]
Patrick Dendorfer, Hamid Rezatofighi, Anton Milan, Javen Shi, Daniel Cremers, Ian Reid, Stefan Roth, Konrad Schindler, and Laura Leal-Taixé. 2020. Mot20: A benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003(2020).
[8]
Abhishek Dutta and Andrew Zisserman. 2019. The VIA Annotation Software for Images, Audio and Video. In Proceedings of the 27th ACM International Conference on Multimedia (Nice, France) (MM ’19). Association for Computing Machinery, New York, NY, USA, 2276–2279. https://doi.org/10.1145/3343031.3350535
[9]
Heartex Inc.2022. Label Studio. https://labelstud.io/. (Accessed on 12/08/2022).
[10]
Arne Hoffhues Jonathon Luiten. 2020. TrackEval. https://github.com/JonathonLuiten/TrackEval.
[11]
Patrick W. Jordan, Bruce Thomas, Bernard A. Weerdmeester, and Ian L. McClelland. 1996. Usability Evaluation In Industry - Google Books. https://rb.gy/juemzj. (Accessed on 12/09/2022).
[12]
Adrian Krenzer, Kevin Makowski, Amar Hekalo, Daniel Fitting, Joel Troya, Wolfram G Zoller, Alexander Hann, and Frank Puppe. 2022. Fast machine learning annotation in the medical domain: a semi-automated video annotation tool for gastroenterologists. BioMedical Engineering OnLine 21, 1 (2022), 1–23.
[13]
Alina Kuznetsova, Aakrati Talati, Yiwen Luo, Keith Simmons, and Vittorio Ferrari. 2020. Efficient video annotation with visual interpolation and frame selection guidance. CoRR abs/2012.12554(2020). arXiv:2012.12554https://arxiv.org/abs/2012.12554
[14]
Jonathon Luiten, Aljosa Osep, Patrick Dendorfer, Philip Torr, Andreas Geiger, Laura Leal-Taixé, and Bastian Leibe. 2020. HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking. International Journal of Computer Vision(2020), 1–31.
[15]
microsoft/VoTT. 2021. GitHub - microsoft/VoTT: Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos.https://github.com/microsoft/VoTT#download-and-install-a-release-package-for-your-platform-recommended. (Accessed on 06/24/2022).
[16]
Supervisely OÜ. 2022. Supervisely. https://supervise.ly/. (Accessed on 12/08/2022).
[17]
Boris Sekachev, Andrey Zhavoronkov, and Nikita Manovich. 2019. Computer Vision Annotation Tool: A Universal Approach to Data Annotation. https://www.intel.com/content/www/us/en/developer/articles/technical/computer-vision-annotation-tool-a-universal-approach-to-data-annotation.html. (Accessed on 06/24/2022).
[18]
sgumhold/cgv. 2022. GitHub - sgumhold/cgv. https://github.com/sgumhold/cgv. (Accessed on 06/28/2022).
[19]
Dennis Stumpf, Stephan Krauß, Gerd Reis, Oliver Wasenmüller, and Didier Stricker. 2021. SALT: A Semi-automatic Labeling Tool for RGB-D Video Sequences. CoRR abs/2102.10820(2021). arXiv:2102.10820https://arxiv.org/abs/2102.10820
[20]
tzutalin. 2021. GitHub - tzutalin/labelImg: LabelImg is a graphical image annotation tool and label object bounding boxes in images. https://github.com/tzutalin/labelImg. (Accessed on 06/24/2022).
[21]
Luis von Ahn and Laura Dabbish. 2004. Labeling Images with a Computer Game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vienna, Austria) (CHI ’04). Association for Computing Machinery, New York, NY, USA, 319–326. https://doi.org/10.1145/985692.985733
[22]
Luis von Ahn, Ruoran Liu, and Manuel Blum. 2006. Peekaboom: A Game for Locating Objects in Images. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Montréal, Québec, Canada) (CHI ’06). Association for Computing Machinery, New York, NY, USA, 55–64. https://doi.org/10.1145/1124772.1124782
[23]
Carl Vondrick, Donald J. Patterson, and Deva Ramanan. 2012. Efficiently Scaling up Crowdsourced Video Annotation. International Journal of Computer Vision 101 (2012), 184–204. http://dx.doi.org/10.1007/s11263-012-0564-1
[24]
Jenny Yuen, Bryan Russell, Ce Liu, and Antonio Torralba. 2009. LabelMe video: Building a video database with human annotations. In 2009 IEEE 12th International Conference on Computer Vision. 1451–1458. https://doi.org/10.1109/ICCV.2009.5459289

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems
April 2023
14911 pages
ISBN:9781450394215
DOI:10.1145/3544548
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 April 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data labeling
  2. manual labeling
  3. video sequence labeling

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • German Federal Ministry of Education and Research (BMBF, 01/S18026A-F) by funding the Center for Scalable Data Analytics and Artificial Intelligance ScaDS.AI Dresden/Leipzig
  • Deutsche Forschungsgemeinschaft through DFG grant 389792660 as part of TRR 248 and the Clusters of Excellence CeTI (EXC 2050/1, grant 390696704)

Conference

CHI '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 376
    Total Downloads
  • Downloads (Last 12 months)166
  • Downloads (Last 6 weeks)12
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media