Enhancement of Speed and Accuracy Trade-Off for Sports Ball Detection in Videos—Finding Fast Moving, Small Objects in Real Time
<p>Architecture of the original YOLOv3 with 416 × 416 pixels input resolution.</p> "> Figure 2
<p>High-level overview of the adapted YOLOv3 training and inference pipeline. For training, we apply data augmentation to every frame by inserting additional balls. For each frame, we consider the previous images to calculate motion information. The value channel of the resulting motion frame is added to the network in addition to the original RGB image. We limit the bounding box prediction to the small objects branch in order to focus detection on small objects only.</p> "> Figure 3
<p>Cut-out of the ball from the original input image in different scales. Downscaling of the original image with a 16:9 aspect ratio to quadratic resolution of <math display="inline"><semantics> <mrow> <mn>960</mn> <mo>×</mo> <mn>960</mn> </mrow> </semantics></math> pixels causes an elongation of the ball. Even though the downscaling to <math display="inline"><semantics> <mrow> <mn>960</mn> <mo>×</mo> <mn>544</mn> </mrow> </semantics></math> pixels does not preserve the original aspect ratio, the characteristic circular shape of the ball is still maintained.</p> "> Figure 4
<p>Comparison of the architectures of the standard YOLOv3 with an input resolution of 416 × 416 pixels and the micro model using motion channel and an input resolution of 960 × 544 pixels.</p> "> Figure 5
<p>Examples of calculated motion channel inputs for a given input frame using different motion fade ratios.</p> "> Figure 6
<p>Different augmentation strategies for a given input frame. (<b>b</b>) shows the augmentation using multiple copies of the original ball (as used for static augmentation). (<b>c</b>) shows the augmentation using previously captured and randomly modified samples of beach volleyball balls (as used for motion augmentation). The red and blue trajectories visualize the simulated movement of two exemplary artificial balls over multiple frames.</p> "> Figure 7
<p>Examples for false positive detections by the reference model regular_960 × 544 that are suppressed by the micro_960 × 544_augmentation_motion_3_rt model using motion information. The examples show similar structures to the ball in the RGB image. However, they do not produce characteristic patterns in the motion channel and are consequently not recognized as balls.</p> "> Figure 8
<p>Comparison between the reference model regular_960 × 544 and the finally selected micro model micro_960 × 544_augmentation_motion_3_rt with motion channel. Due to the suppression of false positive detections, a significantly higher recall and consequently a higher F1-score is achieved. At the same time, the processing speed is increased by more than 10%.</p> ">
Abstract
:1. Introduction
2. Related Work
2.1. Object Detection in Images
2.2. YOLOv3
2.3. Object Detection in Videos
3. Methods
3.1. Adaptions of YOLOv3
3.2. Input Size
3.3. Anchor Boxes
3.4. Architecture Changes
3.5. Motion Channel
3.6. Data Augmentation
3.7. Speed Optimization
4. Experiments
4.1. Datasets
4.2. Models
4.3. Training
4.4. Evaluation Metrics
5. Results
5.1. Performance
5.2. Speed
6. Discussion
6.1. Effect of Input Resolution
6.2. Effect of Motion Channel
6.3. Effect of Data Augmentation
6.4. Effect of Refinement Training
6.5. Effect of Architectural Changes
6.6. Final Model Selection
7. Summary and Conclusions
8. Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AP | Average Precision |
BoF | Bag of Freebies |
BoS | Bag of Specials |
FCN | Fully Convolutional Network |
CNN | Convolutional Neural Networks |
FN | False Negatives |
FP | False Positives |
FPS | Frames per Second |
GT | Ground Truth |
HSV | Hue, Saturation, Value |
IoU | Intersection over Union |
mAP | Mean Average Precision |
NMS | Non-Maximum Suppression |
RGB | Red, Green, Blue |
TP | True Positives |
References
- Link, D. Sports Analytics. Ger. J. Exerc. Sport Res. 2018, 48, 13–25. [Google Scholar] [CrossRef]
- Thomas, G.; Gade, R.; Moeslund, T.B.; Carr, P.; Hilton, A. Computer Vision for Sports: Current Applications and Research Topics. Comput. Vis. Image Underst. 2017, 159, 3–18. [Google Scholar] [CrossRef]
- Burić, M.; Pobar, M.; Ivašić-Kos, M. Object Detection in Sports Videos. In Proceedings of the 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 21–25 May 2018; pp. 1034–1039. [Google Scholar]
- Tong, K.; Wu, Y.; Zhou, F. Recent advances in small object detection based on deep learning: A review. Image Vis. Comput. 2020, 97, 103910. [Google Scholar] [CrossRef]
- Kamble, P.R.; Keskar, A.G.; Bhurchandi, K.M. Ball Tracking in Sports: A Survey. Artif. Intell. Rev. 2019, 52, 1655–1705. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
- Zhao, Z.Q.; Zheng, P.; Xu, S.t.; Wu, X. Object Detection with Deep Learning: A Review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [Green Version]
- Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Lienhart, R.; Maydt, J. An Extended Set of Haar-like Features for Rapid Object Detection. In Proceedings of the International Conference on Image Processing, Rochester, NY, USA, 22–25 September 2002; Volume 1, pp. 900–903. [Google Scholar]
- Cortes, C.; Vapnik, V. Support Vector Machine. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Jones, M.; Viola, P. Fast Multi-view Face Detection. In Mitsubishi Electric Research Lab TR-20003-96; Mitsubishi Electric Research Laboratories: Cambridge, MA, USA, 2003; Volume 3, p. 2. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
- Soviany, P.; Ionescu, R.T. Optimizing the Trade-Off between Single-Stage and Two-Stage Deep Object Detectors using Image Difficulty Prediction. In Proceedings of the 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), Timisoara, Romania, 20–23 September 2018; pp. 209–214. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv 2015, arXiv:1506.01497. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Dai, J.; Li, Y.; He, K.; Sun, J. R-FCN: Object Detection via Region-based Fully Convolutional Networks. arXiv 2016, arXiv:1605.06409. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Huang, J.; Rathod, V.; Sun, C.; Zhu, M.; Korattikara, A.; Fathi, A.; Fischer, I.; Wojna, Z.; Song, Y.; Guadarrama, S.; et al. Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 7310–7311. [Google Scholar]
- Buric, M.; Pobar, M.; Ivasic-Kos, M. Ball Detection using YOLO and Mask R-CNN. In Proceedings of the 2018 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 12–14 December 2018; pp. 319–323. [Google Scholar]
- Burić, M.; Pobar, M.; Ivašić-Kos, M. Adapting YOLO Network for Ball and Player Detection. In Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2019), Prague, Czech Republic, 19–21 February 2019; pp. 845–851. [Google Scholar]
- Kisantal, M.; Wojna, Z.; Murawski, J.; Naruniec, J.; Cho, K. Augmentation for Small Object Detection. arXiv 2019, arXiv:1902.07296. [Google Scholar]
- Montserrat, D.M.; Lin, Q.; Allebach, J.; Delp, E.J. Training Object Detection And Recognition CNN Models Using Data Augmentation. Electron. Imaging 2017, 2017, 27–36. [Google Scholar] [CrossRef]
- Weng, L. Object Detection Part 4: Fast Detection Models. 2018. Available online: lilianweng.github.io/lil-log (accessed on 14 January 2021).
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. IJCV 2015, 115, 211–252. [Google Scholar] [CrossRef] [Green Version]
- Han, W.; Khorrami, P.; Paine, T.L.; Ramachandran, P.; Babaeizadeh, M.; Shi, H.; Li, J.; Yan, S.; Huang, T.S. Seq-NMS for Video Object Detection. arXiv 2016, arXiv:1602.08465. [Google Scholar]
- Hou, R.; Chen, C.; Shah, M. An End-to-end 3D Convolutional Neural Network for Action Detection and Segmentation in Videos. arXiv 2017, arXiv:1712.01111. [Google Scholar]
- Hara, K.; Kataoka, H.; Satoh, Y. Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 3154–3160. [Google Scholar]
- Ji, S.; Xu, W.; Yang, M.; Yu, K. 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 221–231. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xiao, F.; Jae Lee, Y. Video Object Detection with an Aligned Spatial-Temporal Memory. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 485–501. [Google Scholar]
- Liu, M.; Zhu, M. Mobile Video Object Detection with Temporally-Aware Feature Maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 5686–5695. [Google Scholar]
- Baker, S.; Matthews, I. Lucas-Kanade 20 Years On: A Unifying Framework. Int. J. Comput. Vis. 2004, 56, 221–255. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Fischer, P.; Ilg, E.; Hausser, P.; Hazirbas, C.; Golkov, V.; Van Der Smagt, P.; Cremers, D.; Brox, T. FlowNet: Learning Optical Flow with Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 13–16 December 2015; pp. 2758–2766. [Google Scholar]
- Ilg, E.; Mayer, N.; Saikia, T.; Keuper, M.; Dosovitskiy, A.; Brox, T. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2462–2470. [Google Scholar]
- Ranjan, A.; Black, M.J. Optical Flow Estimation Using a Spatial Pyramid Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4161–4170. [Google Scholar]
- Sun, D.; Yang, X.; Liu, M.Y.; Kautz, J. Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 1408–1423. [Google Scholar] [CrossRef] [Green Version]
- Zhu, X.; Xiong, Y.; Dai, J.; Yuan, L.; Wei, Y. Deep Feature Flow for Video Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2349–2358. [Google Scholar]
- Zhu, X.; Wang, Y.; Dai, J.; Yuan, L.; Wei, Y. Flow-Guided Feature Aggregation for Video Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 408–417. [Google Scholar]
- Zhu, X.; Dai, J.; Yuan, L.; Wei, Y. Towards High Performance Video Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 7210–7218. [Google Scholar]
- NVIDIA TensorRT Developer Guide. Available online: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html (accessed on 23 January 2021).
- TensorFlow Lite Guide. Available online: https://www.tensorflow.org/lite/guide (accessed on 23 January 2021).
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. Available online: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/ (accessed on 23 January 2021).
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Long, X.; Deng, K.; Wang, G.; Zhang, Y.; Dang, Q.; Gao, Y.; Shen, H.; Ren, J.; Han, S.; Ding, E.; et al. PP-YOLO: An Effective and Efficient Implementation of Object Detector. arXiv 2020, arXiv:2007.12099. [Google Scholar]
Model | Augmentation | Motion | Input Width | Input Height |
---|---|---|---|---|
regular_416 × 416_COCO | 416 | 416 | ||
regular_416 × 416 | 416 | 416 | ||
regular_608 × 608 | 608 | 608 | ||
regular_960 × 960 | 960 | 960 | ||
regular_960 × 544 | 960 | 544 | ||
micro_960 × 544_augmentation_static_75 | static | √ | 960 | 544 |
micro_960 × 544_augmentation_static_50 | static | √ | 960 | 544 |
micro_960 × 544_augmentation_static_20 | static | √ | 960 | 544 |
micro_960 × 544_augmentation_motion_5 | motion | √ | 960 | 544 |
micro_960 × 544_augmentation_motion_10 | motion | √ | 960 | 544 |
micro_960 × 544_augmentation_motion_20 | motion | √ | 960 | 544 |
micro_960 × 544 | √ | 960 | 544 | |
micro_960 × 544_no_motion | 960 | 544 | ||
micro_960 × 544_no_motion_augmentation_50 | static | 960 | 544 | |
micro_960 × 544_augmentation_motion_10_rt | motion | √ | 960 | 544 |
micro_960 × 544_augmentation_motion_3_rt | motion | √ | 960 | 544 |
micro_960 × 544_rt | √ | 960 | 544 | |
regular_960 × 544_augmentation_motion_10_rt | motion | 960 | 544 | |
regular_960 × 544_augmentation_motion_3_rt | motion | 960 | 544 | |
regular_960 × 544_rt | 960 | 544 |
Model | Input Width | Input Height | Input Pixel | Trainable Parameters | Non-Trainable Parameters | Parameters | FPS | Training Time (Epoch) | Training Time (Sample) |
---|---|---|---|---|---|---|---|---|---|
regular_416 × 416_COCO | 416 | 416 | 173,056 | 61,523,734 | 52,608 | 61,576,342 | 16.2 | ||
regular_416 × 416 | 416 | 416 | 173,056 | 61,523,734 | 52,608 | 61,576,342 | 16.2 | 3:12 | 5.267 |
regular_608 × 608 | 608 | 608 | 369,664 | 61,523,734 | 52,608 | 61,576,342 | 12.9 | 3:14 | 5.325 |
regular_960 × 960 | 960 | 960 | 921,600 | 61,523,734 | 52,608 | 61,576,342 | 8.1 | 3:21 | 5.525 |
regular_960 × 544 | 960 | 544 | 522,240 | 61,523,734 | 52,608 | 61,576,342 | 11.7 | 3:23 | 5.584 |
micro_960 × 544_augmentation_static_75 | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 2:59 | 4.926 |
micro_960 × 544_augmentation_static_50 | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 2:22 | 3.92 |
micro_960 × 544_augmentation_static_20 | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 1:37 | 2.672 |
micro_960 × 544_augmentation_motion_5 | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 1:18 | 2.145 |
micro_960 × 544_augmentation_motion_10 | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 1:29 | 2.457 |
micro_960 × 544_augmentation_motion_20 | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 1:36 | 2.631 |
micro_960 × 544 | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 1:10 | 1.937 |
micro_960 × 544_no_motion | 960 | 544 | 522,240 | 55,594,738 | 49,536 | 55,644,274 | 13.4 | 1:11 | 1.946 |
micro_960 × 544_no_motion_augmentation_50 | 960 | 544 | 522,240 | 55,594,738 | 49,536 | 55,644,274 | 13.4 | 1:29 | 2.459 |
micro_960 × 544_augmentation_motion_10_rt | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 1:29 | 2.460 |
micro_960 × 544_augmentation_motion_3_rt | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 1:15 | 2.059 |
micro_960 × 544_rt | 960 | 544 | 522,240 | 55,595,026 | 49,536 | 55,644,562 | 12.9 | 1:12 | 1.974 |
regular_960 × 544_augmentation_motion_10_rt | 960 | 544 | 522,240 | 61,523,734 | 52,608 | 61,576,342 | 11.7 | 3:32 | 5.830 |
regular_960 × 544_augmentation_motion_3_rt | 960 | 544 | 522,240 | 61,523,734 | 52,608 | 61,576,342 | 11.7 | 3:26 | 5.666 |
regular_960 × 544_rt | 960 | 544 | 522,240 | 61,523,734 | 52,608 | 61,576,342 | 11.7 | 3:29 | 5.747 |
Model | GT | TP | FP | FN | Precision | Recall | F1 | AP | IoU |
---|---|---|---|---|---|---|---|---|---|
regular_416 × 416_COCO | 2189 | 11 | 2 | 2178 | 84.6 | 0.5 | 1.0 | 0.5 | 63.6 |
regular_416 × 416 | 2189 | 1774 | 1679 | 415 | 51.4 | 81.0 | 62.9 | 73.9 | 71.4 |
regular_608 × 608 | 2189 | 2085 | 3555 | 104 | 37.0 | 95.2 | 53.3 | 82.7 | 76.6 |
regular_960 × 960 | 2189 | 2157 | 5752 | 32 | 27.3 | 98.5 | 42.7 | 90.2 | 79.2 |
regular_960 × 544 | 2189 | 2044 | 1748 | 145 | 53.9 | 93.4 | 68.3 | 84.3 | 77.9 |
micro_960 × 544_augmentation_static_75 | 2189 | 1774 | 458 | 415 | 79.5 | 81.0 | 80.3 | 76.0 | 80.0 |
micro_960 × 544_augmentation_static_50 | 2189 | 1789 | 1488 | 400 | 54.6 | 81.7 | 65.5 | 69.5 | 76.9 |
micro_960 × 544_augmentation_static_20 | 2189 | 1803 | 370 | 386 | 83.0 | 82.4 | 82.7 | 78.1 | 79.5 |
micro_960 × 544_augmentation_motion_5 | 2189 | 1734 | 364 | 455 | 82.7 | 79.2 | 80.9 | 74.5 | 79.6 |
micro_960 × 544_augmentation_motion_10 | 2189 | 1451 | 158 | 738 | 90.2 | 66.3 | 76.4 | 63.7 | 79.1 |
micro_960 × 544_augmentation_motion_20 | 2189 | 1452 | 241 | 737 | 85.8 | 66.3 | 74.8 | 62.7 | 77.6 |
micro_960 × 544 | 2189 | 1704 | 343 | 485 | 83.2 | 77.8 | 80.5 | 72.9 | 78.9 |
micro_960 × 544_no_motion | 2189 | 2001 | 1319 | 188 | 60.3 | 91.4 | 72.6 | 84.1 | 79.3 |
micro_960 × 544_no_motion_augmentation_50 | 2189 | 2111 | 2528 | 78 | 45.5 | 96.4 | 61.8 | 89.7 | 83.0 |
micro_960 × 544_augmentation_motion_10_rt | 2189 | 1966 | 270 | 223 | 87.9 | 89.8 | 88.9 | 86.2 | 83.8 |
micro_960 × 544_augmentation_motion_3_rt | 2189 | 2029 | 485 | 160 | 80.7 | 92.7 | 86.3 | 89.0 | 84.5 |
micro_960 × 544_rt | 2189 | 2021 | 482 | 168 | 80.7 | 92.3 | 86.1 | 88.3 | 83.9 |
regular_960 × 544_augmentation_motion_10_rt | 2189 | 2062 | 1706 | 127 | 54.7 | 94.2 | 69.2 | 87.0 | 83.3 |
regular_960 × 544_augmentation_motion_3_rt | 2189 | 1926 | 507 | 263 | 79.2 | 88.0 | 83.3 | 82.7 | 84.1 |
regular_960 × 544_rt | 2189 | 1978 | 725 | 211 | 73.2 | 90.4 | 80.9 | 84.4 | 83.8 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hiemann, A.; Kautz, T.; Zottmann, T.; Hlawitschka, M. Enhancement of Speed and Accuracy Trade-Off for Sports Ball Detection in Videos—Finding Fast Moving, Small Objects in Real Time. Sensors 2021, 21, 3214. https://doi.org/10.3390/s21093214
Hiemann A, Kautz T, Zottmann T, Hlawitschka M. Enhancement of Speed and Accuracy Trade-Off for Sports Ball Detection in Videos—Finding Fast Moving, Small Objects in Real Time. Sensors. 2021; 21(9):3214. https://doi.org/10.3390/s21093214
Chicago/Turabian StyleHiemann, Alexander, Thomas Kautz, Tino Zottmann, and Mario Hlawitschka. 2021. "Enhancement of Speed and Accuracy Trade-Off for Sports Ball Detection in Videos—Finding Fast Moving, Small Objects in Real Time" Sensors 21, no. 9: 3214. https://doi.org/10.3390/s21093214
APA StyleHiemann, A., Kautz, T., Zottmann, T., & Hlawitschka, M. (2021). Enhancement of Speed and Accuracy Trade-Off for Sports Ball Detection in Videos—Finding Fast Moving, Small Objects in Real Time. Sensors, 21(9), 3214. https://doi.org/10.3390/s21093214