Object Tracking in Hyperspectral-Oriented Video with Fast Spatial-Spectral Features
"> Figure 1
<p>Tracking speed-accuracy plot of the same correlation filter tracker based on different features on a hyperspectral surveillance video (HSSV) dataset. The upper right corner indicates the best performance in terms of both standard and robust accuracy. The proposed FSSF algorithm achieves the best accuracy with faster speed.</p> "> Figure 2
<p>Spectral signatures of various materials measured at a center pixel of each object. (<b>a</b>) Target objects and various background materials. From left to right and top to bottom: box, camera, bottle, glass, car, electric car, airplane, boat, building, tree, human, and road. (<b>b</b>) Reflectance of various targets over 680-960 nm in wavelength.</p> "> Figure 3
<p>Sample images and their spectral reflectance. (<b>a</b>) Images are taken in various conditions (normal, deformation, object in shadow, object in light, and background clutter). (<b>b</b>) Spectral profiles of object in different states (normal, deformation and object in shadow, object in light). (<b>c</b>) Spectral signatures of facial skin and hair of the different subjects in the background clutter image of (<b>a</b>).</p> "> Figure 4
<p>The scatter-plot visualization representations of different objects generated for the HSI and RGB datasets using t-SNE. (<b>a</b>) Sample images of the dataset (airplane, bicycle, boat, and person). (<b>b</b>) Visualization of the HSI dataset. (<b>c</b>) Visualization of the RGB dataset. The x axis and y axis represent the two feature values of the data in two-dimensional space, respectively. There are four kinds of objects, each of which is represented a particular color.</p> "> Figure 5
<p>The scatter-plot visualization representations of the HSI and RGB datasets with the challenge of deformation using t-SNE. (<b>a</b>) Sample images of the dataset with deformation (normal and deformation). The target deforms as the face moves. (<b>b</b>) Visualization of the HSI dataset. (<b>c</b>) Visualization of the RGB dataset. The x axis and y axis represent the two feature values of the data in two-dimensional space, respectively. There are two states (normal and deformation) of the same object in two datasets, each of which is represented a particular color.</p> "> Figure 6
<p>The scatter-plot visualization representations of the HSI and RGB datasets with the challenge of illumination variation using t-SNE. (<b>a</b>) Sample images of the dataset with illumination variation (object in light and object in shadow). The electric car is subjected to light changes during driving. (<b>b</b>) Visualization of the HSI dataset. (<b>c</b>) Visualization of the RGB dataset. The x axis and y axis represent the two feature values of the data in two-dimensional space, respectively. There are two states (light and shadow) of the same object in two datasets, each of which is represented by a particular color.</p> "> Figure 7
<p>The scatter-plot visualization representations of the HSI and RGB datasets with the challenge of background clutter using t-SNE. (<b>a</b>) Sample images of the dataset with background clutter (from left to right: object1 and object2). The two objects are similar in visual appearance. (<b>b</b>) Visualization of the HSI dataset. (<b>c</b>) Visualization of the RGB dataset. The x axis and y axis represent the two feature values of the data in two-dimensional space, respectively. There are two kinds of objects in two data sets, each of which is represented by a particular color.</p> "> Figure 8
<p>The initialization (purple box) and updating (blue box) process of the proposed real-time spatial-spectral convolution (RSSC) kernel. In the frame <span class="html-italic">t</span> − 1, RSSC kernels are initialized using the search region of interest centered at position <math display="inline"><semantics> <mrow> <msub> <mi>y</mi> <mrow> <mi>t</mi> <mo>−</mo> <mn>1</mn> </mrow> </msub> </mrow> </semantics></math> and ground-truth bounding box of the object. For the new frame <span class="html-italic">t</span>, spatial-spectral features are extracted using the initialized RSSC kernel to estimate the object position <math display="inline"><semantics> <mrow> <msub> <mi>y</mi> <mi>t</mi> </msub> </mrow> </semantics></math>. Then, RSSC kernels are updated using the search region of interest and bounding box centered at <math display="inline"><semantics> <mrow> <msub> <mi>y</mi> <mi>t</mi> </msub> </mrow> </semantics></math>. For calculation convenience, here we update the numerator <math display="inline"><semantics> <mrow> <msubsup> <mi>A</mi> <mi>t</mi> <mi>k</mi> </msubsup> </mrow> </semantics></math> and denominator <math display="inline"><semantics> <mrow> <msubsup> <mi>B</mi> <mi>t</mi> <mi>k</mi> </msubsup> </mrow> </semantics></math> of the RSSC kernel separately. <math display="inline"><semantics> <mi>F</mi> </semantics></math> and <math display="inline"><semantics> <mrow> <msup> <mi>F</mi> <mrow> <mo>−</mo> <mn>1</mn> </mrow> </msup> </mrow> </semantics></math> denote the FFT and inverse FFT, respectively.</p> "> Figure 9
<p>(<b>a</b>) Visualization of correlation coefficient matrix, (<b>b</b>) Relative entropy of each band relative to the first band.</p> "> Figure 10
<p>Visualization of the spatial-spectral feature maps extracted from different sub-HSIs. Activations are shown for two frames from the deformation challenging car sequences (<b>left</b>). The spatial-spectral features (<b>right</b>) are extracted on each sub-HSI. Notice that although the appearance of object changes significantly, we can still extract discriminative features even the background has changed dramatically.</p> "> Figure 11
<p>Illustration of a set of 25 bands of HSI. The 25 bands are ordered in ascending from left to right and top and bottom, and its center wavelengths are 682.27 nm, 696.83 nm, 721.13 nm, 735.04 nm, 747.12 nm, 760.76 nm, 772.28 nm, 784.81 nm, 796.46 nm, 808.64 nm, 827.73 nm, 839.48 nm, 849.40 nm, 860.49 nm, 870.95 nm, 881.21 nm, 889.97 nm, 898.79 nm, 913.30 nm, 921.13 nm, 929.13 nm, 936.64 nm, 944.55 nm, 950.50 nm, 957.04 nm, respectively.</p> "> Figure 12
<p>Example sequences with different tracking objects of the HSSV dataset. From top to bottom: airplane, boat, pedestrian, electric car, bicycle, car.</p> "> Figure 13
<p>Comparison results for all SSCF trackers and their baseline trackers using three initialization strategies: one-pass evaluation (OPE), temporal robustness evaluation (TRE) and spatial robustness evaluation (SRE). (<b>a</b>) Precision plot and the success plot on OPE. (<b>b</b>) Precision plot and the success plot on SRE. (<b>c</b>) Precision plot and the success plot on TRE. The legend of precision plots and success plots report the precision scores at a threshold of 20 pixels and area-under-the-curve (AUC) scores, respectively.</p> "> Figure 13 Cont.
<p>Comparison results for all SSCF trackers and their baseline trackers using three initialization strategies: one-pass evaluation (OPE), temporal robustness evaluation (TRE) and spatial robustness evaluation (SRE). (<b>a</b>) Precision plot and the success plot on OPE. (<b>b</b>) Precision plot and the success plot on SRE. (<b>c</b>) Precision plot and the success plot on TRE. The legend of precision plots and success plots report the precision scores at a threshold of 20 pixels and area-under-the-curve (AUC) scores, respectively.</p> "> Figure 14
<p>Success plots over eight tracking attributes, including (<b>a</b>) background clutter (24), (<b>b</b>) deformation (18), (<b>c</b>) illumination variation (20), (<b>d</b>) low resolution (27), (<b>e</b>) occlusion (36), (<b>f</b>) out-of-plane rotation (7), (<b>g</b>) out of view (4), (<b>h</b>) scale variation (37). The values in parentheses indicate the number of sequences associated with each attribute. The legend reports the area-under-the-curve score.</p> "> Figure 15
<p>Qualitative results of our hyperspectral video compared to traditional video on some challenging sequences (electriccar, double5, airplane9, human4). The results of SSCF tracker and the baseline tracker are represented by green and red boxes, respectively.</p> "> Figure 16
<p>Precision and success plot of different features on a HSSV dataset using three initialization strategies: one-pass evaluation (OPE), temporal robustness evaluation (TRE) and spatial robustness evaluation (SRE). (<b>a</b>) Precision plot and the success plot on OPE. (<b>b</b>) Precision plot and the success plot on SRE. (<b>c</b>) Precision plot and the success plot on TRE. The legend of precision plots and success plots report the precision scores at a threshold of 20 pixels and area-under-the-curve scores, respectively. The fps of trackers in three initialization strategies is also shown in legend.</p> "> Figure 17
<p>Success plots over six tracking attributes, including (<b>a</b>) low resolution (27), (<b>b</b>) occlusion (36), (<b>c</b>) scale variation (37), (<b>d</b>) fast motion (9), (<b>e</b>) background clutter (24), (<b>f</b>) deformation (18). The values in parentheses indicate the number of sequences associated with each attribute. The legend reports the area-under-the-curve score.</p> "> Figure 18
<p>Comparison results with hyperspectral trackers. (<b>a</b>) Precision plot. (<b>b</b>) Success plot. The legend of the precision plot and success plot report the precision scores at a threshold of 20 pixels and area-under-the-curve (AUC) scores, respectively.</p> ">
Abstract
:1. Introduction
- (1)
- Proposed an FSSF feature extraction algorithm to realize object tracking in hyperspectral video with a correlation filter in real time.
- (2)
- Developed RSSC kernels being updated in real time to encode the discriminative spatial-spectral information of each sub-HSI.
- (3)
- Confirmed the advantage of hyperspectral video tracking and the high efficiency and strong discriminative ability of FSSF on the collected HSSV dataset in challenging environments.
2. Related Work
2.1. Advantage Analysis of Hyperspectral Video Tracking
2.1.1. Spectral Properties of HSI
2.1.2. Separability Visualization of HSI
2.2. Hyperspectral Tracking Method
2.3. Correlation Filter Tracking
3. Fast Spatial-Spectral Feature-Based Tracking
3.1. Fast Spatial-Spectral Feature (FSSF) Extraction
3.1.1. Problem Formulation
3.1.2. Real-Time Spatial-Spectral Convolution (RSSC) Kernel
3.1.3. Computational Complexity Analysis
3.1.4. Feature Extraction
3.2. FSSF-Based Object Tracking
Algorithm 1: FSSF-Based Hyperspectral Video Tracking Method |
Input: t-th frame It, object position on (t − 1)-th frame . |
Output: target location on t frame . |
1: RSSC kernel initialization: |
2: Crop an image patch from at the location on (t − 1)-th frame , initialize convolution kernel by using (9). |
3: Repeat |
4: Location estimation |
5: Crop an image patch from centered at . |
6: Extract the spatial-spectral feature by using (13). |
7: Compute correlation scores using (14). |
8: Set to the target position that maximizes . |
9: RSSC kernel update: |
10: Crop a new patch and label center at . |
11: Update RSSC kernel numerator by using (11), update RSSC kernel denominator by using (12). |
12: Until end of video sequences; |
4. Experimental Results
4.1. Experiment Setup
4.1.1. Dataset
4.1.2. Evaluation Metrics
4.1.3. Comparison Scenarios
4.2. Advantage Evaluation of Hyperspectral Video tracking
4.2.1. Quantitative Evaluation
4.2.2. Attribute-Based Evaluation
4.2.3. Qualitative Evaluation
4.2.4. Running Time Evaluation
4.3. Effectiveness Evaluation of Proposed FSSF
4.3.1. Quantitative Evaluation
4.3.2. Attribute-Based Evaluation
4.4. Comparison With Hyperspectral Trackers
4.4.1. Quantitative Evaluation
4.4.2. Attribute-Based Evaluation
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Shao, Z.; Wang, L.; Wang, Z.; Du, W.; Wu, W. Saliency-Aware Convolution Neural Network for Ship Detection in Surveillance Video. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 781–794. [Google Scholar] [CrossRef]
- Zhou, J.T.; Du, J.; Zhu, H.; Peng, X.; Liu, Y.; Goh, R.S.M. AnomalyNet: An anomaly detection network for video surveillance. IEEE Trans. Inf. Forensics Sec. 2019, 14, 2537–2550. [Google Scholar] [CrossRef]
- Hu, L.; Ni, Q. IoT-driven automated object detection algorithm for urban surveillance systems in smart cities. IEEE Internet Things J. 2018, 5, 747–754. [Google Scholar] [CrossRef] [Green Version]
- Li, A.; Miao, Z.; Cen, Y.; Zhang, X.-P.; Zhang, L.; Chen, S. Abnormal event detection in surveillance videos based on low-rank and compact coefficient dictionary learning. Pattern Recognit. 2020, 108, 107355. [Google Scholar] [CrossRef]
- Ye, L.; Liu, T.; Han, T.; Ferdinando, H.; Seppänen, T.; Alasaarela, E. Campus Violence Detection Based on Artificial Intelligent Interpretation of Surveillance Video Sequences. Remote Sens. 2021, 13, 628. [Google Scholar] [CrossRef]
- Li, M.; Cao, X.; Zhao, Q.; Zhang., L.; Meng., D. Online Rain/Snow Removal from Surveillance Videos. IEEE Trans. Image Process. 2021, 30, 2029–2044. [Google Scholar] [CrossRef] [PubMed]
- Zhang, P.; Zhuo, T.; Xie, L.; Zhang, Y. Deformable object tracking with spatiotemporal segmentation in big vision surveillance. Neurocomputing 2016, 204, 87–96. [Google Scholar] [CrossRef]
- Zou, Q.; Ling, H.; Pang, Y.; Huang, Y.; Tian, M. Joint Headlight Pairing and Vehicle Tracking by Weighted Set Packing in Nighttime Traffic Videos. IEEE Trans. Intell. Transp. Syst. 2018, 19, 1950–1961. [Google Scholar] [CrossRef]
- Müller, M.; Bibi, A.; Giancola, S.; Alsubaihi, S.; Ghanem, B. TrackingNet: A large-scale dataset and benchmark for object tracking in the wild. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Stojnić, V.; Risojević, V.; Muštra, M.; Jovanović, V.; Filipi, J.; Kezić, N.; Babić, Z. A Method for Detection of Small Moving Objects in UAV Videos. Remote Sens. 2021, 13, 653. [Google Scholar] [CrossRef]
- Yang, J.; Zhao, Y.-Q.; Chan, J.C.-W. Hyperspectral and Multispectral Image Fusion via Deep Two-Branches Convolutional Neural Network. Remote Sens. 2018, 10, 800. [Google Scholar] [CrossRef] [Green Version]
- Xie, F.; Gao, Q.; Jin, C.; Zhao, F. Hyperspectral Image Classification Based on Superpixel Pooling Convolutional Neural Network with Transfer Learning. Remote Sens. 2021, 13, 930. [Google Scholar] [CrossRef]
- Yang, J.; Zhao, Y.Q.; Chan, J.C.W.; Xiao, L. A Multi-Scale Wavelet 3D-CNN for Hyperspectral Image Super-Resolution. Remote Sens. 2019, 11, 1557. [Google Scholar] [CrossRef] [Green Version]
- Xue, J.; Zhao, Y.-Q.; Bu, Y.; Liao, W.; Chan, J.C.-W.; Philips, W. Spatial-Spectral Structured Sparse Low-Rank Representation for Hyperspectral Image Super-Resolution. IEEE Trans. Image Process. 2021, 30, 3084–3097. [Google Scholar] [CrossRef] [PubMed]
- Uzair, M.; Mahmood, A.; Mian, A. Hyperspectral Face Recognition With Spatiospectral Information Fusion and PLS Regression. IEEE Trans. Image Process. 2015, 24, 1127–1137. [Google Scholar] [CrossRef] [PubMed]
- Shen, L.; Zheng, S. Hyperspectral face recognition using 3D Gabor wavelets. In Proceedings of the 5th International Conference on Pattern Recognition and Machine Intelligence PReMI 2013, Kolkata, India, 10–14 December 2013. [Google Scholar]
- Uzkent, B.; Hoffman, M.J.; Vodacek, A. Real-Time Vehicle Tracking in Aerial Video Using Hyperspectral Features. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Tochon, G.; Chanussot, J.; Dalla Mura, M.; Bertozzi, A.L. Object Tracking by Hierarchical Decomposition of Hyperspectral Video Se-quences: Application to Chemical Gas Plume Tracking. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4567–4585. [Google Scholar] [CrossRef]
- Sofiane, M. Snapshot Multispectral Image Demosaicking and Classification. Ph.D. Thesis, University of Lille, Lille, France, December 2018. [Google Scholar]
- Ye, M.; Qian, Y.; Zhou, J.; Tang, Y.Y. Dictionary Learning-Based Feature-Level Domain Adaptation for Cross-Scene Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 1544–1562. [Google Scholar] [CrossRef] [Green Version]
- Al-Khafaji, Z.J.; Zia, A.; Liew, A.W.-C. Spectral-spatial scale invariant feature transform for hyperspectral images. IEEE Trans. Image Process. 2018, 27, 837–850. [Google Scholar] [CrossRef] [Green Version]
- Liang, J.; Zhou, J.; Tong, L.; Bai, X.; Wang, B. Material based salient object detection from hyperspectral images. Pattern Recognit. 2018, 76, 476–490. [Google Scholar] [CrossRef] [Green Version]
- Uzkent, B.; Hoffman, M.J.; Vodacek, A. Integrating Hyperspectral Likelihoods in a Multidimensional Assignment Algorithm for Aerial Vehicle Tracking. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 4325–4333. [Google Scholar] [CrossRef]
- Zha, Y.; Wu, M.; Qiu, Z.; Sun, J.; Zhang, P.; Huang, W. Online Semantic Subspace Learning with Siamese Network for UAV Tracking. Remote Sens. 2020, 12, 325. [Google Scholar] [CrossRef] [Green Version]
- Sun, M.; Xiao, J.; Lim, E.G.; Zhang, B.; Zhao, Y. Fast Template Matching and Update for Video Object Tracking and Segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–18 June 2020. [Google Scholar]
- Lukezic, A.; Vojir, T.; Zajc, L.C.; Matas, J.; Kristan, M. Discriminative Correlation Filter with Channel and Spatial Reliability. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Uzkent, B.; Rangnekar, A.; Hoffman, M.J. Tracking in Aerial Hyperspectral Videos Using Deep Kernelized Correlation Filters. IEEE Trans. Geosci. Remote Sens. 2019, 57, 449–461. [Google Scholar] [CrossRef] [Green Version]
- Xiong, F.; Zhou, J.; Qian, Y. Material Based Object Tracking in Hyperspectral Videos. IEEE Trans. Image Process. 2020, 29, 3719–3733. [Google Scholar] [CrossRef]
- Danelljan, M.; Bhat, G.; Khan, F.S.; Felsberg, M. ECO: Efficient convolution kernels for tracking. In Proceedings of the Conference on Computer Vision Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Brosch, T.; Tam, R. Efficient Training of Convolutional Deep Belief Networks in the Frequency Domain for Application to High-Resolution 2D and 3D Images. Neural Comput. 2015, 27, 211–227. [Google Scholar] [CrossRef] [PubMed]
- Xu, K.; Qin, M.; Sun, F.; Wang, Y.; Chen, Y.-K.; Ren, F. Learning in the Frequency Domain. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–18 June 2020. [Google Scholar]
- Wei, X.; Zhu, W.; Liao, B.; Cai, L. Scalable One-Pass Self-Representation Learning for Hyperspectral Band Selection. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4360–4374. [Google Scholar] [CrossRef]
- Yin, J.; Qv, H.; Luo, X.; Jia, X. Segment-Oriented Depiction and Analysis for Hyperspectral Image Data. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3982–3996. [Google Scholar] [CrossRef]
- Chatzimparmpas, A.; Martins, R.M.; Kerren, A. t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections. IEEE Trans. Vis. Comput. Graph. 2020, 26, 2696–2714. [Google Scholar] [CrossRef] [Green Version]
- Van Nguyen, H.; Banerjee, A.; Chellappa, R. Tracking via object reflectance using a hyperspectral video camera. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition—Workshops, San Francisco, CA, USA, 13–18 June 2010. [Google Scholar]
- Uzkent, B.; Rangnekar, A.; Hoffman, M.J. Aerial Vehicle Tracking by Adaptive Fusion of Hyperspectral Likelihood Maps. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Qian, K.; Zhou, J.; Xiong, F.; Du, J. Object tracking in hyperspectral videos with convolutional features and kernelized corre-lation filter. In Proceedings of the International Conference on Smart Multimedia, Toulon, France, 25–26 August 2018; pp. 308–319. [Google Scholar]
- Boddeti, V.N.; Kanade, T.; Kumar, B.V.K.V. Correlation filters for object alignment. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013. [Google Scholar]
- Fernandez, J.A.; Boddeti, V.N.; Rodriguez, A.; Kumar, B.V.K.V.; Vijayakumar, B. Zero-Aliasing Correlation Filters for Object Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 1702–1715. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual object tracking using adaptive correlation filters. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. Exploiting the circulant structure of tracking-by-detection with kernels. In Proceedings of the 12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; pp. 702–715. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. High-Speed Tracking with Kernelized Correlation Filters. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 583–596. [Google Scholar] [CrossRef] [Green Version]
- Dai, K.; Zhang, Y.; Wang, D.; Li, J.; Lu, H.; Yang, X. High-performance long-term tracking with meta-updater. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Ma, C.; Huang, J.-B.; Yang, X.; Yang, M.-H. Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking. Int. J. Comput. Vis. 2018, 126, 771–796. [Google Scholar] [CrossRef] [Green Version]
- Jiang, M.; Li, R.; Liu, Q.; Shi, Y.; Tlelo-Cuautle, E. High speed long-term visual object tracking algorithm for real robot systems. Neurocomputing 2021, 434, 268–284. [Google Scholar] [CrossRef]
- Wang, X.; Hou, Z.; Yu, W.; Jin, Z.; Zha, Y.; Qin, X. Online Scale Adaptive Visual Tracking Based on Multilayer Convolutional Features. IEEE Trans. Cybern. 2019, 49, 146–158. [Google Scholar] [CrossRef] [PubMed]
- Danelljan, M.; Khan, F.S.; Felsberg, M.; Van De Weijer, J. Adaptive color attributes for real-time visual tracking. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014; pp. 1090–1097. [Google Scholar]
- Liu, T.; Wang, G.; Yang, Q. Real-time part-based visual tracking via adaptive correlation filters. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Sun, X.; Cheung, N.-M.; Yao, H.; Guo, Y. Non-rigid object tracking via deformable patches using shape-preserved KCF and level sets. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Ruan, W.; Chen, J.; Wu, Y.; Wang, J.; Liang, C.; Hu, R.; Jiang, J. Multi-Correlation Filters With Triangle-Structure Constraints for Object Tracking. IEEE Trans. Multimed. 2018, 21, 1122–1134. [Google Scholar] [CrossRef]
- Danelljan, M.; Hager, G.; Khan, F.S.; Felsberg, M. Discriminative Scale Space Tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1561–1575. [Google Scholar] [CrossRef] [Green Version]
- Zhang, S.; Lu, W.; Xing, W.; Zhang, L. Learning Scale-Adaptive Tight Correlation Filter for Object Tracking. IEEE Trans. Cybern. 2018, 50, 270–283. [Google Scholar] [CrossRef] [PubMed]
- Xue, W.; Xu, C.; Feng, Z. Robust Visual Tracking via Multi-Scale Spatio-Temporal Context Learning. IEEE Trans. Circuits Syst. Video Technol. 2017, 28, 2849–2860. [Google Scholar] [CrossRef]
- Choi, J.; Chang, H.J.; Fischer, T.; Yun, S.; Lee, K.; Jeong, J.; Demiris, Y.; Choi, J.Y. Context-aware deep feature compression for high-speed visual tracking. In Proceedings of the Conference on Computer Vision Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Mueller, M.; Smith, N.; Ghanem, B. Context-aware correlation filter tracking. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1387–1395. [Google Scholar]
- Galoogahi, H.K.; Fagg, A.; Lucey, S. Learning background-aware correlation filters for visual tracking. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Li, F.; Tian, C.; Zuo, W.; Zhang, L.; Yang, M.-H. Learning spatial-temporal regularized correlation filters for visual tracking. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Li, Y.; Fu, C.; Ding, F.; Huang, Z.; Lu, G. AutoTrack: Towards high-performance visual tracking for UAV with automatic spatio-temporal regularization. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–18 June 2020. [Google Scholar]
- Yan, Y.; Guo, X.; Tang, J.; Li, C.; Wang, X. Learning spatio-temporal correlation filter for visual tracking. Neurocomputing 2021, 436, 273–282. [Google Scholar] [CrossRef]
- Marvasti-Zadeh, S.M.; Khaghani, J.; Ghanei-Yakhdan, H.; Kasaei, S.; Cheng, L. Context-Aware IoU-Guided Network for Small Object Tracking. In Proceedings of the Asian Conference on Computer Vision, Kyoto, Japan, 20–23 May 2021. [Google Scholar]
- Yang, T.; Xu, P.; Hu, R.; Chai, H.; Chan, A.B. ROAM: Recurrently optimizing tracking model. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–18 June 2020. [Google Scholar]
- Zhang, S.; Zhuo, L.; Zhang, H.; Li, J. Object Tracking in Unmanned Aerial Vehicle Videos via Multifeature Discrimination and Instance-Aware Attention Network. Remote Sens. 2020, 12, 2646. [Google Scholar] [CrossRef]
- Fu, C.; Lin, F.; Li, Y.; Chen, G. Correlation Filter-Based Visual Tracking for UAV with Online Multi-Feature Learning. Remote Sens. 2019, 11, 549. [Google Scholar] [CrossRef] [Green Version]
- Wang, N.; Zhou, W.; Tian, Q.; Hong, R.; Wang, M.; Li, H. Multi-cue correlation filters for robust visual tracking. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Mou, L.; Ghamisi, P.; Zhu, X.X. Unsupervised Spectral–Spatial Feature Learning via Deep Residual Conv–Deconv Network for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 56, 391–406. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Xu, C.; Xu, C.; Tao, D. Packing convolutional neural networks in the frequency domain. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 2495–2510. [Google Scholar] [CrossRef]
- Dziedzic, A.; Paparrizos, J.; Krishnan, S.; Elmore, A.; Franklin, M. Band-limited training and inference for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019. [Google Scholar]
- Lavin, A.; Gray, S. Fast Algorithms for Convolutional Neural Networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
SS_STRCF | DeepSTRCF | STRCF | SS_ECO | DeepECO | ECO | SS_fDSST | fDSST | SS_CN | CN | |
---|---|---|---|---|---|---|---|---|---|---|
Mean OP | 0.775 | 0.719 | 0.680 | 0.829 | 0.748 | 0.592 | 0.704 | 0.453 | 0.463 | 0.395 |
SS_STRCF | DeepSTRCF | STRCF | SS_ECO | DeepECO | ECO | SS_fDSST | fDSST | SS_CN | CN | |
---|---|---|---|---|---|---|---|---|---|---|
FPS | 23.64(cpu) | 5.73(gpu) | 32.11 | 46.68(cpu) | 11.87(gpu) | 67.58 | 45.8985 | 220.30 | 126.17 | 981.94 |
Spatial-Sepctral Feature | Spatial Feature | |||
---|---|---|---|---|
FSSF | DeepFeature | HOG | Color | |
FPS | 23.64 | 5.73(gpu) | 32.11 | 24.0113 |
Attribute | MDP(20) | MDP(15) | MDP(10) | MDP(5) | MOP(0.5) | MOP(0.6) | MOP(0.7) | MOP(0.8) | fps |
---|---|---|---|---|---|---|---|---|---|
FSSF | 0.620 | 0.558 | 0.457 | 0.265 | 0.471 | 0.395 | 0.298 | 0.178 | 46.68 |
DeepFeature | 0.594 | 0.510 | 0.378 | 0.178 | 0.385 | 0.303 | 0.214 | 0.116 | 1.23 |
Attribute | FSSF | DeepFeature |
---|---|---|
Illumination variation | 0.520 | 0.530 |
Scale variation | 0.419 | 0.388 |
Occlusion | 0.442 | 0.373 |
Deformation | 0.506 | 0.468 |
Motion blur | 0.444 | 0.473 |
Fast motion | 0.349 | 0.315 |
In-plane rotation | 0.392 | 0.317 |
Out-of-plane rotation | 0.378 | 0.369 |
Out-of-view | 0.357 | 0.372 |
Background clutter | 0.439 | 0.409 |
Low resolution | 0.450 | 0.287 |
SS_ECO | MHT | DeepHKCF | HLT | |
---|---|---|---|---|
Mean OP | 0.520 | 0.506 | 0.444 | 0.349 |
Mean DP | 0.829 | 0.788 | 0.375 | 0.110 |
FPS | 46.68 | 1.34 | 49.87 | 1.58 |
Attribute | SS_ECO | MHT | DeepHKCF | HLT |
---|---|---|---|---|
Illumination variation | 0.658 | 0.578 | 0.289 | 0.147 |
Scale variation | 0.618 | 0.607 | 0.387 | 0.146 |
Occlusion | 0.630 | 0.577 | 0.391 | 0.152 |
Deformation | 0.704 | 0.676 | 0.395 | 0.129 |
Motion blur | 0.641 | 0.555 | 0.434 | 0.087 |
Fast motion | 0.580 | 0.474 | 0.389 | 0.126 |
In-plane rotation | 0.596 | 0.591 | 0.479 | 0.178 |
Out-of-plane rotation | 0.623 | 0.586 | 0.437 | 0.076 |
Out-of-view | 0.574 | 0.407 | 0.419 | 0.158 |
Background clutter | 0.607 | 0.568 | 0.362 | 0.151 |
Low resolution | 0.680 | 0.623 | 0.388 | 0.105 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, L.; Zhao, Y.; Yao, J.; Chen, J.; Li, N.; Chan, J.C.-W.; Kong, S.G. Object Tracking in Hyperspectral-Oriented Video with Fast Spatial-Spectral Features. Remote Sens. 2021, 13, 1922. https://doi.org/10.3390/rs13101922
Chen L, Zhao Y, Yao J, Chen J, Li N, Chan JC-W, Kong SG. Object Tracking in Hyperspectral-Oriented Video with Fast Spatial-Spectral Features. Remote Sensing. 2021; 13(10):1922. https://doi.org/10.3390/rs13101922
Chicago/Turabian StyleChen, Lulu, Yongqiang Zhao, Jiaxin Yao, Jiaxin Chen, Ning Li, Jonathan Cheung-Wai Chan, and Seong G. Kong. 2021. "Object Tracking in Hyperspectral-Oriented Video with Fast Spatial-Spectral Features" Remote Sensing 13, no. 10: 1922. https://doi.org/10.3390/rs13101922
APA StyleChen, L., Zhao, Y., Yao, J., Chen, J., Li, N., Chan, J. C. -W., & Kong, S. G. (2021). Object Tracking in Hyperspectral-Oriented Video with Fast Spatial-Spectral Features. Remote Sensing, 13(10), 1922. https://doi.org/10.3390/rs13101922