Nothing Special   »   [go: up one dir, main page]

Skip to main content

Occluded Gait Recognition with Mixture of Experts: An Action Detection Perspective

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Abstract

Extensive occlusions in real-world scenarios pose challenges to gait recognition due to missing and noisy information, as well as body misalignment in position and scale. We argue that rich dynamic contextual information within a gait sequence inherently possesses occlusion-solving traits: 1) Adjacent frames with gait continuity allow holistic body regions to infer occluded body regions; 2) Gait cycles allow information integration between holistic actions and occluded actions. Therefore, we introduce an action detection perspective where a gait sequence is regarded as a composition of actions. To detect accurate actions under complex occlusion scenarios, we propose an Action Detection Based Mixture of Experts (GaitMoE), consisting of Mixture of Temporal Experts (MTE) and Mixture of Action Experts (MAE). MTE adaptively constructs action anchors by temporal experts and MAE adaptively constructs action proposals from action anchors by action experts. Especially, action detection as a proxy task with gait recognition is an end-to-end joint training only with ID labels. In addition, due to the lack of a unified occluded benchmark, we construct a pioneering Occluded Gait database (OccGait), containing rich occlusion scenarios and annotations of occlusion types. Extensive experiments on OccGait, OccCASIA-B, Gait3D and GREW demonstrate the superior performance of GaitMoE. OccGait is available at https://github.com/BNU-IVC/OccGait.

P. Huang and Y. Peng—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. An, W., et al.: Performance evaluation of model-based gait on multi-view very large population database with pose sequences. IEEE Trans. Biometrics Behav. Identity Sci. 2(4), 421–430 (2020)

    Article  Google Scholar 

  2. Bashir, K., Xiang, T., Gong, S., Mary, Q., et al.: Gait representation using flow fields. In: BMVC, pp. 1–11 (2009)

    Google Scholar 

  3. Chai, T., Li, A., Zhang, S., Li, Z., Wang, Y.: Lagrange motion analysis and view embeddings for improved gait recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20249–20258 (2022)

    Google Scholar 

  4. Chao, H., He, Y., Zhang, J., Feng, J.: Gaitset: regarding gait as a set for cross-view gait recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8126–8133 (2019)

    Google Scholar 

  5. Chattopadhyay, P., Sural, S., Mukherjee, J.: Frontal gait recognition from occluded scenes. Pattern Recogn. Lett. 63, 9–15 (2015)

    Article  Google Scholar 

  6. Chen, C., Liang, J., Zhao, H., Hu, H., Tian, J.: Frame difference energy image for gait recognition with incomplete silhouettes. Pattern Recogn. Lett. 30(11), 977–984 (2009)

    Article  Google Scholar 

  7. Chen, X., Li, H., Li, M., Pan, J.: Learning a sparse transformer network for effective image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5896–5905 (2023)

    Google Scholar 

  8. Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., Girdhar, R.: Masked-attention mask transformer for universal image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1290–1299 (2022)

    Google Scholar 

  9. Delgado-Escano, R., Castro, F.M., Cózar, J.R., Marin-Jimenez, M.J., Guil, N.: Mupeg-the multiple person gait framework. Sensors 20(5), 1358 (2020)

    Article  Google Scholar 

  10. Dou, H., Zhang, P., Su, W., Yu, Y., Lin, Y., Li, X.: Gaitgci: generative counterfactual intervention for gait recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5578–5588 (2023)

    Google Scholar 

  11. Fan, C., Liang, J., Shen, C., Hou, S., Huang, Y., Yu, S.: Opengait: revisiting gait recognition towards better practicality. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9707–9716 (2023)

    Google Scholar 

  12. Fan, C., et al.: Gaitpart: temporal part-based model for gait recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14225–14233 (2020)

    Google Scholar 

  13. Fedus, W., Zoph, B., Shazeer, N.: Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. J. Mach. Learn. Res. 23(1), 5232–5270 (2022)

    MathSciNet  Google Scholar 

  14. Fu, Y., Meng, S., Hou, S., Hu, X., Huang, Y.: Gpgait: generalized pose-based gait recognition. arXiv preprint arXiv:2303.05234 (2023)

  15. Gross, R.: The cmu motion of body (mobo) database. Carnegie Mellon University. The Robotics Institute (2001)

    Google Scholar 

  16. Guo, H., Ji, Q.: Physics-augmented autoencoder for 3d skeleton-based gait recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 19627–19638 (2023)

    Google Scholar 

  17. Gupta, A., Chellappa, R.: You can run but not hide: improving gait recognition with intrinsic occlusion type awareness. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5893–5902 (2024)

    Google Scholar 

  18. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  19. Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)

  20. Hofmann, M., Geiger, J., Bachmann, S., Schuller, B., Rigoll, G.: The tum gait from audio, image and depth (gaid) database: multimodal recognition of subjects and traits. J. Vis. Commun. Image Represent. 25(1), 195–206 (2014)

    Article  Google Scholar 

  21. Hofmann, M., Wolf, D., Rigoll, G.: Identification and reconstruction of complete gait cycles for person identification in crowded scenes. In: Proceedings of International Conference on Computer Vision Theory and Applications (VISAPP), Algarve, Portugal (2011)

    Google Scholar 

  22. Hossain, M.A., Makihara, Y., Wang, J., Yagi, Y.: Clothing-invariant gait identification using part-based clothing categorization and adaptive weight control. Pattern Recogn. 43(6), 2281–2291 (2010)

    Article  Google Scholar 

  23. Hou, S., Cao, C., Liu, X., Huang, Y.: Gait lateral network: learning discriminative and compact representations for gait recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 382–398. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_22

    Chapter  Google Scholar 

  24. Hou, S., Liu, X., Cao, C., Huang, Y.: Set residual network for silhouette-based gait recognition. IEEE Trans. Biometrics Behav. Identity Sci. 3(3), 384–393 (2021)

    Article  Google Scholar 

  25. Huang, X., et al.: Context-sensitive temporal feature learning for gait recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12909–12918 (2021)

    Google Scholar 

  26. Iwama, H., Okumura, M., Makihara, Y., Yagi, Y.: The ou-isir gait database comprising the large population dataset and performance evaluation of gait recognition. IEEE Trans. Inf. Forensics Secur. 7(5), 1511–1521 (2012)

    Article  Google Scholar 

  27. Lepikhin, D., et al.: Gshard: scaling giant models with conditional computation and automatic sharding. arXiv preprint arXiv:2006.16668 (2020)

  28. Li, B., Yang, J., Ren, J., Wang, Y., Liu, Z.: Sparse fusion mixture-of-experts are domain generalizable learners. arXiv e-prints pp. arXiv–2206 (2022)

    Google Scholar 

  29. Li, N., Zhao, X.: A multi-modal dataset for gait recognition under occlusion. Appl. Intell. 53(2), 1517–1534 (2023)

    Article  Google Scholar 

  30. Liang, J., Fan, C., Hou, S., Shen, C., Huang, Y., Yu, S.: Gaitedge: beyond plain end-to-end gait recognition for better practicality. arXiv preprint arXiv:2203.03972 (2022)

  31. Lin, B., Zhang, S., Bao, F.: Gait recognition with multiple-temporal-scale 3d convolutional neural network. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 3054–3062 (2020)

    Google Scholar 

  32. Lin, B., Zhang, S., Yu, X.: Gait recognition via effective global-local feature representation and local temporal aggregation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14648–14656 (2021)

    Google Scholar 

  33. Lin, C., et al.: Learning salient boundary feature for anchor-free temporal action localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3320–3329 (2021)

    Google Scholar 

  34. Ma, K., Fu, Y., Zheng, D., Cao, C., Hu, X., Huang, Y.: Dynamic aggregated network for gait recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22076–22085 (2023)

    Google Scholar 

  35. Makihara, Y., Mannami, H., Yagi, Y.: Gait analysis of gender and age using a large-scale multi-view gait database. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6493, pp. 440–451. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19309-5_34

    Chapter  Google Scholar 

  36. Mustafa, B., Riquelme, C., Puigcerver, J., Jenatton, R., Houlsby, N.: Multimodal contrastive learning with limoe: the language-image mixture of experts. Adv. Neural. Inf. Process. Syst. 35, 9564–9576 (2022)

    Google Scholar 

  37. Peng, Y., Cao, C., He, Z.: Occluded gait recognition. In: 2023 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2023)

    Google Scholar 

  38. Riquelme, C., et al.: Scaling vision with sparse mixture of experts. Adv. Neural. Inf. Process. Syst. 34, 8583–8595 (2021)

    Google Scholar 

  39. Roller, S., Sukhbaatar, S., Weston, J., et al.: Hash layers for large sparse models. Adv. Neural. Inf. Process. Syst. 34, 17555–17566 (2021)

    Google Scholar 

  40. Sepas-Moghaddam, A., Etemad, A.: Deep gait recognition: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 264–284 (2022)

    Article  Google Scholar 

  41. Shazeer, N., et al.: Outrageously large neural networks: the sparsely-gated mixture-of-experts layer. arXiv preprint arXiv:1701.06538 (2017)

  42. Shen, C., Fan, C., Wu, W., Wang, R., Huang, G.Q., Yu, S.: Lidargait: benchmarking 3d gait recognition with point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1054–1063 (2023)

    Google Scholar 

  43. Shi, D., Zhong, Y., Cao, Q., Ma, L., Li, J., Tao, D.: Tridet: temporal action detection with relative boundary modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18857–18866 (2023)

    Google Scholar 

  44. Song, C., Huang, Y., Wang, W., Wang, L.: Casia-e: a large comprehensive dataset for gait recognition. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 2801–2815 (2022)

    Google Scholar 

  45. Takemura, N., Makihara, Y., Muramatsu, D., Echigo, T., Yagi, Y.: Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition. IPSJ Trans. Comput. Vision Appl. 10, 1–14 (2018)

    Google Scholar 

  46. Tan, D., Huang, K., Yu, S., Tan, T.: Efficient night gait recognition based on template matching. In: 18th International Conference on Pattern Recognition (ICPR 2006), vol. 3, pp. 1000–1003. IEEE (2006)

    Google Scholar 

  47. Teepe, T., Gilg, J., Herzog, F., Hörmann, S., Rigoll, G.: Towards a deeper understanding of skeleton-based gait recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1569–1577 (2022)

    Google Scholar 

  48. Teepe, T., Khan, A., Gilg, J., Herzog, F., Hörmann, S., Rigoll, G.: Gaitgraph: graph convolutional network for skeleton-based gait recognition. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 2314–2318. IEEE (2021)

    Google Scholar 

  49. Tsuji, A., Makihara, Y., Yagi, Y.: Silhouette transformation based on walking speed for gait identification. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 717–722. IEEE (2010)

    Google Scholar 

  50. Uddin, M.Z., Muramatsu, D., Takemura, N., Ahad, M.A.R., Yagi, Y.: Spatio-temporal silhouette sequence reconstruction for gait recognition against occlusion. IPSJ Trans. Comput. Vision Appl. 11(1), 1–18 (2019)

    Google Scholar 

  51. Uddin, M.Z., et al.: The ou-isir large population gait database with real-life carried object and its performance evaluation. IPSJ Trans. Comput. Vision Appl. 10(1), 1–11 (2018)

    Google Scholar 

  52. Wang, L., Liu, B., Liang, F., Wang, B.: Hierarchical spatio-temporal representation learning for gait recognition. arXiv preprint arXiv:2307.09856 (2023)

  53. Wang, L., Tan, T., Ning, H., Hu, W.: Silhouette analysis-based gait recognition for human identification. IEEE Trans. Pattern Anal. Mach. Intell. 25(12), 1505–1518 (2003)

    Article  Google Scholar 

  54. Wang, M., et al.: Dygait: exploiting dynamic representations for high-performance gait recognition. arXiv preprint arXiv:2303.14953 (2023)

  55. Wang, W., et al.: Image as a foreign language: beit pretraining for all vision and vision-language tasks. arXiv preprint arXiv:2208.10442 (2022)

  56. Xu, C., Makihara, Y., Li, X., Yagi, Y.: Occlusion-aware human mesh model-based gait recognition. IEEE Trans. Inf. Forensics Secur. 18, 1309–1321 (2023)

    Article  Google Scholar 

  57. Xu, C., Makihara, Y., Ogi, G., Li, X., Yagi, Y., Lu, J.: The ou-isir gait database comprising the large population dataset with age and performance evaluation of age estimation. IPSJ Trans. Comput. Vision Appl. 9(1), 1–14 (2017)

    Google Scholar 

  58. Xu, C., Tsuji, S., Makihara, Y., Li, X., Yagi, Y.: Occluded gait recognition via silhouette registration guided by automated occlusion degree estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3199–3209 (2023)

    Google Scholar 

  59. Yang, L., Peng, H., Zhang, D., Fu, J., Han, J.: Revisiting anchor mechanisms for temporal action localization. IEEE Trans. Image Process. 29, 8535–8548 (2020)

    Article  Google Scholar 

  60. Yu, S., Tan, D., Tan, T.: A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: 18th International Conference on Pattern Recognition (ICPR 2006), vol. 4, pp. 441–444. IEEE (2006)

    Google Scholar 

  61. Zhang, C., Chen, X.P., Han, G.Q., Liu, X.J.: Spatial transformer network on skeleton-based gait recognition. Expert Syst. e13244 (2023)

    Google Scholar 

  62. Zheng, J., Liu, X., Liu, W., He, L., Yan, C., Mei, T.: Gait recognition in the wild with dense 3d representations and a benchmark. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20228–20237 (2022)

    Google Scholar 

  63. Zhu, H., Zheng, W., Zheng, Z., Nevatia, R.: Gaitref: gait recognition with refined sequential skeletons. arXiv preprint arXiv:2304.07916 (2023)

  64. Zhu, H., Zheng, Z., Nevatia, R.: Gait recognition using 3-d human body shape inference. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 909–918 (2023)

    Google Scholar 

  65. Zhu, Z., et al.: Gait recognition in the wild: a benchmark. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14789–14799 (2021)

    Google Scholar 

Download references

Acknowledgment

This work is jointly supported by National Natural Science Foundation of China (62276025, 62206022), Beijing Municipal Science & Technology Commission (Z231100007423015) and Shenzhen Technology Plan Program (KQTD20170331093217368).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Saihui Hou or Yongzhen Huang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 203 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, P. et al. (2025). Occluded Gait Recognition with Mixture of Experts: An Action Detection Perspective. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15064. Springer, Cham. https://doi.org/10.1007/978-3-031-72658-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72658-3_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72657-6

  • Online ISBN: 978-3-031-72658-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics