Abstract
Recently significant progress has been made in human action recognition and behavior prediction using deep learning techniques, leading to improved vision-based semantic understanding. However, there is still a lack of high-quality motion datasets for small bio-robotics, which presents more challenging scenarios for long-term movement prediction and behavior control based on third-person observation. In this study, we introduce RatPose, a bio-robot motion prediction dataset constructed by considering the influence factors of individuals and environments based on predefined annotation rules. To enhance the robustness of motion prediction against these factors, we propose a Dual-stream Motion-Scenario Decoupling (DMSD) framework that effectively separates scenario-oriented and motion-oriented features and designs a scenario contrast loss and motion clustering loss for overall training. With such distinctive architecture, the dual-branch feature flow information is interacted and compensated in a decomposition-then-fusion manner. Moreover, we demonstrate significant performance improvements of the proposed DMSD framework on different difficulty-level tasks. We also implement long-term discretized trajectory prediction tasks to verify the generalization ability of the proposed dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wolpaw, J.R., et al.: Brain-computer interface technology: a review of the first international meeting. IEEE Trans. Rehabilitation Eng. 8(2), 164–173 (2000)
Nicolas-Alonso, L.F., Gomez-Gil, J.: Brain computer interfaces, a review. Sensors 12(2), 1211–1279 (2012)
Roy, A.M.: Adaptive transfer learning-based multiscale feature fused deep convolutional neural network for eeg mi multiclassification in brain-computer interface. Eng. Appl. Artif. Intell. 116, 105347 (2022)
Zhang, R., Li, Y., Yan, Y., Zhang, H., Shaoyu, W., Tianyou, Yu., Zhenghui, G.: Control of a wheelchair in an indoor environment based on a brain-computer interface and automated navigation. IEEE Trans. Neural Syst. Rehabil. Eng. 24(1), 128–139 (2015)
Gupta, A., et al.: A hierarchical meta-model for multi-class mental task based brain-computer interfaces. Neurocomputing, 389, 207–217 (2020)
Moeslund, T.B., Hilton, A., Krüger, V., Sigal, L.: Visual analysis of humans. Springer (2011)
Klette, R.: Dimitris N Metaxas, and Bodo Rosenhahn. Understanding, Modelling, Capture, and Animation. Springer, Human Motion (2008)
Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3. 6m: large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1325–1339 (2013)
Jhuang, D., et al.: Automated home-cage behavioural phenotyping of mice. Nature Commun. 1(1), 1–10 (2010)
Goyal, R., et al.: The “something something” video database for learning and evaluating visual common sense. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5842–5850 (2017)
Zolfaghari, M., Singh, K., Brox, T.: Eco: efficient convolutional network for online video understanding. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 695–712 (2018)
Zhou, B., Andonian, A., Oliva, A., Torralba, A.: Temporal relational reasoning in videos. In: Proceedings of the European conference on computer vision (ECCV), pp. 803–818 (2018)
Zhou, B., Andonian, A., Oliva, A., Torralba, A.: Temporal relational reasoning in videos. European Conference on Computer Vision (2018)
Shao, H., Qian, S., Liu, Y.: Temporal interlacing network. AAAI (2020)
Lin, J., Gan, C., Han, S.: Tsm: temporal shift module for efficient video understanding. In: Proceedings of the IEEE International Conference on Computer Vision (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
MMAction2 Contributors. Openmmlab’s next generation video understanding toolbox and benchmark (2020). https://github.com/open-mmlab/mmaction2
Bertasius, G., Wang, H., Torresani, L.: Is space-time attention all you need for video understanding? In: ICML, vol. 2, p. 4 (2021)
Franceschi, L., Donini, M., Frasconi, P., Pontil, M.: Forward and reverse gradient-based hyperparameter optimization. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 1165–1173. PMLR, 06–11 Aug 2017
Liu, R., Pan, M., Yuan, X., Zeng, S., Zhang, J.: A general descent aggregation framework for gradient-based bi-level optimization. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 38–57 (2023)
Liu, R., Liu, Y., Zeng, S., Zhang, J.: Methodology, analysis and extensions, augmenting iterative trajectory for bilevel optimization (2023)
Acknowledgements
This work is partially supported by the National Key R &D Program of China (2020YFB1313503), the National Natural Science Foundation of China (U22B2052), the Fundamental Research Funds for the Central Universities and the Major Key Project of PCL (PCL2021A12).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, X., Gao, J., Liu, Y., Zheng, N., Liu, R. (2023). Motion-Scenario Decoupling for Rat-Aware Video Position Prediction: Strategy and Benchmark. In: Lu, H., et al. Image and Graphics. ICIG 2023. Lecture Notes in Computer Science, vol 14356. Springer, Cham. https://doi.org/10.1007/978-3-031-46308-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-46308-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46307-5
Online ISBN: 978-3-031-46308-2
eBook Packages: Computer ScienceComputer Science (R0)