Article
Published: 05 April 2021

Unsupervised behaviour analysis and magnification (uBAM) using deep learning

Nature Machine Intelligence volume 3, pages 495–506 (2021)Cite this article

2362 Accesses
13 Citations
64 Altmetric
Metrics details

Subjects

A preprint version of the article is available at arXiv.

Abstract

Motor behaviour analysis is essential to biomedical research and clinical diagnostics as it provides a non-invasive strategy for identifying motor impairment and its change caused by interventions. State-of-the-art instrumented movement analysis is time- and cost-intensive, because it requires the placement of physical or virtual markers. As well as the effort required for marking the keypoints or annotations necessary for training or fine-tuning a detector, users need to know the interesting behaviour beforehand to provide meaningful keypoints. Here, we introduce unsupervised behaviour analysis and magnification (uBAM), an automatic deep learning algorithm for analysing behaviour by discovering and magnifying deviations. A central aspect is unsupervised learning of posture and behaviour representations to enable an objective comparison of movement. Besides discovering and quantifying deviations in behaviour, we also propose a generative model for visually magnifying subtle behaviour differences directly in a video without requiring a detour via keypoints or annotations. Essential for this magnification of deviations, even across different individuals, is a disentangling of appearance and behaviour. Evaluations on rodents and human patients with neurological diseases demonstrate the wide applicability of our approach. Moreover, combining optogenetic stimulation with our unsupervised behaviour analysis shows its suitability as a non-invasive diagnostic tool correlating function to brain plasticity.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Unsupervised behaviour analysis and magnification.**

**Fig. 2: Evaluating the learned disentangled representation.**

**Fig. 3: Behaviour analysis for disease classification.**

**Fig. 4: Evaluation of motor function skills at different points during learning.**

**Fig. 5: Predicting neurophysiological characteristics from behaviour.**

**Fig. 6: Magnifying impaired behaviour as a diagnostic tool.**

Automated procedure to detect subtle motor alterations in the balance beam test in a mouse model of early Parkinson’s disease

Article Open access 09 January 2024

Head movement dynamics in dystonia: a multi-centre retrospective study using visual perceptive deep learning

Article Open access 18 June 2024

Detecting motor symptom fluctuations in Parkinson’s disease with generative adversarial networks

Article Open access 09 September 2022

Data availability

The rat data can be downloaded at https://hci.iwr.uni-heidelberg.de/compvis_files/Rats.zip. The optogenetics data can be downloaded at https://hci.iwr.uni-heidelberg.de/compvis_files/Optogenetics.zip. The mice data can be downloaded at https://hci.iwr.uni-heidelberg.de/compvis_files/Mice.zip. The human dataset cannot be publicly released because of privacy issues (please contact the authors if needed).

Code availability

The code for training and evaluating our models is publicly available on GitHub at the following address: https://github.com/utabuechler/uBAM (ref. ⁵⁹).

References

Berman, G. J. Measuring behavior across scales. BMC Biol. 16, 23 (2018).
Article Google Scholar
Filli, L. et al. Profiling walking dysfunction in multiple sclerosis: characterisation, classification and progression over time. Sci. Rep. 8, 4984 (2018).
Article Google Scholar
Vargas-Irwin, C. E. et al. Decoding complete reach and grasp actions from local primary motor cortex populations. J. Neurosci. 30, 9659–9669 (2010).
Article Google Scholar
Loper, M. M., Mahmood, N. & Black, M. J. {MoSh}: motion and shape capture from sparse markers. ACM Trans. Graph. 33, 220:1–220:13 (2014).
Article Google Scholar
Huang, Y. et al. Deep inertial poser: learning to reconstruct human pose from sparse inertial measurements in real time. ACM Trans. Graph. 37, 185:1–185:15 (2018).
Article Google Scholar
Robie, A. A., Seagraves, K. M., Egnor, S. R. & Branson, K. Machine vision methods for analyzing social interactions. J. Exp. Biol. 220, 25–34 (2017).
Article Google Scholar
Dell, A. I. et al. Automated image-based tracking and its application in ecology. Trends Ecol. Evol. 29, 417–428 (2014).
Article Google Scholar
Peters, S. M. et al. Novel approach to automatically classify rat social behavior using a video tracking system. J. Neurosci. Methods 268, 163–170 (2016).
Article Google Scholar
Arac, A., Zhao, P., Dobkin, B. H., Carmichael, S. T. & Golshani, P. DeepBehavior: a deep learning toolbox for automated analysis of animal and human behavior imaging data. Front. Syst. Neurosci. 13, 20 (2019).
Article Google Scholar
Graving, J. M. et al. DeepPoseKit, a software toolkit for fast and robust animal pose estimation using deep learning. eLife 8, e47994 (2019).
Article Google Scholar
Pereira, T. D. et al. Fast animal pose estimation using deep neural networks. Nat. Methods 16, 117–125 (2019).
Article Google Scholar
Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).
Article Google Scholar
Simon, T., Joo, H., Matthews, I. & Sheikh, Y. Hand keypoint detection in single images using multiview bootstrapping. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 1145–1153 (IEEE, 2017).
Nath, T. et al. Using DeepLabCut for 3D markerless pose estimation across species and behaviors. Nat. Protoc. 14, 2152–2176 (2019).
Article Google Scholar
Mathis, M. W. & Mathis, A. Deep learning tools for the measurement of animal behavior in neuroscience. Curr. Opin. Neurobiol. 60, 1–11 (2020).
Article Google Scholar
Mu, J., Qiu, W., Hager, G. D. & Yuille, A. L. Learning from synthetic animals. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 12386–12395 (IEEE, 2020).
Li, S. et al. Deformation-aware unpaired image translation for pose estimation on laboratory animals. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 13158–13168 (IEEE, 2020).
Sanakoyeu, A., Khalidov, V., McCarthy, M. S., Vedaldi, A. & Neverova, N. Transferring dense pose to proximal animal classes. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 5233–5242 (IEEE, 2020).
Kocabas, M., Athanasiou, N. & Black, M. J. Vibe: video inference for human body pose and shape estimation. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 5253–5263 (IEEE, 2020).
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G. & Black, M. J. SMPL: a skinned multi-person linear model. ACM Trans. Graph. 34, 248:1–248:16 (2015).
Article Google Scholar
Zuffi, S., Kanazawa, A., Berger-Wolf, T. & Black, M. J. Three-D Safari: learning to estimate zebra pose, shape and texture from images ‘in the wild’. In Proc. IEEE/CVF International Conference on Computer Vision 5359–5368 (IEEE, 2019).
Habermann, M., Xu, W., Zollhofer, M., Pons-Moll, G. & Theobalt, C. DeepCap: monocular human performance capture using weak supervision. In Proc IEEE/CVF Conference on Computer Vision and Pattern Recognition 5052–5063 (IEEE, 2020).
Batty, E. et al. BehaveNet: nonlinear embedding and Bayesian neural decoding of behavioral videos. In Advances in Neural Information Processing Systems 15680–15691 (NIPS, 2019).
Ryait, H. et al. Data-driven analyses of motor impairments in animal models of neurological disorders. PLoS Biol. 17, 1–30 (2019).
Article Google Scholar
Kabra, M., Robie, A. A., Rivera-Alba, M., Branson, S. & Branson, K. JAABA: interactive machine learning for automatic annotation of animal behavior. Nat. Methods 10, 64–67 (2012).
Article Google Scholar
Brattoli, B., Büchler, U., Wahl, A. S., Schwab, M. E. & Ommer, B. LSTM self-supervision for detailed behavior analysis. In Proc. IEEE/ECVF Conference on Computer Vision and Pattern Recognition 3747–3756 (IEEE, 2017).
Büchler, U., Brattoli, B. & Ommer, B. Improving spatiotemporal self-supervision by deep reinforcement learning. In Proc. IEEE/ECVF European Conference on Computer Vision 770–776 (IEEE, 2017).
Noroozi, M. & Favaro, P. Unsupervised learning of visual representations by solving jigsaw puzzles. In Proc. IEEE/ECVF European Conference on Computer Vision 69–84 (IEEE, 2016).
Lee, H. Y., Huang, J. B., Singh, M. K. & Yang, M. H. Unsupervised representation learning by sorting sequences. In Proc. IEEE/ECVF International Conference on Computer Vision 667–676 (IEEE, 2017).
Oh, T. H. et al. Learning-based video motion magnification. In Proc. IEEE/CVF European Conference on Computer Vision 633–648 (IEEE, 2018).
Liu, C., Torralba, A., Freeman, W. T., Durand, F. & Adelson, E. H. Motion magnification. ACM Trans. Graph 24, 519–526 (2005).
Article Google Scholar
Wu, H. Y. et al. Eulerian video magnification for revealing subtle changes in the world. ACM Trans. Graph 31, 65 (2012).
Article Google Scholar
Elgharib, M., Hefeeda, M., Durand, F. & Freeman, W. T. Video magnification in presence of large motions. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 4119–4127 (IEEE, 2015).
Wadhwa, N., Rubinstein, M., Durand, F. & Freeman, W. T. Phase-based video motion processing. ACM Trans. Graph. 32, 80 (2013).
Article Google Scholar
Wadhwa, N., Rubinstein, M., Durand, F. & Freeman, W. T. Riesz pyramids for fast phase-based video magnification. In Proc. International Conference on Computational Photography 1–10 (IEEE, 2014).
Zhang, Y., Pintea, S. L. & Van Gemert, J. C. Video acceleration magnification. In Proc. IEEE/ECVF Conference on Computer Vision and Pattern Recognition 529–537 (IEEE, 2017).
Tulyakov, S. et al. Self-adaptive matrix completion for heart rate estimation from face videos under realistic conditions. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2396–2404 (IEEE, 2016).
Dekel, T., Michaeli, T., Irani, M. & Freeman, W. T. Revealing and modifying non-local variations in a single image. ACM Trans. Graph. 34, 227 (2015).
Article Google Scholar
Wadhwa, N., Dekel, T., Wei, D., Durand, F. & Freeman, W. T. Deviation magnification: revealing departures from ideal geometries. ACM Trans. Graph. 34, 226 (2015).
Article Google Scholar
Kingma, D.P. & Welling, M. Auto-encoding variational bayes. In 2nd International Conference on Learning Representations (ICLR, 2014).
Goodfellow, I. et al. Generative adversarial nets. In Proc. Advances in Neural Information Processing Systems Vol. 27, 2672–2680 (NIPS, 2014).
Esser, P., Sutter, E. & Ommer, B. A variational U-Net for conditional appearance and shape generation. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 8857–8866 (IEEE, 2018).
Goodman, A. D. et al. Sustained-release oral fampridine in multiple sclerosis: a randomised, double-blind, controlled trial. Lancet 373, 732–738 (2009).
Article Google Scholar
Zörner, B. et al. Prolonged-release fampridine in multiple sclerosis: improved ambulation effected by changes in walking pattern. Mult. Scler. 22, 1463–1475 (2016).
Article Google Scholar
Schniepp, R. et al. Walking assessment after lumbar puncture in normal-pressure hydrocephalus: a delayed improvement over 3 days. J. Neurosurg. 126, 148–157 (2017).
Article Google Scholar
Tran, D. et al. A closer look at spatiotemporal convolutions for action recognition. In Proc. IEEE/ECVF Conference on Computer Vision and Pattern Recognition 6450–6459 (IEEE, 2018).
Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
MATH Google Scholar
Lafferty, C. K. & Britt, J. P. Off-target influences of arch-mediated axon terminal inhibition on network activity and behavior. Front. Neural Circuits 14, 10 (2020).
Article Google Scholar
Miao, C. et al. Hippocampal remapping after partial inactivation of the medial entorhinal cortex. Neuron 88, 590–603 (2015).
Article Google Scholar
Carta, I., Chen, C. H., Schott, A. L., Dorizan, S. & Khodakhah, K. Cerebellar modulation of the reward circuitry and social behavior. Science 363, eaav0581 (2019).
Article Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Proc. Advances in Neural Information Processing Systems 1097–1105 (NIPS, 2012).
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
Article MathSciNet Google Scholar
Johnson, J., Alahi, A. & Fei-Fei, L. Perceptual losses for real-time style transfer and super-resolution. In Proc. IEEE/ECVF European Conference on Computer Vision 694–711 (Springer, 2016).
Alaverdashvili, M. & Whishaw, I. Q. A behavioral method for identifying recovery and compensation: hand use in a preclinical stroke model using the single pellet reaching task. Neurosci. Biobehav. Rev. 37, 950–967 (2013).
Article Google Scholar
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
MathSciNet MATH Google Scholar
Fisher, R. A. The use of multiple measurements in taxonomic problems. Ann. Eugenics 7, 179–188 (1936).
Article Google Scholar
Wahl, A. S. et al. Optogenetically stimulating intact rat corticospinal tract post-stroke restores motor control through regionalized functional circuit formation. Nat. Commun. 8, 1187 (2017).
Article Google Scholar
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Article MATH Google Scholar
Brattoli, B., Buechler, U. & Ommer, B. Source code of uBAM: first release (version v.1.0) (2020); https://github.com/utabuechler/uBAM. https://doi.org/10.5281/zenodo.4304070

Download references

Acknowledgements

This work was supported in part by German Research Foundation (DFG) projects 371923335 and 421703927 to B.O. as well as the Branco Weiss Fellowship Society in Science and the Swiss National Foundation Grant (Nr. 192678) to ASW.

Author information

These authors contributed equally: Biagio Brattoli, Uta Büchler, Anna-Sophia Wahl, Björn Ommer.

Authors and Affiliations

Interdisciplinary Center for Scientific Computing & Heidelberg Collaboratory for Image Processing, Heidelberg University, Heidelberg, Germany
Biagio Brattoli, Uta Büchler, Michael Dorkenwald, Philipp Reiser & Björn Ommer
Department of Neurology, University Hospital and University of Zurich, Zurich, Switzerland
Linard Filli
Spinal Cord Injury Center, Balgrist University Hospital, Zurich, Switzerland
Linard Filli
Brain Research Institute, University of Zurich, Zurich, Switzerland
Fritjof Helmchen & Anna-Sophia Wahl
Neuroscience Center Zurich, Zurich, Switzerland
Fritjof Helmchen & Anna-Sophia Wahl
Central Institute of Mental Health, Heidelberg University, Mannheim, Germany
Anna-Sophia Wahl

Authors

Biagio Brattoli
View author publications
You can also search for this author in PubMed Google Scholar
Uta Büchler
View author publications
You can also search for this author in PubMed Google Scholar
Michael Dorkenwald
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Reiser
View author publications
You can also search for this author in PubMed Google Scholar
Linard Filli
View author publications
You can also search for this author in PubMed Google Scholar
Fritjof Helmchen
View author publications
You can also search for this author in PubMed Google Scholar
Anna-Sophia Wahl
View author publications
You can also search for this author in PubMed Google Scholar
Björn Ommer
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

B.B., U.B. and B.O. developed uBAM. B.B. and U.B. implemented and evaluated the framework and M.D. and P.R. the VAE. A.-S.W., L.F. and F.H. conducted the biomedical experiments and validated the results. B.B., U.B. and B.O. prepared the figures with input from A.-S.W. All authors contributed to writing the manuscript.

Corresponding author

Correspondence to Björn Ommer.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Ahmet Arac, Sven Dickinson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Qualitative comparison with the state-of-the-art in motion magnification.

To compare our results with Oh et al.(30), we show five clips from different impaired subjects before and after magnification for both methods. First, we re-synthesize the healthy reference behavior to change the appearance to that of the impaired subject so differences in posture can be studied directly, first row (see Method). The second row is the query impaired sequence. Third and forth rows show the magnified frame using the method by Oh et al.(30) and our approach, respectively. The magnified results, represented by magenta markers, show that Oh et al. corrupts the subject appearance, while our method emphasises the differences in posture without altering the appearance. (Details in Supplementary).

Extended Data Fig. 2 Quantitative comparison with the state-of-the-art in motion magnification.

a: mean-squared difference (white = 0) between the original query frame and its magnification using our method and the approach proposed by Oh et al.(30). For impaired subjects, our method modifies only the leg posture, while healthy subjects are not altered. Oh et al.(30) mostly changes the background and alters impaired and healthy indiscriminately. b: Measuring the fraction of frames with important deviation from healthy reference behaviour for each subject and video sequence and plotting the distribution of these scores. c, mean and standard deviation of deviation scores per cohort and approach. (Details in Supplementary).

Extended Data Fig. 3 Abnormality posture before and after magnification.

We show that our magnification supports spotting abnormal postures by applying a generic classifier on our behaviour magnified frames. This doubles the amount of detected abnormal postures without introducing a substantial number of false positives. In particular, we use a one-class linear-svm on ImageNet features trained only on one group (that is healthy) and predict abnormalities on healthy and impaired before and after magnification. The ratio of abnormalities is unaltered within the healthy cohort ( ~ 2%) while it doubles in the impaired cohort (5.7% to 11.7%) showing that our magnification method can detect and magnify small deviations, but that it does not artificially introduce abnormalities. (Details in Supplementary).

Extended Data Fig. 4 Qualitative evaluation of our posture encoding on the rat grasping dataset.

Projection from our posture encoding to a 2D embedding of 1000 randomly chosen postures using tSNE. Similar postures are located close to each other and the grasping action can be reconstructed by following the circle clockwise (best viewed by zooming in on the digital version of this figure). (Details in Supplementary).

Extended Data Fig. 5 Comparison with PCA of posture encoding.

a: A single video clip projected onto the two most important factors of variation using PCA directly on RGB input (left) and our representation (right). Consecutive frames are connected by straight lines colourised according to the time within the video. Every four frames we plot the original frame. PCA is able to sort the frames over time automatically, showing that each cycle is overlapping with the previous one. Our representation better separates different postures thus reflected by the circular shape of the embedding. b: same as a but including more videos. Each colour represent a different subject. In this case, PCA is strongly biased towards the subject appearance. Thus it separates subjects and does not allow to compare behaviour. c: We reduce the appearance bias by normalising per video with the mean appearance. The result still shows subject separation and no similarity of posture across subjects. d: Using our posture representation and applying PCA on E_π instead of directly on video frames shows no subject bias and only similar postures are near in the 2D space. (Details in Supplementary).

Extended Data Fig. 6 Disentanglement comparison with simple baseline.

We transfer posture from a subject (row) to others with different appearance (columns). a: A baseline model which uses the average video frames as appearance. The appearance is subtracted from each frame to extract the posture. b: Disentanglement using our custom VAE for extracting posture and appearance. Checking for consistency in posture along a row and for similarity in appearance along a column shows that disentanglement is a hard problem: a pixel-based representation cannot solve the task, while our model produces more detailed and realistic images. (Details in Supplementary).

Extended Data Fig. 7 DeepLabCut trainset size.

We train DLC models on a growing number of training samples. The model is evaluated as described in Fig. 2 of the main manuscript. Note the limited gain in performance despite annotation increasing by more than an order of magnitude. (Details in Supplementary).

Extended Data Fig. 8 Comparison with R3D.

Besides JAABA and DLC we also compare our method with R3D which is another non-parametric model, very popular for video classification. We extract R3D features and evaluate the representation using the same protocol as our method. Our model is more suited to behaviour analysis. More information regarding the evaluation protocol can be found in the Methods section of the main manuscript. (Details in Supplementary).

Extended Data Fig. 9 Regress Key-points.

We show qualitative results for the key-point regression from our posture representation to key-points and ene-to-end inferred key-points for DLC. This experiment was computed on 14 keypoints, however we only show 6 for clarity: wrist (yellow), start of the first finger (purple), tip of each finger. The ground-truth location is shown with a circle and the detection inferred by the model with a cross. Even though our representation was not trained on keypoint detection, for some frames we can recover keypoints as good as, or even better, than DLC which was trained end-to-end on the task. We study the gap in performance in more detail in the Supplementary (Supplementary Figure 3).

Extended Data Fig. 10 Typical high/low scoring grasps with optogenetics.

Given the classifier that produced Fig. 5b, we score all testing sequences from the same animal and show two typical sequences with high/low classification scores. The positive score indicates that the sequence was predicted as light-on, the negative that it was predicted as light-off. Both sequences are correctly classified as indicated by the ground-truth (‘GT’) and classifier score (‘SVM-Score’). The sequence on the left shows a missed grasp, consistent with a light-on inhibitory behaviour, while the same animal performs a successful grasp in the sequence on the right for the light-off. Obviously, the classifier cannot see the fiber optics, since we cropped this area out before passing it to the classifier. (Details in Supplementary).

Supplementary information

Supplementary Information

Supplementary Figs. 1–3, Tables 1–6 and Discussion.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brattoli, B., Büchler, U., Dorkenwald, M. et al. Unsupervised behaviour analysis and magnification (uBAM) using deep learning. Nat Mach Intell 3, 495–506 (2021). https://doi.org/10.1038/s42256-021-00326-x

Download citation

Received: 02 September 2020
Accepted: 21 February 2021
Published: 05 April 2021
Issue Date: June 2021
DOI: https://doi.org/10.1038/s42256-021-00326-x

This article is cited by

ONIX: a unified open-source platform for multimodal neural recording and perturbation during naturalistic behavior
- Jonathan P. Newman
- Jie Zhang
- Jakob Voigts
Nature Methods (2024)
SUBTLE: An Unsupervised Platform with Temporal Link Embedding that Maps Animal Behavior
- Jea Kwon
- Sunpil Kim
- C. Justin Lee
International Journal of Computer Vision (2024)
EXPLORE: a novel deep learning-based analysis method for exploration behaviour in object recognition tests
- Victor Ibañez
- Laurens Bohlen
- Anna-Sophia Wahl
Scientific Reports (2023)