Abstract
The prevalence of Autism Spectrum Disorder (ASD) in the United States has increased by 178% from 2000 to 2016. However, due to the lack of well-trained specialists and the time-consuming diagnostic process, many children are not able to be promptly diagnosed. Recently, several research have taken steps to explore automatic video-based ASD detection systems with the help of machine learning and deep learning models, such as support vector machine (SVM) and long short-term memory (LSTM) model. However, the models mentioned above could not extract effective features directly from raw videos. In this study, we aim to take advantages of 3D convolution-based deep learning models to aid video-based ASD detection. We explore three representative 3D convolutional neural networks (CNNs), including C3D, I3D and 3D ResNet. In addition, a new 3D convolutional model, called 3D ResNeSt, is also proposed based on ResNeSt. We evaluate these models on an ASD detection dataset. The experimental results show that, on average, all of the four 3D convolutional models can obtain competitive results when compared to the baseline using LSTM model. Our proposed 3D ResNeSt model achieves the best performance, which improves the average detection accuracy from 0.72 to 0.85.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Carreira, J., Zisserman, A.: Quo vadis, action recognition? a new model and the kinetics dataset. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Chen, S., Zhao, Q.: Attention-based autism spectrum disorder screening with privileged modality. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1181–1190 (2019)
Hara, K., Kataoka, H., Satoh, Y.: Learning spatio-temporal features with 3d residual networks for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 3154–3160 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Ji, S., Xu, W., Yang, M., Yu, K.: 3d convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2012)
Jiang, M., Zhao, Q.: Learning visual attention to identify people with autism spectrum disorder. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3267–3276 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Liang, S., Loo, C.K., Md Sabri, A.Q.: Autism spectrum disorder classification in videos: a hybrid of temporal coherency deep networks and self-organizing dual memory approach. In: Kim, K.J., Kim, H.-Y. (eds.) Information Science and Applications. LNEE, vol. 621, pp. 421–430. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-1465-4_42
Maenner, M.J., Shaw, K.A., Baio, J., et al.: Prevalence of autism spectrum disorder among children aged 8 years-autism and developmental disabilities monitoring network, 11 sites, united states, 2016. MMWR Surveill. Summ. 69(4), 1 (2020)
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Sun, K., Li, L., Li, L., He, N., Zhu, J.: Spatial attentional bilinear 3d convolutional network for video-based autism spectrum disorder detection. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3387–3391. IEEE (2020)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Tariq, Q., Daniels, J., Schwartz, J.N., Washington, P., Kalantarian, H., Wall, D.P.: Mobile detection of autism through machine learning on home video: a development and prospective validation study. PLoS Med. 15(11), e1002705 (2018)
Tian, Y., Min, X., Zhai, G., Gao, Z.: Video-based early asd detection via temporal pyramid networks. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 272–277. IEEE (2019)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Zhang, H., et al.: ResNeSt: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)
Zunino, A., et al.: Video gesture analysis for autism spectrum disorder detection. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 3421–3426. IEEE (2018)
Acknowledgments
This work is jointly supported by the National Natural Science Foundation of China (NO. 61976214, 61972188), and Shandong Provincial Key Research and Development Program (Major Scientific and Technological Innovation Project) (NO. 2019JZZY010119).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, K., Wang, W., Guo, Y., Shan, C., Wang, L. (2021). A Comparative Study on Autism Spectrum Disorder Detection via 3D Convolutional Neural Networks. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12661. Springer, Cham. https://doi.org/10.1007/978-3-030-68763-2_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-68763-2_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68762-5
Online ISBN: 978-3-030-68763-2
eBook Packages: Computer ScienceComputer Science (R0)