Spatiotemporal fusion personality prediction based on visual information

Jia Xu¹,
Weijian Tian¹,
Guoyun Lv¹ &
…
Yangyu Fan¹

166 Accesses
Explore all metrics

Abstract

The previous studies have demonstrated that the use of deep learning algorithms can make personality prediction based on two-dimensional image information, and the emergence of video provides more possibilities for exploring personality prediction. Compared to image-based personality prediction, using video can provide more information than static images. But videos contain hundreds of frames, not all of which are useful, and processing these images requires a lot of computation. This paper proposes to apply video analysis algorithms to the task of personality prediction and propose the use of LSTM to fuse image feature information. The best prediction effect is confirmed by experiments when the fusion frame number is 16 frames. This paper is based on 3D-ConvNet to build an end-to-end video analysis network and solve the network over fitting problem by pre-training and data augmentation. Experiments show that the accuracy of character prediction can be improved by using 3D-ConvNet to fuse the spatio-temporal information of videos.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Bimodal Regression for Apparent Personality Analysis

Decision-Level Fusion Method Based on Deep Learning

A deep multimodal fusion method for personality traits prediction

Article 15 October 2024

Data availability

Our research involves personal personality and face images and videos. In order to protect personal privacy, we signed a confidentiality agreement with the subjects before the evaluation to ensure that their data are not disclosed and only used for scientific research experiments. So,the PCCS (Chinese college students) representative personality dataset generated during and/or analysed during the current study are not publicly available.

References

Attrapadung N, Hamada K, Ikarashi D, Kikuchi R, Matsuda T, Mishina I, Morita H, Schuldt J (2021) Adam in Private: Secure and Fast Training of Deep Neural Networks with Adaptive Moment Estimation.
Brooks J (2011) Asdarepro deal forSun and Imagenet[J]. Packaging News, p.3
Cao X, Liu Z (2015) Type-2 Fuzzy Topic Models for Human Action Recognition. IEEE Trans Fuzzy Syst 23(5):1581–1593. https://doi.org/10.1109/TFUZZ.2014.2370678
Article Google Scholar
Diba A, Pazandeh AM, Gool LV (2016) Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification[J]
Hara K, Kataoka H and Satoh Y (2018) Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6546-6555, doi: https://doi.org/10.1109/CVPR.2018.00685
Hara K, Kataoka H and Satoh Y (2018) Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?,"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6546-6555, doi: https://doi.org/10.1109/CVPR.2018.00685
Hara K, Kataoka H, Satoh Y (2018) Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?[J]
Joo J, Steen FF, Zhu S-C (2015) Automated Facial Trait Judgment and Election Outcome Prediction: Social Dimensions of Face, 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3712-3720, doi: https://doi.org/10.1109/ICCV.2015.423
Lin QH, Niu YW, Sui J et al (2022) SSPNet: An interpretable 3D-CNN for classification of schizophrenia using phase maps of resting-state complex-valued fMRI data[J]. Med Image Anal 79:102430
Article Google Scholar
Liu S, Wang S, Liu X, Lin C-T, Lv Z (2021) Fuzzy Detection Aided Real-Time and Robust Visual Tracking Under Complex Environments. IEEE Trans Fuzzy Syst 29(1):90–102. https://doi.org/10.1109/TFUZZ.2020.3006520
Article Google Scholar
Liu S et al (2021) Human Memory Update Strategy: A Multi-Layer Template Update Mechanism for Remote Visual Monitoring. IEEE Trans Multimedia 23:2188–2198. https://doi.org/10.1109/TMM.2021.3065580
Article Google Scholar
Mohammadi G, Vinciarelli A (2012) Automatic Personality Perception: Prediction of Trait Attribution Based on Prosodic Features. IEEE Trans Affective Comput 3(3):273–284. https://doi.org/10.1109/T-AFFC.2012.5
Article Google Scholar
Nguyen LS, Gatica-Perez D (2016) Hirability in the wild: Analysis of online conversational video resumes. IEEE Trans Multimedia 18(7):1422–1437
Article Google Scholar
Ponce-López V et al (2016) ChaLearn LAP 2016: First Round Challenge on First Impressions - Dataset and Results. In: Hua G, Jégou H (eds) Computer Vision – ECCV 2016 Workshops. ECCV 2016. Lecture Notes in Computer Science(), vol 9915. Springer, Cham. https://doi.org/10.1007/978-3-319-49409-8_32
Chapter Google Scholar
Russakovsky O, Deng J, Su H et al (2015) ImageNet Large Scale Visual Recognition Challenge. IntJ Comput Vis 115:211–252. https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Sammeta V, Naveen Y, Suresh C (n.d.) Acoustics Recognition and Video Sound-Track Classification using CNN
Schmid W (1975) On the characters of the discrete series. Invent Math 30:47–144. https://doi.org/10.1007/BF01389847
Article MathSciNet MATH Google Scholar
Teng M, Tao et al (2011) Contextual Bag-of-Words for Visual Categorization.[J]. IEEE Trans Circuits Syst Video Technol 21(4):381–392
Article Google Scholar
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning Spatiotemporal Features with 3D Convolutional Networks. IEEE Int Conference Comput Vision (ICCV) 2015:4489–4497. https://doi.org/10.1109/ICCV.2015.510
Article Google Scholar
Wang S et al (2022) Human Short Long-Term Cognitive Memory Mechanism for Visual Monitoring in IoT-Assisted Smart Cities. IEEE Internet Things J 9(10):7128–7139. https://doi.org/10.1109/JIOT.2021.3077600
Article MathSciNet Google Scholar
Wei X, Zhang C, Zhang H, Wu J (2018) Deep Bimodal Regression of Apparent Personality Traits from Short Video Sequences. IEEE Trans Affect Comput 9(3):303–315. https://doi.org/10.1109/TAFFC.2017.2762299
Article Google Scholar
Wolf L, Levy N (2013) The SVM-Minus Similarity Score for Video Face Recognition[C]// IEEE Conference on Computer Vision & Pattern Recognition. IEEE
Xu J, Tian W, Fan Y, Lin Y, Zhang C (2018) Personality Trait Prediction Based on 2.5D Face Feature Model. In: Sun X, Pan Z, Bertino E (eds) Cloud Computing and Security. ICCCS 2018. Lecture Notes in Computer Science, vol 11068. Springer, Cham. https://doi.org/10.1007/978-3-030-00021-9_54
Chapter Google Scholar
Xu J, Tian W, Lv G, Liu S, Fan Y (2021) Prediction of the Big Five Personality Traits Using Static Facial Images of College Students With Different Academic Backgrounds. IEEE Access 9:76822–76832. https://doi.org/10.1109/ACCESS.2021.3076989
Article Google Scholar
Xu J, Tian W, Lv G, Liu S, Fan Y (2021) 2.5D Facial Personality Prediction Based on Deep Learning. J Adv Trans 2021:5581984, 12 pages. https://doi.org/10.1155/2021/5581984
Article Google Scholar
Yan S (2014) Some examples from Caltech101/256 and PASCAL VOC 2007/2011 datasets
Yu Z, Xu D, Yu J, Yu T, Zhao Z, Zhuang Y, Tao D (2019) ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering. Proc AAAI Conference Artificial Intell 33:9127–9134. https://doi.org/10.1609/aaai.v33i01.33019127
Article Google Scholar
Zha S, Luisier F, Andrews W, Srivastava N and Salakhutdinov R (2015) Exploiting Image-trained CNN Architectures for Unconstrained Video Classification. In Xianghua Xie, Mark W. Jones, and Gary K. L. Tam, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 60.1-60.13. BMVA Press
Zhang W and Wu Y (2022) Semantic sentiment analysis based on a combination of CNN and LSTM model in 2022 International Conference on Machine Learning and Knowledge Engineering (MLKE), Guilin, China, pp. 177-180.doi:https://doi.org/10.1109/MLKE55170.2022.00041

Download references

Funding

This work was funded by the National Natural Science Foundation of China (61402371), the Shaanxi Provincial Science and Technology Innovation Project Plan(2013SZS15-K02), and the Shaanxi Provincial Key Scientific Research Project (2020zdlgy04-09).

Author information

Authors and Affiliations

School of Electronics and Information, Northwestern Polytechnical University, Xi’an, 710072, China
Jia Xu, Weijian Tian, Guoyun Lv & Yangyu Fan

Authors

Jia Xu
View author publications
You can also search for this author in PubMed Google Scholar
Weijian Tian
View author publications
You can also search for this author in PubMed Google Scholar
Guoyun Lv
View author publications
You can also search for this author in PubMed Google Scholar
Yangyu Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jia Xu.

Ethics declarations

Ethical approval

Participants were asked for oral consent to participate in the study, and all data were collected after obtaining consent. The data from consenting participants were applied in this study. In addition, we numbered each subject, and the self-reported personality assessment data were collected anonymously in the form of numbers.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xu, J., Tian, W., Lv, G. et al. Spatiotemporal fusion personality prediction based on visual information. Multimed Tools Appl 82, 44227–44244 (2023). https://doi.org/10.1007/s11042-023-15537-0

Download citation

Received: 14 April 2022
Revised: 22 July 2022
Accepted: 19 April 2023
Published: 01 May 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s11042-023-15537-0

Spatiotemporal fusion personality prediction based on visual information

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Bimodal Regression for Apparent Personality Analysis

Decision-Level Fusion Method Based on Deep Learning

A deep multimodal fusion method for personality traits prediction

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Spatiotemporal fusion personality prediction based on visual information

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Bimodal Regression for Apparent Personality Analysis

Decision-Level Fusion Method Based on Deep Learning

A deep multimodal fusion method for personality traits prediction

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation