research-article

Large-scale visual sentiment ontology and detectors using adjective noun pairs

Authors:

Shih-Fu ChangAuthors Info & Claims

MM '13: Proceedings of the 21st ACM international conference on Multimedia

Pages 223 - 232

https://doi.org/10.1145/2502081.2502282

Published: 21 October 2013 Publication History

Abstract

We address the challenge of sentiment analysis from visual content. In contrast to existing methods which infer sentiment or emotion directly from visual low-level features, we propose a novel approach based on understanding of the visual concepts that are strongly related to sentiments. Our key contribution is two-fold: first, we present a method built upon psychological theories and web mining to automatically construct a large-scale Visual Sentiment Ontology (VSO) consisting of more than 3,000 Adjective Noun Pairs (ANP). Second, we propose SentiBank, a novel visual concept detector library that can be used to detect the presence of 1,200 ANPs in an image. The VSO and SentiBank are distinct from existing work and will open a gate towards various applications enabled by automatic sentiment analysis. Experiments on detecting sentiment of image tweets demonstrate significant improvement in detection accuracy when comparing the proposed SentiBank based predictors with the text-based approaches. The effort also leads to a large publicly available resource consisting of a visual sentiment ontology, a large detector library, and the training/testing benchmark for visual sentiment analysis.

References

[1]

H. Aradhye, G. Toderici, and J. Yagnik. Video2Text: Learning to Annotate Video Content. Internet Multimedia Mining, 2009.

Digital Library

[2]

D. Borth, A. Ulges, and T.M. Breuel. Lookapp - Interactive Construction of web-based Concept Detectors. ICMR, 2011.

Digital Library

[3]

D. Borth and S-F. Chang. Constructing Structures and Relations in SentiBank Visual Sentiment Ontology. Technical Report#CUCS-020--13, Columbia University, Computer Science Dep., 2013.

[4]

E. Dan-Glauser et al. The Geneva Affective Picture Database (GAPED): a new 730-picture database focusing on valence and normative significance. Behavior Research Methods, 2011.

[5]

Charles Darwin. The Expression of the Emotions in Man and Animals. Oxford University Press, USA, 1872 / 1998.

[6]

R. Datta, D. Joshi, J. Li, and J. Wang. Studying Aesthetics in Photographic Images using a Computational Approach. ECCV, 2006.

Digital Library

[7]

J. Deng et al. ImageNet: A Large-Scale Hierarchical Image Database. CVPR, 2009.

[8]

P. Ekman et al. Facial Expression and Emotion. American Psychologist, 48:384--384, 1993.

[9]

A. Esuli and F. Sebastiani. SentiWordnet: A publicly available Lexical Resource for Opinion Mining. LREC, 2006.

[10]

M. Everingham, et al. The Pascal Visual Object Classes (VOC) Challenge. Int. J. of Computer Vision, 88(2):303--338, 2010.

Digital Library

[11]

A. Hanjalic, C. Kofler, and M. Larson. Intent and its Discontents: the User at the Wheel of the Online Video Search Engine. ACM MM, 2012.

Digital Library

[12]

P. Isola, J. Xiao, A. Torralba, and A. Oliva. What makes an Image Memorable? CVPR, 2011.

Digital Library

[13]

J. Jia, S. Wu, X. Wang, P. Hu, L. Cai, and J. Tang. Can we understand van Gogh's Mood?: Learning to infer Affects from Images in Social Networks. ACM MM, 2012.

Digital Library

[14]

Y.-G. Jiang, G. Ye, S.-F. Chang, D. Ellis, and A. Loui. Consumer Video Understand.: Benchmark Database and an Eval. of Human and Machine Performance. ICMR, 2011.

Digital Library

[15]

D. Joshi, R. Datta, E. Fedorovskaya, Q. Luong, J. Wang, J. Li, and J. Luo. Aesthetics and Emotions in Images. Signal Processing Magazine, 28(5):94--115, 2011.

[16]

L. Kennedy, S.-F. Chang, and I. Kozintsev. To Search or to Label?: Predicting the Performance of Search-based Automatic Image Classifiers. MIR Workshop, 2006.

Digital Library

[17]

P. Lang, M. Bradley, and B. Cuthbert. International Affective Picture System (IAPS): Technical Manual and Affective Ratings, 1999.

[18]

B. Li, et al. Scaring or Pleasing: Exploit Emotional Impact of an Image. ACM MM, 2012.

Digital Library

[19]

X. Li, C. Snoek, M. Worring, and A. Smeulders. Harvesting Social Images for Bi-Concept Search. IEEE Transactions on Multimedia, 14(4):1091--1104, 2012.

Digital Library

[20]

N. Codella et al. IBM Research and Columbia University TRECVID-2011 Multimedia Event Detection (med) System. NIST TRECVID Workshop, 2011.

[21]

J. Machajdik and A. Hanbury. Affective Image Classification using Features inspired by Psychology and Art Theory. ACM MM, 2010.

Digital Library

[22]

L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka. Assessing the Aesthetic Quality of Photographs using Generic Image Descriptors. ICCV, 2011.

Digital Library

[23]

M. Naphade, J. Smith, J. Tesic, S. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-Scale Concept Ontology for Multimedia. IEEE MultiMedia, 13(3):86--91, 2006.

Digital Library

[24]

C. Osgood, G. Suci, and P. Tannenbaum. The Measurement of Meaning, volume 47. University of Illinois Press, 1957.

[25]

P. Over et al. Trecvid 2012 -- An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics. TRECVID Workshop, 2012.

[26]

B. Pang and L. Lee. Opinion Mining and Sentiment Analysis. Information Retrieval, 2(1--2):1--135, 2008.

Digital Library

[27]

Robert Plutchik. Emotion: A Psychoevolutionary Synthesis. Harper & Row, Publishers, 1980.

[28]

C. Snoek and M. Worring. Concept-based Video Retrieval. Foundations and Trends in Inf. Retrieval, 4(2), 2009.

Digital Library

[29]

S. Strassel et al. Creating HAVIC: Heterogeneous Audio Visual Internet Collection. LREC, 2012.

[30]

M. Thelwall et al. Sentiment Strength Detection in Short Informal Text. J. of the American Soc. for Information Science and Tech., 61(12):2544--2558, 2010.

Digital Library

[31]

A. Ulges, C. Schulze, M. Koch, and T. Breuel. Learning Automatic Concept Detectors from Online Video. Journal on Comp. Vis. Img. Underst., 114(4):429--438, 2010.

Digital Library

[32]

V. Vonikakis and S. Winkler. Emotion-based Sequence of Family Photos. ACM MM, 2012.

Digital Library

[33]

W. Wang and Q. He. A Survey on Emotional Semantic Image Retrieval. IEEE ICIP, 2008.

[34]

X. Wang, J. Jia, P. Hu, S. Wu, J. Tang, and L. Cai. Understanding the Emotional Impact of Images. ACM MM, 2012.

Digital Library

[35]

T. Wilson et al. Recognizing Contextual Polarity in phrase-level Sentiment Analysis. HLT/EMNLP, 2005.

Digital Library

[36]

V. Yanulevskaya et al. In the Eye of the Beholder: Employing Statistical Analysis and Eye Tracking for Analyzing Abstract Paintings. ACM MM, 2012.

Digital Library

[37]

V. Yanulevskaya et al. Emotional Valence Categorization using Holistic Image Features. IEEE ICIP, 2008.

[38]

Li et al. ObjectBank: A high-level Image Rep. for Scene Classification and Semantic Feature Sparsification. NIPS, 2010.

[39]

Torresani et al. Efficient Object Category Recognition using Classemes. ECCV, 2010.

Digital Library

[40]

A. Olivai and A. Torralba. Modeling the Shape of the Scene: a Holistic Representation of the Spatial Envelope. Int. J. of Computer Vision, 42(3):145--175, 2001.

Digital Library

[41]

S. Bhattacharya and R. Sukthankar and M. Shah. A holistic Approach to Aesthetic Enhancement of Photographs. TOMCCAP, 7(1), 2011.

Digital Library

Cited By

Chen PFu L(2024)Enhancing Multimodal Tourism Review Sentiment Analysis Through Advanced Feature Association TechniquesInternational Journal of Information Systems in the Service Sector10.4018/IJISSS.34956415:1(1-21)Online publication date: 17-Jul-2024
https://doi.org/10.4018/IJISSS.349564
Zhong B(2024)Multimodal Emotion Cognition Method Based on Multi-Channel Graphic InteractionInternational Journal of Cognitive Informatics and Natural Intelligence10.4018/IJCINI.34996918:1(1-17)Online publication date: 17-Sep-2024
https://dl.acm.org/doi/10.4018/IJCINI.349969
Li HLu YZhu H(2024)Multi-Modal Sentiment Analysis Based on Image and Text Fusion Based on Cross-Attention MechanismElectronics10.3390/electronics1311206913:11(2069)Online publication date: 27-May-2024
https://doi.org/10.3390/electronics13112069
Show More Cited By

Index Terms

Large-scale visual sentiment ontology and detectors using adjective noun pairs
1. Information systems
  1. Information retrieval

Recommendations

SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content
MM '13: Proceedings of the 21st ACM international conference on Multimedia

A picture is worth one thousand words, but what words should be used to describe the sentiment and emotions conveyed in the increasingly popular social multimedia? We demonstrate a novel system which combines sound structures from psychology and the ...
Image sentiment prediction based on textual descriptions with adjective noun pairs

We aim to predict the sentiment related information reflected in images based on SentiBank, which is a library including Adjective Noun Pair (ANP) concept detectors for image sentiment analysis. Instead of using only ANP responses in images as mid-level ...
Multilingual Visual Sentiment Concept Matching
ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval

The impact of culture in visual emotion perception has recently captured the attention of multimedia research. In this study, we provide powerful computational linguistics tools to explore, retrieve and browse a dataset of 16K multilingual affective ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '13: Proceedings of the 21st ACM international conference on Multimedia

October 2013

1166 pages

ISBN:9781450324045

DOI:10.1145/2502081

General Chairs:
Alejandro (Alex) Jaimes
Yahoo!, Spain
,
Nicu Sebe
University of Trento, Italy
,
Nozha Boujemaa
INRIA, France
,
Program Chairs:
Daniel Gatica-Perez
IDIAP & EPFL, Switzerland
,
David A. Shamma
Yahoo!, USA
,
Marcel Worring
University of Amsterdam, The Netherlands
,
Roger Zimmermann
National University of Singapore, Singapore

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '13

Sponsor:

SIGMM

MM '13: ACM Multimedia Conference

October 21 - 25, 2013

Barcelona, Spain

Acceptance Rates

MM '13 Paper Acceptance Rate 47 of 235 submissions, 20%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

540
Total Citations
View Citations
2,727
Total Downloads

Downloads (Last 12 months)241
Downloads (Last 6 weeks)24

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chen PFu L(2024)Enhancing Multimodal Tourism Review Sentiment Analysis Through Advanced Feature Association TechniquesInternational Journal of Information Systems in the Service Sector10.4018/IJISSS.34956415:1(1-21)Online publication date: 17-Jul-2024
https://doi.org/10.4018/IJISSS.349564
Zhong B(2024)Multimodal Emotion Cognition Method Based on Multi-Channel Graphic InteractionInternational Journal of Cognitive Informatics and Natural Intelligence10.4018/IJCINI.34996918:1(1-17)Online publication date: 17-Sep-2024
https://dl.acm.org/doi/10.4018/IJCINI.349969
Li HLu YZhu H(2024)Multi-Modal Sentiment Analysis Based on Image and Text Fusion Based on Cross-Attention MechanismElectronics10.3390/electronics1311206913:11(2069)Online publication date: 27-May-2024
https://doi.org/10.3390/electronics13112069
Zhong QShao X(2024)A cross-model hierarchical interactive fusion network for end-to-end multimodal aspect-based sentiment analysisIntelligent Data Analysis10.3233/IDA-23030528:5(1293-1308)Online publication date: 19-Sep-2024
https://doi.org/10.3233/IDA-230305
Zhang JZhang ZWen J(2024)A multifactor model using large language models and investor sentiment from photos and news: new evidence from ChinaSSRN Electronic Journal10.2139/ssrn.4708979Online publication date: 2024
https://doi.org/10.2139/ssrn.4708979
Wu DYang DZhou YMa CCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Bridging Visual Affective Gap: Borrowing Textual Knowledge by Learning from Noisy Image-Text PairsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680875(602-611)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680875
Chen JWang WHu YChen JLiu HHu XCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)TGCA-PVT: Topic-Guided Context-Aware Pyramid Vision Transformer for Sticker Emotion RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680781(9709-9718)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680781
Ma JXu SLiu YFu XSerra ESpezzano F(2024)CH-Mits: A Cross-Modal Dataset for User Sentiment Analysis on Chinese Social MediaProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679125(5390-5394)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679125
Braytee AYang AAnaissi AChaturvedi KPrasad MChua TNgo CKumar RLauw HKa-Wei Lee R(2024)A Novel Dual-Pipeline based Attention Mechanism for Multimodal Social Sentiment AnalysisCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651967(1816-1822)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3651967
Liu YHuang DZhou JWang S(2024)Does image sentiment of major public emergency affect the stock market performance? New insight from deep learning techniquesAccounting & Finance10.1111/acfi.13313Online publication date: 14-Aug-2024
https://doi.org/10.1111/acfi.13313
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents