Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2911996.2912016acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Multilingual Visual Sentiment Concept Matching

Published: 06 June 2016 Publication History

Abstract

The impact of culture in visual emotion perception has recently captured the attention of multimedia research. In this study, we provide powerful computational linguistics tools to explore, retrieve and browse a dataset of 16K multilingual affective visual concepts and 7.3M Flickr images. First, we design an effective crowdsourcing experiment to collect human judgements of sentiment connected to the visual concepts. We then use word embeddings to represent these concepts in a low dimensional vector space, allowing us to expand the meaning around concepts, and thus enabling insight about commonalities and differences among different languages. We compare a variety of concept representations through a novel evaluation task based on the notion of visual semantic relatedness. Based on these representations, we design clustering schemes to group multilingual visual concepts, and evaluate them with novel metrics based on the crowdsourced sentiment annotations as well as visual semantic relatedness. The proposed clustering framework enables us to analyze the full multilingual dataset in-depth and also show an application on a facial data subset, exploring cultural insights of portrait-related affective visual concepts.

References

[1]
B. Jou, T. Chen, N. Pappas, M. Redi, M. Topkara*, and S.-F. Chang, "Visual affect around the world: A large-scale multilingual visual sentiment ontology," in ACM International Conference on Multimedia, (Brisbane, Australia), pp. 159--168, 2015.
[2]
H. Liu, B. Jou, T. Chen, M. Topkara, N. Pappas, M. Redi, and S.-F. Chang, "Complura: Exploring and leveraging a large-scale multilingual visual sentiment ontology," in ACM Interational Conference on Multimedia Retrieval, (New York, NY, USA), 2016.
[3]
J. Turian, L. Ratinov, and Y. Bengio, "Word representations: A simple and general method for semi-supervised learning," in 48th Annual Meeting of the Association for Computational Linguistics, ACL '10, (Uppsala, Sweden), pp. 384--394, 2010.
[4]
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, "Natural language processing (almost) from scratch," Journal of Machine Learning Research, vol. 12, pp. 2493--2537, 2011.
[5]
T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," CoRR, vol. abs/1301.3781, 2013.
[6]
J. Pennington, R. Socher, and C. D. Manning, "GloVe: Global vectors for word representation," in Empirical Methods in Natural Language Processing, pp. 1532--1543, 2014.
[7]
R. Al-Rfou, B. Perozzi, and S. Skiena, "Polyglot: Distributed word representations for multilingual NLP," CoRR, vol. abs/1307.1662, 2013.
[8]
A. Klementiev, I. Titov, and B. Bhattarai, "Inducing crosslingual distributed representations of words," in Proceedings of COLING 2012, (Mumbai, India), pp. 1459--1474, 2012.
[9]
W. Y. Zou, R. Socher, D. Cer, and C. D. Manning, "Bilingual word embeddings for phrase-based machine translation," in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, (Seattle, WA, USA), pp. 1393--1398, 2013.
[10]
K. M. Hermann and P. Blunsom, "Multilingual models for compositional distributed semantics," in Annual Meeting of the Association for Computational Linguistics, (Baltimore, Maryland), pp. 58--68, 2014.
[11]
A. P. S. Chandar, S. Lauly, H. Larochelle, M. M. Khapra, B. Ravindran, V. C. Raykar, and A. Saha, "An autoencoder approach to learning bilingual word representations," CoRR, vol. abs/1402.1454, 2014.
[12]
F. Hill, R. Reichart, and A. Korhonen, "Simlex-999: Evaluating semantic models with (genuine) similarity estimation," CoRR, vol. abs/1408.3456, 2014.
[13]
E. Bruni, N. K. Tran, and M. Baroni, "Multimodal distributional semantics," Journal of Artificial Intelligence Research, vol. 49, pp. 1--47, Jan. 2014.
[14]
C. Silberer and M. Lapata, "Learning grounded meaning representations with autoencoders," in 52nd Annual Meeting of the Association for Computational Linguistics, (Baltimore, Maryland), pp. 721--732, June 2014.
[15]
A. Lazaridou, N. T. Pham, and M. Baroni, "Combining language and vision with a multimodal skip-gram model," in Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Denver, Colorado), pp. 153--163, 2015.
[16]
A. Karpathy, A. Joulin, and F. Li, "Deep fragment embeddings for bidirectional image sentence mapping," in Advances in Neural Information Processing Systems 27, pp. 1889--1897, Curran Associates, Inc., 2014.
[17]
R. Kiros, R. Salakhutdinov, and R. S. Zemel, "Unifying visual-semantic embeddings with multimodal neural language models," CoRR, vol. abs/1411.2539, 2014.
[18]
R. Socher, A. Karpathy, Q. V. Le, C. D. Manning, and A. Y. Ng, "Grounded compositional semantics for finding and describing images with sentences," TACL, vol. 2, pp. 207--218, 2014.
[19]
J. Mao, W. Xu, Y. Yang, J. Wang, and A. L. Yuille, "Explain images with multimodal recurrent neural networks," CoRR, vol. abs/1410.1090, 2014.
[20]
S. Kottur, R. Vedantam, J. M. F. Moura, and D. Parikh, "Visual word2vec (vis-w2v): Learning visually grounded word embeddings using abstract scenes," CoRR, vol. abs/1511.07067, 2015.
[21]
T. Schnabel, I. Labutov, D. Mimno, and T. Joachims, "Evaluation methods for unsupervised word embeddings," in Conference on Empirical Methods in Natural Language Processing, (Lisbon, Portugal), pp. 298--307, 2015.
[22]
O. Levy, Y. Goldberg, and I. Dagan, "Improving distributional similarity with lessons learned from word embeddings," Transactions of Association for Computational Linguistics, vol. 3, pp. 211--225, 2015.
[23]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," in Advances in Neural Information Processing Systems 26, pp. 3111--3119, 2013.
[24]
R. Lebret and R. Collobert, "Word embeddings through hellinger pca," in Conference of the European Chapter of the Association for Computational Linguistics, (Gothenburg, Sweden), pp. 482--490, 2014.
[25]
M. Baroni and R. Zamparelli, "Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space," in Conference on Empirical Methods in Natural Language Processing, (Cambridge, MA, USA), pp. 1183--1193, 2010.
[26]
R. Socher, B. Huval, C. D. Manning, and A. Y. Ng, "Semantic compositionality through recursive matrix-vector spaces," in Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, (Jeju Island, Korea), pp. 1201--1211, 2012.
[27]
H. Schmid, "Probabilistic part-of-speech tagging using decision trees," in International Conference on New Methods in Language Processing, (Manchester, UK), 1994.
[28]
W. A. Freiwald and D. Y. Tsao, "Neurons that keep a straight face," National Academy of Sciences, vol. 111, no. 22, pp. 7894--7895, 2014.
[29]
M. Redi, N. Rasiwasia, G. Aggarwal, and A. Jaimes, "The beauty of capturing faces: Rating the quality of digital portraits," in IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, (Ljubljana, Slovenia), pp. 1--8, 2015.
[30]
B. Jou, S. Bhattacharya, and S.-F. Chang, "Predicting viewer perceived emotions in animated GIFs," in ACM International Conference on Multimedia, (Orlando, Florida, USA), pp. 213--216, 2014.
[31]
S. Bakhshi, D. A. Shamma, and E. Gilbert, "Faces engage us: Photos with faces attract more likes and comments on instagram," in ACM Conference on Human Factors in Computing Systems, (Toronto, ON, Canada), pp. 965--974, 2014.
[32]
S. Liao, A. K. Jain, and S. Z. Li, "A fast and accurate unconstrained face detector," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, pp. 211--223, Feb 2016.

Cited By

View all
  • (2021)Multilingual Sentiment Analysis: A Systematic Literature ReviewPertanika Journal of Science and Technology10.47836/pjst.29.1.2529:1Online publication date: 2021
  • (2020)Sixteen facial expressions occur in similar contexts worldwideNature10.1038/s41586-020-3037-7589:7841(251-257)Online publication date: 16-Dec-2020
  • (2019)VaTeX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research2019 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV.2019.00468(4580-4590)Online publication date: Oct-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval
June 2016
452 pages
ISBN:9781450343596
DOI:10.1145/2911996
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2016

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Multimodal paper

Author Tags

  1. concept detection
  2. cross-cultural
  3. cultures
  4. emotion
  5. language
  6. multilingual
  7. ontology
  8. sentiment
  9. social multimedia

Qualifiers

  • Research-article

Conference

ICMR'16
Sponsor:
ICMR'16: International Conference on Multimedia Retrieval
June 6 - 9, 2016
New York, New York, USA

Acceptance Rates

ICMR '16 Paper Acceptance Rate 20 of 120 submissions, 17%;
Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Multilingual Sentiment Analysis: A Systematic Literature ReviewPertanika Journal of Science and Technology10.47836/pjst.29.1.2529:1Online publication date: 2021
  • (2020)Sixteen facial expressions occur in similar contexts worldwideNature10.1038/s41586-020-3037-7589:7841(251-257)Online publication date: 16-Dec-2020
  • (2019)VaTeX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research2019 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV.2019.00468(4580-4590)Online publication date: Oct-2019
  • (2019)Mining exoticism from visual content with fusion-based deep neural networksInternational Journal of Multimedia Information Retrieval10.1007/s13735-018-00165-48:1(19-33)Online publication date: 23-Jan-2019
  • (2018)Mining Exoticism from Visual Content with Fusion-based Deep Neural NetworksProceedings of the 2018 ACM on International Conference on Multimedia Retrieval10.1145/3206025.3206044(37-45)Online publication date: 5-Jun-2018
  • (2018)A Language-Independent Ontology Construction Method Using Tagged Images in FolksonomyIEEE Access10.1109/ACCESS.2017.27862186(2930-2942)Online publication date: 2018
  • (2017)Multilingual visual sentiment concept clustering and analysisInternational Journal of Multimedia Information Retrieval10.1007/s13735-017-0120-46:1(51-70)Online publication date: 20-Feb-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media