User-guided Hierarchical Attention Network for Multi-modal Social Image Popularity Prediction

Published: 10 April 2018 Publication History


Popularity prediction for the growing social images has opened unprecedented opportunities for wide commercial applications, such as precision advertising and recommender system. While a few studies have explored this significant task, little research has addressed its unstructured properties of both visual and textual modalities, and further considered to learn effective representation from multi-modalities for popularity prediction. To this end, we propose a model named User-guided Hierarchical Attention Network (UHAN) with two novel user-guided attention mechanisms to hierarchically attend both visual and textual modalities. It is capable of not only learning effective representation for each modality, but also fusing them to obtain an integrated multi-modal representation under the guidance of user embedding. As no benchmark dataset exists, we extend a publicly available social image dataset by adding the descriptions of images. The comprehensive experiments have demonstrated the rationality of our proposed UHAN and its better performance than several strong alternatives.


Information & Contributors


Published In

cover image ACM Other conferences
WWW '18: Proceedings of the 2018 World Wide Web Conference
April 2018
2000 pages
  • IW3C2: International World Wide Web Conference Committee



International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 10 April 2018


Author Tags

  1. attention network
  2. multi-modal analysis
  3. social image popularity


Funding Sources

  • NSFC
  • Shanghai Sailing Program
  • NSFC-Zhejiang


WWW '18
  • IW3C2
WWW '18: The Web Conference 2018
April 23 - 27, 2018
Lyon, France

Acceptance Rates

WWW '18 Paper Acceptance Rate 170 of 1,155 submissions, 15%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%


