Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3123266.3123333acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Modeling Image Virality with Pairwise Spatial Transformer Networks

Published: 19 October 2017 Publication History

Abstract

The study of virality and information diffusion is a topic gaining traction rapidly in the computational social sciences. Computer vision and social network analysis research have also focused on understanding the impact of content and information diffusion in making content viral, with prior approaches not performing significantly well as other traditional classification tasks. In this paper, we present a novel pairwise reformulation of the virality prediction problem as an attribute prediction task and develop a novel algorithm to model image virality on online media using a pairwise neural network. Our model provides significant insights into the features that are responsible for promoting virality and surpasses the existing state-of-the-art by a 12% average improvement in prediction. We also investigate the effect of external category supervision on relative attribute prediction and observe an increase in prediction accuracy for the same across several attribute learning datasets.

References

[1]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, and others TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow. org (????).
[2]
Brandon Amos, Bartosz Ludwiczuk, and Mahadev Satyanarayanan. 2016. OpenFace: A general-purpose face recognition library with mobile applications. Technical Report. Carnegie Mellon University-CS-16-118, Carnegie Mellon University School of Computer Science.
[3]
Roy M Anderson, Robert M May, and B Anderson. 1992. Infectious diseases of humans: dynamics and control. Vol. Vol. 28. Oxford university press Oxford.
[4]
Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh. 2015. VQA: Visual question answering. In Proceedings of the IEEE International Conference on Computer Vision. 2425--2433.
[5]
Eytan Bakshy, Itamar Rosenn, Cameron Marlow, and Lada Adamic. 2012. The role of social networks in information diffusion Proceedings of the 21st international conference on World Wide Web. ACM, 519--528.
[6]
Frank M Bass. 1969. A simultaneous equation regression study of advertising and sales of cigarettes. Journal of Marketing Research (1969), 291--300.
[7]
Jonah Berger. 2011. Arousal increases social transmission of information. Psychological science Vol. 22, 7 (2011), 891--893.
[8]
Jonah Berger. 2013. Contagious: Why things catch on. Simon and Schuster.
[9]
Jonah Berger and Chip Heath. 2007. Where consumers diverge from others: Identity signaling and product domains. Journal of Consumer Research Vol. 34, 2 (2007), 121--134.
[10]
Jonah Berger and Katherine L Milkman. 2012. What makes online content viral? Journal of marketing research Vol. 49, 2 (2012), 192--205.
[11]
Jonah Berger and Eric M Schwartz. 2011. What Drives Immediate and Ongoing Word of Mouth? Journal of Marketing Research Vol. 48, 5 (2011), 869--880.
[12]
Damian Borth, Rongrong Ji, Tao Chen, Thomas Breuel, and Shih-Fu Chang. 2013. Large-scale visual sentiment ontology and detectors using adjective noun pairs Proceedings of the 21st ACM international conference on Multimedia. ACM, 223--232.
[13]
Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd international conference on Machine learning. ACM, 89--96.
[14]
BuzzFeed. 2016. BuzzFeed. https://buzzfeed.com. (2016). {Online; accessed 10-Feb-2016}.
[15]
BVLC. 2016. Caffe Model-Zoo. https://github.com/bvlc/caffe/wiki/model-zoo. (2016). {Online; accessed 10-Feb-2016}.
[16]
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach Proceedings of the 24th international conference on Machine learning. ACM, 129--136.
[17]
Sumit Chopra, Raia Hadsell, and Yann LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, Vol. Vol. 1. IEEE, 539--546.
[18]
James Coleman, Elihu Katz, and Herbert Menzel. 1957. The diffusion of an innovation among physicians. Sociometry, Vol. 20, 4 (1957), 253--270.
[19]
Arturo Deza and Devi Parikh. 2015. Understanding image virality. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1818--1826.
[20]
Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. 2013. Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531 (2013).
[21]
Abhimanyu Dubey, Nikhil Naik, Devi Parikh, Ramesh Raskar, and César A Hidalgo. 2016. Deep learning the city: Quantifying urban perception at a global scale European Conference on Computer Vision. Springer, 196--212.
[22]
Gawker. 2016. Gawker. https://gawker.com. (2016). {Online; accessed 10-Feb-2016}.
[23]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation Proceedings of the IEEE conference on computer vision and pattern recognition. 580--587.
[24]
Sharad Goel, Ashton Anderson, Jake Hofman, and Duncan J Watts. 2015. The structural virality of online diffusion. Management Science, Vol. 62, 1 (2015), 180--196.
[25]
Michael Gygli, Helmut Grabner, Hayko Riemenschneider, Fabian Nater, and Luc Gool. 2013. The interestingness of images. In Proceedings of the IEEE International Conference on Computer Vision. 1633--1640.
[26]
Lichan Hong, Gregorio Convertino, and Ed H Chi. 2011. Language Matters In Twitter: A Large Scale Study.
[27]
Phillip Isola, Devi Parikh, Antonio Torralba, and Aude Oliva. 2011. Understanding the intrinsic memorability of images Advances in Neural Information Processing Systems. 2429--2437.
[28]
Max Jaderberg, Karen Simonyan, Andrew Zisserman, and others. 2015. Spatial transformer networks. In Advances in Neural Information Processing Systems. 2017--2025.
[29]
Puneet Jain, Justin Manweiler, Arup Acharya, and Romit Roy Choudhury. 2014. Scalable Social Analytics for Live Viral Event Prediction.
[30]
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding Proceedings of the ACM International Conference on Multimedia. ACM, 675--678.
[31]
Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 23-26, 2002, Edmonton, Alberta, Canada. 133--142.
[32]
Aditya Khosla, Atish Das Sarma, and Raffay Hamid. 2014. What makes an image popular?. In Proceedings of the 23rd international conference on World wide web. ACM, 867--876.
[33]
Aditya Khosla, Jianxiong Xiao, Phillip Isola, Antonio Torralba, and Aude Oliva. 2012 a. Image memorability and visual inception. In SIGGRAPH Asia 2012 Technical Briefs. ACM, 35.
[34]
Aditya Khosla, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2012 b. Memorability of image regions. In Advances in Neural Information Processing Systems. 305--313.
[35]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks Advances in neural information processing systems. 1097--1105.
[36]
Himabindu Lakkaraju, Julian J McAuley, and Jure Leskovec. 2013. What's in a Name? Understanding the Interplay between Titles, Content, and Communities in Social Media. ICWSM, Vol. 1, 2 (2013), 3.
[37]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. Computer Vision--ECCV 2014. Springer, 740--755.
[38]
Naren Naik, Jade Philipoom, Ramesh Raskar, and César Hidalgo. 2014. Streetscore--predicting the perceived safety of one million streetscapes Computer Vision and Pattern Recognition Workshops (CVPRW), 2014 IEEE Conference on. IEEE, 793--799.
[39]
Amandianeze O Nwana, Salman Avestimehr, and Tsuhan Chen. 2013. A latent social approach to youtube popularity prediction Global Communications Conference (GLOBECOM), 2013 IEEE. IEEE, 3138--3144.
[40]
Funny or Die. 2016. FunnyORDie. https://funnyordie.com. (2016). {Online; accessed 10-Feb-2016}.
[41]
Devi Parikh and Kristen Grauman. 2011. Relative attributes Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 503--510.
[42]
Sasa Petrovic, Miles Osborne, and Victor Lavrenko. 2011. RT to Win! Predicting Message Propagation in Twitter.
[43]
Henrique Pinto, Jussara M Almeida, and Marcos A Gonccalves. 2013. Using early view patterns to predict the popularity of youtube videos Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 365--374.
[44]
Reddit. 2016. Reddit. https://reddit.com. (2016). {Online; accessed 10-Feb-2016}.
[45]
Ramachandruni N Sandeep, Yashaswi Verma, and CV Jawahar. 2014. Relative parts: Distinctive parts for learning relative attributes Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3614--3621.
[46]
David A Shamma, Jude Yew, Lyndon Kennedy, and Elizabeth F Churchill. 2011. Viral Actions: Predicting Video View Counts Using Synchronous Sharing Behaviors. ICWSM.
[47]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[48]
Krishna Kumar Singh and Yong Jae Lee. 2016. End-to-End Localization and Ranking for Relative Attributes European Conference on Computer Vision (ECCV), 2016.
[49]
Yaser Souri, Erfan Noury, and Ehsan Adeli-Mosabbeb. 2015. Deep Relative Attributes. arXiv preprint arXiv:1512.04103 (2015).
[50]
Naman Turakhia and Devi Parikh. 2013. Attribute dominance: What pops out?. In Proceedings of the IEEE International Conference on Computer Vision. 1225--1232.
[51]
John Wihbey. 2014. The Challenges of Democratizing News and Information: Examining Data on Social Media, Viral Patterns and Digital Influence. Viral Patterns and Digital Influence (June 9, 2014) (2014).
[52]
Fanyi Xiao and Yong Jae Lee. 2015. Discovering the Spatial Extent of Relative Attributes Proceedings of the IEEE International Conference on Computer Vision. 1458--1466.
[53]
Aron Yu and Kristen Grauman. 2014. Fine-grained visual comparisons with local learning Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 192--199.
[54]
Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning deep features for scene recognition using places database Advances in neural information processing systems. 487--495.

Cited By

View all
  • (2023)ViViD: View Prediction of Online Video Through Deep Neural Network-Based Analysis of Subjective Video AttributesIEEE Transactions on Broadcasting10.1109/TBC.2022.323110069:1(191-200)Online publication date: Mar-2023
  • (2022)Who wants to be a Click-Millionaire? On the Influence of Thumbnails and Captions2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956202(629-635)Online publication date: 21-Aug-2022
  • (2020)The Neil deGrasse Tyson Problem: Methods for Exploring Base Memes in Web ArchivesInternational Conference on Social Media and Society10.1145/3400806.3400836(255-264)Online publication date: 22-Jul-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '17: Proceedings of the 25th ACM international conference on Multimedia
October 2017
2028 pages
ISBN:9781450349062
DOI:10.1145/3123266
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. convolutional neural networks
  2. deep learning for the web
  3. image attributes
  4. image virality

Qualifiers

  • Research-article

Conference

MM '17
Sponsor:
MM '17: ACM Multimedia Conference
October 23 - 27, 2017
California, Mountain View, USA

Acceptance Rates

MM '17 Paper Acceptance Rate 189 of 684 submissions, 28%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)1
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)ViViD: View Prediction of Online Video Through Deep Neural Network-Based Analysis of Subjective Video AttributesIEEE Transactions on Broadcasting10.1109/TBC.2022.323110069:1(191-200)Online publication date: Mar-2023
  • (2022)Who wants to be a Click-Millionaire? On the Influence of Thumbnails and Captions2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956202(629-635)Online publication date: 21-Aug-2022
  • (2020)The Neil deGrasse Tyson Problem: Methods for Exploring Base Memes in Web ArchivesInternational Conference on Social Media and Society10.1145/3400806.3400836(255-264)Online publication date: 22-Jul-2020
  • (2019)Intrinsic Image Popularity AssessmentProceedings of the 27th ACM International Conference on Multimedia10.1145/3343031.3351007(1979-1987)Online publication date: 15-Oct-2019
  • (2019)Multi-object tracking using deformable convolution networks with tracklets updatingInternational Journal of Wavelets, Multiresolution and Information Processing10.1142/S021969131950042517:06(1950042)Online publication date: 24-Nov-2019
  • (2019)Novel framework for image attribute annotation with gene selection XGBoost algorithm and relative attribute modelApplied Soft Computing10.1016/j.asoc.2019.03.01780:C(57-79)Online publication date: 1-Jul-2019
  • (2018)MemeSequencerProceedings of the 2018 World Wide Web Conference10.1145/3178876.3186021(1225-1235)Online publication date: 10-Apr-2018
  • (2018)Pairwise Confusion for Fine-Grained Visual ClassificationComputer Vision – ECCV 201810.1007/978-3-030-01258-8_5(71-88)Online publication date: 6-Oct-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media