Abstract
The ability of multimedia data to attract and keep people’s interest for longer periods of time is gaining more and more importance in the fields of information retrieval and recommendation, especially in the context of the ever growing market value of social media and advertising. In this chapter we introduce a benchmarking framework (dataset and evaluation tools) designed specifically for assessing the performance of media interestingness prediction techniques. We release a dataset which consists of excerpts from 78 movie trailers of Hollywood-like movies. These data are annotated by human assessors according to their degree of interestingness. A real-world use scenario is targeted, namely interestingness is defined in the context of selecting visual content for illustrating a Video on Demand (VOD) website. We provide an in-depth analysis of the human aspects of this task, i.e., the correlation between perceptual characteristics of the content and the actual data, as well as of the machine aspects by overviewing the participating systems of the 2016 MediaEval Predicting Media Interestingness campaign. After discussing the state-of-art achievements, valuable insights, existing current capabilities as well as future challenges are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
References
Almeida, J.: UNIFESP at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Almeida, J., Leite, N.J., Torres, R.S.: Comparison of video sequences with histograms of motion patterns. In: IEEE ICIP International Conference on Image Processing, pp. 3673–3676 (2011)
Baveye, Y., Dellandréa, E., Chamaret, C., Chen, L.: Liris-accede: a video database for affective content analysis. IEEE Trans. Affect. Comput. 6(1), 43–55 (2015)
Berg, A.C., Berg, T.L., Daume, H., Dodge, J., Goyal, A., Han, X., Mensch, A., Mitchell, M., Sood, A., Stratos, K., et al.: Understanding and predicting importance in images. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 3562–3569. IEEE, Providence (2012)
Berlyne, D.E.: Conflict, Arousal and Curiosity. Mc-Graw-Hill, New York (1960)
Boiman, O., Irani, M.: Detecting irregularities in images and in video. Int. J. Comput. Vis. 74(1), 17–31 (2007)
Bradley, R.A., Terry, M.E.: Rank analysis of incomplete block designs: the method of paired comparisons. Biometrika 39(3-4), 324–345 (1952)
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: ACM Sigmod Record, vol. 29, pp. 93–104. ACM, New York (2000)
Bulling, A., Roggen, D.: Recognition of visual memory recall processes using eye movement analysis. In: Proceedings of the 13th international conference on Ubiquitous Computing, pp. 455–464. ACM, New York (2011)
Chamaret, C., Demarty, C.H., Demoulin, V., Marquant, G.: Experiencing the interestingness concept within and between pictures. In: Proceeding of SPIE, Human Vision and Electronic Imaging (2016)
Chen, A., Darst, P.W., Pangrazi, R.P.: An examination of situational interest and its sources. Br. J. Educ. Psychol. 71(3), 383–400 (2001)
Chen, S., Dian, Y., Jin, Q.: RUC at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Chu, S.L., Fedorovskaya, E., Quek, F., Snyder, J.: The effect of familiarity on perceived interestingness of images. In: Proceedings of SPIE, vol. 8651, pp. 86,511C–86,511C–12 (2013). DOI 10.1117/12.2008551, http://dx.doi.org/10.1117/12.2008551
Constantin, M.G., Boteanu, B., Ionescu, B.: LAPI at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition (2005)
Danelljan, M., Hager, G., Khan, F.S., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference (2014)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: IEEE ECCV European Conference on Computer Vision, pp. 288–301. Springer, Berlin (2006)
Demarty, C.H., Sjöberg, M., Ionescu, B., Do, T.T., Wang, H., Duong, N.Q.K., Lefebvre, F.: Mediaeval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Dhar, S., Ordonez, V., Berg, T.L.: High level describable attributes for predicting aesthetics and interestingness. In: IEEE International Conference on Computer Vision and Pattern Recognition (2011)
Elazary, L., Itti, L.: Interesting objects are visually salient. J. Vis. 8(3), 3–3 (2008)
Erdogan, G., Erdem, A., Erdem, E.: HUCVL at MediaEval 2016: predicting interesting key frames with deep models. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Grabner, H., Nater, F., Druey, M., Gool, L.V.: Visual interestingness in image sequences. In: ACM International Conference on Multimedia, pp. 1017–1026. ACM, New York (2013). DOI 10.1145/2502081.2502109, http://doi.acm.org/10.1145/2502081.2502109
Gygli, M., Grabner, H., Riemenschneider, H., Nater, F., van Gool, L.: The interestingness of images. In: ICCV International Conference on Computer Vision (2013)
Gygli, M., Song, Y., Cao, L.: Video2gif: automatic generation of animated gifs from video. CoRR abs/1605.04850 (2016). http://arxiv.org/abs/1605.04850
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, pp. 545–552 (2006)
Hayes, A.F., Krippendorff, K.: Answering the call for a standard reliability measure for coding data. Commun. Methods Meas. 1(1), 77–89 (2007). DOI 10.1080/19312450709336664, http://dx.doi.org/10.1080/19312450709336664
Hsieh, L.C., Hsu, W.H., Wang, H.C.: Investigating and predicting social and visual image interestingness on social media by crowdsourcing. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4309–4313. IEEE, Providence (2014)
Hua, X.S., Yang, L., Wang, J., Wang, J., Ye, M., Wang, K., Rui, Y., Li, J.: Clickage: towards bridging semantic and intent gaps via mining click logs of search engines. In: ACM International Conference on Multimedia (2013)
Isola, P., Parikh, D., Torralba, A., Oliva, A.: Understanding the intrinsic memorability of images. In: Advances in Neural Information Processing Systems, pp. 2429–2437 (2011)
Isola, P., Xiao, J., Torralba, A., Oliva, A.: What makes an image memorable? In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 145–152. IEEE, Providence (2011)
Jiang, Y.G., Wang, Y., Feng, R., Xue, X., Zheng, Y., Yan, H.: Understanding and predicting interestingness of videos. In: AAAI Conference on Artificial Intelligence (2013)
Jiang, Y.G., Dai, Q., Mei, T., Rui, Y., Chang, S.F.: Super fast event recognition in internet videos. IEEE Trans. Multimedia 177(8), 1–13 (2015)
Joachims, T.: Optimizing search engines using clickthrough data. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 133–142. ACM, New York (2002)
Ke, Y., Hoiem, D., Sukthankar, R.: Computer vision for music identification. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 597–604. IEEE, Providence (2005)
Ke, Y., Tang, X., Jing, F.: The design of high-level features for photo quality assessment. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 419–426. IEEE, Providence (2006)
Khosla, A., Raju, A.S., Torralba, A., Oliva, A.: Understanding and predicting image memorability at a large scale. In: International Conference on Computer Vision (ICCV) (2015)
Krippendorff, K.: Content Analysis: An Introduction to Its Methodology, 3rd edn. Sage, Thousand Oaks (2013)
Lam, V., Do, T., Phan, S., Le, D.D., Satoh, S., Duong, D.: NII-UIT at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
Li, J., Barkowsky, M., Le Callet, P.: Boosting paired comparison methodology in measuring visual discomfort of 3dtv: performances of three different designs. In: Proceedings of SPIE Electronic Imaging, Stereoscopic Displays and Applications, vol. 8648 (2013)
Li, L.J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: a high-level image representation for scene classification & semantic feature sparsification. In: Advances in Neural Information Processing Systems, pp. 1378–1386 (2010)
Liem, C.: TUD-MMC at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Liu, F., Niu, Y., Gleicher, M.: Using web photos for measuring video frame interestingness. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 2058–2063 (2009)
Liu, Y., Gu, Z., Cheung, Y.M.: Supervised manifold learning for media interestingness prediction. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: ACM International Conference on Multimedia, pp. 83–92. ACM, New York (2010). DOI 10.1145/1873951.1873965, http://doi.acm.org/10.1145/1873951.1873965
McCrae, R.R.: Aesthetic chills as a universal marker of openness to experience. Motiv. Emot. 31(1), 5–11 (2007)
Murray, N., Marchesotti, L., Perronnin, F.: Ava: a large-scale database for aesthetic visual analysis. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 2408–2415. IEEE, Providence (2012)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)
Ovadia, S.: Ratings and rankings: reconsidering the structure of values and their measurement. Int. J. Soc. Res. Methodol. 7(5), 403–414 (2004). DOI 10.1080/1364557032000081654, http://dx.doi.org/10.1080/1364557032000081654
Parekh, J., Parekh, S.: The MLPBOON Predicting Media Interestingness System for MediaEval 2016. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Rayatdoost, S., Soleymani, M.: Ranking images and videos on visual interestingness by visual sentiment features. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Schaul, T., Pape, L., Glasmachers, T., Graziano, V., Schmidhuber, J.: Coherence progress: a measure of interestingness based on fixed compressors. In: International Conference on Artificial General Intelligence, pp. 21–30. Springer, Berlin (2011)
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, Providence (2007)
Shen, Y., Demarty, C.H., Duong, N.Q.K.: Technicolor@MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Shen, Y., Demarty, C.H., Duong, N.Q.K.: Deep learning for multimodal-based video interestingness prediction. In: IEEE International Conference on Multimedia and Expo, ICME’17 (2017)
Silvia, P.J.: What is interesting? Exploring the appraisal structure of interest. Emotion 5(1), 89 (2005)
Silvia, P.J., Henson, R.A., Templin, J.L.: Are the sources of interest the same for everyone? using multilevel mixture models to explore individual differences in appraisal structures. Cognit. Emot. 23(7), 1389–1406 (2009)
Sjöberg, M., Baveye, Y., Wang, H., Quang, V.L., Ionescu, B., Dellandréa, E., Schedl, M., Demarty, C.H., Chen, L.: The mediaeval 2015 affective impact of movies task. In: Proceedings of the MediaEval Workshop, CEUR Workshop Proceedings (2015)
Soleymani, M.: The quest for visual interest. In: ACM International Conference on Multimedia, pp. 919–922. New York, NY, USA (2015). DOI 10.1145/2733373.2806364, http://doi.acm.org/10.1145/2733373.2806364
Spain, M., Perona, P.: Measuring and predicting object importance. Int. J. Comput. Vis. 91(1), 59–76 (2011)
Stein, B.E., Stanford, T.R.: Multisensory integration: current issues from the perspective of the single neuron. Nat. Rev. Neurosci. 9(4), 255–266 (2008)
Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: IEEE ECCV European Conference on Computer Vision, pp. 776–789. Springer, Berlin (2010)
Turner, S.A. Jr, Silvia, P.J.: Must interesting things be pleasant? A test of competing appraisal structures. Emotion 6(4), 670 (2006)
Valdez, P., Mehrabian, A.: Effects of color on emotions. J. Exp. Psychol. Gen. 123(4), 394 (1994)
Vasudevan, A.B., Gygli, M., Volokitin, A., Gool, L.V.: Eth-cvl @ MediaEval 2016: Textual-visual embeddings and video2gif for video interestingness. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010)
Xu, B., Fu, Y., Jiang, Y.G.: BigVid at MediaEval 2016: predicting interestingness in images and videos. In: Proceedings of the MediaEval Workshop, Hilversum (2016)
Yang, Y.H., Chen, H.H.: Ranking-based emotion recognition for music organization and retrieval. IEEE Trans. Audio Speech Lang. Process. 19(4), 762–774 (2011)
Yannakakis, G.N., Hallam, J.: Ranking vs. preference: a comparative study of self-reporting. In: International Conference on Affective Computing and Intelligent Interaction, pp. 437–446. Springer, Berlin (2011)
Acknowledgements
We would like to thank Yu-Gang Jiang and Baohan Xu from the Fudan University, China, and Hervé Bredin, from LIMSI, France for providing the features that accompany the released data, and Frédéric Lefebvre, Alexey Ozerov and Vincent Demoulin for their valuable inputs to the task definition. We also would like to thank our anonymous annotators for their contribution to building the ground-truth for the datasets. Part of this work was funded under project SPOTTER PN-III-P2-2.1-PED-2016-1065, contract 30PED/2017.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Demarty, CH. et al. (2017). Predicting Interestingness of Visual Content. In: Benois-Pineau, J., Le Callet, P. (eds) Visual Content Indexing and Retrieval with Psycho-Visual Models. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-57687-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-57687-9_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57686-2
Online ISBN: 978-3-319-57687-9
eBook Packages: Computer ScienceComputer Science (R0)