Nothing Special   »   [go: up one dir, main page]

Skip to main content

Predicting Interestingness of Visual Content

  • Chapter
  • First Online:
Visual Content Indexing and Retrieval with Psycho-Visual Models

Abstract

The ability of multimedia data to attract and keep people’s interest for longer periods of time is gaining more and more importance in the fields of information retrieval and recommendation, especially in the context of the ever growing market value of social media and advertising. In this chapter we introduce a benchmarking framework (dataset and evaluation tools) designed specifically for assessing the performance of media interestingness prediction techniques. We release a dataset which consists of excerpts from 78 movie trailers of Hollywood-like movies. These data are annotated by human assessors according to their degree of interestingness. A real-world use scenario is targeted, namely interestingness is defined in the context of selecting visual content for illustrating a Video on Demand (VOD) website. We provide an in-depth analysis of the human aspects of this task, i.e., the correlation between perceptual characteristics of the content and the actual data, as well as of the machine aspects by overviewing the participating systems of the 2016 MediaEval Predicting Media Interestingness campaign. After discussing the state-of-art achievements, valuable insights, existing current capabilities as well as future challenges are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

eBook
USD 15.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.multimediaeval.org/.

  2. 2.

    http://www.multimediaeval.org/mediaeval2016/mediainterestingness/.

  3. 3.

    http://www.technicolor.com.

  4. 4.

    http://www.technicolor.com/en/innovation/scientific-community/scientific-data-sharing/interestingness-dataset.

  5. 5.

    https://github.com/mvsjober/pair-annotate.

  6. 6.

    http://trec.nist.gov/trec_eval/.

References

  1. Almeida, J.: UNIFESP at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  2. Almeida, J., Leite, N.J., Torres, R.S.: Comparison of video sequences with histograms of motion patterns. In: IEEE ICIP International Conference on Image Processing, pp. 3673–3676 (2011)

    Google Scholar 

  3. Baveye, Y., Dellandréa, E., Chamaret, C., Chen, L.: Liris-accede: a video database for affective content analysis. IEEE Trans. Affect. Comput. 6(1), 43–55 (2015)

    Article  Google Scholar 

  4. Berg, A.C., Berg, T.L., Daume, H., Dodge, J., Goyal, A., Han, X., Mensch, A., Mitchell, M., Sood, A., Stratos, K., et al.: Understanding and predicting importance in images. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 3562–3569. IEEE, Providence (2012)

    Google Scholar 

  5. Berlyne, D.E.: Conflict, Arousal and Curiosity. Mc-Graw-Hill, New York (1960)

    Google Scholar 

  6. Boiman, O., Irani, M.: Detecting irregularities in images and in video. Int. J. Comput. Vis. 74(1), 17–31 (2007)

    Article  Google Scholar 

  7. Bradley, R.A., Terry, M.E.: Rank analysis of incomplete block designs: the method of paired comparisons. Biometrika 39(3-4), 324–345 (1952)

    Article  MathSciNet  MATH  Google Scholar 

  8. Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: ACM Sigmod Record, vol. 29, pp. 93–104. ACM, New York (2000)

    Google Scholar 

  9. Bulling, A., Roggen, D.: Recognition of visual memory recall processes using eye movement analysis. In: Proceedings of the 13th international conference on Ubiquitous Computing, pp. 455–464. ACM, New York (2011)

    Google Scholar 

  10. Chamaret, C., Demarty, C.H., Demoulin, V., Marquant, G.: Experiencing the interestingness concept within and between pictures. In: Proceeding of SPIE, Human Vision and Electronic Imaging (2016)

    Google Scholar 

  11. Chen, A., Darst, P.W., Pangrazi, R.P.: An examination of situational interest and its sources. Br. J. Educ. Psychol. 71(3), 383–400 (2001)

    Article  Google Scholar 

  12. Chen, S., Dian, Y., Jin, Q.: RUC at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  13. Chu, S.L., Fedorovskaya, E., Quek, F., Snyder, J.: The effect of familiarity on perceived interestingness of images. In: Proceedings of SPIE, vol. 8651, pp. 86,511C–86,511C–12 (2013). DOI 10.1117/12.2008551, http://dx.doi.org/10.1117/12.2008551

  14. Constantin, M.G., Boteanu, B., Ionescu, B.: LAPI at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  15. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition (2005)

    Book  Google Scholar 

  16. Danelljan, M., Hager, G., Khan, F.S., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference (2014)

    Book  Google Scholar 

  17. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: IEEE ECCV European Conference on Computer Vision, pp. 288–301. Springer, Berlin (2006)

    Google Scholar 

  18. Demarty, C.H., Sjöberg, M., Ionescu, B., Do, T.T., Wang, H., Duong, N.Q.K., Lefebvre, F.: Mediaeval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  19. Dhar, S., Ordonez, V., Berg, T.L.: High level describable attributes for predicting aesthetics and interestingness. In: IEEE International Conference on Computer Vision and Pattern Recognition (2011)

    Book  Google Scholar 

  20. Elazary, L., Itti, L.: Interesting objects are visually salient. J. Vis. 8(3), 3–3 (2008)

    Article  Google Scholar 

  21. Erdogan, G., Erdem, A., Erdem, E.: HUCVL at MediaEval 2016: predicting interesting key frames with deep models. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  22. Grabner, H., Nater, F., Druey, M., Gool, L.V.: Visual interestingness in image sequences. In: ACM International Conference on Multimedia, pp. 1017–1026. ACM, New York (2013). DOI 10.1145/2502081.2502109, http://doi.acm.org/10.1145/2502081.2502109

  23. Gygli, M., Grabner, H., Riemenschneider, H., Nater, F., van Gool, L.: The interestingness of images. In: ICCV International Conference on Computer Vision (2013)

    Book  Google Scholar 

  24. Gygli, M., Song, Y., Cao, L.: Video2gif: automatic generation of animated gifs from video. CoRR abs/1605.04850 (2016). http://arxiv.org/abs/1605.04850

  25. Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, pp. 545–552 (2006)

    Google Scholar 

  26. Hayes, A.F., Krippendorff, K.: Answering the call for a standard reliability measure for coding data. Commun. Methods Meas. 1(1), 77–89 (2007). DOI 10.1080/19312450709336664, http://dx.doi.org/10.1080/19312450709336664

    Article  Google Scholar 

  27. Hsieh, L.C., Hsu, W.H., Wang, H.C.: Investigating and predicting social and visual image interestingness on social media by crowdsourcing. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4309–4313. IEEE, Providence (2014)

    Google Scholar 

  28. Hua, X.S., Yang, L., Wang, J., Wang, J., Ye, M., Wang, K., Rui, Y., Li, J.: Clickage: towards bridging semantic and intent gaps via mining click logs of search engines. In: ACM International Conference on Multimedia (2013)

    Book  Google Scholar 

  29. Isola, P., Parikh, D., Torralba, A., Oliva, A.: Understanding the intrinsic memorability of images. In: Advances in Neural Information Processing Systems, pp. 2429–2437 (2011)

    Google Scholar 

  30. Isola, P., Xiao, J., Torralba, A., Oliva, A.: What makes an image memorable? In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 145–152. IEEE, Providence (2011)

    Google Scholar 

  31. Jiang, Y.G., Wang, Y., Feng, R., Xue, X., Zheng, Y., Yan, H.: Understanding and predicting interestingness of videos. In: AAAI Conference on Artificial Intelligence (2013)

    Google Scholar 

  32. Jiang, Y.G., Dai, Q., Mei, T., Rui, Y., Chang, S.F.: Super fast event recognition in internet videos. IEEE Trans. Multimedia 177(8), 1–13 (2015)

    Article  Google Scholar 

  33. Joachims, T.: Optimizing search engines using clickthrough data. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 133–142. ACM, New York (2002)

    Google Scholar 

  34. Ke, Y., Hoiem, D., Sukthankar, R.: Computer vision for music identification. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 597–604. IEEE, Providence (2005)

    Google Scholar 

  35. Ke, Y., Tang, X., Jing, F.: The design of high-level features for photo quality assessment. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 419–426. IEEE, Providence (2006)

    Google Scholar 

  36. Khosla, A., Raju, A.S., Torralba, A., Oliva, A.: Understanding and predicting image memorability at a large scale. In: International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

  37. Krippendorff, K.: Content Analysis: An Introduction to Its Methodology, 3rd edn. Sage, Thousand Oaks (2013)

    Google Scholar 

  38. Lam, V., Do, T., Phan, S., Le, D.D., Satoh, S., Duong, D.: NII-UIT at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  39. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)

    Google Scholar 

  40. Li, J., Barkowsky, M., Le Callet, P.: Boosting paired comparison methodology in measuring visual discomfort of 3dtv: performances of three different designs. In: Proceedings of SPIE Electronic Imaging, Stereoscopic Displays and Applications, vol. 8648 (2013)

    Google Scholar 

  41. Li, L.J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: a high-level image representation for scene classification & semantic feature sparsification. In: Advances in Neural Information Processing Systems, pp. 1378–1386 (2010)

    Google Scholar 

  42. Liem, C.: TUD-MMC at MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  43. Liu, F., Niu, Y., Gleicher, M.: Using web photos for measuring video frame interestingness. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 2058–2063 (2009)

    Google Scholar 

  44. Liu, Y., Gu, Z., Cheung, Y.M.: Supervised manifold learning for media interestingness prediction. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  45. Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  46. Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: ACM International Conference on Multimedia, pp. 83–92. ACM, New York (2010). DOI 10.1145/1873951.1873965, http://doi.acm.org/10.1145/1873951.1873965

  47. McCrae, R.R.: Aesthetic chills as a universal marker of openness to experience. Motiv. Emot. 31(1), 5–11 (2007)

    Article  Google Scholar 

  48. Murray, N., Marchesotti, L., Perronnin, F.: Ava: a large-scale database for aesthetic visual analysis. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 2408–2415. IEEE, Providence (2012)

    Google Scholar 

  49. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)

    Article  MATH  Google Scholar 

  50. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)

    Article  MATH  Google Scholar 

  51. Ovadia, S.: Ratings and rankings: reconsidering the structure of values and their measurement. Int. J. Soc. Res. Methodol. 7(5), 403–414 (2004). DOI 10.1080/1364557032000081654, http://dx.doi.org/10.1080/1364557032000081654

    Article  Google Scholar 

  52. Parekh, J., Parekh, S.: The MLPBOON Predicting Media Interestingness System for MediaEval 2016. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  53. Rayatdoost, S., Soleymani, M.: Ranking images and videos on visual interestingness by visual sentiment features. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  54. Schaul, T., Pape, L., Glasmachers, T., Graziano, V., Schmidhuber, J.: Coherence progress: a measure of interestingness based on fixed compressors. In: International Conference on Artificial General Intelligence, pp. 21–30. Springer, Berlin (2011)

    Google Scholar 

  55. Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE, Providence (2007)

    Google Scholar 

  56. Shen, Y., Demarty, C.H., Duong, N.Q.K.: Technicolor@MediaEval 2016 Predicting Media Interestingness Task. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  57. Shen, Y., Demarty, C.H., Duong, N.Q.K.: Deep learning for multimodal-based video interestingness prediction. In: IEEE International Conference on Multimedia and Expo, ICME’17 (2017)

    Google Scholar 

  58. Silvia, P.J.: What is interesting? Exploring the appraisal structure of interest. Emotion 5(1), 89 (2005)

    Google Scholar 

  59. Silvia, P.J., Henson, R.A., Templin, J.L.: Are the sources of interest the same for everyone? using multilevel mixture models to explore individual differences in appraisal structures. Cognit. Emot. 23(7), 1389–1406 (2009)

    Article  Google Scholar 

  60. Sjöberg, M., Baveye, Y., Wang, H., Quang, V.L., Ionescu, B., Dellandréa, E., Schedl, M., Demarty, C.H., Chen, L.: The mediaeval 2015 affective impact of movies task. In: Proceedings of the MediaEval Workshop, CEUR Workshop Proceedings (2015)

    Google Scholar 

  61. Soleymani, M.: The quest for visual interest. In: ACM International Conference on Multimedia, pp. 919–922. New York, NY, USA (2015). DOI 10.1145/2733373.2806364, http://doi.acm.org/10.1145/2733373.2806364

  62. Spain, M., Perona, P.: Measuring and predicting object importance. Int. J. Comput. Vis. 91(1), 59–76 (2011)

    Article  Google Scholar 

  63. Stein, B.E., Stanford, T.R.: Multisensory integration: current issues from the perspective of the single neuron. Nat. Rev. Neurosci. 9(4), 255–266 (2008)

    Article  Google Scholar 

  64. Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: IEEE ECCV European Conference on Computer Vision, pp. 776–789. Springer, Berlin (2010)

    Google Scholar 

  65. Turner, S.A. Jr, Silvia, P.J.: Must interesting things be pleasant? A test of competing appraisal structures. Emotion 6(4), 670 (2006)

    Google Scholar 

  66. Valdez, P., Mehrabian, A.: Effects of color on emotions. J. Exp. Psychol. Gen. 123(4), 394 (1994)

    Article  Google Scholar 

  67. Vasudevan, A.B., Gygli, M., Volokitin, A., Gool, L.V.: Eth-cvl @ MediaEval 2016: Textual-visual embeddings and video2gif for video interestingness. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  68. Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: IEEE CVPR International Conference on Computer Vision and Pattern Recognition, pp. 3485–3492 (2010)

    Google Scholar 

  69. Xu, B., Fu, Y., Jiang, Y.G.: BigVid at MediaEval 2016: predicting interestingness in images and videos. In: Proceedings of the MediaEval Workshop, Hilversum (2016)

    Google Scholar 

  70. Yang, Y.H., Chen, H.H.: Ranking-based emotion recognition for music organization and retrieval. IEEE Trans. Audio Speech Lang. Process. 19(4), 762–774 (2011)

    Article  Google Scholar 

  71. Yannakakis, G.N., Hallam, J.: Ranking vs. preference: a comparative study of self-reporting. In: International Conference on Affective Computing and Intelligent Interaction, pp. 437–446. Springer, Berlin (2011)

    Google Scholar 

Download references

Acknowledgements

We would like to thank Yu-Gang Jiang and Baohan Xu from the Fudan University, China, and Hervé Bredin, from LIMSI, France for providing the features that accompany the released data, and Frédéric Lefebvre, Alexey Ozerov and Vincent Demoulin for their valuable inputs to the task definition. We also would like to thank our anonymous annotators for their contribution to building the ground-truth for the datasets. Part of this work was funded under project SPOTTER PN-III-P2-2.1-PED-2016-1065, contract 30PED/2017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Claire-Hélène Demarty .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Demarty, CH. et al. (2017). Predicting Interestingness of Visual Content. In: Benois-Pineau, J., Le Callet, P. (eds) Visual Content Indexing and Retrieval with Psycho-Visual Models. Multimedia Systems and Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-57687-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57687-9_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57686-2

  • Online ISBN: 978-3-319-57687-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics