PhotoPrev: Unifying Context and Content Cues to Enhance Personal Photo Revisitation

Li Jin¹,
Gang-Li Liu¹,
Liang Zhao¹ &
…
Ling Feng¹

103 Accesses
Explore all metrics

Abstract

Personal photo revisitation on smart phones is a common yet uneasy task for users due to the large volume of photos taken in daily life. Inspired by the human memory and its natural recall characteristics, we build a personal photo revisitation tool, PhotoPrev, to facilitate users to revisit previous photos through associated memory cues. To mimic users’ episodic memory recall, we present a way to automatically generate an abundance of related contextual metadata (e.g., weather, temperature) and organize them as context lattices for each photo in a life cycle. Meanwhile, photo content (e.g., object, text) is extracted and managed in a weighted term list, which corresponds to semantic memory. A threshold algorithm based photo revisitation framework for context- and content-based keyword search on a personal photo collection, together with a user feedback mechanism, is also given. We evaluate the scalability on a large synthetic dataset by crawling users’ photos from Flickr, and a 12-week user study demonstrates the feasibility and effectiveness of our photo revisitation strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Photo annotation: a survey

Article 28 December 2016

A System Using Tag Cloud for Recalling Personal Memories

Personalized Annotation for Photos with Visual Instance Search

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Naaman M, Song Y J, Paepcke A et al. Automatic organization for digital photographs with geographic coordinates. In Proc. the 4th ACM/IEEE Joint Conference on Digital Libraries, June 2004, pp.53–62.
Naaman M, Harada S, Wang Y et al. Context data in geo-referenced digital photo collections. In Proc. the 12th ACM International Conference on Multimedia, Oct. 2004, pp.196–203.
Cao L, Luo J, Kautz H et al. Annotating collections of photos using hierarchical event and scene models. In Proc. the 21st IEEE Conference on Computer Vision and Pattern Recognition, June 2008.
Joshi D, Luo J. Inferring generic activities and events from image content and bags of geo-tags. In Proc. the 7th International Conference on Content-Based Image and Video Retrieval, July 2008, pp.37–46.
VianaW, Filho J B, Gensel J et al. PhotoMap —Automatic spatiotemporal annotation for mobile photos. In Proc. the 7th Int. Symp. Web and Wireless Geographical Information Systems, Nov. 2007, pp.187-201.
Viana W, Hammiche S, Villanova-Oliver M et al. Photo context as a bag of words. In Proc. the 10th IEEE International Symposium on Multimedia, Dec. 2008, pp.310-315.
Crandall D, Felzenszwalb P, Huttenlocher D. Spatial priors for part-based recognition using statistical models. In Proc. the 18th IEEE Conference on Computer Vision and Pattern Recognition, June 2005, pp.10-17.
Dalal N, Triggs B. Histograms of oriented gradients for human detection. In Proc. the 18th IEEE Conference on Computer Vision and Pattern Recognition, June 2005, pp.886-893.
Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. In Proc. the 21st IEEE Conference on Computer Vision and Pattern Recognition, June 2008.
Felzenszwalb P F, Huttenlocher D P. Pictorial structures for object recognition. International Journal of Computer Vision, 2005, 61(1): 55-79.
Article Google Scholar
Hu J, Pei J, Tang J. How can I index my thousands of photos effectively and automatically? An unsupervised feature selection approach. In Proc. the 14th SIAM International Conference on Data Mining, Apr. 2014, pp.136-144.
Zhou W, Li H, Lu Y et al. Encoding spatial context for large-scale partial-duplicate web image retrieval. Journal of Computer Science and Technology, 2014, 29(5): 837-848.
Article Google Scholar
Shotton J, Winn J, Rother C et al. Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision, 2009, 81(1): 2-23.
Article Google Scholar
Hu S, Chen T, Xu K et al. Internet visual media processing: A survey with graphics and vision applications. The Visual Computer, 2013, 29(5): 393-405.
Article Google Scholar
Frome A, Singer Y, Malik J. Image retrieval and classification using local distance functions. In Proc. Neural Information Processing Systems, Dec. 2006, pp.417-424.
Russell B C, Torralba A, Liu C et al. Object recognition by scene alignment. In Proc. Neural Information Processing Systems, Dec. 2007, pp.1241-1248.
Russell B C, Torralba A, Murphy K P et al. LabelMe: A database and web-based tool for image annotation. International Journal of Computer Vision, 2008, 77(1/2/3): 157-173.
Article Google Scholar
Liu C, Yuen J, Torralba A. Nonparametric scene parsing via label transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(12): 2368-2382.
Article Google Scholar
Liu C, Yuen J, Torralba A. Sift flow: Dense correspondence across different scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(5): 978-994.
Article Google Scholar
Cao W, Liu N, Kong Q et al. Content-based image retrieval using high-dimensional information geometry. SCIENCE CHINA Information Sciences, 2014, 57(7): 1-11.
MathSciNet Google Scholar
Gllavata J, Ewerth R, Freisleben B. Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In Proc. the 17th International Conference on Pattern Recognition, Aug. 2004, pp.425-428.
Chen X, Yuille A L. Detecting and reading text in natural scenes. In Proc. the 17th IEEE Conference on Computer Vision and Pattern Recognition, June 2004, pp.366-373.
Ye Q, Huang Q, Gao W et al. Fast and robust text detection in images and video frames. Image and Vision Computing, 2005, 23(6): 565-576.
Article Google Scholar
Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform. In Proc. the 23rd IEEE Conference on Computer Vision and Pattern Recognition, June 2010, pp.2963-2970.
Lee J, Lee P, Lee S et al. AdaBoost for text detection in natural scene. In Proc. the 12th International Conference on Document Analysis and Recognition, Sept. 2011, pp.429-434.
Matas J, Chum O, Urban M et al. Robust wide baseline stereo from maximally stable extremal regions. Image and Vision Computing, 2004, 22(10): 761-767.
Article Google Scholar
Neumann L, Matas J. Real-time scene text localization and recognition. In Proc. the 25th IEEE Conference on Computer Vision and Pattern Recognition, June 2012, pp.3538-3545.
Zhang X, Lin Z, Sun F et al. Transform invariant text extraction. The Visual Computer, 2013, 30(4): 401-415.
Article MathSciNet Google Scholar
Chen T, Chen M, Tan P et al. Sketch2Photo: Internet image montage. ACM Transactions on Graphics, 2009, 28(5): Article No. 124.
MathSciNet Google Scholar
Lee Y, Zitnick C L, Cohen M F. ShadowDraw: Real-time user guidance for freehand drawing. ACM Transactions on Graphics, 2011, 30(4): Article No. 27.
Article Google Scholar
Ellis H C. Fundamentals of Human Memory and Cognition (3rd edition). William C. Brown Press, 1983.
Rubin D C, Wenzel A E. One hundred years of forgetting: A quantitative description of retention. Psychological Review, 1996, 103(4): 734-760.
Article Google Scholar
Tulving E. What is episodic memory? Current Directions in Psychological Science, 1993, 2(3): 67-70.
Article Google Scholar
Wiggs C L, Weisberg J, Martin A. Neural correlates of semantic and episodic memory retrieval. Neuropsychologia, 1999, 37(1): 103-118.
Article Google Scholar
Ding Y, Li X. Time weight collaborative filtering. In Proc. the 14th ACM International Conference on Information and Knowledge Management, Oct. 2005, pp.485-492.
Fagin R, Lotem A, Naor M. Optimal aggregation algorithms for middleware. In Proc. the 20th ACM SIGMODSIGACT-SIGART Symposium on Principles of Database Systems, May 2001, pp.102-113.
Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. the 18th International Conference on Machine Learning, June 28–July 1, 2001, pp.282-289.

Download references

Author information

Authors and Affiliations

Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology Tsinghua University, Beijing, 100084, China
Li Jin, Gang-Li Liu, Liang Zhao & Ling Feng

Authors

Li Jin
View author publications
You can also search for this author in PubMed Google Scholar
Gang-Li Liu
View author publications
You can also search for this author in PubMed Google Scholar
Liang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Ling Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Jin.

Additional information

The work was supported by the National Natural Science Foundation of China under Grant Nos. 61373022, 61073004, and the National Basic Research 973 Program of China under Grant No. 2011CB302203-2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, L., Liu, GL., Zhao, L. et al. PhotoPrev: Unifying Context and Content Cues to Enhance Personal Photo Revisitation. J. Comput. Sci. Technol. 30, 453–466 (2015). https://doi.org/10.1007/s11390-015-1536-z

Download citation

Received: 01 December 2014
Revised: 16 March 2015
Published: 01 May 2015
Issue Date: May 2015
DOI: https://doi.org/10.1007/s11390-015-1536-z

Abstract

Access this article

Subscribe and save

Buy Now