Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2713168.2713172acmconferencesArticle/Chapter ViewAbstractPublication PagesmmsysConference Proceedingsconference-collections
research-article

EMOD: an efficient on-device mobile visual search system

Published: 18 March 2015 Publication History

Abstract

Recently, researchers have proposed solutions to build on-device mobile visual search (ODMVS) systems. Different from traditional client-server mobile visual search systems, an ODMVS supports image searching directly within a mobile device. An ODMVS needs to be designed with constrained hardware in mind e.g. limited memory, less powerful CPU. In this paper, we present, EMOD, an efficient on-device mobile visual search system based on the Bag-of-Visual-Word (BOVW) framework but uses a small visual dictionary. An Object Word Ranking (OWR) algorithm is proposed to efficiently identify the most useful visual words of an image so as to construct a compact image signature for fast retrieval and greatly improved retrieval performance. Due to having a small visual dictionary, we propose the Top Inverted Index Ranking scheme to reduce the number of candidate images for similarity calculation. In addition, EMOD adopts a more efficient version of the recently proposed Ranking Consistency re-ranking algorithm for further performance enhancement. Via extensive experimental evaluations, we demonstrate that our prototype EMOD system yields good retrieval accuracy and query response times for a database with over 10K images.

References

[1]
http://www.google.com/mobile/goggles/.
[2]
http://camfindapp.com/.
[3]
http://www.flowerchecker.com/.
[4]
http://www.vision.ee.ethz.ch/showroom/zubud/index.en.html/.
[5]
http://www.robots.ox.ac.uk/vgg/data/oxbuildings/.
[6]
http://www.robots.ox.ac.uk/vgg/data/parisbuildings/.
[7]
http://mickey.cse.lehigh.edu/lubud/.
[8]
https://github.com/EsotericSoftware/kryo/.
[9]
R. Arandjelovic and A. Zisserman. All about vlad. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 1578--1585, June 2013.
[10]
H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool. Speeded-up robust features (surf). Comput. Vis. Image Underst., 110(3):346--359, June 2008.
[11]
J. Beis and D. Lowe. Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In Computer Vision and Pattern Recognition, 1997. Proceedings., 1997 IEEE Computer Society Conference on, pages 1000--1006, Jun 1997.
[12]
S. A. Chatzichristofis and Y. S. Boutalis. Cedd: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval. In Proceedings of the 6th international conference on Computer vision systems, ICVS'08, pages 312--322, 2008.
[13]
D. Chen and B. Girod. Memory-efficient image databases for mobile visual search. MultiMedia, IEEE, 21(1):14--23, Jan 2014.
[14]
D. Chen, S. Tsai, V. Chandrasekhar, G. Takacs, R. Vedantham, R. Grzeszczuk, and B. Girod. Residual enhanced visual vector as a compact signature for mobile visual search. Signal Process., 93(8):2316--2327, Aug. 2013.
[15]
X. Chen and M. Koskela. Mobile visual search from dynamic image databases. In A. Heyden and F. Kahl, editors, Image Analysis, volume 6688 of Lecture Notes in Computer Science, pages 196--205. Springer Berlin Heidelberg, 2011.
[16]
Y. Chen, X. Li, A. Dick, and R. Hill. Ranking consistency for image matching and object retrieval. Pattern Recognition, 47(3): 1349--1360, 2014. Handwriting Recognition and other {PR} Applications.
[17]
O. Chum, J. Philbin, and A. Zisserman. Near duplicate image detection: min-hash and tf-idf weighting. In British Machine Vision Conference, 2008.
[18]
Y. D. Chun, N. C. Kim, and I. H. Jang. Content-based image retrieval using multiresolution color and texture features. Multimedia, IEEE Transactions on, 10(6):1073--1084, Oct 2008.
[19]
M. A. Fischler and R. C. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24(6):381--395, June 1981.
[20]
A. Hartl, D. Schmalstieg, and G. Reitmayr. Client-side mobile visual search. In VISAPP 2014 - Proceedings of the 9th International Conference on Computer Vision Theory and Applications, 2014.
[21]
H. Jegou, M. Douze, and C. Schmid. Packing bag-of-features. In Computer Vision, 2009 IEEE 12th International Conference on, pages 2357--2364, Sept 2009.
[22]
E. Keogh, J. Lin, and A. Fu. Hot sax: Efficiently finding the most unusual time series subsequence. In Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM '05, pages 226--233, Washington, DC, USA, 2005. IEEE Computer Society.
[23]
J. Knopp, J. Sivic, and T. Pajdla. Avoiding confusing features in place recognition. In Proceedings of the 11th European Conference on Computer Vision: Part I, ECCV'10, pages 748--761, Berlin, Heidelberg, 2010. Springer-Verlag.
[24]
D. Kumhyr. Method for suspect identification using scanning of surveillance media. In US Patent Application 10/185685, Jan 2014.
[25]
D. Li and M. C. Chuah. Emovis: An efficient mobile visual search system for landmark recognition. In Mobile Ad-hoc and Sensor Networks (MSN), 2013 IEEE Ninth International Conference on, pages 53--60, Dec 2013.
[26]
D. Li, M.-C. Chuah, and L. Tian. Demo: Lehigh explorer augmented campus tour (lact). In Proceedings of the 2014 Workshop on Mobile Augmented Reality and Robotic Technology-based Systems, MARS '14, pages 15--16, New York, NY, USA, 2014. ACM.
[27]
D. G. Lowe. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision, 60(2):91--110, Nov. 2004.
[28]
B. Manjunath, J.-R. Ohm, V. Vasudevan, and A. Yamada. Color and texture descriptors. Circuits and Systems for Video Technology, IEEE Transactions on, 11(6):703--715, Jun 2001.
[29]
K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. V. Gool. A comparison of affine region detectors. Int. J. Comput. Vision, 65(1--2):43--72, Nov. 2005.
[30]
M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Application VISSAPP'09), pages 331--340. INSTICC Press, 2009.
[31]
D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2, CVPR '06, pages 2161--2168, Washington, DC, USA, 2006. IEEE Computer Society.
[32]
M. Perd'och, O. Chum, and J. Matas. Efficient representation of local geometry for large scale object retrieval. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 9--16, June 2009.
[33]
F. Perronnin, J. Sánchez, and T. Mensink. Improving the fisher kernel for large-scale image classification. In Proceedings of the 11th European Conference on Computer Vision: Part IV, ECCV'10, pages 143--156, Berlin, Heidelberg, 2010. Springer-Verlag.
[34]
J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on, pages 1--8, June 2007.
[35]
J. Philbin, M. Isard, J. Sivic, and A. Zisserman. Descriptor learning for efficient retrieval. In K. Daniilidis, P. Maragos, and N. Paragios, editors, Computer Vision âĂŞ ECCV 2010, volume 6313 of Lecture Notes in Computer Science, pages 677--691. Springer Berlin Heidelberg, 2010.
[36]
B. Ruf, E. Kokiopoulou, and M. Detyniecki. Mobile museum guide based on fast sift recognition. In M. Detyniecki, U. Leiner, and A. NÃijrnberger, editors, Adaptive Multimedia Retrieval. Identifying, Summarizing, and Recommending Image and Music, volume 5811 of Lecture Notes in Computer Science, pages 170--183. Springer Berlin Heidelberg, 2010.
[37]
J. Sanchez and F. Perronnin. High-dimensional signature compression for large-scale image classification. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1665--1672, June 2011.
[38]
C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast image descriptor matching. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1--8, June 2008.
[39]
J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2, ICCV '03, pages 1470--, Washington, DC, USA, 2003. IEEE Computer Society.
[40]
Y. Song, W. Cai, and D. Deng. Hierarchical spatial matching for medical image retrieval. In Proceedings of 2011 International ACM Workshop on Medical Multimedia Analysis and Retrieval, 2011.
[41]
B. Thomee, E. M. Bakker, and M. S. Lew. Top-surf: A visual words toolkit. In Proceedings of the International Conference on Multimedia, MM '10, pages 1473--1476, New York, NY, USA, 2010. ACM.
[42]
P. Turcot and D. Lowe. Better matching with fewer features: The selection of useful features in large database recognition problems. In Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, pages 2109--2116, Sept 2009.
[43]
Y. Yang and S. Newsam. Geographic image retrieval using local invariant features. Geoscience and Remote Sensing, IEEE Transactions on, 51(2):818--832, Feb 2013.
[44]
K.-H. Yap, T. Chen, Z. Li, and K. Wu. A comparative study of mobile-based landmark recognition techniques. Intelligent Systems, IEEE, 25(1):48--57, Jan 2010.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MMSys '15: Proceedings of the 6th ACM Multimedia Systems Conference
March 2015
277 pages
ISBN:9781450333511
DOI:10.1145/2713168
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 March 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. augmented reality
  2. bag-of-visual-word
  3. inverted index
  4. memory impact
  5. mobile visual search
  6. on-device system

Qualifiers

  • Research-article

Funding Sources

Conference

MMSys '15
Sponsor:
MMSys '15: Multimedia Systems Conference 2015
March 18 - 20, 2015
Oregon, Portland

Acceptance Rates

MMSys '15 Paper Acceptance Rate 12 of 41 submissions, 29%;
Overall Acceptance Rate 176 of 530 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Accurate Image Retrieval With Unsupervised 2-Stage k-NN Re-RankingComputer Vision10.4018/978-1-5225-5204-8.ch072(1726-1745)Online publication date: 2018
  • (2017)Adaptive RNN Tree for Large-Scale Human Action Recognition2017 IEEE International Conference on Computer Vision (ICCV)10.1109/ICCV.2017.161(1453-1461)Online publication date: Oct-2017
  • (2017)Mobile multi-view object image searchMultimedia Tools and Applications10.1007/s11042-016-3659-976:10(12433-12456)Online publication date: 1-May-2017
  • (2016)Accurate Image Retrieval with Unsupervised 2-Stage k-NN Re-RankingInternational Journal of Multimedia Data Engineering & Management10.4018/IJMDEM.20160101037:1(41-59)Online publication date: 1-Jan-2016
  • (2016)DeepCham: Collaborative Edge-Mediated Adaptive Deep Learning for Mobile Object Recognition2016 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC.2016.38(64-76)Online publication date: Oct-2016
  • (2016)GPU Acceleration of Content-Based Image Retrieval Based on SIFT Descriptors2016 19th International Conference on Network-Based Information Systems (NBiS)10.1109/NBiS.2016.55(342-347)Online publication date: Sep-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media