Toward semantic image similarity from crowdsourced clustering

Yanir Kleiman¹,
George Goldberg¹,
Yael Amsterdamer² &
…
Daniel Cohen-Or¹

501 Accesses
Explore all metrics

Abstract

Determining the similarity between images is a fundamental step in many applications, such as image categorization, image labeling and image retrieval. Automatic methods for similarity estimation often fall short when semantic context is required for the task, raising the need for human judgment. Such judgments can be collected via crowdsourcing techniques, based on tasks posed to web users. However, to allow the estimation of image similarities in reasonable time and cost, the generation of tasks to the crowd must be done in a careful manner. We observe that distances within local neighborhoods provide valuable information that allows a quick and accurate construction of the global similarity metric. This key observation leads to a solution based on clustering tasks, comparing relatively similar images. In each query, crowd members cluster a small set of images into bins. The results yield many relative similarities between images, which are used to construct a global image similarity metric. This metric is progressively refined, and serves to generate finer, more local queries in subsequent iterations. We demonstrate the effectiveness of our method on datasets where ground truth is available, and on a collection of images where semantic similarities cannot be quantified. In particular, we show that our method outperforms alternative baseline approaches, and prove the usefulness of clustering queries, and of our progressive refinement process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

Article Open access 21 March 2016

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

LOH and Behold: Web-Scale Visual Search, Recommendation and Clustering Using Locally Optimized Hashing

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Crowdsourcing is a general name for processes that involve posing many small-scale tasks to the crowd of web users, and piecing together the crowd’s answers to achieve a larger-scale goal, such as constructing a large knowledge base.

References

Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a mahalanobis metric from equivalence constraints. J. Mach. Learn. Res. 6(6), 937–965 (2005)
MathSciNet MATH Google Scholar
Biswas, A., Jacobs, D.: Active image clustering with pairwise constraints from humans. Int. J. Comput. Vis. 108(1–2), 133–147 (2014)
Article MathSciNet MATH Google Scholar
Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., et al.: Shapenet: An information-rich 3d model repository. arXiv:1512.03012 (arXiv preprint) (2015)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer vision and pattern recognition, IEEE, 2005, vol. 1, pp. 886–893 (2005)
Davidson, S.B., Khanna, S., Milo, T., Roy, S.: Using the crowd for top-k and group-by queries. In: International conference on database theory, pp. 225–236. ACM (2013)
Frome, A., Singer, Y., Sha, F., Malik, J.: Learning globally-consistent local distance functions for shape-based image retrieval and classification. In: International conference on computer vision, IEEE. pp. 1–8 (2007)
Gomes, R.G., Welinder, P., Krause, A., Perona, P.: Crowdclustering. In: Advances in neural information processing systems. pp. 558–566 (2011)
Lowe, D.G.: Object recognition from local scale-invariant features. In: International conference on computer vision, IEEE 1999, vol. 2, pp. 1150–1157 (1999)
Lun, Z., Kalogerakis, E., Sheffer, A.: Elements of style: learning perceptual shape style similarity. ACM Trans. Gr. (TOG) 34(4), 84 (2015)
Google Scholar
Marcus, A., Wu, E., Karger, D., Madden, S., Miller, R.: Human-powered sorts and joins. Proc. VLDB Endow. 5(1), 13–24 (2011)
Article Google Scholar
O’Donovan, P., Lībeks, J., Agarwala, A., Hertzmann, A.: Exploratory font selection using crowdsourced attributes. ACM Trans. Gr. (TOG) 33(4), 92 (2014)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)
Article MATH Google Scholar
Saleh, B., Dontcheva, M., Hertzmann, A., Liu, Z.: Learning style similarity for searching infographics. In: Proceedings of the 41st Graphics Interface Conference, pp. 59–64. Canadian Information Processing Society (2015)
Sammon, J.W.: A nonlinear mapping for data structure analysis. In: IEEE transactions on computers (1969)
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: International conference on computer vision, IEEE 2003, pp. 1470–1477 (2003)
Tamuz, O., Liu, C., Shamir, O., Kalai, A., Belongie, S.J.: Adaptively learning the crowd kernel. In: International conference on machine learning (ICML-11), pp. 673–680. ACM (2011)
Wang, C., Blei, D., Li, F.-F.: Simultaneous image classification and annotation. Computer vision and pattern recognition, IEEE 2009, pp. 1903–1910 (2009)
Wang, J., Kraska, T., Franklin, M.J., Feng, J.: Crowder: crowdsourcing entity resolution. Proc. VLDB Endow. 5(11), 1483–1494 (2012)
Article Google Scholar
Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: Advances in neural information processing systems, pp. 1473–1480 (2005)
Wilber, M.J., Kwak, I.S., Belongie, S.J.: Cost-effective hits for relative similarity comparisons. In: Conference on human computation and crowdsourcing (2014)
Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.: Distance metric learning with application to clustering with side-information. Adv. Neural Inf. Proc. Syst. 15, 505–512 (2003)
Google Scholar
Yi, J., Jin, R., Jain, S., Yang, T., Jain, A.K.: Semi-crowdsourced clustering: Generalizing crowd labeling by robust distance metric learning. In: Advances in neural information processing systems, pp. 1772–1780 (2012)
Zha, Z.-J., Hua, X.-S., Mei, T., Wang, J., Qi, G.-J., Wang, Z.: Joint multi-label multi-instance learning for image classification. Computer vision and pattern recognition, IEEE 2008, pp. 1–8 (2008)

Download references

Acknowledgments

This research was supported by a Google Focused Research Award, the Israeli Science Foundation (ISF, Grant No. 1636/13), by ICRC-The Blavatnik Interdisciplinary Cyber Research Center, and by the European Research Council under the FP7, ERC Grant MoDaS, Agreement 291071.

Author information

Authors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Yanir Kleiman, George Goldberg & Daniel Cohen-Or
Bar Ilan University, Ramat Gan, Israel
Yael Amsterdamer

Authors

Yanir Kleiman
View author publications
You can also search for this author in PubMed Google Scholar
George Goldberg
View author publications
You can also search for this author in PubMed Google Scholar
Yael Amsterdamer
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Cohen-Or
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yael Amsterdamer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kleiman, Y., Goldberg, G., Amsterdamer, Y. et al. Toward semantic image similarity from crowdsourced clustering. Vis Comput 32, 1045–1055 (2016). https://doi.org/10.1007/s00371-016-1266-4

Download citation

Published: 20 May 2016
Issue Date: June 2016
DOI: https://doi.org/10.1007/s00371-016-1266-4

Toward semantic image similarity from crowdsourced clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

LOH and Behold: Web-Scale Visual Search, Recommendation and Clustering Using Locally Optimized Hashing

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Toward semantic image similarity from crowdsourced clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

Joint Inference in Weakly-Annotated Image Datasets via Dense Correspondence

LOH and Behold: Web-Scale Visual Search, Recommendation and Clustering Using Locally Optimized Hashing

Explore related subjects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation