research-article

Social negative bootstrapping for visual categorization

Authors:

Cees G. M. Snoek,

Marcel Worring,

Arnold W. M. SmeuldersAuthors Info & Claims

ICMR '11: Proceedings of the 1st ACM International Conference on Multimedia Retrieval

Article No.: 12, Pages 1 - 8

https://doi.org/10.1145/1991996.1992008

Published: 18 April 2011 Publication History

Abstract

To learn classifiers for many visual categories, obtaining labeled training examples in an efficient way is crucial. Since a classifier tends to misclassify negative examples which are visually similar to positive examples, inclusion of such informative negatives should be stressed in the learning process. However, they are unlikely to be hit by random sampling, the de facto standard in literature. In this paper, we go beyond random sampling by introducing a novel social negative bootstrapping approach. Given a visual category and a few positive examples, the proposed approach adaptively and iteratively harvests informative negatives from a large amount of social-tagged images. To label negative examples without human interaction, we design an effective virtual labeling procedure based on simple tag reasoning. Virtual labeling, in combination with adaptive sampling, enables us to select the most misclassified negatives as the informative samples. Learning from the positive set and the informative negative sets results in visual classifiers with higher accuracy. Experiments on two present-day image benchmarks employing 650K virtually labeled negative examples show the viability of the proposed approach. On a popular visual categorization benchmark our precision at 20 increases by 34%, compared to baselines trained on randomly sampled negatives. We achieve more accurate visual categorization without the need of manually labeling any negatives.

References

[1]

H. Bay, A. Ess, T. Tuytelaars, and L. van Gool. Speeded-up robust features (SURF). Comput. Vis. Image Underst., 110(3):346--359, 2008.

Digital Library

[2]

L. Breiman. Bagging predictors. Mach. Learn., 24(2):123--140, 1996.

Digital Library

[3]

C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.

Digital Library

[4]

T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y.-T. Zheng. NUS-WIDE: A real-world web image database from National University of Singapore. In CIVR, 2009.

Digital Library

[5]

R. Cilibrasi and P. Vitanyi. The Google similarity distance. In IEEE Trans. on Knowl. and Data Eng., 2004.

Digital Library

[6]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.

[7]

M. Everingham, L. van Gool, C. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2008 Results. http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html.

[8]

Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55:119--139, 1997.

Digital Library

[9]

N. Japkowicz and S. Stephen. The class imbalance problem: A systematic study. Intell. Data Anal., 6(5):429--449, 2002.

[10]

L. Kennedy, S.-F. Chang, and I. Kozintsev. To search or to label?: Predicting the performance of search-based automatic image classifiers. In ACM MIR, 2006.

Digital Library

[11]

X. Li and C. Snoek. Visual categorization with negative examples for free. In ACM MM, 2009.

Digital Library

[12]

X. Li, C. Snoek, and M. Worring. Learning social tag relevance by neighbor voting. IEEE Trans. MM, 11(7):1310--1322, 2009.

Digital Library

[13]

B. Liu, Y. Dai, X. Li, W. Lee, and P. Yu. Building text classifiers using positive and unlabeled examples. In ICDM, 2003.

Digital Library

[14]

D. Liu, X.-S. Hua, and H.-J. Zhang. Content-based tag processing for internet social images. Multimedia Tools Appl., 51:723--738, 2011.

Digital Library

[15]

G. Miller. WordNet: a lexical database for english. Commun. ACM, 38(11):39--41, 1995.

Digital Library

[16]

M. Naphade, J. Smith, J. Tešić, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE MM, 13(3):86--91, 2006.

Digital Library

[17]

A. Natsev, M. Naphade, and J. Tešić. Learning the semantics of multimedia queries and concepts from a small number of examples. In ACM MM, pages 598--607, 2005.

Digital Library

[18]

B. Russell, A. Torralba, K. Murphy, and W. Freeman. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vision, 77(1--3):157--173, 2008.

Digital Library

[19]

F. Schroff, A. Criminisi, and A. Zisserman. Harvesting image databases from the web. IEEE Trans. Pattern Anal. Mach. Intell., 2010. in press.

Digital Library

[20]

A. Sun and S. Bhowmick. Quantifying tag representativeness of visual content of social images. In ACM MM, 2010.

Digital Library

[21]

D. Tax. One-class classification. PhD thesis, Delft University of Technology, 2001.

[22]

S. Tong and E. Chang. Support vector machine active learning for image retrieval. In ACM MM, 2001.

Digital Library

[23]

A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell., 30(11):1958--1970, 2008.

Digital Library

[24]

J. Uijlings, A. Smeulders, and R. Scha. Real-time visual concept classification. IEEE Trans. MM, 12(7):665--681, 2010.

Digital Library

[25]

A. Ulges, C. Schulze, M. Koch, and T. Breuel. Learning automatic concept detectors from online video. Comput. Vis. Image Underst., 114(4):429--438, 2010.

Digital Library

[26]

V. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 2000.

[27]

P. Viola and M. Jones. Robust real-time face detection. Int. J. Comput. Vision, 57:137--154, 2004.

Digital Library

[28]

X.-J. Wang, L. Zhang, X. Li, and W.-Y. Ma. Annotating images by mining image search results. IEEE Trans. Pattern Anal. Mach. Intell., 30(11):1919--1932, 2008.

Digital Library

[29]

R. Yan, A. Hauptmann, and R. Jin. Negative pseudo-relevance feedback in content-based video retrieval. In ACM MM, 2003.

Digital Library

[30]

K. Yanai and K. Barnard. Probabilistic web image gathering. In ACM MIR, 2005.

Digital Library

[31]

G. Zhu, S. Yan, and Y. Ma. Image tag refinement towards low-rank, content-tag prior and error sparsity. In ACM MM, 2010.

Digital Library

[32]

S. Zhu, G. Wang, C.-W. Ngo, and Y.-G. Jiang. On the sampling of web images for learning visual concept classifiers. In CIVR, 2010.

Digital Library

Cited By

Kumar VKumar V(2017)Automation of image categorization with most relevant negativesPattern Recognition and Image Analysis10.1134/S105466181703005127:3(371-379)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1134/S1054661817030051
Zhou PCheng GLiu ZBu SHu X(2016)Weakly supervised target detection in remote sensing images based on transferred deep features and negative bootstrappingMultidimensional Systems and Signal Processing10.1007/s11045-015-0370-327:4(925-944)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1007/s11045-015-0370-3
Zhou PZhang DCheng GHan J(2015)Negative Bootstrapping for Weakly Supervised Target Detection in Remote Sensing ImagesProceedings of the 2015 IEEE International Conference on Multimedia Big Data10.1109/BigMM.2015.13(318-323)Online publication date: 20-Apr-2015
https://dl.acm.org/doi/10.1109/BigMM.2015.13
Show More Cited By

Index Terms

Social negative bootstrapping for visual categorization
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Negative Bootstrapping for Weakly Supervised Target Detection in Remote Sensing Images
BIGMM '15: Proceedings of the 2015 IEEE International Conference on Multimedia Big Data

When training a classifier in a traditional weakly supervised learning scheme, negative samples are obtained by randomly sampling. However, it may bring deterioration or fluctuation for the performance of the classifier during the iterative training ...
Weakly supervised target detection in remote sensing images based on transferred deep features and negative bootstrapping

Target detection in remote sensing images (RSIs) is a fundamental yet challenging problem faced for remote sensing images analysis. More recently, weakly supervised learning, in which training sets require only binary labels indicating whether an image ...
Bootstrapping Visual Categorization With Relevant Negatives

Learning classifiers for many visual concepts are important for image categorization and retrieval. As a classifier tends to misclassify negative examples which are visually similar to positive ones, inclusion of such misclassified and thus relevant ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '11: Proceedings of the 1st ACM International Conference on Multimedia Retrieval

April 2011

512 pages

ISBN:9781450303361

DOI:10.1145/1991996

General Chairs:
Francesco G. B. De Natale
University of Trento, Italy
,
Alberto Del Bimbo
University of Florence, Italy
,
Program Chairs:
Alan Hanjalic
University of Amsterdam, Netherlands
,
B. S. Manjunath
University of California, Santa Barbara
,
Shin'ichi Satoh
NII, Japan

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 April 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMR'11

Sponsor:

SIGMM

ICMR'11: International Conference on Multimedia Retrieval

April 18 - 20, 2011

Trento, Italy

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
183
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kumar VKumar V(2017)Automation of image categorization with most relevant negativesPattern Recognition and Image Analysis10.1134/S105466181703005127:3(371-379)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1134/S1054661817030051
Zhou PCheng GLiu ZBu SHu X(2016)Weakly supervised target detection in remote sensing images based on transferred deep features and negative bootstrappingMultidimensional Systems and Signal Processing10.1007/s11045-015-0370-327:4(925-944)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1007/s11045-015-0370-3
Zhou PZhang DCheng GHan J(2015)Negative Bootstrapping for Weakly Supervised Target Detection in Remote Sensing ImagesProceedings of the 2015 IEEE International Conference on Multimedia Big Data10.1109/BigMM.2015.13(318-323)Online publication date: 20-Apr-2015
https://dl.acm.org/doi/10.1109/BigMM.2015.13
Li JQian XLi QZhao YWang LTang Y(2015)Mining near duplicate image groupsMultimedia Tools and Applications10.1007/s11042-014-2008-074:2(655-669)Online publication date: 1-Jan-2015
https://dl.acm.org/doi/10.1007/s11042-014-2008-0
Katsurai MOgawa THaseyama M(2014)A Cross-Modal Approach for Extracting Semantic Relationships Between Concepts Using Tagged ImagesIEEE Transactions on Multimedia10.1109/TMM.2014.230665516:4(1059-1074)Online publication date: 1-Jun-2014
https://dl.acm.org/doi/10.1109/TMM.2014.2306655
Liu XHuet B(2014)On the automatic online collection of training data for visual event modelingMultimedia Tools and Applications10.1007/s11042-013-1376-170:1(525-542)Online publication date: 1-May-2014
https://dl.acm.org/doi/10.1007/s11042-013-1376-1
Burghouts GSchutte KBouma HHollander R(2014)Selection of negative samples and two-stage combination of multiple features for action detection in thousands of videosMachine Vision and Applications10.1007/s00138-013-0514-025:1(85-98)Online publication date: 1-Jan-2014
https://dl.acm.org/doi/10.1007/s00138-013-0514-0
Li XSnoek CWorring MKoelma DSmeulders A(2013)Bootstrapping Visual Categorization With Relevant NegativesIEEE Transactions on Multimedia10.1109/TMM.2013.223852315:4(933-945)Online publication date: 1-Jun-2013
https://dl.acm.org/doi/10.1109/TMM.2013.2238523
Li XSnoek CWorring MSmeulders AIp HRui Y(2012)Fusing concept detection and geo context for visual searchProceedings of the 2nd ACM International Conference on Multimedia Retrieval10.1145/2324796.2324801(1-8)Online publication date: 5-Jun-2012
https://dl.acm.org/doi/10.1145/2324796.2324801
Li XSnoek CWorring MSmeulders A(2012)Harvesting Social Images for Bi-Concept SearchIEEE Transactions on Multimedia10.1109/TMM.2012.219194314:4(1091-1104)Online publication date: 1-Aug-2012
https://dl.acm.org/doi/10.1109/TMM.2012.2191943
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents