research-article

Querying Discriminative and Representative Samples for Batch Mode Active Learning

Authors:

Jieping YeAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 9, Issue 3

Article No.: 17, Pages 1 - 23

https://doi.org/10.1145/2700408

Published: 17 February 2015 Publication History

Abstract

Empirical risk minimization (ERM) provides a principled guideline for many machine learning and data mining algorithms. Under the ERM principle, one minimizes an upper bound of the true risk, which is approximated by the summation of empirical risk and the complexity of the candidate classifier class. To guarantee a satisfactory learning performance, ERM requires that the training data are i.i.d. sampled from the unknown source distribution. However, this may not be the case in active learning, where one selects the most informative samples to label, and these data may not follow the source distribution. In this article, we generalize the ERM principle to the active learning setting. We derive a novel form of upper bound for the true risk in the active learning setting; by minimizing this upper bound, we develop a practical batch mode active learning method. The proposed formulation involves a nonconvex integer programming optimization problem. We solve it efficiently by an alternating optimization method. Our method is shown to query the most informative samples while preserving the source distribution as much as possible, thus identifying the most uncertain and representative queries. We further extend our method to multiclass active learning by introducing novel pseudolabels in the multiclass case and developing an efficient algorithm. Experiments on benchmark datasets and real-world applications demonstrate the superior performance of our proposed method compared to state-of-the-art methods.

References

[1]

Naoki Abe, Bianca Zadrozny, and John Langford. 2006. Outlier detection by active learning. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 504--509.

Digital Library

[2]

Peter L. Bartlett and Shahar Mendelson. 2002. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463--482.

Digital Library

[3]

Alina Beygelzimer, Sanjoy Dasgupta, and John Langford. 2009. Importance weighted active learning. In Proceedings of the 26th International Conference on Machine Learning (ICML). 49--56.

Digital Library

[4]

James C. Bezdek and Richard J. Hathaway. 2003. Convergence of alternating optimization. Neural, Parallel, and Scientific Computations 11, 4, 351--368.

Digital Library

[5]

Karsten M. Borgwardt, Arthur Gretton, Malte J. Rasch, Hans-Peter Kriegel, Bernhard Schölkopf, and Alex J. Smola. 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22, 14, 49--57.

Digital Library

[6]

Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3, 1--122.

Digital Library

[7]

Christopher J. C. Burges. 1998. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2, 2, 121--167.

Digital Library

[8]

Colin Campbell, Nello Cristianini, and Alex J. Smola. 2000. Query learning with large margin classifiers. In Proceedings of the 17th International Conference on Machine Learning (ICML). 111--118.

Digital Library

[9]

Shayok Chakraborty, Vineeth Balasubramanian, and Sethuraman Panchanathan. 2011. Dynamic batch mode active learning. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2649--2656.

Digital Library

[10]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 3, 27:1--27:27.

Digital Library

[11]

Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien (Eds.). 2006. Semi-Supervised Learning. MIT Press, Cambridge, MA.

[12]

Rita Chattopadhyay, Zheng Wang, Wei Fan, Ian Davidson, Sethuraman Panchanathan, and Jieping Ye. 2012. Batch mode active sampling based on marginal probability distribution matching. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 741--749.

Digital Library

[13]

Yuxin Chen and Andreas Krause. 2013. Near-optimal batch mode active learning and adaptive submodular optimization. In Proceedings of the 30th International Conference on Machine Learning (ICML). 160--168.

[14]

Yunmei Chen and Xiaojing Ye. 2011. Projection onto a simplex. arXiv preprint arXiv:1101.6081.

[15]

David A. Cohn, Zoubin Ghahramani, and Michael I. Jordan. 1996. Active learning with statistical models. Journal of Artificial Intelligence Research 4, 1, 129--145.

Digital Library

[16]

F. d’Alché Buc, Yves Grandvalet, and Christophe Ambroise. 2002. Semi-supervised MarginBoost. In Advances in Neural Information Processing Systems 14, 553--563.

[17]

Sanjoy Dasgupta. 2011. Two faces of active learning. Theoretical Computer Science 412, 19, 1767--1781.

Digital Library

[18]

Richard M. Dudley. 2002. Real Analysis and Probability. Cambridge University Press.

[19]

Andrew Frank and Arthur Asuncion. 2010. UCI Machine Learning Repository. Retrieved December 28, 2014, from http://archive.ics.uci.edu/ml.

[20]

Yoav Freund, H. Sebastian Seung, Eli Shamir, and Naftali Tishby. 1997. Selective sampling using the query by committee algorithm. Machine Learning 28, 2--3, 133--168.

Digital Library

[21]

Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A kernel two-sample test. Journal of Machine Learning Research 13, 723--773.

Digital Library

[22]

Yuhong Guo. 2010. Active instance sampling via matrix partition. In Advances in Neural Information Processing Systems 23, 802--810.

[23]

Yuhong Guo and Dale Schuurmans. 2008. Discriminative batch mode active learning. In Advances in Neural Information Processing Systems 20, 593--600.

[24]

Steven C. H. Hoi, Rong Jin, Jianke Zhu, and Michael R. Lyu. 2006a. Batch mode active learning and its application to medical image classification. In Proceedings of the 23rd International Conference on Machine Learning (ICML). 417--424.

Digital Library

[25]

Steven C. H. Hoi, Rong Jin, and Michael R. Lyu. 2006b. Large-scale text categorization by batch mode active learning. In Proceedings of the 15th International Conference on World Wide Web (WWW). 633--642.

Digital Library

[26]

Steven C. H. Hoi, Rong Jin, and Michael R. Lyu. 2009a. Batch mode active learning with applications to text categorization and image retrieval. IEEE Transactions on Knowledge and Data Engineering 21, 9, 1233--1248.

Digital Library

[27]

Steven C. H. Hoi, Rong Jin, Jianke Zhu, and Michael R. Lyu. 2008. Semi-supervised SVM batch mode active learning for image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--7.

[28]

Steven C. H. Hoi, Rong Jin, Jianke Zhu, and Michael R. Lyu. 2009b. Semisupervised SVM batch mode active learning with applications to image retrieval. ACM Transactions on Information Systems 27, 3, Article No. 16.

Digital Library

[29]

Sheng-Jun Huang, Rong Jin, and Zhi-Hua Zhou. 2010. Active learning by querying informative and representative examples. In Advances in Neural Information Processing Systems 23, 892--900.

[30]

Ajay J. Joshi, Fatih Porikli, and Nikolaos Papanikolopoulos. 2010. Multi-class batch-mode active learning for image classification. In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA). 1873--1878.

[31]

Yehuda Koren. 2008. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 426--434.

Digital Library

[32]

Hieu T. Nguyen and Arnold Smeulders. 2004. Active learning using pre-clustering. In Proceedings of the 21st International Conference on Machine Learning (ICML). 79--86.

Digital Library

[33]

Ryan Rifkin and Aldebaro Klautau. 2004. In defense of one-vs-all classification. Journal of Machine Learning Research 5, 101--141.

Digital Library

[34]

Nicholas Roy and Andrew McCallum. 2001. Toward optimal active learning through sampling estimation of error reduction. In Proceedings of the 18th International Conference on Machine Learning (ICML). 441--448.

Digital Library

[35]

Burr Settles. 2009. Active Learning Literature Survey. Computer Sciences Technical Report 1648. University of Wisconsin--Madison.

[36]

H. Sebastian Seung, Manfred Opper, and Haim Sompolinsky. 1992. Query by committee. In Proceedings of the 5th Annual Conference on Computational Learning Theory (COLT). 287--294.

Digital Library

[37]

Bharath K. Sriperumbudur, Arthur Gretton, Kenji Fukumizu, Bernhard Schölkopf, and Gert R. G. Lanckriet. 2010. Hilbert space embeddings and metrics on probability measures. Journal of Machine Learning Research 11, 1517--1561.

Digital Library

[38]

Masashi Sugiyama. 2006. Active learning in approximately linear regression based on conditional expectation of generalization error. Journal of Machine Learning Research 7, 141--166.

Digital Library

[39]

Simon Tong and Daphne Koller. 2002. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research 2, 45--66.

Digital Library

[40]

Vladimir Vapnik. 1998. Statistical Learning Theory. Wiley.

[41]

Zheng Wang, Shuicheng Yan, and Changshui Zhang. 2011. Active learning with adaptive regularization. Pattern Recognition 44, 10--11, 2375--2383.

Digital Library

[42]

Manfred K. Warmuth, Gunnar Rätsch, Michael Mathieson, Jun Liao, and Christian Lemmen. 2001. Active learning in the drug discovery process. In Advances in Neural Information Processing Systems 14, 1449--1456.

[43]

Zhao Xu, Kai Yu, Volker Tresp, Xiaowei Xu, and Jizhi Wang. 2003. Representative sampling for text classification using support vector machines. In Proceedings of the European Conference on Information Retrieval (ECIR). 393--407.

Digital Library

[44]

Kai Yu, Jinbo, and Volker Tresp. 2006. Active learning via transductive experimental design. In Proceedings of the 23rd International Conference on Machine Learning (ICML). 1081--1088.

Digital Library

[45]

Xiaojin Zhu, John Lafferty, and Zoubin Ghahramani. 2003. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining.

Cited By

Li BLiu CShi MChen XJi XYe Q(2025)Proposal Distribution Calibration for Few-Shot Object DetectionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.333164836:1(1911-1918)Online publication date: Jan-2025
https://doi.org/10.1109/TNNLS.2023.3331648
He DTaimoor N(2025)Batch-mode active ordinal classification based on expected model output change and leadership treeApplied Intelligence10.1007/s10489-024-06152-z55:4Online publication date: 4-Jan-2025
https://dl.acm.org/doi/10.1007/s10489-024-06152-z
Huang SWang TXiong HWen BHuan JDou D(2024)Temporal Output Discrepancy for Loss Estimation-Based Active LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318685535:2(2109-2123)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3186855
Show More Cited By

Index Terms

Querying Discriminative and Representative Samples for Batch Mode Active Learning
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Querying discriminative and representative samples for batch mode active learning
KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Empirical risk minimization (ERM) provides a useful guideline for many machine learning and data mining algorithms. Under the ERM principle, one minimizes an upper bound of the true risk, which is approximated by the summation of empirical risk and the ...
A batch-mode active learning framework by querying discriminative and representative samples for hyperspectral image classification

Batch-mode active learning approaches are dedicated on the training sample set selection for classification, where a batch of unlabeled samples is queried at each iteration. The current state-of-the-art AL techniques exploit different query functions, ...
Batch Mode Active Sampling Based on Marginal Probability Distribution Matching
Special Issue on ACM SIGKDD 2012

Active Learning is a machine learning and data mining technique that selects the most informative samples for labeling and uses them as training data; it is especially useful when there are large amount of unlabeled data and labeling them is expensive. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 9, Issue 3

TKDD Special Issue (SIGKDD'13)

April 2015

313 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/2737800

Editor:
Philip S. Yu
University of Illinois at Chicago, USA

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 February 2015

Accepted: 01 September 2014

Revised: 01 April 2014

Received: 01 October 2013

Published in TKDD Volume 9, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

NSF CCF-1025177
NIH LM010730
ONR N00014-11-1-0108

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

69
Total Citations
View Citations
1,022
Total Downloads

Downloads (Last 12 months)60
Downloads (Last 6 weeks)5

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li BLiu CShi MChen XJi XYe Q(2025)Proposal Distribution Calibration for Few-Shot Object DetectionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.333164836:1(1911-1918)Online publication date: Jan-2025
https://doi.org/10.1109/TNNLS.2023.3331648
He DTaimoor N(2025)Batch-mode active ordinal classification based on expected model output change and leadership treeApplied Intelligence10.1007/s10489-024-06152-z55:4Online publication date: 4-Jan-2025
https://dl.acm.org/doi/10.1007/s10489-024-06152-z
Huang SWang TXiong HWen BHuan JDou D(2024)Temporal Output Discrepancy for Loss Estimation-Based Active LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318685535:2(2109-2123)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3186855
Lin XLiu XChen BWang YDong CHu P(2024)ATAL: Active Learning Using Adversarial Training for Data AugmentationIEEE Internet of Things Journal10.1109/JIOT.2023.330030011:3(4787-4800)Online publication date: 1-Feb-2024
https://doi.org/10.1109/JIOT.2023.3300300
Alajrami ENg TJevsikov JNaidoo PFernandes PAzarmehr NDinmohammadi FShun-shin MDadashi Serej NFrancis DZolgharni M(2024)Active Learning for Left Ventricle Segmentation in EchocardiographyComputer Methods and Programs in Biomedicine10.1016/j.cmpb.2024.108111(108111)Online publication date: Mar-2024
https://doi.org/10.1016/j.cmpb.2024.108111
Wang QLi HXiong HWang WBian JLu YWang SCheng ZDou DYin D(2024)A Simple yet Effective Framework for Active Learning to RankMachine Intelligence Research10.1007/s11633-023-1422-z21:1(169-183)Online publication date: 15-Jan-2024
https://doi.org/10.1007/s11633-023-1422-z
Gonsior JFalkenberg CMagino SReusch AHartmann CThiele MLehner W(2024)Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding OutliersInformation Systems Frontiers10.1007/s10796-024-10503-zOnline publication date: 26-Jun-2024
https://doi.org/10.1007/s10796-024-10503-z
Rangnekar VUpadhyay UAkata ZBanerjee BEvans RShpitser I(2023)USIM-DALProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625994(1707-1717)Online publication date: 31-Jul-2023
https://dl.acm.org/doi/10.5555/3625834.3625994
Lv QDong M(2023)An adaptive active learning algorithm with informativeness and representativenessIntelligent Data Analysis10.3233/IDA-21641827:1(199-222)Online publication date: 30-Jan-2023
https://doi.org/10.3233/IDA-216418
Zhao BBilen H(2023)Dataset Condensation with Distribution Matching2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00645(6503-6512)Online publication date: Jan-2023
https://doi.org/10.1109/WACV56688.2023.00645
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents