Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Querying Discriminative and Representative Samples for Batch Mode Active Learning

Published: 17 February 2015 Publication History

Abstract

Empirical risk minimization (ERM) provides a principled guideline for many machine learning and data mining algorithms. Under the ERM principle, one minimizes an upper bound of the true risk, which is approximated by the summation of empirical risk and the complexity of the candidate classifier class. To guarantee a satisfactory learning performance, ERM requires that the training data are i.i.d. sampled from the unknown source distribution. However, this may not be the case in active learning, where one selects the most informative samples to label, and these data may not follow the source distribution. In this article, we generalize the ERM principle to the active learning setting. We derive a novel form of upper bound for the true risk in the active learning setting; by minimizing this upper bound, we develop a practical batch mode active learning method. The proposed formulation involves a nonconvex integer programming optimization problem. We solve it efficiently by an alternating optimization method. Our method is shown to query the most informative samples while preserving the source distribution as much as possible, thus identifying the most uncertain and representative queries. We further extend our method to multiclass active learning by introducing novel pseudolabels in the multiclass case and developing an efficient algorithm. Experiments on benchmark datasets and real-world applications demonstrate the superior performance of our proposed method compared to state-of-the-art methods.

References

[1]
Naoki Abe, Bianca Zadrozny, and John Langford. 2006. Outlier detection by active learning. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 504--509.
[2]
Peter L. Bartlett and Shahar Mendelson. 2002. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463--482.
[3]
Alina Beygelzimer, Sanjoy Dasgupta, and John Langford. 2009. Importance weighted active learning. In Proceedings of the 26th International Conference on Machine Learning (ICML). 49--56.
[4]
James C. Bezdek and Richard J. Hathaway. 2003. Convergence of alternating optimization. Neural, Parallel, and Scientific Computations 11, 4, 351--368.
[5]
Karsten M. Borgwardt, Arthur Gretton, Malte J. Rasch, Hans-Peter Kriegel, Bernhard Schölkopf, and Alex J. Smola. 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22, 14, 49--57.
[6]
Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3, 1--122.
[7]
Christopher J. C. Burges. 1998. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2, 2, 121--167.
[8]
Colin Campbell, Nello Cristianini, and Alex J. Smola. 2000. Query learning with large margin classifiers. In Proceedings of the 17th International Conference on Machine Learning (ICML). 111--118.
[9]
Shayok Chakraborty, Vineeth Balasubramanian, and Sethuraman Panchanathan. 2011. Dynamic batch mode active learning. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2649--2656.
[10]
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 3, 27:1--27:27.
[11]
Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien (Eds.). 2006. Semi-Supervised Learning. MIT Press, Cambridge, MA.
[12]
Rita Chattopadhyay, Zheng Wang, Wei Fan, Ian Davidson, Sethuraman Panchanathan, and Jieping Ye. 2012. Batch mode active sampling based on marginal probability distribution matching. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 741--749.
[13]
Yuxin Chen and Andreas Krause. 2013. Near-optimal batch mode active learning and adaptive submodular optimization. In Proceedings of the 30th International Conference on Machine Learning (ICML). 160--168.
[14]
Yunmei Chen and Xiaojing Ye. 2011. Projection onto a simplex. arXiv preprint arXiv:1101.6081.
[15]
David A. Cohn, Zoubin Ghahramani, and Michael I. Jordan. 1996. Active learning with statistical models. Journal of Artificial Intelligence Research 4, 1, 129--145.
[16]
F. d’Alché Buc, Yves Grandvalet, and Christophe Ambroise. 2002. Semi-supervised MarginBoost. In Advances in Neural Information Processing Systems 14, 553--563.
[17]
Sanjoy Dasgupta. 2011. Two faces of active learning. Theoretical Computer Science 412, 19, 1767--1781.
[18]
Richard M. Dudley. 2002. Real Analysis and Probability. Cambridge University Press.
[19]
Andrew Frank and Arthur Asuncion. 2010. UCI Machine Learning Repository. Retrieved December 28, 2014, from http://archive.ics.uci.edu/ml.
[20]
Yoav Freund, H. Sebastian Seung, Eli Shamir, and Naftali Tishby. 1997. Selective sampling using the query by committee algorithm. Machine Learning 28, 2--3, 133--168.
[21]
Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A kernel two-sample test. Journal of Machine Learning Research 13, 723--773.
[22]
Yuhong Guo. 2010. Active instance sampling via matrix partition. In Advances in Neural Information Processing Systems 23, 802--810.
[23]
Yuhong Guo and Dale Schuurmans. 2008. Discriminative batch mode active learning. In Advances in Neural Information Processing Systems 20, 593--600.
[24]
Steven C. H. Hoi, Rong Jin, Jianke Zhu, and Michael R. Lyu. 2006a. Batch mode active learning and its application to medical image classification. In Proceedings of the 23rd International Conference on Machine Learning (ICML). 417--424.
[25]
Steven C. H. Hoi, Rong Jin, and Michael R. Lyu. 2006b. Large-scale text categorization by batch mode active learning. In Proceedings of the 15th International Conference on World Wide Web (WWW). 633--642.
[26]
Steven C. H. Hoi, Rong Jin, and Michael R. Lyu. 2009a. Batch mode active learning with applications to text categorization and image retrieval. IEEE Transactions on Knowledge and Data Engineering 21, 9, 1233--1248.
[27]
Steven C. H. Hoi, Rong Jin, Jianke Zhu, and Michael R. Lyu. 2008. Semi-supervised SVM batch mode active learning for image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--7.
[28]
Steven C. H. Hoi, Rong Jin, Jianke Zhu, and Michael R. Lyu. 2009b. Semisupervised SVM batch mode active learning with applications to image retrieval. ACM Transactions on Information Systems 27, 3, Article No. 16.
[29]
Sheng-Jun Huang, Rong Jin, and Zhi-Hua Zhou. 2010. Active learning by querying informative and representative examples. In Advances in Neural Information Processing Systems 23, 892--900.
[30]
Ajay J. Joshi, Fatih Porikli, and Nikolaos Papanikolopoulos. 2010. Multi-class batch-mode active learning for image classification. In Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA). 1873--1878.
[31]
Yehuda Koren. 2008. Factorization meets the neighborhood: A multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 426--434.
[32]
Hieu T. Nguyen and Arnold Smeulders. 2004. Active learning using pre-clustering. In Proceedings of the 21st International Conference on Machine Learning (ICML). 79--86.
[33]
Ryan Rifkin and Aldebaro Klautau. 2004. In defense of one-vs-all classification. Journal of Machine Learning Research 5, 101--141.
[34]
Nicholas Roy and Andrew McCallum. 2001. Toward optimal active learning through sampling estimation of error reduction. In Proceedings of the 18th International Conference on Machine Learning (ICML). 441--448.
[35]
Burr Settles. 2009. Active Learning Literature Survey. Computer Sciences Technical Report 1648. University of Wisconsin--Madison.
[36]
H. Sebastian Seung, Manfred Opper, and Haim Sompolinsky. 1992. Query by committee. In Proceedings of the 5th Annual Conference on Computational Learning Theory (COLT). 287--294.
[37]
Bharath K. Sriperumbudur, Arthur Gretton, Kenji Fukumizu, Bernhard Schölkopf, and Gert R. G. Lanckriet. 2010. Hilbert space embeddings and metrics on probability measures. Journal of Machine Learning Research 11, 1517--1561.
[38]
Masashi Sugiyama. 2006. Active learning in approximately linear regression based on conditional expectation of generalization error. Journal of Machine Learning Research 7, 141--166.
[39]
Simon Tong and Daphne Koller. 2002. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research 2, 45--66.
[40]
Vladimir Vapnik. 1998. Statistical Learning Theory. Wiley.
[41]
Zheng Wang, Shuicheng Yan, and Changshui Zhang. 2011. Active learning with adaptive regularization. Pattern Recognition 44, 10--11, 2375--2383.
[42]
Manfred K. Warmuth, Gunnar Rätsch, Michael Mathieson, Jun Liao, and Christian Lemmen. 2001. Active learning in the drug discovery process. In Advances in Neural Information Processing Systems 14, 1449--1456.
[43]
Zhao Xu, Kai Yu, Volker Tresp, Xiaowei Xu, and Jizhi Wang. 2003. Representative sampling for text classification using support vector machines. In Proceedings of the European Conference on Information Retrieval (ECIR). 393--407.
[44]
Kai Yu, Jinbo, and Volker Tresp. 2006. Active learning via transductive experimental design. In Proceedings of the 23rd International Conference on Machine Learning (ICML). 1081--1088.
[45]
Xiaojin Zhu, John Lafferty, and Zoubin Ghahramani. 2003. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining.

Cited By

View all
  • (2025)Proposal Distribution Calibration for Few-Shot Object DetectionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.333164836:1(1911-1918)Online publication date: Jan-2025
  • (2025)Batch-mode active ordinal classification based on expected model output change and leadership treeApplied Intelligence10.1007/s10489-024-06152-z55:4Online publication date: 4-Jan-2025
  • (2024)Temporal Output Discrepancy for Loss Estimation-Based Active LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318685535:2(2109-2123)Online publication date: Feb-2024
  • Show More Cited By

Index Terms

  1. Querying Discriminative and Representative Samples for Batch Mode Active Learning

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 9, Issue 3
    TKDD Special Issue (SIGKDD'13)
    April 2015
    313 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/2737800
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 February 2015
    Accepted: 01 September 2014
    Revised: 01 April 2014
    Received: 01 October 2013
    Published in TKDD Volume 9, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Active learning
    2. empirical risk minimization
    3. maximum mean discrepancy
    4. representative and discriminative

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • NSF CCF-1025177
    • NIH LM010730
    • ONR N00014-11-1-0108

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)60
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Proposal Distribution Calibration for Few-Shot Object DetectionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.333164836:1(1911-1918)Online publication date: Jan-2025
    • (2025)Batch-mode active ordinal classification based on expected model output change and leadership treeApplied Intelligence10.1007/s10489-024-06152-z55:4Online publication date: 4-Jan-2025
    • (2024)Temporal Output Discrepancy for Loss Estimation-Based Active LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318685535:2(2109-2123)Online publication date: Feb-2024
    • (2024)ATAL: Active Learning Using Adversarial Training for Data AugmentationIEEE Internet of Things Journal10.1109/JIOT.2023.330030011:3(4787-4800)Online publication date: 1-Feb-2024
    • (2024)Active Learning for Left Ventricle Segmentation in EchocardiographyComputer Methods and Programs in Biomedicine10.1016/j.cmpb.2024.108111(108111)Online publication date: Mar-2024
    • (2024)A Simple yet Effective Framework for Active Learning to RankMachine Intelligence Research10.1007/s11633-023-1422-z21:1(169-183)Online publication date: 15-Jan-2024
    • (2024)Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding OutliersInformation Systems Frontiers10.1007/s10796-024-10503-zOnline publication date: 26-Jun-2024
    • (2023)USIM-DALProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625994(1707-1717)Online publication date: 31-Jul-2023
    • (2023)An adaptive active learning algorithm with informativeness and representativenessIntelligent Data Analysis10.3233/IDA-21641827:1(199-222)Online publication date: 30-Jan-2023
    • (2023)Dataset Condensation with Distribution Matching2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00645(6503-6512)Online publication date: Jan-2023
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media