Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ICDMW.2009.38guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm

Published: 06 December 2009 Publication History

Abstract

Active learning algorithms actively select training examples to acquire labels from domain experts, which are very effective to reduce human labeling effort in the context of supervised learning. To reduce computational time in training, as well as provide more convenient user interaction environment, it is necessary to select batches of new training examples instead of a single example. Batch mode active learning algorithms incorporate a diversity measure to construct a batch of diversified candidate examples. Existing approaches use greedy algorithms to make it feasible to the scale of thousands of data. Greedy algorithms, however, are not efficient enough to scale to even larger real world classification applications, which contain millions of data. In this paper, we present an extremely efficient active learning algorithm. This new active learning algorithm achieves the same results as the traditional greedy algorithm, while the run time is reduced by a factor of several hundred times. We prove that the objective function of the algorithm is submodular, which guarantees to find the same solution as the greedy algorithm. We evaluate our approach on several largescale real-world text classification problems, and show that our new approach achieves substantial speedups, while obtaining the same classification accuracy.

Cited By

View all
  • (2016)Scalability of Continuous Active Learning for Reliable High-Recall Text ClassificationProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983776(1039-1048)Online publication date: 24-Oct-2016
  • (2014)PAKDD'12 best paperKnowledge and Information Systems10.5555/2687513.268757041:3(871-892)Online publication date: 1-Dec-2014
  • (2012)Batch Mode Active Learning for Networked DataACM Transactions on Intelligent Systems and Technology10.1145/2089094.20891093:2(1-25)Online publication date: 1-Feb-2012
  • Show More Cited By

Index Terms

  1. Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    ICDMW '09: Proceedings of the 2009 IEEE International Conference on Data Mining Workshops
    December 2009
    679 pages
    ISBN:9780769539027

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 06 December 2009

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)Scalability of Continuous Active Learning for Reliable High-Recall Text ClassificationProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983776(1039-1048)Online publication date: 24-Oct-2016
    • (2014)PAKDD'12 best paperKnowledge and Information Systems10.5555/2687513.268757041:3(871-892)Online publication date: 1-Dec-2014
    • (2012)Batch Mode Active Learning for Networked DataACM Transactions on Intelligent Systems and Technology10.1145/2089094.20891093:2(1-25)Online publication date: 1-Feb-2012
    • (2012)Generating balanced classifier-independent training samples from unlabeled dataProceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I10.1007/978-3-642-30217-6_23(266-281)Online publication date: 29-May-2012

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media