Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ICDM.2005.24guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

An Improved Categorization of Classifier's Sensitivity on Sample Selection Bias

Published: 27 November 2005 Publication History

Abstract

A recent paper categorizes classifier learning algorithms according to their sensitivity to a common type of sample selection bias where the chance of an example being selected into the training sample depends on its feature vector x but not (directly) on its class label y. A classifier learner is categorized as "local" if it is insensitive to this type of sample selection bias, otherwise, it is considered "global". In that paper, the true model is not clearly distinguished from the model that the algorithm outputs. In their discussion of Bayesian classifiers, logistic regression and hard-margin SVMs, the true model (or the model that generates the true class label for every example) is implicitly assumed to be contained in the model space of the learner, and the true class probabilities and model estimated class probabilities are assumed to asymptotically converge as the training data set size increases. However, in the discussion of naive Bayes, decision trees and soft-margin SVMs, the model space is assumed not to contain the true model, and these three algorithms are instead argued to be "global learners". We argue that most classifier learners may or may not be affected by sample selection bias; this depends on the dataset as well as the heuristics or inductive bias implied by the learning algorithm and their appropriateness to the particular dataset.

References

[1]
Ying So. A tutorial on logistic regression. Technical report, SAS Institute Inc, Cary, NC, 1999.
[2]
Bianca Zadrozny. Learning and evaluating classifiers under sample selection bias. In Proceedings of the 21th International Conference on Machine Learning, 2004.

Cited By

View all
  • (2023)Correcting for selection bias and missing response in regression using privileged informationProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625853(195-205)Online publication date: 31-Jul-2023
  • (2021)Why Attentions May Not Be Interpretable?Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467307(25-34)Online publication date: 14-Aug-2021
  • (2013)Transfer defect learningProceedings of the 2013 International Conference on Software Engineering10.5555/2486788.2486839(382-391)Online publication date: 18-May-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICDM '05: Proceedings of the Fifth IEEE International Conference on Data Mining
November 2005
837 pages
ISBN:0769522785

Publisher

IEEE Computer Society

United States

Publication History

Published: 27 November 2005

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Correcting for selection bias and missing response in regression using privileged informationProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625853(195-205)Online publication date: 31-Jul-2023
  • (2021)Why Attentions May Not Be Interpretable?Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467307(25-34)Online publication date: 14-Aug-2021
  • (2013)Transfer defect learningProceedings of the 2013 International Conference on Software Engineering10.5555/2486788.2486839(382-391)Online publication date: 18-May-2013
  • (2013)Inferring the demographics of search usersProceedings of the 22nd international conference on World Wide Web10.1145/2488388.2488401(131-140)Online publication date: 13-May-2013
  • (2013)Generalization of Malaria Incidence Prediction Models by Correcting Sample Selection BiasPart II of the Proceedings of the 9th International Conference on Advanced Data Mining and Applications - Volume 834710.1007/978-3-642-53917-6_17(189-200)Online publication date: 14-Dec-2013
  • (2011)Predicting concept changes using a committee of expertsProceedings of the 18th international conference on Neural Information Processing - Volume Part I10.1007/978-3-642-24955-6_69(580-588)Online publication date: 13-Nov-2011
  • (2009)Graph-based transfer learningProceedings of the 18th ACM conference on Information and knowledge management10.1145/1645953.1646073(937-946)Online publication date: 2-Nov-2009
  • (2008)Pedestrian flow prediction in extensive road networks using biased observational dataProceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems10.1145/1463434.1463512(1-4)Online publication date: 5-Nov-2008
  • (2007)Covariate Shift Adaptation by Importance Weighted Cross ValidationThe Journal of Machine Learning Research10.5555/1314498.13903248(985-1005)Online publication date: 1-Dec-2007
  • (2006)Reverse testingProceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1150402.1150422(147-156)Online publication date: 20-Aug-2006

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media