Nothing Special   »   [go: up one dir, main page]

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Online ISSN : 1745-1337
Print ISSN : 0916-8508
Regular Section
Improved Majority Filtering Algorithm for Cleaning Class Label Noise in Supervised Learning
Muhammad Ammar MALIKJae Young CHOIMoonsoo KANGBumshik LEE
Author information
JOURNAL RESTRICTED ACCESS

2019 Volume E102.A Issue 11 Pages 1556-1559

Details
Abstract

In most supervised learning problems, the labelling quality of datasets plays a paramount role in the learning of high-performance classifiers. The performance of a classifier can significantly be degraded if it is trained with mislabeled data. Therefore, identification of such examples from the dataset is of critical importance. In this study, we proposed an improved majority filtering algorithm, which utilized the ability of a support vector machine in terms of capturing potentially mislabeled examples as support vectors (SVs). The key technical contribution of our work, is that the base (or component) classifiers that construct the ensemble of classifiers are trained using non-SV examples, although at the time of testing, the examples captured as SVs were employed. An example can be tagged as mislabeled if the majority of the base classifiers incorrectly classifies the example. Experimental results confirmed that our algorithm not only showed high-level accuracy with higher F1 scores, for identifying the mislabeled examples, but was also significantly faster than the previous methods.

Content from these authors
© 2019 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top