Data Augmentation using Counterfactuals: Proximity vs Diversity
DOI:
https://doi.org/10.32473/flairs.v35i.130705Abstract
Counterfactual explanations are gaining in popularity as a way of explaining machine learning models. Counterfactual examples are generally created to help interpret the decision of a model. In that case, if a model makes a certain decision for an instance, the counterfactual examples of that instance reverse the decision of the model. Counterfactual examples can be created by craftily changing particular feature values of the instance. Though counterfactual examples are generated to explain the decision of machine learning models, we have already explored that counterfactual examples can be used for effective data augmentation. In this work, we want to explore what kind of counterfactual examples work best for data augmentation. In particular, we want to generate counterfactual examples from two perspectives: proximity and diversity. We want to observe
which perspective works best in this regard. We demonstrate the efficacy of these approaches on the widely used “Adult-Income” dataset. We consider several scenarios where we do not have enough data and use each of these approaches to augment the dataset. We compare these two approaches and discuss the implications of the results.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Md Golam Moula Mehedi Hasan, Douglas Talbert
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.