Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3278721.3278742acmconferencesArticle/Chapter ViewAbstractPublication PagesaiesConference Proceedingsconference-collections
research-article

Fair Forests: Regularized Tree Induction to Minimize Model Bias

Published: 27 December 2018 Publication History

Abstract

The potential lack of fairness in the outputs of machine learning algorithms has recently gained attention both within the research community as well as in society more broadly. Surprisingly, there is no prior work developing tree-induction algorithms for building fair decision trees or fair random forests. These methods have widespread popularity as they are one of the few to be simultaneously interpretable, non-linear, and easy-to-use. In this paper we develop, to our knowledge, the first technique for the induction of fair decision trees.We show that our "Fair Forest" retains the benefits of the tree-based approach, while providing both greater accuracy and fairness than other alternatives, for both "group fairness'' and "individual fairness.'' We also introduce new measures for fairness which are able to handle multinomial and continues attributes as well as regression problems, as opposed to binary attributes and labels only. Finally, we demonstrate a new, more robust evaluation procedure for algorithms that considers the dataset in its entirety rather than only a specific protected attribute.

References

[1]
Yahav Bechavod and Katrina Ligett. 2017. Learning Fair Classifiers: A Regularization-Inspired Approach. In FAT ML Workshop . http://arxiv.org/abs/1707.00044
[2]
Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. 2017. A Convex Framework for Fair Regression. In FAT ML Workshop . http://arxiv.org/abs/1706.02409
[3]
Leo Breiman. 2001. Random forests . Machine learning, Vol. 45, 1 (2001), 5--32.
[4]
Leo Breiman. 2003. Manual on setting up, using, and understanding random forests v4.0 . Statistics Department University of California Berkeley, CA, USA (2003).
[5]
Leo Breiman, Jerome Friedman, Charles J. Stone, and R.A. Olshen. 1984. Classification and Regression Trees. CRC press.
[6]
Toon Calders, Asim Karim, Faisal Kamiran, Wasif Ali, and Xiangliang Zhang. 2013. Controlling Attribute Effect in Linear Regression. In 2013 IEEE 13th International Conference on Data Mining. IEEE, 71--80.
[7]
Toon Calders and Sicco Verwer. 2010. Three Naive Bayes Approaches for Discrimination-free Classification . Data Min. Knowl. Discov., Vol. 21, 2 (9 2010), 277--292.
[8]
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: Reliable Large-scale Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining .
[9]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness Through Awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS '12). ACM, New York, NY, USA, 214--226.
[10]
Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, and Max Leiserson. 2017. Decoupled classifiers for fair and efficient machine learning. In FAT ML Workshop . https://doi.org/1707.06613
[11]
Harrison Edwards and Amos Storkey. 2016. Censoring Representations with an Adversary. In International Conference on Learning Representations (ICLR) . http://arxiv.org/abs/1511.05897
[12]
Manuel Ferná ndez-Delgado, Eva Cernadas, Senén Barro, and Dinani Amorim. 2014. Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? Journal of Machine Learning Research, Vol. 15 (2014), 3133--3181. http://jmlr.org/papers/v15/delgado14a.html
[13]
Jerome H. Friedman. 2002. Stochastic gradient boosting . Computational Statistics & Data Analysis, Vol. 38, 4 (2002), 367--378. http://www.sciencedirect.com/science/article/pii/S0167947301000652
[14]
Eva Garcí a-Martí n and Niklas Lavesson. 2017. Is it ethical to avoid error analysis?. In FAT ML Workshop . http://arxiv.org/abs/1706.10237
[15]
Patrick Hall and Navdeep Gill. 2017. Debugging the Black-Box COMPAS Risk Assessment Instrument to Diagnose and Remediate Bias . (2017). https://openreview.net/pdf?id=r1iWHVJ7Z
[16]
Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of Opportunity in Supervised Learning . In Advances in Neural Information Processing Systems 29 (NIPS 2016) .
[17]
Faisal Kamiran and Toon Calders. 2009. Classifying without discriminating. In 2009 2nd International Conference on Computer, Control and Communication. IEEE, 1--6.
[18]
Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-aware Learning Through Regularization Approach. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW '11). IEEE Computer Society, Washington, DC, USA, 643--650.
[19]
Virgile Landeiro and Aron Culotta. 2016. Robust Text Classification in the Presence of Confounding Bias. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI'16). AAAI Press, 186--193. http://dl.acm.org/citation.cfm?id=3015812.3015840
[20]
Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard Zemel. 2016. The Variational Fair Autoencoder. In International Conference on Learning Representations (ICLR) . http://arxiv.org/abs/1511.00830
[21]
Gilles Louppe, Louis Wehenkel, Antonio Sutera, and Pierre Geurts. 2013. Understanding variable importances in forests of randomized trees . In Advances in Neural Information Processing Systems 26, C.j.c. Burges, L Bottou, M Welling, Z Ghahramani, and K.q. Weinberger (Eds.). 431--439. http://media.nips.cc/nipsbooks/nipspapers/paper_files/nips26/281.pdf
[22]
Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini. 2011. k-NN As an Implementation of Situation Testing for Discrimination Discovery and Prevention. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '11). ACM, New York, NY, USA, 502--510.
[23]
Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware Data Mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08). ACM, New York, NY, USA, 560--568.
[24]
J R Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann series in M achine L earning, Vol. 1. Morgan Kaufmann. 302 pages. http://portal.acm.org/citation.cfm?id=152181
[25]
Edward Raff. 2017. JSAT: Java Statistical Analysis Tool, a Library for Machine Learning . Journal of Machine Learning Research, Vol. 18, 23 (2017), 1--5. http://jmlr.org/papers/v18/16--131.html
[26]
Michael Skirpan and Micha Gorelick. 2017. The Authority of "Fair" in Machine Learning. In FAT ML Workshop . http://arxiv.org/abs/1706.09976
[27]
Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning Fair Representations. In Proceedings of the 30th International Conference on Machine Learning (Proceedings of Machine Learning Research), Sanjoy Dasgupta and David McAllester (Eds.), Vol. 28. PMLR, Atlanta, Georgia, USA, 325--333. http://proceedings.mlr.press/v28/zemel13.html

Cited By

View all
  • (2024)A Critical Survey on Fairness Benefits of Explainable AIProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658990(1579-1595)Online publication date: 3-Jun-2024
  • (2024)Addressing bias in bagging and boosting regression modelsScientific Reports10.1038/s41598-024-68907-514:1Online publication date: 8-Aug-2024
  • (2024)A Case Study of Accurate and Fair ClassificationProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9243-0_40(409-419)Online publication date: 2-Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society
December 2018
406 pages
ISBN:9781450360128
DOI:10.1145/3278721
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 December 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. fairness
  2. feature importance
  3. random forest

Qualifiers

  • Research-article

Conference

AIES '18
Sponsor:
AIES '18: AAAI/ACM Conference on AI, Ethics, and Society
February 2 - 3, 2018
LA, New Orleans, USA

Acceptance Rates

AIES '18 Paper Acceptance Rate 61 of 162 submissions, 38%;
Overall Acceptance Rate 61 of 162 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)43
  • Downloads (Last 6 weeks)4
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Critical Survey on Fairness Benefits of Explainable AIProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658990(1579-1595)Online publication date: 3-Jun-2024
  • (2024)Addressing bias in bagging and boosting regression modelsScientific Reports10.1038/s41598-024-68907-514:1Online publication date: 8-Aug-2024
  • (2024)A Case Study of Accurate and Fair ClassificationProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9243-0_40(409-419)Online publication date: 2-Feb-2024
  • (2023)FAREProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619037(15401-15420)Online publication date: 23-Jul-2023
  • (2023)Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health recordsBMC Medical Informatics and Decision Making10.1186/s12911-023-02271-823:1Online publication date: 18-Sep-2023
  • (2023)Bias Mitigation for Machine Learning Classifiers: A Comprehensive SurveyACM Journal on Responsible Computing10.1145/3631326Online publication date: 1-Nov-2023
  • (2023)The Role of Explainable AI in the Research Field of AI EthicsACM Transactions on Interactive Intelligent Systems10.1145/359997413:4(1-39)Online publication date: 1-Jun-2023
  • (2023)Tracking Machine Learning Bias Creep in Traditional and Online Lending Systems with Covariance AnalysisProceedings of the 15th ACM Web Science Conference 202310.1145/3578503.3583605(184-195)Online publication date: 30-Apr-2023
  • (2023)Fairmod: making predictions fair in multiple protected attributesKnowledge and Information Systems10.1007/s10115-023-02003-466:3(1861-1884)Online publication date: 30-Oct-2023
  • (2022)A minimax framework for quantifying risk-fairness trade-off in regressionThe Annals of Statistics10.1214/22-AOS219850:4Online publication date: 1-Aug-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media