research-article

Fair Forests: Regularized Tree Induction to Minimize Model Bias

Authors:

Jared Sylvester,

Steven MillsAuthors Info & Claims

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

Pages 243 - 250

https://doi.org/10.1145/3278721.3278742

Published: 27 December 2018 Publication History

Abstract

The potential lack of fairness in the outputs of machine learning algorithms has recently gained attention both within the research community as well as in society more broadly. Surprisingly, there is no prior work developing tree-induction algorithms for building fair decision trees or fair random forests. These methods have widespread popularity as they are one of the few to be simultaneously interpretable, non-linear, and easy-to-use. In this paper we develop, to our knowledge, the first technique for the induction of fair decision trees.We show that our "Fair Forest" retains the benefits of the tree-based approach, while providing both greater accuracy and fairness than other alternatives, for both "group fairness'' and "individual fairness.'' We also introduce new measures for fairness which are able to handle multinomial and continues attributes as well as regression problems, as opposed to binary attributes and labels only. Finally, we demonstrate a new, more robust evaluation procedure for algorithms that considers the dataset in its entirety rather than only a specific protected attribute.

References

[1]

Yahav Bechavod and Katrina Ligett. 2017. Learning Fair Classifiers: A Regularization-Inspired Approach. In FAT ML Workshop . http://arxiv.org/abs/1707.00044

[2]

Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. 2017. A Convex Framework for Fair Regression. In FAT ML Workshop . http://arxiv.org/abs/1706.02409

[3]

Leo Breiman. 2001. Random forests . Machine learning, Vol. 45, 1 (2001), 5--32.

Digital Library

[4]

Leo Breiman. 2003. Manual on setting up, using, and understanding random forests v4.0 . Statistics Department University of California Berkeley, CA, USA (2003).

[5]

Leo Breiman, Jerome Friedman, Charles J. Stone, and R.A. Olshen. 1984. Classification and Regression Trees. CRC press.

[6]

Toon Calders, Asim Karim, Faisal Kamiran, Wasif Ali, and Xiangliang Zhang. 2013. Controlling Attribute Effect in Linear Regression. In 2013 IEEE 13th International Conference on Data Mining. IEEE, 71--80.

[7]

Toon Calders and Sicco Verwer. 2010. Three Naive Bayes Approaches for Discrimination-free Classification . Data Min. Knowl. Discov., Vol. 21, 2 (9 2010), 277--292.

Digital Library

[8]

Tianqi Chen and Carlos Guestrin. 2016. XGBoost: Reliable Large-scale Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining .

Digital Library

[9]

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness Through Awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS '12). ACM, New York, NY, USA, 214--226.

Digital Library

[10]

Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, and Max Leiserson. 2017. Decoupled classifiers for fair and efficient machine learning. In FAT ML Workshop . https://doi.org/1707.06613

[11]

Harrison Edwards and Amos Storkey. 2016. Censoring Representations with an Adversary. In International Conference on Learning Representations (ICLR) . http://arxiv.org/abs/1511.05897

[12]

Manuel Ferná ndez-Delgado, Eva Cernadas, Senén Barro, and Dinani Amorim. 2014. Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? Journal of Machine Learning Research, Vol. 15 (2014), 3133--3181. http://jmlr.org/papers/v15/delgado14a.html

Digital Library

[13]

Jerome H. Friedman. 2002. Stochastic gradient boosting . Computational Statistics & Data Analysis, Vol. 38, 4 (2002), 367--378. http://www.sciencedirect.com/science/article/pii/S0167947301000652

Digital Library

[14]

Eva Garcí a-Martí n and Niklas Lavesson. 2017. Is it ethical to avoid error analysis?. In FAT ML Workshop . http://arxiv.org/abs/1706.10237

[15]

Patrick Hall and Navdeep Gill. 2017. Debugging the Black-Box COMPAS Risk Assessment Instrument to Diagnose and Remediate Bias . (2017). https://openreview.net/pdf?id=r1iWHVJ7Z

[16]

Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of Opportunity in Supervised Learning . In Advances in Neural Information Processing Systems 29 (NIPS 2016) .

Digital Library

[17]

Faisal Kamiran and Toon Calders. 2009. Classifying without discriminating. In 2009 2nd International Conference on Computer, Control and Communication. IEEE, 1--6.

[18]

Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-aware Learning Through Regularization Approach. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW '11). IEEE Computer Society, Washington, DC, USA, 643--650.

Digital Library

[19]

Virgile Landeiro and Aron Culotta. 2016. Robust Text Classification in the Presence of Confounding Bias. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI'16). AAAI Press, 186--193. http://dl.acm.org/citation.cfm?id=3015812.3015840

Digital Library

[20]

Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard Zemel. 2016. The Variational Fair Autoencoder. In International Conference on Learning Representations (ICLR) . http://arxiv.org/abs/1511.00830

[21]

Gilles Louppe, Louis Wehenkel, Antonio Sutera, and Pierre Geurts. 2013. Understanding variable importances in forests of randomized trees . In Advances in Neural Information Processing Systems 26, C.j.c. Burges, L Bottou, M Welling, Z Ghahramani, and K.q. Weinberger (Eds.). 431--439. http://media.nips.cc/nipsbooks/nipspapers/paper_files/nips26/281.pdf

Digital Library

[22]

Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini. 2011. k-NN As an Implementation of Situation Testing for Discrimination Discovery and Prevention. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '11). ACM, New York, NY, USA, 502--510.

Digital Library

[23]

Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware Data Mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08). ACM, New York, NY, USA, 560--568.

Digital Library

[24]

J R Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann series in M achine L earning, Vol. 1. Morgan Kaufmann. 302 pages. http://portal.acm.org/citation.cfm?id=152181

Digital Library

[25]

Edward Raff. 2017. JSAT: Java Statistical Analysis Tool, a Library for Machine Learning . Journal of Machine Learning Research, Vol. 18, 23 (2017), 1--5. http://jmlr.org/papers/v18/16--131.html

Digital Library

[26]

Michael Skirpan and Micha Gorelick. 2017. The Authority of "Fair" in Machine Learning. In FAT ML Workshop . http://arxiv.org/abs/1706.09976

[27]

Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning Fair Representations. In Proceedings of the 30th International Conference on Machine Learning (Proceedings of Machine Learning Research), Sanjoy Dasgupta and David McAllester (Eds.), Vol. 28. PMLR, Atlanta, Georgia, USA, 325--333. http://proceedings.mlr.press/v28/zemel13.html

Digital Library

Cited By

Choi D(2025)Unbiased Isotonic Regression Tree for Discovering Hidden Heterogeneity in Monotonicity ConstraintsApplied Sciences10.3390/app1502081815:2(818)Online publication date: 15-Jan-2025
https://doi.org/10.3390/app15020818
Fermanian JGuégan DLiu X(2025)Fair Learning by Model Averaging (Revised Version)Risk and Decision Analysis10.1177/15697371251321734Online publication date: 2-Mar-2025
https://doi.org/10.1177/15697371251321734
Zahid AAli ARaza SShahnawaz RKamiran FKarim A(2025)FairUDT: Fairness-aware Uplift Decision TreesKnowledge-Based Systems10.1016/j.knosys.2025.113068311(113068)Online publication date: Feb-2025
https://doi.org/10.1016/j.knosys.2025.113068
Show More Cited By

Index Terms

Fair Forests: Regularized Tree Induction to Minimize Model Bias
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Classification and regression trees
2. Social and professional topics
  1. Professional topics
    1. Computing profession
      1. Codes of ethics
  2. User characteristics

Recommendations

Performance evaluation of a fair backoff algorithm for IEEE 802.11 DFWMAC
MobiHoc '02: Proceedings of the 3rd ACM international symposium on Mobile ad hoc networking & computing

Due to hidden terminals and a dynamic topology, contention among stations in an ad-hoc network is not homogeneous. Some stations are at a disadvantage in opportunity of access to the shared channel and can suffer severe throughput degradation when the ...
Inter-AP coordination for fair throughput in infrastructure-based IEEE 802.11 mesh networks
IWCMC '06: Proceedings of the 2006 international conference on Wireless communications and mobile computing

This paper studies throughput fairness among different basic service sets (BSSs) in infrastructure-based IEEE 802.11 mesh networks, where inter-BSS interference is unavoidable because of the difficulty in frequency and coverage planning and the limited ...
Enhanced binary exponential backoff algorithm for fair channel access in the ieee 802.11 medium access control protocol

The medium access control protocol determines system throughput in wireless mobile ad hoc networks following the ieee 802.11 standard. Under this standard, asynchronous data transmissions have a defined distributed coordination function that allows ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

December 2018

406 pages

ISBN:9781450360128

DOI:10.1145/3278721

Program Chairs:
Jason Furman
Harvard University, USA
,
Gary Marchant
Arizona State University, USA
,
Huw Price
Cambridge University, UK
,
Francesca Rossi
IBM Research, USA & University of Padova, Italy

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 December 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AIES '18

Sponsor:

SIGAI

AIES '18: AAAI/ACM Conference on AI, Ethics, and Society

February 2 - 3, 2018

LA, New Orleans, USA

Acceptance Rates

AIES '18 Paper Acceptance Rate 61 of 162 submissions, 38%;

Overall Acceptance Rate 61 of 162 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

34
Total Citations
View Citations
405
Total Downloads

Downloads (Last 12 months)45
Downloads (Last 6 weeks)8

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Choi D(2025)Unbiased Isotonic Regression Tree for Discovering Hidden Heterogeneity in Monotonicity ConstraintsApplied Sciences10.3390/app1502081815:2(818)Online publication date: 15-Jan-2025
https://doi.org/10.3390/app15020818
Fermanian JGuégan DLiu X(2025)Fair Learning by Model Averaging (Revised Version)Risk and Decision Analysis10.1177/15697371251321734Online publication date: 2-Mar-2025
https://doi.org/10.1177/15697371251321734
Zahid AAli ARaza SShahnawaz RKamiran FKarim A(2025)FairUDT: Fairness-aware Uplift Decision TreesKnowledge-Based Systems10.1016/j.knosys.2025.113068311(113068)Online publication date: Feb-2025
https://doi.org/10.1016/j.knosys.2025.113068
Deck LSchoeffer JDe-Arteaga MKühl N(2024)A Critical Survey on Fairness Benefits of Explainable AIProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658990(1579-1595)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3630106.3658990
Wang XChang CYang C(2024)Achieving Equity via Transfer Learning With Fairness OptimizationIEEE Access10.1109/ACCESS.2024.351946512(195229-195241)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3519465
Ugirumurera JBensen ESeverino JSanyal J(2024)Addressing bias in bagging and boosting regression modelsScientific Reports10.1038/s41598-024-68907-514:1Online publication date: 8-Aug-2024
https://doi.org/10.1038/s41598-024-68907-5
Liu XChao Z(2024)A Case Study of Accurate and Fair ClassificationProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9243-0_40(409-419)Online publication date: 2-Feb-2024
https://doi.org/10.1007/978-981-99-9243-0_40
Jovanović NBalunović MDimitrov DVechev MKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)FAREProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619037(15401-15420)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619037
Berge GGranmo OTveit TRuthjersen ASharma J(2023)Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health recordsBMC Medical Informatics and Decision Making10.1186/s12911-023-02271-823:1Online publication date: 18-Sep-2023
https://doi.org/10.1186/s12911-023-02271-8
Hort MChen ZZhang JHarman MSarro F(2023)Bias Mitigation for Machine Learning Classifiers: A Comprehensive SurveyACM Journal on Responsible Computing10.1145/3631326Online publication date: 1-Nov-2023
https://dl.acm.org/doi/10.1145/3631326
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten