The Inefficiency of Achieving Fairness with Protected Attribute Suppression
Resumo
In recent years, there has been an increase in the use of artificial intelligence for various tasks, including classifying individuals for purposes such as granting bank loans. Although this technology has enabled the automation of tasks, it has also raised social and ethical concerns due to the potential propagation of bias against historically discriminated groups. The attributes that contain these groups are known as protected attributes. This work suggests that simple methods of suppressing these attributes are insufficient to eliminate bias and achieve fairness in classification algorithms. We analyzed the correlation and independence between attributes and evaluated the impact of suppression on the classification task, considering both utility and fairness.
Palavras-chave:
Algorithmic Fairness, Protected attribute suppression, Machine Learning
Referências
M. Baak, R. Koopman, H. Snoek, and S. Klous. A new correlation coefficient between categorical, ordinal and interval variables with pearson characteristics. Computational Statistics & Data Analysis, 152:107043, 2020.
S. Barocas, M. Hardt, and A. Narayanan. Fairness and machine learning: Limitations and opportunities. MIT Press, 2023.
B. Becker and R. Kohavi. Adult. UCI Machine Learning Repository, 1996. J. Berkson. Application of the logistic function to bio-assay. Journal of the American statistical association, 39(227):357–365, 1944.
S. Caton and C. Haas. Fairness in machine learning: A survey. ACM Comput. Surv., 56 (7), apr 2024. ISSN 0360-0300.
P. Dhar, J. Gleason, A. Roy, C. D. Castillo, and R. Chellappa. Pass: protected attribute suppression system for mitigating bias in face recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15087–15096, 2021.
T. Fawcett. An introduction to roc analysis. Pattern recognition letters, 27(8):861–874, 2006.
F. Kamiran and T. Calders. Classifying without discriminating. In 2009 2nd international conference on computer, control and communication, pages 1–6. IEEE, 2009.
M. Kearns and A. Roth. The ethical algorithm: The science of socially aware algorithm design. Oxford University Press, 2019.
M. Koivisto and K. Sood. Exact bayesian structure discovery in bayesian networks. The Journal of Machine Learning Research, 5:549–573, 2004.
M. Langenkamp, A. Costa, and C. Cheung. Hiring fairly in the age of algorithms. arXiv preprint arXiv:2004.07132, 2020.
J. Larson, M. Roswell, and V. Atlidakis. Compas. [link], 2016. July 29, 2022.
T. Le Quy, A. Roy, V. Iosifidis, W. Zhang, and E. Ntoutsi. A survey on datasets for fairness-aware machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(3):e1452, 2022.
V. G. Martini. Análise de equidade em algoritmos de ia na área da saúde: um estudo sobre viés de dados, medidas de pós-processamento e correlações de atributos. 2023.
S. Moro, P. Cortez, and P. Rita. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62:22–31, 2014.
U. E. Orji, C. H. Ugwuishiwu, J. C. Nguemaleu, and P. N. Ugwuanyi. Machine learning models for predicting bank loan eligibility. In 2022 IEEE Nigeria 4th International Conference on Disruptive Technologies for Sustainable Development (NIGERCON), pages 1–5. IEEE, 2022.
J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan kaufmann, 1988.
D. Pedreshi, S. Ruggieri, and F. Turini. Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 560–568, 2008.
E. Pitoura, K. Stefanidis, and G. Koutrika. Fairness in rankings and recommenders: Models, methods and research directions. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 2358–2361. IEEE, 2021.
The US EEOC. Uniform guidelines on employee selection procedures, March 2, 1979. L. F. Wightman. Lsac national longitudinal bar passage study. lsac research report series. 1998.
S. Barocas, M. Hardt, and A. Narayanan. Fairness and machine learning: Limitations and opportunities. MIT Press, 2023.
B. Becker and R. Kohavi. Adult. UCI Machine Learning Repository, 1996. J. Berkson. Application of the logistic function to bio-assay. Journal of the American statistical association, 39(227):357–365, 1944.
S. Caton and C. Haas. Fairness in machine learning: A survey. ACM Comput. Surv., 56 (7), apr 2024. ISSN 0360-0300.
P. Dhar, J. Gleason, A. Roy, C. D. Castillo, and R. Chellappa. Pass: protected attribute suppression system for mitigating bias in face recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15087–15096, 2021.
T. Fawcett. An introduction to roc analysis. Pattern recognition letters, 27(8):861–874, 2006.
F. Kamiran and T. Calders. Classifying without discriminating. In 2009 2nd international conference on computer, control and communication, pages 1–6. IEEE, 2009.
M. Kearns and A. Roth. The ethical algorithm: The science of socially aware algorithm design. Oxford University Press, 2019.
M. Koivisto and K. Sood. Exact bayesian structure discovery in bayesian networks. The Journal of Machine Learning Research, 5:549–573, 2004.
M. Langenkamp, A. Costa, and C. Cheung. Hiring fairly in the age of algorithms. arXiv preprint arXiv:2004.07132, 2020.
J. Larson, M. Roswell, and V. Atlidakis. Compas. [link], 2016. July 29, 2022.
T. Le Quy, A. Roy, V. Iosifidis, W. Zhang, and E. Ntoutsi. A survey on datasets for fairness-aware machine learning. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 12(3):e1452, 2022.
V. G. Martini. Análise de equidade em algoritmos de ia na área da saúde: um estudo sobre viés de dados, medidas de pós-processamento e correlações de atributos. 2023.
S. Moro, P. Cortez, and P. Rita. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62:22–31, 2014.
U. E. Orji, C. H. Ugwuishiwu, J. C. Nguemaleu, and P. N. Ugwuanyi. Machine learning models for predicting bank loan eligibility. In 2022 IEEE Nigeria 4th International Conference on Disruptive Technologies for Sustainable Development (NIGERCON), pages 1–5. IEEE, 2022.
J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan kaufmann, 1988.
D. Pedreshi, S. Ruggieri, and F. Turini. Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 560–568, 2008.
E. Pitoura, K. Stefanidis, and G. Koutrika. Fairness in rankings and recommenders: Models, methods and research directions. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 2358–2361. IEEE, 2021.
The US EEOC. Uniform guidelines on employee selection procedures, March 2, 1979. L. F. Wightman. Lsac national longitudinal bar passage study. lsac research report series. 1998.
Publicado
14/10/2024
Como Citar
R. ARAGÃO, Lucas; SILVA, Maria de Lourdes M.; MACHADO, Javam C..
The Inefficiency of Achieving Fairness with Protected Attribute Suppression. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 39. , 2024, Florianópolis/SC.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 813-819.
ISSN 2763-8979.
DOI: https://doi.org/10.5753/sbbd.2024.243146.