MAPPING: debiasing graph neural networks for fair node classification with limited sensitive information leakage

Ying Song¹ &
Balaji Palanisamy¹

90 Accesses
Explore all metrics

Abstract

Despite remarkable success in diverse web-based applications, Graph Neural Networks (GNNs) inherit and further exacerbate historical discrimination and social stereotypes, which critically hinder their deployments in high-stake domains such as online clinical diagnosis, financial crediting, etc. However, existing research in fair graph learning typically favors pairwise constraints to achieve fairness but fails to cast off dimensional limitations and generalize them into multiple sensitive attributes. Besides, most studies focus on in-processing techniques to enforce and calibrate fairness, constructing a model-agnostic debiasing GNN framework at the pre-processing stage to prevent downstream misuses and improve training reliability is still largely under-explored. Furthermore, previous work tends to enhance either fairness or privacy individually but few probes into how fairness issues trigger privacy concerns and whether such concerns can be alleviated with fairness intervention. In this paper, we propose a novel model-agnostic debiasing framework named MAPPING (Masking And Pruning and Message-Passing trainING) for fair node classification, in which we adopt the distance covariance (dCov)-based fairness constraints to simultaneously reduce feature and topology biases under multiple sensitive memberships, and combine them with adversarial debiasing to confine the risks of sensitive attribute inference. Experiments on real-world datasets with different GNN variants demonstrate the effectiveness and flexibility of MAPPING. Our results show that MAPPING can achieve better trade-offs between utility and fairness, and mitigate privacy risks of sensitive information leakage. This work paves the way for a new direction in trustworthy GNNs by addressing fairness and privacy concerns simultaneously, rather than achieving fairness at the expense of privacy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Toward fair graph neural networks via real counterfactual samples

Article 15 July 2024

Disentangled Counterfactual Graph Augmentation Framework for Fair Graph Learning with Information Bottleneck

Advancing Graph Counterfactual Fairness Through Fair Representation Learning

Data Availability

Data is provided within the manuscript.

Materials Availability

Yes.

Code Availability

Currently no.

Notes

https://github.com/chirag126/nifty

References

Fan, W., Ma, Y., Li, Q., He, Y., Zhao, E., Tang, J., Yin, D.: Graph Neural Networks for Social Recommendation, in The World Wide Web Conference. Association for Computing Machinery, New York, NY, USA, WWW ’19, p. 417–426 (2019). https://doi.org/10.1145/3308558.3313488
Yang, Z., Pei, W., Chen, M., Yue, C.: WTAGRAPH: Web Tracking and Advertising Detection using Graph Neural Networks, In: 2022 IEEE Symposium on Security and Privacy (SP) , pp. 1540–1557 (2022). https://doi.org/10.1109/SP46214.2022.9833670
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. (2020). https://arxiv.org/pdf/1901.00596.pdf
Rahman, T., Surma, B., Backes, M., Zhang, Y.: Fairwalk: Towards Fair Graph Embedding, in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19. Int. Joint Conf. Art. Intell. Org. pp. 3289–3295 (2019). https://doi.org/10.24963/ijcai.2019/456
Suresh, H., Guttag, J.: A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle, In: Equity and Access in Algorithms, Mechanisms, and Optimization ACM, (2021). https://doi.org/10.1145/3465416.3483305
Hamberg, K.: Gender bias in medicine. Women’s Health 4(3), 237–243 (2008). https://doi.org/10.2217/17455057.4.3.237
Article Google Scholar
Dai, E., Wang, S.: Say No to the Discrimination: Learning Fair Graph Neural Networks with Limited Sensitive Attribute Information, In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining , pp. 680–688 (2021)
Agarwal, C., Lakkaraju, H., Zitnik, M.: Towards a unified framework for fair and stable graph representation learning, In: Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, Proceedings of Machine Learning Research, vol. 161 (PMLR, 2021), pp. 2114–2124. https://proceedings.mlr.press/v161/agarwal21b.html
Oneto, L., Navarin, N., Donini, M.: Learning Deep Fair Graph Neural Networks, In: 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2020, Bruges, Belgium, October 2-4, 2020 , pp. 31–36 (2020). https://www.esann.org/sites/default/files/proceedings/2020/ES2020-75.pdf
Jiang, Z., Han, X., Fan, C., Liu, Z., Zou, N., Mostafavi, A., Hu, X.: Fmp: Toward fair graph message passing against topology bias (2022)
Spinelli, I., Scardapane, S., Hussain, A., Uncini, A.: Fairdrop: Biased edge dropout for enhancing fairness in graph representation learning. IEEE Transactions on Artificial Intelligence pp. 1–1 (2021). https://doi.org/10.1109/TAI.2021.3133818
Kamiran, F., Calders, T.: Data pre-processing techniques for classification without discrimination. Knowl. Inf. Syst. 33 (2011). https://doi.org/10.1007/s10115-011-0463-8
Wang, H., Ustun, B., Calmon, F.: Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions, In: Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research, PMLR, 97 pp. 6618–6627 (2019) https://proceedings.mlr.press/v97/wang19l.html
Dong, Y., Ma, J., Wang, S., Chen, C., Li, J.: Fairness in graph mining: A survey. IEEE Trans. Knowl. Data Eng. 01, 1–22 (5555). https://doi.org/10.1109/TKDE.2023.3265598
Dong, Y., Liu, N., Jalaian, B., Li, J.: Edits: Modeling and mitigating data bias for graph neural networks. Proceed. ACM. Web. Conf. 2022, 1259–1269 (2022)
Google Scholar
Villani, C.: Topic. Opt. Transport. Theory. American Mathematical Society, Providence, Rhode Island (2003)
Google Scholar
Belghazi, M.I., Baratin, A., Rajeshwar, S., Ozair, S., Bengio, Y., Courville, A., Hjelm, D.: Mutual Information Neural Estimation, In: Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, PMLR, 80, pp. 531–540 (2018). https://proceedings.mlr.press/v80/belghazi18a.html
Song, J., Ermon, S.: Understanding the Limitations of Variational Mutual Information Estimators, In: International Conference on Learning Representations (2020)
Staerman, G., Laforgue, P., Mozharovskyi, P., d’Alché Buc, F.: When OT meets MoM: Robust estimation of Wasserstein Distance, In: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, PMLR, 130, pp. 136–144 (2021). https://proceedings.mlr.press/v130/staerman21a.html
Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by correlation of distances. The Annal. Stat. 35(6), 2769–2794 (2007). https://doi.org/10.1214/009053607000000505
Article MathSciNet Google Scholar
Hou, J., Ye, X., Feng, W., Zhang, Q., Han, Y., Yusong Liu, Y., Li, Y., Wei, Y.: Distance correlation application to gene co-expression network analysis. BMC Bioinf. 23 (2022). https://doi.org/10.1186/s12859-022-04609-x
Zhang, H., Wu, B., Yuan, X., Pan, S., Tong, H., Pei, J.: Trustworthy graph neural networks: Aspects, methods, and trends. Proceed. IEEE. 112(2), 97–139 (2024). https://doi.org/10.1109/JPROC.2024.3369017
Article Google Scholar
Duddu, V., Boutet, A., Shejwalkar, V.: Quantifying Privacy Leakage in Graph Embedding, In: MobiQuitous 2020 - 17th EAI International Conference on Mobile and Ubiquitous Systems: Comput. Netw. Serv. ACM, (2020). https://doi.org/10.1145/3448891.3448939
He, X., Jia, J., Backes, M., Gong, N.Z., Zhang, Y.: Stealing Links from Graph Neural Networks, In: 30th USENIX Security Symposium (USENIX Security 21) (USENIX Association, 2021), pp. 2669–2686 (2021). https://www.usenix.org/conference/usenixsecurity21/presentation/he-xinlei
He, X., Wen, R., Wu, Y., Backes, M., Shen, Y., Zhang, Y.: Node-level membership inference attacks against graph neural networks (2021)
Conti, M., Li, J., Picek, S., Xu, J.: Label-Only Membership Inference Attack against Node-Level Graph Neural Networks, In: Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security (Association for Computing Machinery, New York, NY, USA, 2022), AISec 22, p. 1–12 (2022) https://doi.org/10.1145/3560830.3563734
Golle, P.: Revisiting the uniqueness of simple demographics in the US population, In: Proceedings of the 5th ACM Workshop on Privacy in Electronic Society (Association for Computing Machinery, New York, NY, USA, 2006), WPES 06, p. 77–80 (2006). https://doi.org/10.1145/1179601.1179615
Zhang, H., Yuan, X., Pan, S.: Unraveling Privacy Risks of Individual Fairness in Graph Neural Networks, In: 2024 IEEE 40th International Conference on Data Engineering (ICDE) IEEE Comput. Soc. Los Alamitos, CA, USA, 2024), pp. 1712–1725 (2024). https://doi.org/10.1109/ICDE60146.2024.00139
Dai, E., Wang, S.: Learning fair graph neural networks with limited and private sensitive attribute information. IEEE Trans. Knowl. Data. Eng. pp. 1–14 (2022). https://doi.org/10.1109/TKDE.2022.3197554
Zhang, S., Yin, H., Chen, T., Huang, Z., Cui, L., Zhang, X.: Graph Embedding for Recommendation against Attribute Inference Attacks, In: Proceedings of the Web Conference 2021 (Associate. Comput. Machine., New York, NY, USA,) WWW 21, p. 3002–3014 (2021). https://doi.org/10.1145/3442381.3449813
Sajadmanesh, S., Gatica-Perez, D.: Locally Private Graph Neural Networks, In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (Association for Computing Machinery, New York, NY, USA, ) CCS 21, p. 2130–2145 (2021). https://doi.org/10.1145/3460120.3484565
Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by correlation of distances. The Annal. Stat. 35(6), 2769–2794 (2007). http://www.jstor.org/stable/25464608
Liu, J., Li, Z., Yao, Y., Xu, F., Ma, X., Xu, M., Tong, H.: Fair Representation Learning: An Alternative to Mutual Information, In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, NY, USA, ) KDD 22, p. 1088–1097 (2022). https://doi.org/10.1145/3534678.3539302
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through Awareness, In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (Association for Computing Machinery, New York, NY, USA,) ITCS 12, p. 214–226 (2012). https://doi.org/10.1145/2090236.2090255
Hardt, M., Price, E., Srebro, N.: Equality of Opportunity in Supervised Learning, In: Proceedings of the 30th International Conference on Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY, USA,) NIPS’ 16, p. 3323–3331 (2016)
Louizos, C., Swersky, K., Li, Y., Welling, M., Zemel, R.: The variational fair autoencoder (2017)
Bose, A.J., Hamilton, W.: Compositional Fairness Constraints for Graph Embeddings, In: Proceedings of the Thirty-sixth International Conference on Machine Learning, Long Beach CA (2019)
Wang, Y., Zhao, Y., Dong, Y., Chen, H., Li, J., Derr, T.: Improving Fairness in Graph Neural Networks via Mitigating Sensitive Attribute Leakage, in SIGKDD (2022)
SKEEM, J.L., LOWENKAMP, C.T. : Risk, race, and recidivism: Predictive bias and disparate impact*. Criminology 54(4), 680–712 (2016). https://doi.org/10.1111/1745-9125.12123. https://onlinelibrary.wiley.com//pdf/10.1111/1745-9125.12123
van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Machine. Learn. Res. 9(86), 2579–2605 (2008). http://jmlr.org/papers/v9/vandermaaten08a.html
Zeng, Z., Islam, R., Keya, K.N., Foulds, J., Song, Y., Pan, S.: Fair representation learning for heterogeneous information networks, In: Proceedings of the International AAAI Conference on Web and Social Media, 15, pp. 877–887 (2021)
Yao, S., Huang, B.: Beyond parity: Fairness objectives for collaborative filtering. Adv. Neural. Inf. Process. Syst. 30 (2017)
Jayaraman, B., Evans, D.: Are Attribute Inference Attacks Just Imputation?, In: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (Association for Computing Machinery, New York, NY, USA,) CCS 22, p. 1569–1582 (2022). https://doi.org/10.1145/3548606.3560663
Beutel, A., Chen, J., Zhao, Z., Chi, E.H.: Data decisions and theoretical implications when adversarially learning fair representations (2017)
Zhao, T., Dai, E., Shu, K., Wang, S.: Towards Fair Classifiers Without Sensitive Attributes: Exploring Biases in Related Features, In: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (Association for Computing Machinery, New York, NY, USA,) WSDM 22, p. 1433–1442 (2022). https://doi.org/10.1145/3488560.3498493
Zhu, H., Wang, S.: Learning fair models without sensitive attributes: A generative approach (2022)
Zafar, M.B., Valera, I., Gomez Rodriguez, M., Gummadi, K.P.: Fairness Beyond Disparate Treatment and Disparate Impact: Learning Classification without Disparate Mistreatment, In: Proceedings of the 26th International Conference on World Wide Web (International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE,) WWW 17, p. 1171–1180 (2017). https://doi.org/10.1145/3038912.3052660
Cho, J., Hwang, G., Suh, C.: A Fair Classifier Using Mutual Information, In: 2020 IEEE International Symposium on Information Theory (ISIT) (IEEE Press,) p. 2521–2526 (2020). https://doi.org/10.1109/ISIT44484.2020.9174293
Roh, Y., Lee, K., Whang, S.E., Suh, C.: FR-Train: A Mutual Information-Based Approach to Fair and Robust Training, In: Proceedings of the 37th International Conference on Machine Learning (JMLR.org,) ICML20, (2020)
Zhang, B.H., Lemoine, B., Mitchell, M.: Mitigating Unwanted Biases with Adversarial Learning, In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society (Association for Computing Machinery, New York, NY, USA,) AIES ’18, p. 335–340 (2018). https://doi.org/10.1145/3278721.3278779
Hamilton, W.L., Ying, R., Leskovec, J.: Inductive Representation Learning on Large Graphs, In: NIPS (2017)
Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How Powerful are Graph Neural Networks?, In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=ryGs6iA5Km
Weisfeiler, B., Lehman, A.: A reduction of a graph to a canonical form and an algebra arising during this reduction., In: Nauchno-Technicheskaya Informatsia, 2(9) , pp. 12–16 (1968)
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: A Next-Generation Hyperparameter Optimization Framework, In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, NY, USA,) KDD 19, p. 2623–2631 ( 2019). https://doi.org/10.1145/3292500.3330701
Huang, C., Huo, X.: A statistically and numerically efficient independence test based on random projections and distance covariance. Frontier. Applied. Math. Stat. 7 (2022). https://doi.org/10.3389/fams.2021.779841
Huo, X., Székely, G.J.: Fast computing for distance covariance. Technometrics 58(4), 435–447 (2016). https://doi.org/10.1080/00401706.2015.1054435
Article MathSciNet Google Scholar
Li, Y., Purcell, M., Rakotoarivelo, T., Smith, D., Ranbaduge, T., Ng, K.S.: Private graph data release: A survey (2022)
Masrour, F., Wilson, T., Yan, H., Tan, P.N., Esfahanian, A.: Bursting the filter bubble: Fairness-aware network link prediction. Proceed. AAAI Conf. Art. Intell. 34(01), 841–848 (2020). https://doi.org/10.1609/aaai.v34i01.5429, https://ojs.aaai.org/index.php/AAAI/article/view/5429
Li, K., Luo, G., Ye, Y., Li, W., Ji, S., Cai, Z.: Adversarial privacy-preserving graph embedding against inference attack. IEEE Internet. Things. J. 8, 6904–6915 (2020)
Article Google Scholar
Liao, P., Zhao, H., Xu, K., Jaakkola, T., Gordon, G.J., Jegelka, S., Salakhutdinov, R.: Information Obfuscation of Graph Neural Networks, In: Proceedings of the 38th International Conference on Machine Learning, Proceed. Machine. Learn. Res., PMLR, 139, pp. 6600–6610 (2021). http://proceedings.mlr.press/v139/liao21a.html
Hu, H., Cheng, L., Vap, J.P., Borowczak, M.: Learning Privacy-Preserving Graph Convolutional Network with Partially Observed Sensitive Attributes. Proceed. ACM Web Conf. 2022, 3552–3561 (2022)
Google Scholar
Wang, B., Guo, J., Li, A., Chen, Y., Li, H.: Privacy-Preserving Representation Learning on Graphs: A Mutual Information Perspective, In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, NY, USA,) KDD 21, p. 1667–1676 (2021). https://doi.org/10.1145/3447548.3467273
Chang, H., Shokri, R.: On the Privacy Risks of Algorithmic Fairness, In: 2021 IEEE European Symposium on Security and Privacy (EuroS &P) (IEEE Computer Society, Los Alamitos, CA, USA,) pp. 292–303 (2021). https://doi.org/10.1109/EuroSP51992.2021.00028
Chen, C., Liang, Y., Xu, X., Xie, S., Hong, Y., Shu, K.: When Fairness Meets Privacy: Fair Classification with Semi-Private Sensitive Attributes, In: Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS 2022 (2022)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating Noise to Sensitivity in Private Data Analysis, in Theory of Cryptography, pp. 265–284. Springer, Berlin Heidelberg, Berlin, Heidelberg (2006)
Google Scholar
Pujol, D., McKenna, R., Kuppam, S., Hay, M., Machanavajjhala, A., Miklau, G.: Fair Decision Making Using Privacy-Protected Data, In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Association for Computing Machinery, New York, NY, USA,) FAT 20, p. 189–199 (2020). https://doi.org/10.1145/3351095.3372872
de Oliveira, A.S., Kaplan, C., Mallat, K., Chakraborty, T.: An empirical analysis of fairness notions under differential privacy (2023)
Ding, J., Zhang, X., Li, X., Wang, J., Yu, R., Pan, M.: Differentially private and fair classification via calibrated functional mechanism, In: Proceedings of the AAAI Conference on Artificial Intelligence, 34 , pp. 622–629 (2020)
Fey, M., Lenssen, J.E.: Fast Graph Representation Learning with PyTorch Geometric, In: ICLR Workshop on Representation Learning on Graphs and Manifolds (2019)
Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization, In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conf. Track. Proceed. (2015). http://arxiv.org/abs/1412.6980
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: An Imperative Style, High-Performance Deep Learning Library, In: Advances in Neural Information Processing Systems 32 (Curran Associates, Inc.,) pp. 8024–8035 (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Download references

Acknowledgements

This research was supported in part by the University of Pittsburgh Center for Research Computing through the resources provided. The first author acknowledges the support from the SCI fellowship.

Funding

This research was supported in part by the University of Pittsburgh Center for Research Computing through the resources provided.

Author information

Authors and Affiliations

Department of Informatics and Networked Systems, University of Pittsburgh, 135 North Bellefield Avenue, Pittsburgh, 15213, PA, USA
Ying Song & Balaji Palanisamy

Authors

Ying Song
View author publications
You can also search for this author in PubMed Google Scholar
Balaji Palanisamy
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Balaji Palanisamy as an advisor reviewed and edited the manuscript. Ying Song finished all sections except for the above two parts.

Corresponding author

Correspondence to Ying Song.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Ethics approval and consent to participate

Yes.

Consent for publication

Yes.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Empirical analysis

1.1 A.1 Data synthesis

Please note that $S_m$ can be highly related to $S_n$ or even not. For $S_n$, we generate a $2500 \times 3$ biased non-sensitive feature matrix from multivariate normal distributions $\mathcal {N} \left( \mu _0, \Sigma _0\right) $ and $\mathcal {N} \left( \mu _1, \Sigma _1\right) $, where subgroup 0 represents minority in reality, $\mu _0=\left( -10,-2,-5\right) ^{T}$, $\mu _1=\left( 10,2,5\right) ^{T}$, $\Sigma _0=\Sigma _1$ are both identity matrices, and $|S_0|=500$ and $|S_1|=2000$. We combine the top 100 sample from $\left( \mu _0,\Sigma _0\right) $, the top 200 from $\left( \mu _1,\Sigma _1\right) $, then the next 200 from $\left( \mu _0,\Sigma _0\right) $ and 600 from $\left( \mu _1,\Sigma _1\right) $, then the next 100 from $\left( \mu _0,\Sigma _0\right) $ and 700 from $\left( \mu _1,\Sigma _1\right) $, and finally the last 100 from $\left( \mu _0,\Sigma _0\right) $ and 500 from $\left( \mu _1,\Sigma _1\right) $. Next, generate $S_n$ based on this combination. And then we create another $2500 \times 3$ non-sensitive matrix attached to $S_n$ from multivariate normal distributions $\mathcal {N} \left( \mu _2, \Sigma _2\right) $ and $\mathcal {N} \left( \mu _3, \Sigma _3\right) $, where subgroup 2 represents minority in reality, $\mu _2=\left( -12,-8,-4\right) ^{T}$, $\mu _3=\left( 12,8,4\right) ^{T}$, $\Sigma _2=\Sigma _3$ are both identity matrices, and $|S_2|=700$ and $|S_3|=1800$. We combine the top 300 from $\left( \mu _2,\Sigma _2\right) $, and 1200 from $\left( \mu _3,\Sigma _3\right) $, and then 400 from $\left( \mu _2,\Sigma _2\right) $ and 600 from $\left( \mu _3,\Sigma _3\right) $. Next, generate $S_m$ based on this combination again. Debiased features are sampled from multivariate normal distributions with $\mu _4=\left( 0,1,0,1,0,1\right) $ and covariance as the identity matrix. Second, the biased topology is formed via the stochastic block model, where the first block contains 500 nodes while the second contains 2000 nodes, and the link probability within blocks is 5e-3 and between blocks is 1e-7. The debiased topology is built with a random geometric graph with 0.033 radius.

1.2 A.2 Implementation details

We built 1-layer GCNs with PyTorch Geometric [69] with Adam optimizer [70], learning rate 1e-3, dropout 0.2, weight decay 1e-5, training epoch 1000, and hidden layer size 16 and implemented them in PyTorch [71]. All experiments are conducted on a 64-bit machine with 4 Nvidia A100 GPUs. The experiments are trained on 1, 000 epochs. We repeat experiments 10 times with different seeds to report the average results.

Appendix B: Pseudo codes

1.1 B.1 Algorithm 1 - Pre-masking Strategies

1.2 B.2 Algorithm 2 - MAPPING

Appendix C: Experiments

1.1 C.1 Dataset description

In German, nodes represent bank clients and edges are connected based on the similarity of clients’ credit accounts. To control credit risks, the bank needs to differentiate applicants with good/bad credits. In Recidivism, nodes denote defendants who got released on bail at the U.S state courts during 1990-2009 and edges are linked based on the similarity of defendants’ basic demographics and past criminal histories. The task is to predict whether the defendants will be bailed. In Credit, nodes are credit card applicants and edges are formed based on the similarity of applicants’ spending and payment patterns. The goal is to predict whether the applicants will default.

1.2 C.2 Hyperparameter setting of SOTA

In this subsection, we detail the hyperparameters for different fair models. To obtain relatively better performance, we leverage Optuna to facilitate grid search.

FairGNN: dropout from $\{0.1,0.2,0.3,0.4,0.5\}$, weight decay 1e-5, learning rate $\{0.001,0.005,0.01,0.05,0.1\}$, regularization coefficients $\alpha =4$ and $\beta =0.01$, sensitive number 200 and label number 500, hidden layer size $\{16,32,64,128\}$, .

NIFTY: project hidden layer size 16, drop edge and feature rates are 0.001 and 0.1, dropout $\{0.1,0.3,0.5\}$, weight decay 1e-5, learning rate $\{0.0001,0.001,0.01\}$, regularization coefficient $\{0.4,0.5,0.6,0.7,0.8\}$, hidden layer size 16.

EDITS: we directly use the debiased datasets in [15], dropout $\{0.05,0.1,0.3,0.5\}$, weight decay {1e-4,1e-5,1e-6,1e-7}, learning rate $\{0.001,0.005,0.01,0.05\}$, hidden layer size 16.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Song, Y., Palanisamy, B. MAPPING: debiasing graph neural networks for fair node classification with limited sensitive information leakage. World Wide Web 27, 74 (2024). https://doi.org/10.1007/s11280-024-01312-0

Download citation

Received: 27 January 2024
Revised: 12 September 2024
Accepted: 23 October 2024
Published: 06 November 2024
DOI: https://doi.org/10.1007/s11280-024-01312-0

MAPPING: debiasing graph neural networks for fair node classification with limited sensitive information leakage

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Toward fair graph neural networks via real counterfactual samples

Disentangled Counterfactual Graph Augmentation Framework for Fair Graph Learning with Information Bottleneck

Advancing Graph Counterfactual Fairness Through Fair Representation Learning

Data Availability

Materials Availability

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethics approval and consent to participate

Consent for publication

Additional information

Publisher's Note

Appendices

Appendix A: Empirical analysis

1.1 A.1 Data synthesis

1.2 A.2 Implementation details

Appendix B: Pseudo codes

1.1 B.1 Algorithm 1 - Pre-masking Strategies

1.2 B.2 Algorithm 2 - MAPPING

Appendix C: Experiments

1.1 C.1 Dataset description

1.2 C.2 Hyperparameter setting of SOTA

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation