REST: Enhancing Group Robustness in DNNs Through Reweighted Sparse Training

Jiaxu Zhao¹²,
Lu Yin¹²,
Shiwei Liu^12,13,
Meng Fang^12,14 &
…
Mykola Pechenizkiy¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14170))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1378 Accesses
1 Citations

Abstract

The deep neural network (DNN) has been proven effective in various domains. However, they often struggle to perform well on certain minority groups during inference, despite showing strong performance on the majority of data groups. This is because over-parameterized models learned bias attributes from a large number of bias-aligned training samples. These bias attributes are strongly spuriously correlated with the target variable, causing the models to be biased towards spurious correlations (i.e., bias-conflicting). To tackle this issue, we propose a novel reweighted sparse training framework, dubbed as REST, which aims to enhance the performance of biased data while improving computation and memory efficiency. Our proposed REST framework has been experimentally validated on three datasets, demonstrating its effectiveness in exploring unbiased subnetworks. We found that REST reduces the reliance on spuriously correlated features, leading to better performance across a wider range of data groups with fewer training and inference resources. We highlight that the REST framework represents a promising approach for improving the performance of DNNs on biased data, while simultaneously improving computation and memory efficiency. By reducing the reliance on spurious correlations, REST has the potential to enhance the robustness of DNNs and improve their generalization capabilities. Code is released at https://github.com/zhao1402072392/REST.

J. Zhao and L. Yin—Contributed equally to this research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

UnbiasedNets: a dataset diversification framework for robustness bias alleviation in neural networks

Article Open access 01 March 2023

Effective node selection technique towards sparse learning

Article 15 May 2020

A Mathematics Framework of Artificial Shifted Population Risk and Its Further Understanding Related to Consistency Regularization

References

Agarwal, A., Beygelzimer, A., Dudík, M., Langford, J., Wallach, H.: A reductions approach to fair classification. In: International Conference on Machine Learning, pp. 60–69. PMLR (2018)
Google Scholar
Arjovsky, M., Bottou, L., Gulrajani, I., Lopez-Paz, D.: Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019)
Beery, S., Van Horn, G., Perona, P.: Recognition in terra incognita. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 456–473 (2018)
Google Scholar
Ben-Tal, A., Den Hertog, D., De Waegenaere, A., Melenberg, B., Rennen, G.: Robust solutions of optimization problems affected by uncertain probabilities. Manag. Sci. 59(2), 341–357 (2013)
Article Google Scholar
Byrd, J., Lipton, Z.: What is the effect of importance weighting in deep learning? In: International Conference on Machine Learning, pp. 872–881. PMLR (2019)
Google Scholar
Dettmers, T., Zettlemoyer, L.: Sparse networks from scratch: faster training without losing performance. arXiv preprint arXiv:1907.04840 (2019)
Dietrich, A., Gressmann, F., Orr, D., Chelombiev, I., Justus, D., Luschi, C.: Towards structured dynamic sparse pre-training of bert. arXiv preprint arXiv:2108.06277 (2021)
Duchi, J., Glynn, P., Namkoong, H.: Statistics of robust optimization: a generalized empirical likelihood approach. arXiv preprint arXiv:1610.03425 (2016)
Duchi, J.C., Hashimoto, T., Namkoong, H.: Distributionally robust losses against mixture covariate shifts (2019)
Google Scholar
Evci, U., Gale, T., Menick, J., Castro, P.S., Elsen, E.: Rigging the lottery: making all tickets winners. In: International Conference on Machine Learning, pp. 2943–2952. PMLR (2020)
Google Scholar
Gale, T., Elsen, E., Hooker, S.: The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574 (2019)
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 (2018)
Goel, K., Gu, A., Li, Y., Ré, C.: Model patching: closing the subgroup performance gap with data augmentation. arXiv preprint arXiv:2008.06775 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261 (2019)
Huang, T., Liu, S., Shen, L., He, F., Lin, W., Tao, D.: On heterogeneously distributed data, sparsity matters. In: Submitted to The Tenth International Conference on Learning Representations (2022). https://openreview.net/forum?id=AT0K-SZ3QGq
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Google Scholar
Izmailov, P., Kirichenko, P., Gruver, N., Wilson, A.G.: On feature learning in the presence of spurious correlations. arXiv preprint arXiv:2210.11369 (2022)
Jayakumar, S., Pascanu, R., Rae, J., Osindero, S., Elsen, E.: Top-kast: Top-k always sparse training. Adv. Neural Inf. Process. Syst. 33, 20744–20754 (2020)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Google Scholar
Kepner, J., Robinett, R.: Radix-net: structured sparse matrices for deep neural networks. In: 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 268–274. IEEE (2019)
Google Scholar
Khani, F., Raghunathan, A., Liang, P.: Maximum weighted loss discrepancy. arXiv preprint arXiv:1906.03518 (2019)
Kim, E., Lee, J., Choo, J.: Biaswap: removing dataset bias with bias-tailored swapping augmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14992–15001 (2021)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lee, J., Kim, E., Lee, J., Lee, J., Choo, J.: Learning debiased representation via disentangled feature augmentation. Adv. Neural Inf. Process. Syst. 34, 25123–25133 (2021)
Google Scholar
Liu, E.Z., et al.: Just train twice: improving group robustness without training group information. In: International Conference on Machine Learning, pp. 6781–6792. PMLR (2021)
Google Scholar
Liu, S., et al.: Deep ensembling with no overhead for either training or testing: the all-round blessings of dynamic sparsity. arXiv preprint arXiv:2106.14568 (2021)
Liu, S., et al.: Sparse training via boosting pruning plasticity with neuroregeneration. Adv. Neural Inf. Process. Syst. 34, 9908–9922 (2021)
Google Scholar
Liu, S., Yin, L., Mocanu, D.C., Pechenizkiy, M.: Do we actually need dense over-parameterization? in-time over-parameterization in sparse training. In: International Conference on Machine Learning, pp. 6989–7000. PMLR (2021)
Google Scholar
Mocanu, D.C., Mocanu, E., Nguyen, P.H., Gibescu, M., Liotta, A.: A topological insight into restricted boltzmann machines. Mach. Learn. 104(2), 243–270 (2016)
Article MathSciNet MATH Google Scholar
Mocanu, D.C., Mocanu, E., Stone, P., Nguyen, P.H., Gibescu, M., Liotta, A.: Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science. Nature Commun. 9(1), 1–12 (2018)
Article Google Scholar
Mostafa, H., Wang, X.: Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization. In: International Conference on Machine Learning (2019)
Google Scholar
Nam, J., Cha, H., Ahn, S., Lee, J., Shin, J.: Learning from failure: de-biasing classifier from biased classifier. Adv. Neural Inf. Process. Syst. 33, 20673–20684 (2020)
Google Scholar
Park, G.Y., Lee, S., Lee, S.W., Ye, J.C.: Efficient debiasing with contrastive weight pruning. arXiv preprint arXiv:2210.05247 (2022)
Prabhu, A., Varma, G., Namboodiri, A.: Deep expander networks: efficient deep networks from graph theory. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 20–35 (2018)
Google Scholar
Sagawa, S., Koh, P.W., Hashimoto, T.B., Liang, P.: Distributionally robust neural networks for group shifts: on the importance of regularization for worst-case generalization. arXiv preprint arXiv:1911.08731 (2019)
Sagawa, S., Raghunathan, A., Koh, P.W., Liang, P.: An investigation of why overparameterization exacerbates spurious correlations. In: International Conference on Machine Learning, pp. 8346–8356. PMLR (2020)
Google Scholar
Shimodaira, H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Stat. Plan. Inference 90(2), 227–244 (2000)
Article MathSciNet MATH Google Scholar
Tanaka, H., Kunin, D., Yamins, D.L., Ganguli, S.: Pruning neural networks without any data by iteratively conserving synaptic flow. In: Advances in Neural Information Processing Systems. arXiv:2006.05467 (2020)
Yao, H., et al.: Improving out-of-distribution robustness via selective augmentation. arXiv preprint arXiv:2201.00299 (2022)
Yin, L., Menkovski, V., Fang, M., Huang, T., Pei, Y., Pechenizkiy, M.: Superposing many tickets into one: a performance booster for sparse neural network training. In: Uncertainty in Artificial Intelligence, pp. 2267–2277. PMLR (2022)
Google Scholar
Yuan, G., et al.: Mest: accurate and fast memory-economic sparse training framework on the edge. Adv. Neural Inf. Process. Syst. 34 (2021)
Google Scholar
Zhang, D., Ahuja, K., Xu, Y., Wang, Y., Courville, A.: Can subnetwork structure be the key to out-of-distribution generalization? In: International Conference on Machine Learning, pp. 12356–12367. PMLR (2021)
Google Scholar
Zhao, J., Fang, M., Shi, Z., Li, Y., Chen, L., Pechenizkiy, M.: Chbias: bias evaluation and mitigation of Chinese conversational language models. In: Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Toronto (2023)
Google Scholar

Download references

Acknowledgements

This work used the Dutch national e-infrastructure with the support of the SURF Cooperative using grant no. EINF-3953/L1.

Author information

Authors and Affiliations

Eindhoven University of Technology, 5600 MB, Eindhoven, The Netherlands
Jiaxu Zhao, Lu Yin, Shiwei Liu, Meng Fang & Mykola Pechenizkiy
The University of Texas at Austin, Austin, TX, 78705, USA
Shiwei Liu
University of Liverpool, Liverpool, L69 3BX, UK
Meng Fang

Authors

Jiaxu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Lu Yin
View author publications
You can also search for this author in PubMed Google Scholar
Shiwei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Meng Fang
View author publications
You can also search for this author in PubMed Google Scholar
Mykola Pechenizkiy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiaxu Zhao .

Editor information

Editors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Danai Koutra
University of Vienna, Vienna, Austria
Claudia Plant
Max Planck Institute for Software Systems, Kaiserslautern, Germany
Manuel Gomez Rodriguez
Politecnico di Torino, Turin, Italy
Elena Baralis
CENTAI, Turin, Italy
Francesco Bonchi

Ethics declarations

Ethical Statement

As researchers in the field of deep neural networks, we recognize the importance of developing methods that improve the generalization capabilities of these models, particularly for minority groups that may be underrepresented in training data. Our proposed reweighted sparse training framework, REST, aims to tackle the issue of bias-conflicting correlations in DNNs by reducing reliance on spurious correlations. We believe that this work has the potential to enhance the robustness of DNNs and improve their performance on out-of-distribution samples, which may have significant implications for various applications such as healthcare and criminal justice. However, we acknowledge that there may be ethical considerations associated with the development and deployment of machine learning algorithms, particularly those that may impact human lives. As such, we encourage the responsible use and evaluation of our proposed framework to ensure that it aligns with ethical standards and does not perpetuate biases or harm vulnerable populations.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, J., Yin, L., Liu, S., Fang, M., Pechenizkiy, M. (2023). REST: Enhancing Group Robustness in DNNs Through Reweighted Sparse Training. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14170. Springer, Cham. https://doi.org/10.1007/978-3-031-43415-0_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-43415-0_19
Published: 17 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43414-3
Online ISBN: 978-3-031-43415-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

REST: Enhancing Group Robustness in DNNs Through Reweighted Sparse Training

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

UnbiasedNets: a dataset diversification framework for robustness bias alleviation in neural networks

Effective node selection technique towards sparse learning

A Mathematics Framework of Artificial Shifted Population Risk and Its Further Understanding Related to Consistency Regularization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Ethical Statement

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

REST: Enhancing Group Robustness in DNNs Through Reweighted Sparse Training

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

UnbiasedNets: a dataset diversification framework for robustness bias alleviation in neural networks

Effective node selection technique towards sparse learning

A Mathematics Framework of Artificial Shifted Population Risk and Its Further Understanding Related to Consistency Regularization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Ethical Statement

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation