Nothing Special   »   [go: up one dir, main page]

Skip to main content

Newer is Not Always Better: Rethinking Transferability Metrics, Their Peculiarities, Stability and Performance

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13713))

Abstract

Fine-tuning of large pre-trained image and language models on small customized datasets has become increasingly popular for improved prediction and efficient use of limited resources. Fine-tuning requires identification of best models to transfer-learn from and quantifying transferability prevents expensive re-training on all of the candidate models/tasks pairs. In this paper, we show that the statistical problems with covariance estimation drive the poor performance of H-score—a common baseline for newer metrics—and propose shrinkage-based estimator. This results in up to \(80\%\) absolute gain in H-score correlation performance, making it competitive with the state-of-the-art LogME measure. Our shrinkage-based H-score is 3–10 times faster to compute compared to LogME. Additionally, we look into a less common setting of target (as opposed to source) task selection. We demonstrate previously overlooked problems in such settings with different number of labels, class-imbalance ratios etc. for some recent metrics e.g., NCE, LEEP that resulted in them being misrepresented as leading measures. We propose a correction and recommend measuring correlation performance against relative accuracy in such settings. We support our findings with \(\sim \) 164,000 (fine-tuning trials) experiments on both vision models and graph neural networks.

S. Ibrahim—This work was completed as an Intern and Student Researcher at Google.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Condition number of a positive semidefinite matrix A, is the ratio of its largest and smallest eigenvalues.

  2. 2.

    https://keras.io/api/applications/.

References

  1. Bao, Y., Li, Y., Huang, S., et al.: An information-theoretic approach to transferability in task transfer learning. In: 2019 IEEE ICIP, pp. 2309–2313 (2019)

    Google Scholar 

  2. Chen, Y., Wiesel, A., Eldar, Y.C., et al.: Shrinkage algorithms for MMSE covariance estimation. IEEE Trans. Signal Process. 58(10), 5016–5029 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  3. Chollet, F., et al.: Keras. https://github.com/fchollet/keras (2015)

  4. Cui, Y., Song, Y., Sun, C., et al.: Large scale fine-grained categorization and domain-specific transfer learning. CoRR abs/1806.06193 (2018)

    Google Scholar 

  5. Deshpande, A., Achille, A., Ravichandran, A., et al.: A linearized framework and a new benchmark for model selection for fine-tuning (2021)

    Google Scholar 

  6. Devlin, J., Chang, M., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018)

    Google Scholar 

  7. Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch Geometric. In: ICLR Workshop on Representation Learning on Graphs and Manifolds (2019)

    Google Scholar 

  8. Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors). Ann. Stat. 28(2), 337–407 (2000)

    Article  MATH  Google Scholar 

  9. Guyon, I.: Design of experiments for the nips 2003 variable selection benchmark (2003)

    Google Scholar 

  10. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS, Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7

    Book  MATH  Google Scholar 

  11. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)

    Google Scholar 

  12. Huang, L.K., Wei, Y., Rong, Y., et al.: Frustratingly easy transferability estimation. ArXiv abs/2106.09362 (2021)

    Google Scholar 

  13. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR 2017, Toulon, France, 24–26 April 2017. OpenReview.net (2017)

    Google Scholar 

  14. Kornblith, S., Shlens, J., Le, Q.V.: Do better imageNet models transfer better? In: 2019 IEEE/CVF CVPR, pp. 2656–2666 (2019)

    Google Scholar 

  15. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Neural Information Processing Systems 25 (01 2012)

    Google Scholar 

  16. Ledoit, O., Wolf, M.: A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal. 88(2), 365–411 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  17. Li, H., Chaudhari, P., Yang, H., et al.: Rethinking the hyperparameters for fine-tuning. CoRR abs/2002.11770 (2020)

    Google Scholar 

  18. Li, Y., Jia, X., Sang, R., et al.: Ranking neural checkpoints. In: Proceedings of the IEEE/CVF CVPR, pp. 2663–2673 (2021)

    Google Scholar 

  19. Mahajan, D., Girshick, R.B., Ramanathan, V., et al.: Exploring the limits of weakly supervised pretraining. CoRR abs/1805.00932 (2018)

    Google Scholar 

  20. Max, A.W.: Inverting modified matrices. In: Memorandum Rept. 42, Statistical Research Group, p. 4. Princeton Univ. (1950)

    Google Scholar 

  21. Nguyen, C.V., Hassner, T., Seeger, M., Archambeau, C.: Leep: A new measure to evaluate transferability of learned representations (2020)

    Google Scholar 

  22. Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    Google Scholar 

  23. Pourahmadi, M.: High-dimensional covariance estimation: with high-dimensional data, vol. 882. John Wiley & Sons (2013)

    Google Scholar 

  24. Rabanser, S., Günnemann, S., Lipton, Z.C.: Failing loudly: an empirical study of methods for detecting dataset shift. In: NeurIPS (2019)

    Google Scholar 

  25. Rozemberczki, B., Allen, C., Sarkar, R.: Multi-scale attributed node embedding. J. Complex Netw. 9(2), cnab014 (2021)

    Google Scholar 

  26. Rozemberczki, B., Sarkar, R.: Characteristic functions on graphs: birds of a feather, from statistical descriptors to parametric models. In: Proceedings of the 29th ACM CIKM, pp. 1325–1334. CIKM 2020, ACM, New York, NY, USA (2020)

    Google Scholar 

  27. Rozemberczki, B., Sarkar, R.: Twitch gamers: a dataset for evaluating proximity preserving and structural role-based node embeddings (2021)

    Google Scholar 

  28. Schäfer, J., Strimmer, K.: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. 4, 32 (2005)

    Google Scholar 

  29. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2015)

    Google Scholar 

  30. Tan, Y., Li, Y., Huang, S.: OTCE: a transferability metric for cross-domain cross-task representations. CoRR abs/2103.13843 (2021)

    Google Scholar 

  31. Tran, A., Nguyen, C., Hassner, T.: Transferability and hardness of supervised classification tasks. In: 2019 IEEE/CVF ICCV, pp. 1395–1405 (2019)

    Google Scholar 

  32. Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11(95), 2837–2854 (2010)

    MathSciNet  MATH  Google Scholar 

  33. You, K., Liu, Y., Wang, J., Long, M.: LogME: practical assessment of pre-trained models for transfer learning. In: ICML (2021)

    Google Scholar 

  34. You, K., Liu, Y., Zhang, Z., Wang, J., Jordan, M.I., Long, M.: Ranking and tuning pre-trained models: a new paradigm of exploiting model hubs (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shibal Ibrahim .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 449 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ibrahim, S., Ponomareva, N., Mazumder, R. (2023). Newer is Not Always Better: Rethinking Transferability Metrics, Their Peculiarities, Stability and Performance. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13713. Springer, Cham. https://doi.org/10.1007/978-3-031-26387-3_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26387-3_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26386-6

  • Online ISBN: 978-3-031-26387-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics