Abstract
Fine-tuning of large pre-trained image and language models on small customized datasets has become increasingly popular for improved prediction and efficient use of limited resources. Fine-tuning requires identification of best models to transfer-learn from and quantifying transferability prevents expensive re-training on all of the candidate models/tasks pairs. In this paper, we show that the statistical problems with covariance estimation drive the poor performance of H-score—a common baseline for newer metrics—and propose shrinkage-based estimator. This results in up to \(80\%\) absolute gain in H-score correlation performance, making it competitive with the state-of-the-art LogME measure. Our shrinkage-based H-score is 3–10 times faster to compute compared to LogME. Additionally, we look into a less common setting of target (as opposed to source) task selection. We demonstrate previously overlooked problems in such settings with different number of labels, class-imbalance ratios etc. for some recent metrics e.g., NCE, LEEP that resulted in them being misrepresented as leading measures. We propose a correction and recommend measuring correlation performance against relative accuracy in such settings. We support our findings with \(\sim \) 164,000 (fine-tuning trials) experiments on both vision models and graph neural networks.
S. Ibrahim—This work was completed as an Intern and Student Researcher at Google.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Condition number of a positive semidefinite matrix A, is the ratio of its largest and smallest eigenvalues.
- 2.
References
Bao, Y., Li, Y., Huang, S., et al.: An information-theoretic approach to transferability in task transfer learning. In: 2019 IEEE ICIP, pp. 2309–2313 (2019)
Chen, Y., Wiesel, A., Eldar, Y.C., et al.: Shrinkage algorithms for MMSE covariance estimation. IEEE Trans. Signal Process. 58(10), 5016–5029 (2010)
Chollet, F., et al.: Keras. https://github.com/fchollet/keras (2015)
Cui, Y., Song, Y., Sun, C., et al.: Large scale fine-grained categorization and domain-specific transfer learning. CoRR abs/1806.06193 (2018)
Deshpande, A., Achille, A., Ravichandran, A., et al.: A linearized framework and a new benchmark for model selection for fine-tuning (2021)
Devlin, J., Chang, M., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018)
Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch Geometric. In: ICLR Workshop on Representation Learning on Graphs and Manifolds (2019)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors). Ann. Stat. 28(2), 337–407 (2000)
Guyon, I.: Design of experiments for the nips 2003 variable selection benchmark (2003)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS, Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)
Huang, L.K., Wei, Y., Rong, Y., et al.: Frustratingly easy transferability estimation. ArXiv abs/2106.09362 (2021)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR 2017, Toulon, France, 24–26 April 2017. OpenReview.net (2017)
Kornblith, S., Shlens, J., Le, Q.V.: Do better imageNet models transfer better? In: 2019 IEEE/CVF CVPR, pp. 2656–2666 (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Neural Information Processing Systems 25 (01 2012)
Ledoit, O., Wolf, M.: A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal. 88(2), 365–411 (2004)
Li, H., Chaudhari, P., Yang, H., et al.: Rethinking the hyperparameters for fine-tuning. CoRR abs/2002.11770 (2020)
Li, Y., Jia, X., Sang, R., et al.: Ranking neural checkpoints. In: Proceedings of the IEEE/CVF CVPR, pp. 2663–2673 (2021)
Mahajan, D., Girshick, R.B., Ramanathan, V., et al.: Exploring the limits of weakly supervised pretraining. CoRR abs/1805.00932 (2018)
Max, A.W.: Inverting modified matrices. In: Memorandum Rept. 42, Statistical Research Group, p. 4. Princeton Univ. (1950)
Nguyen, C.V., Hassner, T., Seeger, M., Archambeau, C.: Leep: A new measure to evaluate transferability of learned representations (2020)
Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pourahmadi, M.: High-dimensional covariance estimation: with high-dimensional data, vol. 882. John Wiley & Sons (2013)
Rabanser, S., Günnemann, S., Lipton, Z.C.: Failing loudly: an empirical study of methods for detecting dataset shift. In: NeurIPS (2019)
Rozemberczki, B., Allen, C., Sarkar, R.: Multi-scale attributed node embedding. J. Complex Netw. 9(2), cnab014 (2021)
Rozemberczki, B., Sarkar, R.: Characteristic functions on graphs: birds of a feather, from statistical descriptors to parametric models. In: Proceedings of the 29th ACM CIKM, pp. 1325–1334. CIKM 2020, ACM, New York, NY, USA (2020)
Rozemberczki, B., Sarkar, R.: Twitch gamers: a dataset for evaluating proximity preserving and structural role-based node embeddings (2021)
Schäfer, J., Strimmer, K.: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. 4, 32 (2005)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2015)
Tan, Y., Li, Y., Huang, S.: OTCE: a transferability metric for cross-domain cross-task representations. CoRR abs/2103.13843 (2021)
Tran, A., Nguyen, C., Hassner, T.: Transferability and hardness of supervised classification tasks. In: 2019 IEEE/CVF ICCV, pp. 1395–1405 (2019)
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11(95), 2837–2854 (2010)
You, K., Liu, Y., Wang, J., Long, M.: LogME: practical assessment of pre-trained models for transfer learning. In: ICML (2021)
You, K., Liu, Y., Zhang, Z., Wang, J., Jordan, M.I., Long, M.: Ranking and tuning pre-trained models: a new paradigm of exploiting model hubs (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ibrahim, S., Ponomareva, N., Mazumder, R. (2023). Newer is Not Always Better: Rethinking Transferability Metrics, Their Peculiarities, Stability and Performance. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13713. Springer, Cham. https://doi.org/10.1007/978-3-031-26387-3_42
Download citation
DOI: https://doi.org/10.1007/978-3-031-26387-3_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26386-6
Online ISBN: 978-3-031-26387-3
eBook Packages: Computer ScienceComputer Science (R0)