Abstract
Metric learning serves to mitigate, to a great extent, the class-imbalance problem associated with large multi-class image databases. However, the computational complexity associated with metric learning increases when the number of classes is very large. In this paper, a novel localized metric learning scheme is proposed for a large multi-class extremely imbalanced face database with an imbalance ratio as high as 265:1. The Histogram of Gradient (HOG) features are extracted from each facial image and these are given as input for metric learning. The proposed scheme involves confining the metric learning process to local subspaces that have similar class populations. The training dataset is divided into smaller subsets based on the class populations such that the class imbalance ratio within a local group does not exceed 2:1. The locally learnt distance metrics are then, one by one, used to transform the entire input space. The nearest neighbor of the test sample, in the training set, is noted for each transformation. A comparison amongst all transformations for the closest nearest neighbor in the training set establishes the class of the test sample. Experiments are conducted on the highly imbalanced benchmark Labeled Faces in the Wild (LFW) dataset containing 1680 classes of celebrity faces. All classes are retained for the experimentation including those minority classes having just two samples. The proposed localized metric learning scheme outperforms the state of the art for face classification from large multi-class extremely imbalanced face databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Prog. Artif. Intell. 5(4), 221–232 (2016). https://doi.org/10.1007/s13748-016-0094-0
Susan, S., Kumar, A.: SSOMaj-SMOTE-SSOMin: three-step intelligent pruning of majority and minority samples for learning from imbalanced datasets. Appl. Soft Comput. 78, 141–149 (2019)
Ling, C.X., Sheng, V.S.: Cost-sensitive learning and the class imbalance problem. Encycl. Mach. Learn. 2008, 231–235 (2011)
Susan, S., Kumar, A.: The balancing trick: optimized sampling of imbalanced datasets—a brief survey of the recent State of the Art. Eng. Rep. 3(4), e12298 (2021)
Mienye, I.D., Sun, Y.: Performance analysis of cost-sensitive learning methods with application to imbalanced medical data. Inform. Med. Unlocked 25, 100690 (2021)
Piras, L., Giacinto, G.: Synthetic pattern generation for imbalanced learning in image retrieval. Pattern Recogn. Lett. 33(16), 2198–2205 (2012)
Saini, M., Susan, S.: Data augmentation of minority class with transfer learning for classification of imbalanced breast cancer dataset using inception-V3. In: Morales, A., Fierrez, J., Sánchez, J.S., Ribeiro, B. (eds.) IbPRIA 2019. LNCS, vol. 11867, pp. 409–420. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31332-6_36
Rezaei, M., Uemura, T., Näppi, J., Yoshida, H., Lippert, C., Meinel, C.: Generative synthetic adversarial network for internal bias correction and handling class imbalance problem in medical image diagnosis. In: Medical Imaging 2020: Computer-Aided Diagnosis, vol. 11314, p. 113140E. International Society for Optics and Photonics (2020)
Rezaei, M., Yang, H., Meinel, C.: Recurrent generative adversarial network for learning imbalanced medical image semantic segmentation. Multimedia Tools Appl. 79(21–22), 15329–15348 (2019). https://doi.org/10.1007/s11042-019-7305-1
Susan, S., Kumar, A.: DST-ML-EkNN: data space transformation with metric learning and elite k-nearest neighbor cluster formation for classification of imbalanced datasets. In: Chiplunkar, N.N., Fukao, T. (eds.) Advances in Artificial Intelligence and Data Engineering. AISC, vol. 1133, pp. 319–328. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-3514-7_26
Sukarna Barua, M., Islam, M., Yao, X., Murase, K.: MWMOTE--majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2), 405–425 (2014). https://doi.org/10.1109/TKDE.2012.232
Tao, X., et al.: Real-value negative selection over-sampling for imbalanced data set learning. Expert Syst. Appl. 129, 118–134 (2019)
Liu, T., Zhu, X., Pedrycz, W., Li, Z.: A design of information granule-based under-sampling method in imbalanced data classification. Soft. Comput. 24(22), 17333–17347 (2020). https://doi.org/10.1007/s00500-020-05023-2
Moutafis, P., Leng, M., Kakadiaris, I.A.: An overview and empirical comparison of distance metric learning methods. IEEE Trans. Cybern. 47(3), 612–625 (2016)
Feng, L., Wang, H., Jin, B., Li, H., Xue, M., Wang, L.: Learning a distance metric by balancing kl-divergence for imbalanced datasets. IEEE Trans. Syst. Man Cybern. Syst. 49(12), 2384–2395 (2018)
Wang, N., Zhao, X., Jiang, Y., Gao, Y.: Iterative metric learning for imbalance data classification. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 2805–2811 (2018)
Wang, C., Xin, C., Zili, X.: A novel deep metric learning model for imbalanced fault diagnosis and toward open-set classification. Knowl.-Based Syst. 220, 106925 (2021)
Kulis, B.: Metric learning: a survey. Found. Trends Mach. Learn. 5(4), 287–364 (2012)
Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: Advances in Neural Information Processing Systems, pp. 1473–1480 (2006)
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10(2) (2009)
Susan, S., Kumar, A.: Learning data space transformation matrix from pruned imbalanced datasets for nearest neighbor classification. In: 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 2831–2838. IEEE (2019)
Ghanavati, M., Wong, R.K., Chen, F., Wang, Y., Perng, C.-S.: An effective integrated method for learning big imbalanced data. In: 2014 IEEE International Congress on Big Data, pp. 691–698. IEEE (2014)
Tan, M., Wang, B., Zhaohui, W., Wang, J., Pan, G.: Weakly supervised metric learning for traffic sign recognition in a LIDAR-equipped vehicle. IEEE Trans. Intell. Transp. Syst. 17(5), 1415–1427 (2016)
Jing, X.-Y., et al.: Multiset feature learning for highly imbalanced data classification. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 139–156 (2019)
Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments (2008)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893. IEEE (2005)
Dadi, H.S., Pillutla, G.K.M.: Improved face recognition rate using HOG features and SVM classifier. IOSR J. Electron. Commun. Eng. 11(04), 34–44 (2016)
Chen, D., Cao, X., Wen, F., Sun, J.: Blessing of dimensionality: high-dimensional feature and its efficient compression for face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3025–3032 (2013)
Abuzneid, M.A., Mahmood, A.: Enhanced human face recognition using LBPH descriptor, multi-KNN, and back-propagation neural network. IEEE Access 6, 20641–20651 (2018)
Bhele, S.G., Mankar, V.H.: Recognition of faces using discriminative features of LBP and HOG descriptor in varying environment. In: 2015 International Conference on Computational Intelligence and Communication Networks (CICN), pp. 426–432. IEEE (2015)
Fu, Y., Li, Z., Huang, T.S., Katsaggelos, A.K.: Locally adaptive subspace and similarity metric learning for visual data clustering and retrieval. Comput. Vis. Image Underst. 110(3), 390–402 (2008)
Shen, P., Xin, D., Li, C.: Distributed semi-supervised metric learning. IEEE Access 4, 8558–8571 (2016)
Li, J., Lin, X., Rui, X., Rui, Y., Tao, D.: A distributed approach toward discriminative distance metric learning. IEEE Trans. Neural Netw. Learn. Syst. 26(9), 2111–2122 (2014)
Taigman, Y., Yang, M., Ranzato, M.A., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)
Susan, S., Kaushik, A.: Weakly supervised metric learning with majority classes for large imbalanced image dataset. In: Proceedings of the 2020 the 4th International Conference on Big Data and Internet of Things, pp. 16–19 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Susan, S., Kaushik, A. (2022). Localized Metric Learning for Large Multi-class Extremely Imbalanced Face Database. In: Rage, U.K., Goyal, V., Reddy, P.K. (eds) Database Systems for Advanced Applications. DASFAA 2022 International Workshops. DASFAA 2022. Lecture Notes in Computer Science, vol 13248. Springer, Cham. https://doi.org/10.1007/978-3-031-11217-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-11217-1_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11216-4
Online ISBN: 978-3-031-11217-1
eBook Packages: Computer ScienceComputer Science (R0)