Abstract
In sparse representation algorithms, a test sample can be sufficiently represented by exploiting only the training samples from the same class. However, due to variations of facial expressions, illuminations and poses, the other classes also have different degrees of influence on the linear representation of the test sample. Therefore, in order to represent a test sample more accurately, we propose a new sparse representation-based classification method which can strengthen the discriminative property of different classes and obtain a better representation coefficient vector. In our method, we introduce a weighted matrix, which can make small deviations correspond to higher weights and large deviations correspond to lower weights. Meanwhile, we improve the constraint term of representation coefficients, which can enhance the distinctiveness of different classes and make a better positive contribution to classification. In addition, motivated by the work of ProCRC algorithm, we take into account the deviation between the linear combination of all training samples and of each class. Thereby, the discriminative representation of the test sample is further guaranteed. Experimental results on the ORL, FERET, Extended-YaleB and AR databases show that the proposed method has better classification performance than other methods.
Similar content being viewed by others
References
Liu, W., Zha, Z.J., Wang, Y., Lu, K., Tao, D.: $p$-Laplacian regularized sparse coding for human activity recognition. IEEE Trans. Ind. Electron. 63(8), 5120–5129 (2016)
Xu, Y., Fei, L., Wen, J., Zhang, D.: Discriminative and robust competitive code for palmprint recognition. IEEE Trans. Syst. Man Cybern. Syst. PP(99), 1–10 (2016)
Chen, G., Tao, D., Wei, L., Liu, L., Jie, Y.: Label propagation via teaching-to-learn and learning-to-teach. IEEE Trans. Neural Netw. Learn. Syst. 28(6), 1452–1465 (2017)
Yong, X., Li, X., Yang, J., Lai, Z., Zhang, D.: Integrating conventional and inverse representation for face recognition. IEEE Trans. Cybern. 44(10), 1738–1746 (2014)
Yong, X., Fang, X., Li, X., Yang, J., You, J., Liu, H., Teng, S.: Data uncertainty in face recognition. IEEE Trans. Cybern. 44(10), 1950–1961 (2014)
Chen, X., Ziarko, W.: Experiments with rough set approach to face recognition. Int. J. Intell. Syst. 26(6), 499–517 (2011)
Fang, Y., Lin, W., Fang, Z., Chen, Z., Lin, C.W., Deng, C.: Visual acuity inspired saliency detection by using sparse features. Inf. Sci. Int. J. 309(C), 1–10 (2015)
Du, B., Wang, Z., Zhang, L., Zhang, L., Liu, W., Shen, J., Tao, D.: Exploring representativeness and informativeness for active learning. IEEE Trans. Cybern. PP(99), 1–13 (2015)
Liu, W., Ma, T., Xie, Q., Tao, D., Cheng, J.: LMAE: a large margin auto-encoders for classification. Sig. Process. 141, 137–143 (2017)
Liu, W., Tao, D., Cheng, J., Tang, Y.: Multiview Hessian discriminative sparse coding for image annotation. Comput. Vis. Image Underst. 118(1), 50–60 (2014)
Fang, Y., Wang, J., Narwaria, M., Le Callet, P., Lin, W.: Saliency detection for stereoscopic images. IEEE Trans. Image Process. 23(6), 2625–2636 (2014)
Du, B., Xiong, W., Wu, J., Zhang, L., Zhang, L., Tao, D.: Stacked convolutional denoising auto-encoders for feature representation. IEEE Trans. Cybern. 47(4), 1017–1027 (2016)
Gong, C., Liu, T., Tao, D., Fu, K., Tu, E., Yang, J.: Deformed graph laplacian for semisupervised learning. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 2261–2274 (2015)
Liu, T., Tao, D.: On the performance of manhattan nonnegative matrix factorization. IEEE Trans. Neural Netw. Learn. Syst. 27(9), 1851–1863 (2016)
Du, B., Wang, N., Wang, N., Zhang, L., Zhang, L., Zhang, L.: Hyperspectral signal unmixing based on constrained non-negative matrix factorization approach. Neurocomputing 204(C), 153–161 (2016)
Liu, W., Yang, X., Tao, D., Cheng, J., Tang, Y.: Multiview dimension reduction via Hessian multiset canonical correlations. Inf. Fusion 41, 119–128 (2017)
Liu, T., Gong, M., Tao, D.: Large-cone nonnegative matrix factorization. IEEE Trans. Neural Netw. Learn. Syst. 28(9), 2129–2142 (2017)
Yu, J., Hong, C., Rui, Y., Tao, D.: Multi-task autoencoder model for recovering human poses. IEEE Trans. Ind. Electron. PP(99), 1 (2017)
Gong, C., Tao, D., Maybank, S.J., Liu, W., Kang, G., Yang, J.: Multi-modal curriculum learning for semi-supervised image classification. IEEE Trans. Image Process. 25(7), 3249–3260 (2016)
Bo, D., Zhang, M., Zhang, L., Ruimin, H., Tao, D.: PLTD: patch-based low-rank tensor decomposition for hyperspectral images. IEEE Trans. Multimed. 19(1), 67–79 (2017)
Liu, W., Zhang, L., Tao, D., Cheng, J.: Support vector machine active learning by Hessian regularization. J. Vis. Commun. Image Represent. 49, 47–56 (2017)
Yang, X., Liu, W., Tao, D., Cheng, J.: Canonical correlation analysis networks for two-view image recognition. Inf. Sci. Int. J. 385(C), 338–352 (2017)
Fang, Y., Wang, Z., Lin, W.: Video saliency incorporating spatiotemporal cues and uncertainty weighting. In: IEEE International Conference on Multimedia and Expo, pP. 1–6 (2013)
Bo, D., Zhao, R., Zhang, L., Zhang, L.: A spectral-spatial based local summation anomaly detection method for hyperspectral images. Signal Process. 124(C), 115–131 (2016)
Tao, D., Li, X., Wu, X., Maybank, S.J.: Geometric mean for subspace selection. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 260–274 (2009)
Li, L., Liu, S., Peng, Y., Sun, Z.: Overview of principal component analysis algorithm. Optik Int. J. Light Electron Opt. 127(9), 3935–3944 (2016)
Gong, C., Tao, D., Fu, K., Yang, J.: Fick’s law assisted propagation for semisupervised learning. IEEE Trans. Neural Netw. Learn. Syst. 26(9), 2148–2162 (2015)
Chen, G., Liu, T., Tang, Y., Jian, Y., Jie, Y., Tao, D.: A regularization approach for instance-based superset label learning. IEEE Trans. Cybern. PP(99), 1–12 (2017)
Yu, J., Yang, X., Fei, G., Tao, D.: Deep multimodal distance metric learning using click constraints for image ranking. IEEE Trans. Cybern. PP(99), 1–11 (2016)
Fang, Y., Fang, Z., Yuan, F., Yang, Y., Yang, S., Xiong, N.N.: Optimized multioperator image retargeting based on perceptual similarity measure. IEEE Trans. Syst. Man Cybern. Syst. 47(11), 2956–2966 (2017)
Gong, C., Tao, D., Chang, X., Yang, J.: Ensemble teaching for hybrid label propagation. IEEE Trans. Cybern. PP(99), 1–15 (2017)
Yong, X., Zhong, A., Yang, J., Zhang, D.: LPP solution schemes for use with face recognition. Pattern Recognit. 43(12), 4165–4176 (2010)
Yu, J., Rui, Y., Tang, Y.Y., Tao, D.: High-order distance-based multiview stochastic learning in image classification. IEEE Trans. Cybern. 44(12), 2431 (2014)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Belkin, M., Niyogi, P.: Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. MIT Press, Cambridge (2003)
Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)
Wright, J., Ganesh, A., Zhou, Z., Wagner, A., Ma, Y.: Demo: robust face recognition via sparse representation. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 1–2 (2009)
Naseem, I., Togneri, R., Bennamoun, M.: Linear regression for face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(11), 2106–2112 (2010)
Yong, X., Zhang, D., Yang, J., Yang, J.Y.: A two-phase test sample sparse representation method for use with face recognition. IEEE Trans. Circuits Syst. Video Technol. 21(9), 1255–1262 (2011)
Zhang, L., Yang, M., Feng, X.: Sparse representation or collaborative representation: which helps face recognition? In: IEEE International Conference on Computer Vision, pp. 471–478 (2012)
Deng, W., Jiani, H., Guo, J.: Extended SRC: undersampled face recognition via intraclass variant dictionary. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1864–1870 (2012)
Tang, X., Feng, G., Cai, J.: Weighted group sparse representation for undersampled face recognition. Neurocomputing 145(18), 402–415 (2014)
Timofte, R., Van Gool, L.: Adaptive and weighted collaborative representations for image classification. Pattern Recognit. Lett. 43(1), 127–135 (2014)
Wu, J., Timofte, R., Van Gool, L.: Learned collaborative representations for image classification. In: IEEE Winter Conference on Applications of Computer Vision, pp. 456–463 (2015)
Yong, X., Zhong, Z., Jian, Y., You, J., Zhang, D.: A new discriminative sparse representation method for robust face recognition via regularization. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1–10 (2016)
Cai, S., Zhang, L., Zuo, W., Feng, X.: A probabilistic collaborative representation based approach for pattern classification. In: Computer Vision and Pattern Recognition, pp. 2950–2959 (2016)
Amaldi, E., Kann, V.: On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theor. Comput. Sci. 209(1–2), 237–260 (1998)
Liu, T., Tao, D.: Classification with noisy labels by importance reweighting. IEEE Trans. Pattern Anal. Mach. Intell. 38(3), 447 (2016)
Candès, E.J., Romberg, J.K., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59(8), 1207–1223 (2010)
Fang, Y., Lin, W., Chen, Z., Tsai, C.M., Lin, C.W.: A video saliency detection model in compressed domain. IEEE Trans. Circuits Syst. Video Technol. 24(1), 27–38 (2014)
Huang, W., Wang, X., Jin, Z., Li, J.: Penalized collaborative representation based classification for face recognition. Appl. Intell. 4(4), 12–19 (2015)
Xu, Y., Zhu, Q., Chen, Y., Pan, J.S.: An improvement to the nearest neighbor classifier and face recognition experiments. Int. J. Innov. Comput. Inf. Control 9(2), 543–554 (2013)
Yong, X., Zhu, Q., Fan, Z., Qiu, M., Chen, Y., Liu, H.: Coarse to fine K nearest neighbor classifier. Pattern Recognit. Lett. 34(9), 980–986 (2013)
Yong, X., Fang, X., You, J., Chen, Y., Liu, H.: Noise-free representation based classification and face recognition experiments. Neurocomputing 147(1), 307–314 (2015)
Yong, X., Fan, Z., Zhu, Q.: Feature space-based human face image representation and recognition. Opt. Eng. 51(1), 7205 (2012)
Yong, X., Li, X., Yang, J., Zhang, D.: Integrate the original face image and its mirror image for face recognition. Neurocomputing 131(7), 191–199 (2014)
ORL: Face database. http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html. Accessed 1 Mar 2017
FERET: Face database. http://www.itl.nist.gov/iad/humanid/feret/feret_master.html. Accessed 1 Mar 2017
YaleB: Face database. http://vision.ucsd.edu/content/yale-face-database. Accessed 1 Mar 2017
AR: Face database. http://web.mit.edu/emeyers/www/face_databases.html#ar. Accessed 1 Mar 2017
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Nos. 61672333, 61402274, 61703096, 41471280), China Postdoctoral Science Foundation (No. 2017M611655), the Program of Key Science and Technology Innovation Team in Shaanxi Province (No. 2014KTC-18), the Key Science and Technology Program of Shaanxi Province (No. 2016GY-081), the National Natural Science Foundation of Jiangsu Province (No. BK20170691), the Fundamental Research Funds for the Central Universities (Nos. GK201803059, GK201803088), Interdisciplinary Incubation Project of Learning Science of Shaanxi Normal University.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: The derivative over \(\beta \) of \(\frac{1}{2}{\left( {y - {\mathbf{X}}\beta } \right) ^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) + \gamma \sum _{i = 1}^c \sum _{j = 1}^c {\beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}} + \lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \)
First, \(\frac{d}{{d\beta }}\left( {\frac{1}{2}{{\left( {y - {\mathbf{X}}\beta } \right) }^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) } \right) = - {{\mathbf{X}}^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) \).
Next, letting \(f\left( \beta \right) = \gamma \sum _{i = 1}^c {\sum _{j = 1}^c {\beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}} } \), we can calculate the partial derivatives \(\frac{{\partial f}}{{\partial {\beta _k}}}\). Then \(\frac{{df}}{{d\beta }}\) can be obtained by using all \(\frac{{\partial f}}{{\partial {\beta _k}}}\) \(k = 1, \ldots ,c\). Based on mathematical experience,
So \(f\left( \beta \right) \) can be rewritten as
The calculation procedure of \(\frac{{\partial f}}{{\partial {\beta _k}}}\) is as follows,
Thus, the derivative over \(\beta \) of \(f\left( \beta \right) \) is \(\frac{{df}}{{d\beta }} = \left[ \begin{array}{c} \frac{{\partial f}}{{\partial {\beta _1}}} \vdots \frac{{\partial f}}{{\partial {\beta _c}}} \end{array} \right] = \left[ \begin{array}{c} 2\gamma {\mathbf{X}}_1^\mathrm{T}{\mathbf{X}}\beta - 2\gamma {\mathbf{X}}_1^\mathrm{T}{{\mathbf{X}}_1}{\beta _1} \vdots 2\gamma {\mathbf{X}}_k^\mathrm{T}{\mathbf{X}}\beta - 2\gamma {\mathbf{X}}_k^\mathrm{T}{{\mathbf{X}}_k}{\beta _k} \end{array} \right] =2\gamma {{\mathbf{X}}^\mathrm{T}}{\mathbf{X}}\beta - 2\gamma {\mathbf{M}}\beta \) ,
where \({\mathbf{M}} = \left( {\begin{matrix} {{\mathbf{X}}_1^\mathrm{T}{{\mathbf{X}}_1}} &{} \ldots &{} 0 \\ \vdots &{} \ddots &{} \vdots \\ 0 &{} \cdots &{} {{\mathbf{X}}_c^\mathrm{T}{{\mathbf{X}}_c}} \\ \end{matrix} } \right) .\)
As for \(\frac{\partial }{{\partial \beta }}\left( {\lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} } \right) \), we need to analyze \(\sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \) and deduce the deformation formula of \(\sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \) for convenience of calculation. Due to \({\mathbf{X}}\beta = \left[ {{{\mathbf{X}}_1}, \ldots ,{{\mathbf{X}}_c}} \right] \left[ \begin{matrix} {\beta _1} \\ {\beta _2} \\ \vdots \\ {\beta _c} \\ \end{matrix} \right] = {{\mathbf{X}}_1}{\beta _1} + \cdots + {{\mathbf{X}}_c}{\beta _c}\), we have \({\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}={{\mathbf{X}}_1}{\beta _1} + \cdots + {{\mathbf{X}}_{i - 1}}{\beta _{i - 1}} + {{\mathbf{X}}_{i + 1}}{\beta _{i + 1}} + \cdots + {{\mathbf{X}}_c}{\beta _c}\). Letting \({{\mathbf{S}}_i}=\left[ {0, \ldots ,{{\mathbf{X}}_i}, \ldots ,0} \right] \) and \({{\mathbf{Z}}_i}={\mathbf{X}} - {{\mathbf{S}}_i}=\left[ {{{\mathbf{X}}_1}, \ldots ,{{\mathbf{X}}_{i - 1}},0,{{\mathbf{X}}_{i + 1}}, \ldots ,{{\mathbf{X}}_c}} \right] \), we can obtain the deformation formula of \({\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}\), i.e., \({\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}={{\mathbf{Z}}_i}\beta ={{\mathbf{X}}_1}{\beta _1} + \cdots + {{\mathbf{X}}_{i - 1}}{\beta _{i - 1}} + {{\mathbf{X}}_{i + 1}}{\beta _{i + 1}} + \cdots + {{\mathbf{X}}_c}{\beta _c}\). Therefore, the derivative over \(\beta \) of \(\lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \) is
Eventually, the derivative over \(\beta \) of \(\frac{1}{2}{\left( {y - {\mathbf{X}}\beta } \right) ^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) + \gamma \sum _{i = 1}^c {\sum _{j = 1}^c {\beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j}} + \lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} } \) is
Appendix 2: Proof of our objective function is convex function
In the literature [49], there is a description that one function is a convex function as long as it satisfies some certain conditions. Specifically, suppose f is a twice differentiable function, namely, its second derivative or Hessian \({\nabla ^2}f\) is continuous and exists at each point in \({\mathbf{dom}}f\), where \({\mathbf{dom}}f\) is open. Then, f is a convex function if and only if \({\mathbf{dom}}f\) is convex, and also the Hessian of f is positive semidefinite, i.e., \({\nabla ^2}f\left( x \right) \underline{\succ }\, 0\), all \(x \in {\mathbf{dom}}f\). In addition, there is an example which can help us to better explain and prove the convex characteristic of the objective function, as follows.
Example 1
Consider the quadratic function \(f:{{\mathbf{R}}^n} \rightarrow {\mathbf{R}}\), with \({\mathbf{dom}}f = {{\mathbf{R}}^n}\), given by \(f\left( x \right) = \left( {{1 \big / }2} \right) {x^\mathrm{T}}{\mathbf{P}}x + {q^\mathrm{T}}x + r\), where \({\mathbf{P}}\) is a symmetric matrix of size \(n \times n\), \(q \in {{\mathbf{R}}^n}\), and \(r \in {\mathbf{R}}\). Due to \({\nabla ^2}f\left( x \right) ={\mathbf{P}}\) for all x, f is convex if and only if \({\mathbf{P}}\underline{\succ }0\).
Let \(g\left( \beta \right) =\frac{1}{2}{\left( {y - {\mathbf{X}}\beta } \right) ^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) + \gamma \sum _{i = 1}^c \sum _{j = 1}^c \beta _i^\mathrm{T}{\mathbf{X}}_i^\mathrm{T}{{\mathbf{X}}_j}{\beta _j} + \lambda \sum _{i = 1}^c {\left\| {{\mathbf{X}}\beta - {{\mathbf{X}}_i}{\beta _i}} \right\| _2^2} \).
Then according to the aforementioned theorem and example, we can infer that the function \(g\left( \beta \right) \) is convex function if \({\nabla ^2}g\left( \beta \right) \underline{\succ }0\) is proved to be valid, that is, \({\nabla ^2}g\left( \beta \right) \) is a positive semidefinite matrix. As for the problem of how to determine a matrix is positive semidefinite matrix, as long as this matrix is a real symmetric matrix and all order principal minor determinant are greater than or equal to zero, we can conclude that it is positive semidefinite matrix. From Eq. (7), we can get \({\nabla ^1}g\left( \beta \right) \), i.e., \({\nabla ^1}g\left( \beta \right) = - {{\mathbf{X}}^\mathrm{T}}{\mathbf{W}}\left( {y - {\mathbf{X}}\beta } \right) +2\gamma {{\mathbf{X}}^\mathrm{T}}{\mathbf{X}}\beta - 2\gamma {\mathbf{M}}\beta +2\lambda \left[ {\sum _{i = 1}^c {{{\left( {{{\mathbf{Z}}_i}} \right) }^\mathrm{T}}{{\mathbf{Z}}_i}} } \right] \beta \), and then \({\nabla ^2}g\left( \beta \right) = - {{\mathbf{X}}^\mathrm{T}}{\mathbf{WX}} + 2\gamma {{\mathbf{X}}^\mathrm{T}}{\mathbf{X}} - 2\gamma {\mathbf{M}}+2\lambda \sum _{i = 1}^c {{{\left( {{{\mathbf{Z}}_i}} \right) }^\mathrm{T}}{{\mathbf{Z}}_i}} \). Because \({\nabla ^2}g\left( \beta \right) \) satisfies the above determination conditions of positive semidefinite matrix, it is concluded that our objective function is convex function.
Rights and permissions
About this article
Cite this article
Peng, Y., Li, L., Liu, S. et al. Extended sparse representation-based classification method for face recognition. Machine Vision and Applications 29, 991–1007 (2018). https://doi.org/10.1007/s00138-018-0941-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-018-0941-z