Abstract
Clustering has been extensively explored in pattern recognition and data mining in order to facilitate various applications. Due to the presence of data noise, traditional clustering approaches may become vulnerable and unreliable, thereby degrading clustering performance. In this paper, we propose a robust spectral clustering approach, termed Non-negative Low-rank Self-reconstruction (NLS), which simultaneously a) explores the nonnegative low-rank properties of data correlation as well as b) adaptively models the structural sparsity of data noise. Specifically, in order to discover the intrinsic correlation among data, we devise a self-reconstruction approach to jointly consider the nonnegativity and low-rank property of data correlation matrix. Meanwhile, we propose to model data noise via a structural norm, i.e., ℓp,2-norm, which not only naturally conforms to genuine patterns of data noise in real-world situations, but also provides more adaptivity and flexibility to different noise levels. Extensive experiments on various real-world datasets illustrate the advantage of the proposed robust spectral clustering approach compared to existing clustering methods.
Similar content being viewed by others
References
Aha, D.W., Kibler, D.F., Albert, M.K.: Instance-based prediction of real-valued attributes, vol. 5 (1989)
Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE T PAMI 19(7), 711–720 (1997)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15(6), 1373–1396 (2003)
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. JMLR 7, 2399–2434 (2006)
Borjigin, S, Guo, C.: Perturbation analysis for the normalized Laplacian matrices in the multiway spectral clustering method. Sci. China Inf. Sci. 57(11), 112102 (2014). https://doi.org/10.1007/s11432-014-5156-y
Cai, J.-F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4), 1956–1982 (2010)
Cendrowska, J.: Prism: an algorithm for inducing modular rules. International Journal of Man-Machine Studies 27(4), 349–370 (1987)
Chen, L., Xu, D., Tsang, I.-H., Li, X.: Spectral embedded hashing for scalable image retrieval. IEEE TCYB 44(7), 1180–1190 (2014)
De la Torre, F., Kanade, T.: Discriminative cluster analysis. In: ICML, pp. 241–248 (2006)
Deng, F., Lu, J., Wang, S., Pan, J., Zhang, L.: A distributed PDP model based on spectral clustering for improving evaluation performance. World Wide Web 22(4), 1555–1576 (2019)
Ding, C., Li, T.: Adaptive dimension reduction using discriminant analysis and k-means clustering. In: ICML, pp. 521–528 (2007)
Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: CVPR, pp. 2790–2797 (2009)
Felzenszwalb, P., Huttenlocher, D.: Efficient graph-based image segmentation. IJCV 59(2), 167–181 (2004)
Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. PR 41(1), 176–190 (2008)
Gordon, S., Greenspan, H., Goldberger, J.: Applying the information bottleneck principle to unsupervised clustering of discrete and continuous image representations. In: CVPR, pp. 370–377 (2003)
Graham, D., Allinson, N.: Characterising virtual eigensignatures for general purpose face recognition. NATO ASI series. Series F: Computer and System Sciences 163, 446–456 (1998)
Hammouda, K., Kamel, M.: Efficient phrase-based document indexing for web document clustering. TKDE 16(10), 1279–1296 (2004)
He, S., Wang, B., Wang, Z., Yang, Y., Shen, F., Huang, Z., Shen, H.T.: Bidirectional discrete matrix factorization hashing for image search. IEEE Transactions on Cybernetics (2019)
Hu, M., Yang, Y., Shen, F., Xie, N., Hong, R., Shen, H.T.: Collective reconstructive embeddings for cross-modal hashing. IEEE Trans. Image Process. 28(6), 2770–2784 (2019)
Huang, Q., Wang, T., Tao, D., Li, X.: Biclustering learning of trading rules. IEEE TCYB 45(10), 2287–2298 (2015)
Jain, A., Murty, M., Flynn, P.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Jia, J., Yu, N., Hua, X.: Annotating personal albums via web mining. In: ACM Multimedia, pp. 459–468 (2008)
Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. TKDE 16(11), 1370–1386 (2004)
Li, J., Wang, J.: Real-time computerized annotation of pictures. TPAMI 30 (6), 985–1002 (2008)
Li, C., Chang, E., Garcia-Molina, H., Wiederhold, G.: Clustering for approximate similarity search in high-dimensional spaces. TKDE 14(4), 792–808 (2002)
Li, J., Lu, K., Huang, Z., Zhu, L., Shen, H.T.: Heterogeneous domain adaptation through progressive alignment. IEEE TNNLS 30(5), 1381–1391 (2018)
Li, J., Lu, K., Huang, Z., Zhu, L., Shen, H.T.: Transfer independently together: a generalized framework for domain adaptation. IEEE TCYB 49(6), 2144–2155 (2018)
Li, C., Xu, Z., Quiao, C., Luo, T.: Hierarchical clustering driven by cognitive features. Sci. China Inf. Sci. 57(1), 12109 (2014). https://doi.org/10.1007/s11432-013-4858-x
Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., Ma, Y.: Robust recovery of subspace structures by low-rank representation. IEEE TPAMI 35(1), 171–184 (2013)
Liu, G., Xu, H., Tang, J., Liu, Q., Yan, S.: A deterministic analysis for lrr. IEEE TPAMI 38(3), 417–430 (2016)
Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with gabor wavelets. In: Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 200–205 (1998)
Meng, D., Leung, Y., Xu, Z.: Detecting intrinsic loops underlying data manifold. IEEE Trans. Knowl. Data Eng. 25(2), 337–347 (2013)
Nie, F., Xu, D., Tsang, I., Zhang, C.: Spectral embedded clustering. In: IJCAI, pp. 1181–1186 (2009)
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint ℓ2, 1-norms minimization. In: NIPS, pp. 1813–1821 (2010)
Shi, J., Malik, J.: Normalized cuts and image segmentation. TPAMI 22(8), 888–905 (2000)
Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. JMLR 3, 583–617 (2003)
Wang, X., Zhang, L., Li, X., Ma, W.: Annotating images by mining image search results. TPAMI 30(11), 1919–1932 (2008)
Wang, F., Zhang, C., Li, T.: Clustering with local and global regularization. TKDE 21(12), 1665–1678 (2009)
Wang, Z., Na, C., Ma, Z., Chen, S., Song, L., Yang, Y.: Exploring nonnegative and low-rank correlation for noise-resistant spectral clustering. In: Apweb-Waim, pp. 12–26 (2019)
Wu, M., Scholkopf, B.: A local learning approach for clustering. NIPS 19, 1529–1536 (2007)
Xiao, Y., Zhu, Z., Zhao, Y., Wei, Y., Wei, S., Li, X.: Topographic nmf for data representation. IEEE TCYB 44(10), 1762–1771 (2014)
Yang, Y., Xu, D., Nie, F., Yan, S., Zhuang, Y.: Image clustering using local discriminant models and global integration. TIP 19(10), 2761–2773 (2010)
Yang, Y., Shen, H., Nie, F., Ji, R., Zhou, X.: Nonnegative spectral clustering with discriminative regularization. In: AAAI, pp. 555–560 (2011)
Yang, Y., Yang, Y., Shen, H.T., Zhang, Y., Du, X., Zhou, X.: Discriminative nonnegative spectral clustering with out-of-sample extension. IEEE TKDE 25(8), 1760–1771 (2013)
Yang, Y., Ma, Z., Yang, Y., Nie, F., Shen, H.T.: Multitask spectral clustering by exploring intertask correlation. IEEE Trans. Cyber. 45(5), 1083–1094 (2015)
Yang, Y., Shen, F., Shen, H.T., Li, H., Li, X.: Robust discrete spectral hashing for large-scale image semantic indexing. IEEE Transactions on Big Data 1(4), 162–171 (2015)
Yang, Y., Shen, F., Huang, Z., Shen, H.T.: A unified framework for discrete spectral clustering. In: IJCAI, pp. 2273–2279 (2016)
Yang, Y., Shen, F., Huang, Z., Shen, H.T., Li, X.: Discrete nonnegative spectral clustering. IEEE Trans. Knowl. Data Eng. 29(9), 1834–1845 (2017)
Ye, J., Zhao, Z., Liu, H.: Adaptive distance metric learning for clustering. In: CVPR, pp. 1–7 (2007)
Ye, J., Zhao, Z., Wu, M.: Discriminative k-means for clustering. NIPS 20, 1649–1656 (2007)
Yu, S., Shi, J.: Multiclass Spectral Clustering. In: ICCV, pp. 313–319 (2003)
Zelnik-Manor, L., Perona, P.: Self-tuning spectral clustering. NIPS 17, 1601–1608 (2004)
Zhang, Z., Liu, L., Shen, F., Shen, H.T., Shao, L.: Binary multi-view clustering. IEEE TPAMI 41(7), 1774–1782 (July 2019)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Lin Zuo and Zheng Wang contributed equally to this work.
This article belongs to the Topical Collection: Special Issue on Web and Big Data 2019
Guest Editors: Jie Shao, Man Lung Yiu, and Toyoda Masashi
Appendices
Appendix A: Proof of Lemma 1
Proof
Inspired by [34], we consider the following function
where p ∈ (0, 2). We expect to show that when a > 0, f(a) ≥ 0. The first and second order derivatives of the function in (28) are \(f^{\prime }(a ) = 2pa - 2p{a^{p - 1}}\) and \(f^{\prime \prime }(a) = 2p - 2p(p - 1){a^{p - 2}}\), respectively. We can see that a = 1 is the only point that satisfies \(f^{\prime }(a)=0\). Also, when 0 < a < 1, \(f^{\prime }(a )<0\) and when a > 1, \(f^{\prime }(a )>0\). This means that f(a) is monotonically decreasing when 0 < a < 1 and monotonically increasing when a > 1. Moreover, we have \(f^{\prime \prime }(1)=2p(2-p)>0\). Therefore, for ∀a > 0, f(a) ≥ f(1) = 0.
Then, by substituting \(a = \frac {\| \tilde {\varepsilon }_{i} \|}{\|\varepsilon _{i}\|}\) into (28), we obtain the conclusion
□
Appendix B: Proof of Theorem 1
Proof
Denote \({\mathcal{L}}(\mathcal {E}) = \frac {1}{2} \left \| \mathcal {E} - \left (X - XW + \frac {P}{\alpha }\right ) \right \|_{F}^{2}\) and \(\lambda ^{\prime } = \lambda /\alpha \). Suppose \(\tilde {\mathcal {E}}\) is the optimized solution of the alternative problem (18), then we obtain the following conclusion:
Given the conclusion of Lemma 2, we finally arrive at
Hence, the value of the objective function in (17) monotonically decreases in each iteration. □
Rights and permissions
About this article
Cite this article
Wang, Z., Zuo, L., Ma, J. et al. Exploring nonnegative and low-rank correlation for noise-resistant spectral clustering. World Wide Web 23, 2107–2127 (2020). https://doi.org/10.1007/s11280-020-00802-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-020-00802-1