Multi-view clustering with exemplars for scientific mapping

Xiangfeng Meng¹,
Xinhai Liu^2,3,
YunHai Tong¹,
Wolfgang Glänzel^4,5 &
…
Shaohua Tan¹

845 Accesses
1 Altmetric
Explore all metrics

Abstract

Scientific mapping has now become an important subject in the scientometrics field. Journal clustering can provide insights into both the internal relations among journals and the evolution trend of studies. In this paper, we apply the affinity propagation (AP) algorithm to do scientific journal clustering. The AP algorithm identifies clusters by detecting their representative points through message passing within the data points. Compared with other clustering algorithms, it can provide representatives for each cluster and does not need to pre-specify the number of clusters. Because the input of the AP algorithm is the similarity matrix among data points, it can be applied to various forms of data sets with different similarity metrics. In this paper, we extract the similarity matrices from the journal data sets in both cross citation view and text view and use the AP algorithm to cluster the journals. Through empirical analysis, we conclude that these two clustering results by the two single views are highly complementary. Therefore, we further combine text information with cross citation information by using the simple average scheme and apply the AP algorithm to conduct multi-view clustering. The multi-view clustering strategy aims at obtaining refined clusters by integrating information from multiple views. With text view and citation view integrated, experiments on the Web of Science journal data set verify that the AP algorithm obtains better clustering results as expected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An overview of the history of Science of Science in China based on the use of bibliographic and citation data: a new method of analysis based on clustering with feature maximization and contrast graphs

Article 22 May 2020

Research on the Clustering Method of Agricultural Scientific Data Based on the Author’s Scientific Research Relationship

Citation-based clustering of publications using CitNetExplorer and VOSviewer

Article Open access 27 February 2017

References

Bickel, S., & Scheffer, T. (2004). Multi-view clustering. ICDM, 4, 19–26.
Google Scholar
Blaschko, M. B., & Lampert, C. H. (2008). Correlational spectral clustering. In IEEE Conference on, Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE, (pp. 1–8).
Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10, P10008.
Article Google Scholar
Boyack, K. W., Börner, K., & Klavans, R. (2009). Mapping the structure and evolution of chemistry research. Scientometrics, 79(1), 45–60.
Article Google Scholar
Cai, X., Nie, F., & Huang, H. (2013). Multi-view k-means clustering on big data. In Proceedings of the twenty-third international joint conference on artificial intelligence (pp. 2598–2604). AAAI Press.
Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22(2), 191–235.
Article Google Scholar
Chaudhuri, K., Kakade, S. M., Livescu, K., & Sridharan, K. (2009). Multi-view clustering via canonical correlation analysis. In Proceedings of the 26th annual international conference on machine learning, ACM (pp. 129–136).
Drost, I., Bickel, S., & Scheffer, T. (2006). Discovering communities in linked data by multi-view clustering. In From data and information analysis to knowledge engineering (pp. 342–349). Springer.
Dueck, D., & Frey, B. J. (2007). Non-metric affinity propagation for unsupervised image categorization. In IEEE 11th International Conference on, Computer Vision, 2007. ICCV 2007. IEEE (pp. 1–8).
Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd, 96(34), 226–231.
Google Scholar
Frey, B. J., & Dueck, D. (2006). Mixture modeling by affinity propagation. Advances in Neural Information Processing Systems, 18, 379.
Google Scholar
Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315(5814), 972–976.
Article MATH MathSciNet Google Scholar
Givoni, I., Chung, C., & Frey, B. J. (2012). Hierarchical affinity propagation. arXiv preprint arXiv:12023722.
Glenisson, P., Glänzel, W., Janssens, F., & De Moor, B. (2005). Combining full text and bibliometric information in mapping scientific disciplines. Information Processing & Management, 41(6), 1548–1572.
Article Google Scholar
Han, J., Kamber, M., & Pei, J. (2006). Data mining, Southeast Asia edition: Concepts and techniques. Los Altos, CA: Morgan Kaufmann.
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Friedman, J., & Tibshirani, R. (2009). The elements of statistical learning (Vol. 2). Berlin: Springer.
Book MATH Google Scholar
Hatcher, E., & Gospodnetic, O. (2004). Lucene in action. Greenwich: Manning Publications.
Google Scholar
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Article Google Scholar
Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data (Vol. 6). Englewood Cliffs: Prentice Hall.
MATH Google Scholar
Janssens, F., Zhang, L., De Moor, B., & Glänzel, W. (2009). Hybrid clustering for validation and improvement of subject-classification schemes. Information Processing & Management, 45(6), 683–702.
Article Google Scholar
Jia, Y., Wang, J., Zhang, C., & Hua, X. S. (2008). Finding image exemplars using fast sparse affinity propagation. In Proceedings of the 16th ACM international conference on Multimedia, ACM (pp. 639–642).
Kostoff, R. N., Buchtel, H. A., Andrews, J., & Pfeil, K. M. (2005). The hidden structure of neuropsychology: Text mining of the journal cortex: 1991–2001. Cortex, 41(2), 103–115.
Article Google Scholar
Lai, D., Nardini, C., & Lu, H. (2011). Partitioning networks into communities by message passing. Physical Review E, 83(1), 016,115.
Article MathSciNet Google Scholar
Leone, M., & Weigt, M. (2007). Clustering by soft-constraint affinity propagation: Applications to gene-expression data. Bioinformatics, 23(20), 2708–2715.
Article Google Scholar
Leydesdorff, L. (2006). Can scientific journals be classified in terms of aggregated journal–journal citation relations using the journal citation reports? Journal of the American Society for Information Science and Technology, 57(5), 601–613.
Article Google Scholar
Leydesdorff, L., & Rafols, I. (2009). A global map of science based on the isi subject categories. Journal of the American Society for Information Science and Technology, 60(2), 348–362.
Article Google Scholar
Liu, X., Yu, S., Janssens, F., Glänzel, W., Moreau, Y., & De Moor, B. (2010). Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database. Journal of the American Society for Information Science and Technology, 61(6), 1105–1119.
Google Scholar
Liu, X., Glänzel, W., & De Moor, B. (2012). Optimal and hierarchical clustering of large-scale hybrid networks for scientific mapping. Scientometrics, 91(2), 473–493.
Article Google Scholar
Liu, X., Ji, S., Glanzel, W., & De Moor, B. (2013). Multiview partitioning via tensor methods. IEEE Transactions on Knowledge and Data Engineering, 25(5), 1056–1069.
Article Google Scholar
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1(14), 281–297.
MathSciNet Google Scholar
Mirkin, B. (1998). Mathematical classification and clustering: From how to what and why. Berlin: Springer.
Book Google Scholar
Moya Anegón, S. G., Vargas Quesada, B., Chinchilla Rodríguez, Z., CoreraÁvarez, E., Munoz Fernández, F. J., & Herrero Solana, V. (2007). Visualizing the marrow of science. Journal of the American Society for Information Science and Technology, 58(14), 2167–2179.
Article Google Scholar
Muller, E., Gunnemann, S., Farber, I., & Seidl, T. (2012). Discovering multiple clustering solutions: Grouping objects in different views of the data. In 2012 IEEE 28th International Conference on, Data Engineering (ICDE), IEEE (pp. 1207–1210).
Rip, A., Callon, M., & Law, J. (1986). Mapping the dynamics of science and technology: Sociology of science in the real world. New York: Macmillan.
Google Scholar
Salton, G., & McGill, M. J. (1986). Introduction to modern information retrieval. New York: Mcgraw-Hill.
Google Scholar
Shang, F., Jiao, L. C., Shi, J., Wang, F., & Gong, M. (2012). Fast affinity propagation clustering: A multilevel approach. Pattern Recognition, 45(1), 474–486.
Article Google Scholar
Strehl, A., & Ghosh, J. (2003). Cluster ensembles–A knowledge reuse framework for combining multiple partitions. The Journal of Machine Learning Research, 3, 583–617.
MATH MathSciNet Google Scholar
Tremolieres, R. (1979). The percolation method for an efficient grouping of data. Pattern Recognition, 11(4), 255–262.
Article MATH Google Scholar
Xu, C., Tao, D., & Xu, C. (2013). A survey on multi-view learning. arXiv preprint arXiv:13045634.
Yu, S., Tranchevent, L. C., Liu, X., Glanzel, W., Suykens, J. A. K., De Moor, B., et al. (2012). Optimized data fusion for kernel k-means clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(5), 1031–1039.
Article Google Scholar
Zhang, L., Janssens, F., Liang, L. M., & Glänzel, W. (2009). Hybrid clustering analysis for mapping large scientific domains. In Proceedings of ISSI (pp. 178–188).
Zhang, L., Liu, X., Janssens, F., Liang, L., & Glänzel, W. (2010). Subject clustering analysis based on isi category classification. Journal of Informetrics, 4(2), 185–193.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Machine Perception (Ministry of Education), Center for Information Science, Peking University, 100871, Beijing, China
Xiangfeng Meng, YunHai Tong & Shaohua Tan
Credit Reference Center, People’s Bank of China, Beijing, China
Xinhai Liu
Center of Financial Intelligence Research, Peking University, Beijing, China
Xinhai Liu
Department of MSI, Center for R&D Monitoring (ECOOM), Katholieke Universiteit Leuven, Waaistraat 6, 3000, Leuven, Belgium
Wolfgang Glänzel
IRPS, Hungarian Academy of Sciences, Budapest, Hungry
Wolfgang Glänzel

Authors

Xiangfeng Meng
View author publications
You can also search for this author in PubMed Google Scholar
Xinhai Liu
View author publications
You can also search for this author in PubMed Google Scholar
YunHai Tong
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Glänzel
View author publications
You can also search for this author in PubMed Google Scholar
Shaohua Tan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinhai Liu.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 50 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meng, X., Liu, X., Tong, Y. et al. Multi-view clustering with exemplars for scientific mapping. Scientometrics 105, 1527–1552 (2015). https://doi.org/10.1007/s11192-015-1682-7

Download citation

Received: 19 December 2014
Published: 04 September 2015
Issue Date: December 2015
DOI: https://doi.org/10.1007/s11192-015-1682-7

Multi-view clustering with exemplars for scientific mapping

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An overview of the history of Science of Science in China based on the use of bibliographic and citation data: a new method of analysis based on clustering with feature maximization and contrast graphs

Research on the Clustering Method of Agricultural Scientific Data Based on the Author’s Scientific Research Relationship

Citation-based clustering of publications using CitNetExplorer and VOSviewer

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (PDF 50 kb)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Multi-view clustering with exemplars for scientific mapping

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An overview of the history of Science of Science in China based on the use of bibliographic and citation data: a new method of analysis based on clustering with feature maximization and contrast graphs

Research on the Clustering Method of Agricultural Scientific Data Based on the Author’s Scientific Research Relationship

Citation-based clustering of publications using CitNetExplorer and VOSviewer

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (PDF 50 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation