Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Multi-view clustering with exemplars for scientific mapping

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Scientific mapping has now become an important subject in the scientometrics field. Journal clustering can provide insights into both the internal relations among journals and the evolution trend of studies. In this paper, we apply the affinity propagation (AP) algorithm to do scientific journal clustering. The AP algorithm identifies clusters by detecting their representative points through message passing within the data points. Compared with other clustering algorithms, it can provide representatives for each cluster and does not need to pre-specify the number of clusters. Because the input of the AP algorithm is the similarity matrix among data points, it can be applied to various forms of data sets with different similarity metrics. In this paper, we extract the similarity matrices from the journal data sets in both cross citation view and text view and use the AP algorithm to cluster the journals. Through empirical analysis, we conclude that these two clustering results by the two single views are highly complementary. Therefore, we further combine text information with cross citation information by using the simple average scheme and apply the AP algorithm to conduct multi-view clustering. The multi-view clustering strategy aims at obtaining refined clusters by integrating information from multiple views. With text view and citation view integrated, experiments on the Web of Science journal data set verify that the AP algorithm obtains better clustering results as expected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Bickel, S., & Scheffer, T. (2004). Multi-view clustering. ICDM, 4, 19–26.

    Google Scholar 

  • Blaschko, M. B., & Lampert, C. H. (2008). Correlational spectral clustering. In IEEE Conference on, Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE, (pp. 1–8).

  • Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10, P10008.

    Article  Google Scholar 

  • Boyack, K. W., Börner, K., & Klavans, R. (2009). Mapping the structure and evolution of chemistry research. Scientometrics, 79(1), 45–60.

    Article  Google Scholar 

  • Cai, X., Nie, F., & Huang, H. (2013). Multi-view k-means clustering on big data. In Proceedings of the twenty-third international joint conference on artificial intelligence (pp. 2598–2604). AAAI Press.

  • Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22(2), 191–235.

    Article  Google Scholar 

  • Chaudhuri, K., Kakade, S. M., Livescu, K., & Sridharan, K. (2009). Multi-view clustering via canonical correlation analysis. In Proceedings of the 26th annual international conference on machine learning, ACM (pp. 129–136).

  • Drost, I., Bickel, S., & Scheffer, T. (2006). Discovering communities in linked data by multi-view clustering. In From data and information analysis to knowledge engineering (pp. 342–349). Springer.

  • Dueck, D., & Frey, B. J. (2007). Non-metric affinity propagation for unsupervised image categorization. In IEEE 11th International Conference on, Computer Vision, 2007. ICCV 2007. IEEE (pp. 1–8).

  • Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd, 96(34), 226–231.

    Google Scholar 

  • Frey, B. J., & Dueck, D. (2006). Mixture modeling by affinity propagation. Advances in Neural Information Processing Systems, 18, 379.

    Google Scholar 

  • Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315(5814), 972–976.

    Article  MATH  MathSciNet  Google Scholar 

  • Givoni, I., Chung, C., & Frey, B. J. (2012). Hierarchical affinity propagation. arXiv preprint arXiv:12023722.

  • Glenisson, P., Glänzel, W., Janssens, F., & De Moor, B. (2005). Combining full text and bibliometric information in mapping scientific disciplines. Information Processing & Management, 41(6), 1548–1572.

    Article  Google Scholar 

  • Han, J., Kamber, M., & Pei, J. (2006). Data mining, Southeast Asia edition: Concepts and techniques. Los Altos, CA: Morgan Kaufmann.

    Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Friedman, J., & Tibshirani, R. (2009). The elements of statistical learning (Vol. 2). Berlin: Springer.

    Book  MATH  Google Scholar 

  • Hatcher, E., & Gospodnetic, O. (2004). Lucene in action. Greenwich: Manning Publications.

    Google Scholar 

  • Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.

    Article  Google Scholar 

  • Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data (Vol. 6). Englewood Cliffs: Prentice Hall.

    MATH  Google Scholar 

  • Janssens, F., Zhang, L., De Moor, B., & Glänzel, W. (2009). Hybrid clustering for validation and improvement of subject-classification schemes. Information Processing & Management, 45(6), 683–702.

    Article  Google Scholar 

  • Jia, Y., Wang, J., Zhang, C., & Hua, X. S. (2008). Finding image exemplars using fast sparse affinity propagation. In Proceedings of the 16th ACM international conference on Multimedia, ACM (pp. 639–642).

  • Kostoff, R. N., Buchtel, H. A., Andrews, J., & Pfeil, K. M. (2005). The hidden structure of neuropsychology: Text mining of the journal cortex: 1991–2001. Cortex, 41(2), 103–115.

    Article  Google Scholar 

  • Lai, D., Nardini, C., & Lu, H. (2011). Partitioning networks into communities by message passing. Physical Review E, 83(1), 016,115.

    Article  MathSciNet  Google Scholar 

  • Leone, M., & Weigt, M. (2007). Clustering by soft-constraint affinity propagation: Applications to gene-expression data. Bioinformatics, 23(20), 2708–2715.

    Article  Google Scholar 

  • Leydesdorff, L. (2006). Can scientific journals be classified in terms of aggregated journal–journal citation relations using the journal citation reports? Journal of the American Society for Information Science and Technology, 57(5), 601–613.

    Article  Google Scholar 

  • Leydesdorff, L., & Rafols, I. (2009). A global map of science based on the isi subject categories. Journal of the American Society for Information Science and Technology, 60(2), 348–362.

    Article  Google Scholar 

  • Liu, X., Yu, S., Janssens, F., Glänzel, W., Moreau, Y., & De Moor, B. (2010). Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database. Journal of the American Society for Information Science and Technology, 61(6), 1105–1119.

    Google Scholar 

  • Liu, X., Glänzel, W., & De Moor, B. (2012). Optimal and hierarchical clustering of large-scale hybrid networks for scientific mapping. Scientometrics, 91(2), 473–493.

    Article  Google Scholar 

  • Liu, X., Ji, S., Glanzel, W., & De Moor, B. (2013). Multiview partitioning via tensor methods. IEEE Transactions on Knowledge and Data Engineering, 25(5), 1056–1069.

    Article  Google Scholar 

  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1(14), 281–297.

    MathSciNet  Google Scholar 

  • Mirkin, B. (1998). Mathematical classification and clustering: From how to what and why. Berlin: Springer.

    Book  Google Scholar 

  • Moya Anegón, S. G., Vargas Quesada, B., Chinchilla Rodríguez, Z., CoreraÁvarez, E., Munoz Fernández, F. J., & Herrero Solana, V. (2007). Visualizing the marrow of science. Journal of the American Society for Information Science and Technology, 58(14), 2167–2179.

    Article  Google Scholar 

  • Muller, E., Gunnemann, S., Farber, I., & Seidl, T. (2012). Discovering multiple clustering solutions: Grouping objects in different views of the data. In 2012 IEEE 28th International Conference on, Data Engineering (ICDE), IEEE (pp. 1207–1210).

  • Rip, A., Callon, M., & Law, J. (1986). Mapping the dynamics of science and technology: Sociology of science in the real world. New York: Macmillan.

    Google Scholar 

  • Salton, G., & McGill, M. J. (1986). Introduction to modern information retrieval. New York: Mcgraw-Hill.

    Google Scholar 

  • Shang, F., Jiao, L. C., Shi, J., Wang, F., & Gong, M. (2012). Fast affinity propagation clustering: A multilevel approach. Pattern Recognition, 45(1), 474–486.

    Article  Google Scholar 

  • Strehl, A., & Ghosh, J. (2003). Cluster ensembles–A knowledge reuse framework for combining multiple partitions. The Journal of Machine Learning Research, 3, 583–617.

    MATH  MathSciNet  Google Scholar 

  • Tremolieres, R. (1979). The percolation method for an efficient grouping of data. Pattern Recognition, 11(4), 255–262.

    Article  MATH  Google Scholar 

  • Xu, C., Tao, D., & Xu, C. (2013). A survey on multi-view learning. arXiv preprint arXiv:13045634.

  • Yu, S., Tranchevent, L. C., Liu, X., Glanzel, W., Suykens, J. A. K., De Moor, B., et al. (2012). Optimized data fusion for kernel k-means clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(5), 1031–1039.

    Article  Google Scholar 

  • Zhang, L., Janssens, F., Liang, L. M., & Glänzel, W. (2009). Hybrid clustering analysis for mapping large scientific domains. In Proceedings of ISSI (pp. 178–188).

  • Zhang, L., Liu, X., Janssens, F., Liang, L., & Glänzel, W. (2010). Subject clustering analysis based on isi category classification. Journal of Informetrics, 4(2), 185–193.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinhai Liu.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 50 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meng, X., Liu, X., Tong, Y. et al. Multi-view clustering with exemplars for scientific mapping. Scientometrics 105, 1527–1552 (2015). https://doi.org/10.1007/s11192-015-1682-7

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-015-1682-7

Keywords

Navigation