Abstract
The large amounts of publicly available bibliographic repositories on the web provide us great opportunities to study the scientific behaviors of scholars. This paper aims to study the way we collaborate, model the dynamics of collaborations and predict future collaborations among authors. We investigate the collaborations in three disciplines including physics, computer science and information science,and different kinds of features which may influence the creation of collaborations. Path-based features are found to be particularly useful in predicting collaborations. Besides, the combination of path-based and attribute-based features achieves almost the same performance as the combination of all features considered. Inspired by the findings, we propose an agent-based model to simulate the dynamics of collaborations. The model merges the ideas of network structure and node attributes by leveraging random walk mechanism and interests similarity. Empirical results show that the model could reproduce a number of realistic and critical network statistics and patterns. We further apply the model to predict collaborations in an unsupervised manner and compare it with several state-of-the-art approaches. The proposed model achieves the best predictive performance compared with the random baseline and other approaches. The results suggest that both network structure and node attributes may play an important role in shaping the evolution of collaboration networks.
Similar content being viewed by others
References
Abbasi, A., Hossain, L., & Leydesdorff, L. (2012). Betweenness centrality as a driver of preferential attachment in the evolution of research collaboration networks. Journal of Informetrics, 6(3), 403–412.
Adamic, L. A., & Adar, E. (2003). Friends and neighbors on the web. Social Networks, 25(3), 211–230.
Amaral, L. A. N., Scala, A., Barthelemy, M., & Stanley, H. E. (2000). Classes of small-world networks. Proceedings of the National Academy of Sciences, 97(21), 11,149–11,152.
Backstrom, L., & Leskovec, J. (2011). Supervised random walks: Predicting and recommending links in social networks. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp. 635–644.
Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.
Barabási, A. L., Jeong, H., Néda, Z., Ravasz, E., Schubert, A., & Vicsek, T. (2002). Evolution of the social network of scientific collaborations. Physica A: Statistical Mechanics and its Applications, 311(3), 590–614.
Beaver, D., & Rosen, R. (1978). Studies in scientific collaboration. part i. The professional origins of scientific co-authorship. Scientometrics, 1(1), 65–84.
Beaver, D., & Rosen, R. (1979). Studies in scientific collaboration part iii. Professionalization and the natural history of modern scientific co-authorship. Scientometrics, 1(3), 231–245.
Boguna, M., Pastor-Satorras, R., Diaz-Guilera, A., & Arenas, A. (2004). Models of social networks based on social distance attachment. Physical Review E, 70(5), 56,122.
Börner, K., Contractor, N., Falk-Krzesinski, H. J., Fiore, S. M., Hall, K. L., Keyton, J., et al. (2010). A multi-level systems perspective for the science of team science. Science Translational Medicine, 2(49), 49cm24–49cm24.
Clauset, A., Moore, C., & Newman, M. E. (2008). Hierarchical structure and the prediction of missing links in networks. Nature, 453(7191), 98–101.
de Beaver, D., & Rosen, R. (1979). Studies in scientific collaboration. part ii. Scientific co-authorship, research productivity and visibility in the french scientific elite. Scientometrics, 1(2), 133–149.
de Solla Price, D. J., & Beaver, D. (1966). Collaboration in an invisible college. American Psychologist, 21(11), 1011.
Fiala, D., Rousselot, F., & Ježek, K. (2008). Pagerank for bibliographic networks. Scientometrics, 76(1), 135–158.
Getoor, L., & Diehl, C. P. (2005). Link mining: A survey. ACM SIGKDD Explorations Newsletter, 7(2), 3–12.
Granovetter, M. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1.
He, B., Ding, Y., Tang, J., Reguramalingam, V., & Bollen, J. (2013). Mining diversity subgraph in multidisciplinary scientific collaboration networks: A meso perspective. Journal of Informetrics, 7(1), 117–128.
Hou, H., Kretschmer, H., & Liu, Z. (2008). The structure of scientific collaboration networks in scientometrics. Scientometrics, 75(2), 189–202.
Huang, J., Zhuang, Z., Li, J., & Giles, CL. (2008). Collaboration over time: Characterizing and modeling network evolution. In: Proceedings of the 2008 international conference on web search and data mining, ACM, pp. 107–116.
Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39–43.
Ley, M., & Reuther, P. (2006). Maintaining an online bibliographical database: The problem of data quality. In EGC, pp. 5–10.
Liben-Nowell, D., & Kleinberg, J. (2007). The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), 1019–1031.
Lichtenwalter, RN., Lussier, JT., & Chawla, NV., (2010). New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp. 243–252.
Liu, Y., Rousseau, R., & Guns, R. (2013). A layered framework to study collaboration as a form of knowledge sharing and diffusion. Journal of Informetrics, 7(3), 651–664.
McCarty, C., Jawitz, J. W., Hopkins, A., & Goldman, A. (2013). Predicting author h-index using characteristics of the co-author network. Scientometrics, 96(2), 467–483.
Milojević, S. (2014). Principles of scientific research team formation and evolution. Proceedings of the National Academy of Sciences, 111(11), 3984–3989.
Newman, M. E. (2001a). Clustering and preferential attachment in growing networks. Physical Review E, 64(2), 025,102.
Newman, M. E. (2001b). The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences, 98(2), 404–409.
Newman, M. E. (2002). Assortative mixing in networks. Physical Review Letters, 89(20), 208,701.
Newman, M. E. (2004). Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Sciences, 101(suppl 1), 5200–5205.
Payette, N. (2012). Agent-based models of science. In: Models of science dynamics (pp. 127–157). Berlin: Springer.
Sharan, U., & Neville, J., (2008). Temporal-relational classifiers for prediction in evolving domains. In: Data mining, 2008. ICDM’08. Eighth IEEE international conference on, IEEE, pp. 540–549.
Sun, X., Kaur, J., Milojević, S., Flammini, A., & Menczer, F., (2013a). Social dynamics of science. Scientific Reports 3:1069, doi:10.1038/srep01069.
Sun, X., Kaur, J., Possamai, L., & Menczer, F. (2013b). Ambiguous author query detection using crowdsourced digital library annotations. Information Processing & Management, 49(2), 454–464.
Sun, X., Lin, H., Xu, K., (2015). A Social Network Model Driven by Events and Interests. Expert Systems With Applications 42(9):4229–4238, doi:10.1016/j.eswa.2015.01.020.
Tang, J., Wu, S., Sun, J., & Su, H. (2012). Cross-domain collaboration recommendation. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp. 1285–1293.
Toivonen, R., Kovanen, L., Kivelä, M., Onnela, J. P., Saramäki, J., & Kaski, K. (2009). A comparative study of social network models: Network evolution models and nodal attribute models. Social Networks, 31(4), 240–254.
Tong, H., Faloutsos, C., & Pan, JY. (2006). Fast random walk with restart and its applications. 2013 IEEE 13th international conference on data mining 0:613–622.
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393(6684), 440–442.
Watts, D. J., Dodds, P. S., & Newman, M. E. J. (2002). Identity and search in social networks. Science, 296(5571), 1302–1305.
Acknowledgments
This work is partially supported by grant from the Natural Science Foundation of China (No. 61277370, 61402075), Natural Science Foundation of Liaoning Province, China (No. 201202031), the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sun, X., Lin, H., Xu, K. et al. How we collaborate: characterizing, modeling and predicting scientific collaborations. Scientometrics 104, 43–60 (2015). https://doi.org/10.1007/s11192-015-1597-3
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-015-1597-3