Abstract
The problem with the prediction of scientific collaboration success based on the previous collaboration of scholars using machine learning techniques is addressed in this study. As the exploitation of collaboration network is essential in collaborator discovery systems, in this article an attempt is made to understand how to exploit the information embedded in collaboration networks. We benefit the link structure among the scholars and also among the scholars and the concepts to extract set of features that are correlated with the collaboration success and increase the prediction performance. The effect of considering other aggregate methods in addition to average and maximum, for computing the collaboration features based on the feature of the members is examined as well. A dataset extracted from Northwestern University’s SciVal Expert is used for evaluating the proposed approach. The results demonstrate the capability of the proposed collaboration features in order to increase the prediction performance in combination with the widely-used features like h-index and average citation counts. Consequently, the introduced features are appropriate to incorporate in collaborator discovery systems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
Computer supported cooperative work.
www.linkedin.com.
Medical Subject Heading.
Scival Expert assigns a unique identifier (uid) to each scholar.
Multilayer Perceptron.
References
Abbasi, A., Wigand, R. T., & Hossain, L. (2014). Measuring social capital through network analysis and its influence on individual performance. Library & Information Science Research, 36(1), 66–73.
Awal, G. K., & Bharadwaj, K. (2014). Team formation in social networks based on collective intelligence-an evolutionary approach. Applied Intelligence, 41(2), 627–648.
Bennett, L. M., & Gadlin, H. (2012). Collaboration and team science. Journal of Investigative Medicine, 60(5), 768–775.
Börner, K., Contractor, N., Falk-Krzesinski, H.J., Fiore, S.M., Hall, K.L., Keyton, J., Spring, B., Stokols, D., Trochim, W., Uzzi, B. (2010). A multi-level systems perspective for the science of team science. Science Translational Medicine 2(49), 49cm24–49cm24.
Bozeman, B., Fay, D., & Slade, C. P. (2013). Research collaboration in universities and academic entrepreneurship: The-state-of-the-art. The Journal of Technology Transfer, 38(1), 1–67.
Callaham, M., Wears, R. L., & Weber, E. (2002). Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals. JAMA, 287(21), 2847–2850.
Castillo, C., Donato, D., & Gionis, A. (2007) Estimating number of citations using author reputation. In: String processing and information retrieval (pp. 107–117). Berlin: Springer
Cummings, J. N., & Kiesler, S. (2008). Who collaborates successfully? Prior experience reduces collaboration barriers in distributed interdisciplinary research. In: Proceedings of the 2008 ACM conference on computer supported cooperative work (pp. 437–446). ACM
Didegah, F., & Thelwall, M. (2013). Determinants of research citation impact in nanoscience and nanotechnology. Journal of the American Society for Information Science and Technology, 64(5), 1055–1064.
Dorn, C., & Dustdar, S. (2010). Composing near-optimal expert teams: A trade-off between skills and connectivity. On the Move to Meaningful Internet Systems: OTM, 2010, 472–489.
Egghe, L. (2006). An improvement of the h-index: The g-index. ISSI Newsletter, 2(1), 8–9.
Eslami, H., Ebadi, A., & Schiffauerova, A. (2013). Effect of collaboration network structure on knowledge creation and technological performance: The case of biotechnology in canada. Scientometrics, 97(1), 99–119.
Fazel-Zarandi, M., & Fox, M. S. (2013). Inferring and validating skills and competencies over time. Applied Ontology, 8(3), 131–177.
Freeman, L. C. (1978). Centrality in social networks conceptual clarification. Social Networks, 1(3), 215–239.
Fu, L. D., & Aliferis, C. F. (2010). Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature. Scientometrics, 85(1), 257–270.
Gajewar, A., & Sarma, A. D. (2012) Multi-skill collaborative teams based on densest subgraphs. In: SDM (pp. 165–176). SIAM.
Jirotka, M., Lee, C. P., & Olson, G. M. (2013). Supporting scientific collaboration: Methods, tools and concepts. Computer Supported Cooperative Work (CSCW), 22(4–6), 667–715.
Lappas, T., Liu, K., Terzi, E. (2009). Finding a team of experts in social networks. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 467–476). ACM.
Li, C. T., Shan, M. K., & Lin, S. D. (2015). On team formation with expertise query in collaborative social networks. Knowledge and Information Systems, 42(2), 441–463.
Liang, T. P., Liu, C. C., Lin, T. M., & Lin, B. (2007). Effect of team diversity on software project performance. Industrial Management & Data Systems, 107(5), 636–653.
Liben-Nowell, D., & Kleinberg, J. (2007). The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), 1019–1031.
Olson, G. M., Zimmerman, A., & Bos, N. (2008). Scientific collaboration on the Internet. Cambridge, MA: The MIT Press.
PubMed: MS Windows NT kernel description (2005). http://www.ncbi.nlm.nih.gov/books/NBK3827.
Schleyer, T., Butler, B. S., Song, M., & Spallek, H. (2012). Conceptualizing and advancing research networking systems. ACM Transactions on Computer-Human Interaction (TOCHI), 19(1), 2.
Schmidt, K., & Bannon, L. (2013). Constructing cscw: The first quarter century. Computer Supported Cooperative Work (CSCW), 22(4–6), 345–372.
Sharma, A., Srivastava, J., &Chandra, A. (2014). Predicting multi-actor collaborations using hypergraphs. arXiv preprint arXiv:1401.6404.
Skilton, P. (2008). Does the human capital of teams of natural science authors predict citation frequency? Scientometrics, 78(3), 525–542.
Sonnenwald, D. H. (2007). Scientific collaboration: A synthesis of challenges and strategies. Annual Review of Information Science and Technology, 41, 643–681.
Stokols, D., Misra, S., Moser, R. P., Hall, K. L., & Taylor, B. K. (2008). The ecology of team science: Understanding contextual influences on transdisciplinary collaboration. American Journal of Preventive Medicine, 35(2), S96–S115.
Tan, S., Bu, J., Chen, C., & He, X. (2011). Using rich social media information for music recommendation via hypergraph model. In: Social media modeling and computing (pp. 213–237). New York: Springer.
Torres-Carrasquillo, P. A., Reynolds, D. A., & Deller Jr, J. (2002). Language identification using gaussian mixture model tokenization. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP), Vol. 1, pp. I–757.
Wang, M., Yu, G., An, S., & Yu, D. (2012). Discovery of factors influencing citation impact based on a soft fuzzy rough set model. Scientometrics, 93(3), 635–644.
Whitfield, J. (2008). Collaboration: Group theory. Nature, 455, 720–723.
Wi, H., Oh, S., Mun, J., & Jung, M. (2009). A team formation model based on knowledge and collaboration. Expert Systems with Applications, 36(5), 9121–9134.
Yan, R., Huang, C., Tang, J., Zhang, Y., & Li, X. (2012). To better stand on the shoulder of giants. In: Proceedings of the 12th ACM/IEEE-CS joint conference on digital libraries (pp. 51–60). ACM.
Yu, T., Yu, G., Li, P. Y., & Wang, L. (2014). Citation impact prediction for scientific papers using stepwise regression analysis. Scientometrics, 101(2), 1233–1252.
Author information
Authors and Affiliations
Corresponding author
Additional information
An erratum to this article is available at http://dx.doi.org/10.1007/s11192-016-2170-4.
Rights and permissions
About this article
Cite this article
Ghasemian, F., Zamanifar, K., Ghasem-Aqaee, N. et al. Toward a better scientific collaboration success prediction model through the feature space expansion. Scientometrics 108, 777–801 (2016). https://doi.org/10.1007/s11192-016-1999-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-016-1999-x