Abstract
This study critically evaluates gender assignment methods within academic contexts, employing a comparative analysis of diverse techniques, including a SVM classifier, gender-guesser, genderize.io, and a Cultural Consensus Theory based classifier. Emphasizing the significance of transparency, data sources, and methodological considerations, the research introduces nomquamgender, a cultural consensus-based method, and applies it to Teseo, a Spanish dissertation database. The results reveal a substantial reduction in the number of individuals with unknown gender compared to traditional methods relying on INE data. The nuanced differences in gender distribution underscore the importance of methodological choices in gender studies, urging for transparent, comprehensive, and freely accessible methods to enhance the accuracy and reliability of gender assignment in academic research. After reevaluating the problem of gender imbalances in the doctoral system we can conclude that it’s still evident although the trend is clearly set for its reduction. Finaly, specific problems related to some disciplines, including STEM fields and seniority roles are found to be worth of attention in the near future.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Gender labels for Spanish affiliated researchers. Available online: https://zenodo.org/doi/10.5281/zenodo.11243211
References
Abramo, G., D’Angelo, C. A., & Di Costa, F. (2019). A gender analysis of top scientists’ collaboration behavior: Evidence from Italy. Scientometrics, 120(2), 405–418. https://doi.org/10.1007/s11192-019-03136-6
Aguillo-Caño, I. F. (2022a). Ranking of researchers in Spain and Spaniards abroad. Edition 2023. https://www.webometrics.info/en/GoogleScholar/Spain
Aguillo-Caño, I. F. (2022b). Ranking de investigadoras españolas y extranjeras en España según Google Scholar. Jun 2022. https://www.webometrics.info/en/investigadoras
Andersen, J. P. (2023). Field-level differences in paper and author characteristics across all fields of science in Web of Science, 2000–2020. Quantitative Science Studies, 4(2), 394–422. https://doi.org/10.1162/qss_a_00246
Biblioteca de la Universidad de Oviedo (2022). Ranking of researchers in Spain (BCS -Uniovi). Biblioteca de ciencias de la salud de la Universidad de Oviedo. https://bcsuniovi.com/2022/12/22/ranking-of-researchers-in-spain/
Blackburn, H. (2017). The status of women in STEM in higher education: A review of the literature 2007–2017. Science & Technology Libraries, 36(3), 235–273. https://doi.org/10.1080/0194262X.2017.1371658
Blickenstaff, J. C. (2005). Women and science careers: Leaky pipeline or gender filter? Gender and Education, 17(4), 369–386. https://doi.org/10.1080/09540250500145072
Boekhout, H., van der Weijden, I., & Waltman, L. (2021). Gender differences in scientific careers: A large-scale bibliometric analysis. Preprint retrieved from https://doi.org/10.48550/arXiv.2106.12624
Borrego, Á., Barrios, M., Villarroya, A., & Ollé, C. (2010). Scientific output and impact of postdoctoral scientists: A gender perspective. Scientometrics, 83(1), 93–101. https://doi.org/10.1007/s11192-009-0025-y
Chan, H. F., & Torgler, B. (2020). Gender differences in performance of top cited scientists by field and country. Scientometrics, 125(3), 2421–2447. https://doi.org/10.1007/s11192-020-03733-w
Curiel-Marín, E., & Fernández-Cano, A. (2015). Análisis cienciométrico de tesis doctorales españolas en didáctica de las ciencias sociales (1976–2012). Revista Española De Documentación Científica, 38(4), 9. https://doi.org/10.3989/redc.2015.4.1282
El-Ouahi, J., & Larivière, V. (2023). On the lack of women researchers in the Middle East and North Africa. Scientometrics, 128(8), 4321–4348. https://doi.org/10.1007/s11192-023-04768-5
Etzkowitz, H., Kemelgor, C., Neuschatz, M., & Uzzi, B. (1992). Athena unbound: Barriers to women in academic science and engineering. Science and Public Policy, 19(3), 157–179. https://doi.org/10.1093/spp/19.3.157
Fell, C. B., & König, C. J. (2016). Is there a gender difference in scientific collaboration? A scientometric examination of co-authorships among industrial–organizational psychologists. Scientometrics, 108(1), 113–141. https://doi.org/10.1007/s11192-016-1967-5
Gaule, P., & Piacentini, M. (2018). An advisor like me? Advisor gender and post-graduate careers in science. Research Policy, 47(4), 805–813. https://doi.org/10.1016/j.respol.2018.02.011
Ghosh, R. (2022). Name based gender identification using machine learning and deep learning models. TechRxiv. https://doi.org/10.36227/techrxiv.21388140.v1
González-Salmón, E., & Robinson-García, N. (2024). A call for transparency in gender assignment approaches. Scientometrics, 129(4), 2451–2454. https://doi.org/10.1007/s11192-024-04995-4
Gulbranson, D. (2023). Nameparser: a simple Python module for parsing human names into their individual components. PyPI: The Python Package Index. https://pypi.org/project/nameparser/
Hernández-González, V., De Pano-Rodríguez, A., & Reverter-Masia, J. (2020). Spanish doctoral theses in physical activity and sports sciences and authors’ scientific publications (LUSTRUM 2013–2017). Scientometrics, 122(1), 661–679. https://doi.org/10.1007/s11192-019-03295-6
Holman, L., Stuart-Fox, D., & Hauser, C. E. (2018). The gender gap in science: How long until women are equally represented? PLoS Biology, 16(4), e2004956. https://doi.org/10.1371/journal.pbio.2004956
Huang, J., Gates, A. J., Sinatra, R., & Barabasi, A. L. (2020). Historical comparison of gender inequality in scientific careers across countries and disciplines. Proceedings of the National Academy of Sciences of the USA, 117(9), 4609–4616. https://doi.org/10.1073/pnas.1914221117
INE. (2021). Apellidos y nombres más frecuentes. Latest data from year 2020. Published: 20/05/2021. https://www.ine.es/uc/ijPGiEWy
Ioannidis, J. P., Boyack, K. W., Collins, T. A., & Baas, J. (2023). Gender imbalances among top-cited scientists across scientific disciplines over time through the analysis of nearly 5.8 million authors. PLoS Biology, 21(11), e3002385. https://doi.org/10.1371/journal.pbio.3002385
Karimi, F., Wagner, C., Lemmerich, F., Jadidi, M., & Strohmaier, M. (2016). Inferring gender from names on the Web: a comparative evaluation of gender detection methods. Proceedings of the 25th International Conference Companion on World Wide Web - WWW ‘16 Companion (pp. 53–54). https://doi.org/10.1145/2872518.2889385
Kim, L., Smith, D. S., Hofstra, B., & McFarland, D. A. (2022). Gendered knowledge in fields and academic careers. Research Policy, 51(1), 104411. https://doi.org/10.1016/j.respol.2021.104411
LaBerge, N., Wapman, K. H., Clauset, A., & Larremore, D. B. (2024). Gendered hiring and attrition on the path to parity for academic faculty. eLife, 13, RP93755. https://doi.org/10.7554/eLife.93755.1
Larivière, V., Ni, C., Gingras, Y., Cronin, B., & Sugimoto, C. R. (2013). Bibliometrics: Global gender disparities in science. Nature, 504, 211–213. https://doi.org/10.1038/504211a
Leo, M. S. (2021). Boy or girl? A machine learning web app to detect gender from name. Towards data science. 09 Sep 2021. https://towardsdatascience.com/boy-or-girl-a-machine-learning-web-app-to-detect-gender-from-name-16dc0331716c
Lin, Z., Yin, Y., Liu, L., & Wang, D. (2023). SciSciNet: A large-scale open data lake for the science of science research. Scientific Data, 10(1), 315. https://doi.org/10.1038/s41597-023-02198-9
Macaluso, B., Larivière, V., Sugimoto, T., & Sugimoto, C. R. (2016). Is science built on the shoulders of women? A study of gender differences in contributorship. Academic Medicine, 91(8), 1136–1142. https://doi.org/10.1097/ACM.0000000000001261
Malmasi, S., & Dras, M. (2014). A data-driven approach to studying given names and their gender and ethnicity associations. Proceedings of the Australasian Language Technology Association Workshop 2014, (pp. 145–149). https://aclanthology.org/U14-1021.pdf
Maz-Machado, A., Gutiérrez-Rubio, D., Madrid, M. J., & Pedrosa-Jesús, C. (2022). A look at doctoral theses in mathematics education at Andalusian Universities (2010–2020) from a gender perspective. TEM Journal, 11(3), 1007–1012. https://doi.org/10.18421/TEM113-03
Mihaljević, H., Tullney, M., Santamaría, L., & Steinfeldt, C. (2019). Reflections on gender analyses of bibliographic corpora. Frontiers in Big Data. https://doi.org/10.3389/fdata.2019.00029
Musi-Lechuga, B., Olivas-Ávila, J. A., & Buela-Casal, G. (2009). Producción científica de los programas de doctorado en psicología clínica y de la salud de España. International Journal of Clinical and Health Psychology, 9(1), 161–173.
Nicholas, D., Watkinson, A., Boukacem-Zeghmouri, C., Rodríguez-Bravo, B., Xu, J., Abrizah, A., Świgoń, M., & Herman, E. (2017). Early career researchers: Scholarly behaviour and the prospect of change. Learned Publishing, 30(2), 157–166. https://doi.org/10.1002/leap.1098
Olivas-Avila, J. A., & Musi-Lechuga, B. (2010). Doctoral theses production of the more productive Spanish psychology professors in the Web of Science. Psicothema, 22(4), 917–923.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Prim-Espada, M. P., De Diego-Sastre, J. I., & Pérez-Fernández, E. (2010). Gender patterns in Spanish otolaryngologic doctoral theses. Acta Otorrinolaringologica (english Edition), 61(5), 358–364. https://doi.org/10.1016/S2173-5735(10)70065-X
Ramos-Pardo, F. J., & Sánchez-Antolín, P. (2017). Production of educational theory doctoral theses in Spain (2001–2015). Scientometrics, 112(3), 1615–1630. https://doi.org/10.1007/s11192-017-2435-6
Repiso-Caballero, R., Torres-Salinas, D., & Delgado-López-Cózar, E. (2011). Análisis de la investigación sobre radio en España: Una aproximación a través del análisis bibliométrico y de redes sociales de las tesis doctorales defendidas en España entre 1976–2008. Estudios Sobre El Mensaje Periodístico, 17(2), 417–430. https://doi.org/10.5209/rev_ESMP.2011.v17.n2.38123
Reybold, L. E., Brazer, S. D., Schrum, L., & Corda, K. W. (2012). The politics of dissertation advising: How early career women faculty negotiate access and participation. Innovative Higher Education, 37, 227–242. https://doi.org/10.1007/s10755-011-9200-1
Saeta-Pérez, I. (2016). Gender-guesser. 5 Dic 2016. https://pypi.org/project/gender-guesser/
Sánchez-Jiménez, R., Blázquez-Ochando, M., Montesi, M., & Botezan, I. (2017). La producción de tesis doctorales en España (1995–2014): Evolución, disciplinas, principales actores y comparación con la producción científica en WoS y Scopus. Revista Española De Documentación Científica, 40(4), e188. https://doi.org/10.3989/redc.2017.4.1409
Sánchez-Jiménez, R., Botezan, I., Barrasa-Rodríguez, J., Suárez-Figueroa, M. C., & Blázquez-Ochando, M. (2023). Gender imbalance in doctoral education: An analysis of the Spanish university system (1977–2021). Scientometrics, 128(4), 2577–2599. https://doi.org/10.1007/s11192-023-04648-y
Santamaría, L., & Mihaljević, H. (2018). Comparison and benchmark of name-to-gender inference services. PeerJ Computer Science, 4, e156. https://doi.org/10.7717/peerj-cs.156
Schiebinger, L. (1987). The history and philosophy of women in science: A review essay. Signs, 12(2), 305–332.
Spoon, K., LaBerge, N., Wapman, K. H., Zhang, S., Morgan, A. C., Galesic, M., Fosdick, B. K., Larremore, D. B., & Clauset, A. (2023). Gender and retention patterns among U.S. faculty. Science Advances, 9(42), eadi2205. https://doi.org/10.1126/sciadv.adi2205
Surawicz, C. M. (2016). Women in leadership: Why so few and what to do about it. Journal of the American College of Radiology, 13(12), 1433–1437. https://doi.org/10.1016/j.jacr.2016.08.026
Van Buskirk, I., Clauset, A., & Larremore, D. B. (2023). An open-source cultural consensus approach to name-based gender classification. Proceedings of the International AAAI Conference on Web and Social Media, 17, 866–877. https://doi.org/10.1609/icwsm.v17i1.22195
Villarroya, A., Barrios, M., Borrego, A., & Frías, A. (2008). PhD theses in Spain: A gender study covering the years 1990–2004. Scientometrics, 77(3), 469–483. https://doi.org/10.1007/s11192-007-1965-8
West, J. D., Jacquet, J., King, M. M., Correll, S. J., & Bergstrom, C. T. (2013). The role of gender in scholarly authorship. PLoS ONE, 8(7), e66212. https://doi.org/10.1371/journal.pone.0066212
White, K. (2004). The leaking pipeline: Women postgraduate and early career researchers in Australia. Tertiary Education and Management, 10(3), 227–241. https://doi.org/10.1080/13583883.2004.9967129
Zuckerman, H., & Cole, J. R. (1975). Women in American science. Minerva, 13(1), 82–102.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Matias-Rayme, N., Botezan, I., Suárez-Figueroa, M.C. et al. Gender assignment in doctoral theses: revisiting Teseo with a method based on cultural consensus theory. Scientometrics 129, 4553–4572 (2024). https://doi.org/10.1007/s11192-024-05079-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-024-05079-z