Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3477495.3531879acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper
Open access

Assessing Scientific Research Papers with Knowledge Graphs

Published: 07 July 2022 Publication History

Abstract

In recent decades, the growing scale of scientific research has led to numerous novel findings. Reproducing these findings is the foundation of future research. However, due to the complexity of experiments, manually assessing scientific research is laborious and time-intensive, especially in social and behavioral sciences. Although increasing reproducibility studies have garnered increased attention in the research community, there is still a lack of systematic ways for evaluating scientific research at scale. In this paper, we propose a novel approach towards automatically assessing scientific publications by constructing a knowledge graph (KG) that captures a holistic view of the research contributions. Specifically, during the KG construction, we combine information from two different perspectives: micro-level features that capture knowledge from published articles such as sample sizes, effect sizes, and experimental models, and macro-level features that comprise relationships between entities such as authorship and reference information. We then learn low-dimensional representations using language models and knowledge graph embeddings for entities (nodes in KGs), which are further used for the assessments. A comprehensive set of experiments on two benchmark datasets shows the usefulness of leveraging KGs for scoring scientific research.

Supplementary Material

MP4 File (SIGIR22-sp1936.mp4)
A short version of the presentation video of the research work on constructing knowledge graphs to assess scientific publications.

References

[1]
Ralph Abboud, .Ismail .Ilkan Ceylan, Thomas Lukasiewicz, and Tommaso Salvatori. 2020. BoxE: A Box Embedding Model for Knowledge Base Completion. In Proceedings of the Thirty-Fourth Annual Conference on Advances in Neural Information Processing Systems .
[2]
Mehdi Ali, Max Berrendorf, Charles Tapley Hoyt, Laurent Vermue, Sahand Sharifzadeh, Volker Tresp, and Jens Lehmann. 2021. PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings . Journal of Machine Learning Research, Vol. 22, 82 (2021), 1--6.
[3]
Nazanin Alipourfard, Beatrix Arendt, Daniel M Benjamin, Noam Benkler, Michael M Bishop, Mark Burstein, Martin Bush, James Caverlee, Yiling Chen, Chae Clark, and et al. 2021. Systematizing Confidence in Open Research and Evidence (SCORE). https://doi.org/10.31235/osf.io/46mnb
[4]
Adam Altmejd, Anna Dreber, Eskil Forsell, Juergen Huber, Taisuke Imai, Magnus Johannesson, Michael Kirchler, Gideon Nave, and Colin Camerer. 2019. Predicting the replicability of social science lab experiments., Vol. 14, 12 (2019). https://doi.org/10.1371/journal.pone.0225826
[5]
Monya Baker. 2016. IS THERE A REPRODUCIBILITY CRISIS? Nature, Vol. 533 (05 2016), 452--454.
[6]
C. Glenn Begley and Lee M. Ellis. 2012. Raise standards for preclinical cancer research textbar Nature., Vol. 83, 7391 (2012), 531--533. https://doi.org/doi.org/10.1038/483531a
[7]
Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: Pretrained Language Model for Scientific Text. In EMNLP .
[8]
Taylor Berg-Kirkpatrick and Daniel Spokoyny. 2020. An Empirical Investigation of Contextualized Number Prediction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing . 4754--4764. https://doi.org/10.18653/v1/2020.emnlp-main.385
[9]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data . 1247--1250. https://doi.org/10.1145/1376616.1376746
[10]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In Advances in Neural Information Processing Systems, Vol. 26.
[11]
Colin F. Camerer, Anna Dreber, Eskil Forsell, Teck-Hua Ho, Jürgen Huber, Magnus Johannesson, Michael Kirchler, Johan Almenberg, Adam Altmejd, Taizan Chan, Emma Heikensten, Felix Holzmeister, Taisuke Imai, Siri Isaksson, Gideon Nave, Thomas Pfeiffer, Michael Razen, and Hang Wu. 2016. Evaluating replicability of laboratory experiments in economics. Science, Vol. 351, 6280 (2016), 1433--1436. https://doi.org/10.1126/science.aaf0918
[12]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171--4186. https://doi.org/10.18653/v1/N19--1423
[13]
Timothy M Errington, Courtney K Soderberg Maya Mathur, Alexandria Denis, Nicole Perfito, Elizabeth Iorns, and Brian A Nosek. 2021. Investigating the replicability of preclinical cancer biology. (2021). https://doi.org/10.7554/eLife.71601
[14]
John P. A. Ioannidis. 2005. Why Most Published Research Findings Are False. PLOS Medicine, Vol. 2 (08 2005), null. https://doi.org/10.1371/journal.pmed.0020124
[15]
Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Knowledge Graph Embedding via Dynamic Mapping Matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 687--696. https://doi.org/10.3115/v1/P15--1067
[16]
M. G. Kendall. 1938. A New Measure of Rank Correlation. Biometrika, Vol. 30, 1/2 (1938), 81--93.
[17]
Agustinus Kristiadi, Mohammad Asif Khan, Denis Lukovnikov, Jens Lehmann, and Asja Fischer. [n.d.]. Incorporating Literals into Knowledge Graph Embeddings. In The Semantic Web . 347--363. https://doi.org/10.1007/978--3-030--30793--6_20
[18]
Imre Lakatos. 1970. Criticism and the Growth of Knowledge (Proceedings of the International Colloquium in the Philosophy of Science, London 1965, Volume 4). (1970).
[19]
Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick Van Kleef, Sören Auer, and Christian Bizer. 2014. DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web Journal, Vol. 6 (01 2014). https://doi.org/10.3233/SW-140134
[20]
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence . 2181--2187.
[21]
Farzaneh Mahdisoltani, Joanna Asia Biega, and Fabian M. Suchanek. 2015. YAGO3: A Knowledge Base from Multilingual Wikipedias. In CIDR .
[22]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. (2013). arxiv: 1301.3781
[23]
Brian A. Nosek and Timothy M. Errington. 2020. What is replication? PLOS Biology, Vol. 18 (03 2020), 1--8. https://doi.org/10.1371/journal.pbio.3000691
[24]
Open Science Collaboration. 2015. Estimating the reproducibility of psychological science textbar Science., Vol. 349, 6251 (2015). https://doi.org/10.1126/science.aac4716
[25]
Florian Prinz, Thomas Schlange, and Khusru Asadullah. 2011. Believe it or not: how much can we rely on published data on potential drug targets?, Vol. 10, 9 (2011), 712--712. https://doi.org/10.1038/nrd3439-c1
[26]
Théo Trouillon, Johannes Welbl, Sebastian Riedel, Eric Gaussier, and Guillaume Bouchard. 2016. Complex Embeddings for Simple Link Prediction. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research), Vol. 48. 2071--2080.
[27]
Shikhar Vashishth, Soumya Sanyal, Vikram Nitin, and Partha Talukdar. 2020. Composition-based Multi-Relational Graph Convolutional Networks. In International Conference on Learning Representations .
[28]
Ruobing Xie, Zhiyuan Liu, Jia Jia, Huanbo Luan, and Maosong Sun. 2016. Representation Learning of Knowledge Graphs with Entity Descriptions. Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30, 1 (2016).
[29]
Jiacheng Xu, Xipeng Qiu, Kan Chen, and Xuanjing Huang. 2017. Knowledge Graph Representation with Jointly Structural and Textual Encoding. In IJCAI .
[30]
Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In 3rd International Conference on Learning Representations, ICLR 2015 .
[31]
Yang Yang, Wu Youyou, and Brian Uzzi. 2020. Estimating the deep replicability of scientific findings using human and artificial intelligence., Vol. 117, 20 (2020), 10762--10768. https://doi.org/10.1073/pnas.1909046117

Cited By

View all
  • (2023)Knowledge Graph Based Medical Chatbot building2023 4th IEEE Global Conference for Advancement in Technology (GCAT)10.1109/GCAT59970.2023.10353415(1-6)Online publication date: 6-Oct-2023

Index Terms

  1. Assessing Scientific Research Papers with Knowledge Graphs

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
      July 2022
      3569 pages
      ISBN:9781450387323
      DOI:10.1145/3477495
      This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 July 2022

      Check for updates

      Author Tags

      1. knowledge graph
      2. reproducibility
      3. social and behavioral sciences

      Qualifiers

      • Short-paper

      Funding Sources

      Conference

      SIGIR '22
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)380
      • Downloads (Last 6 weeks)40
      Reflects downloads up to 24 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Knowledge Graph Based Medical Chatbot building2023 4th IEEE Global Conference for Advancement in Technology (GCAT)10.1109/GCAT59970.2023.10353415(1-6)Online publication date: 6-Oct-2023

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media