Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3534678.3539023acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

ChemicalX: A Deep Learning Library for Drug Pair Scoring

Published: 14 August 2022 Publication History

Abstract

In this paper, we introduce ChemicalX, a PyTorch-based deep learning library designed for providing a range of state of the art models to solve the drug pair scoring task. The primary objective of the library is to make deep drug pair scoring models accessible to machine learning researchers and practitioners in a streamlined framework. The design of ChemicalX reuses existing high level model training utilities, geometric deep learning, and deep chemistry layers from the PyTorch ecosystem. Our system provides neural network layers, custom pair scoring architectures, data loaders, and batch iterators for end users. We showcase these features with example code snippets and case studies to highlight the characteristics of ChemicalX. A range of experiments on real world drug-drug interaction, polypharmacy side effect, and combination synergy prediction tasks demonstrate that the models available in ChemicalX are effective at solving the pair scoring task. Finally, we show that ChemicalX could be used to train and score machine learning models on large drug pair datasets with hundreds of thousands of compounds on commodity hardware.

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, et al. 2016. Tensorflow: A System for Large-Scale Machine Learning. In 12th (USENIX) symposium on operating systems design and implementation (OSDI 16). 265--283.
[2]
Mehdi Ali, Max Berrendorf, Charles Tapley Hoyt, Laurent Vermue, Sahand Sharifzadeh, Volker Tresp, and Jens Lehmann. 2021. PyKEEN 1.0: a Python Library for Training and Evaluating Knowledge Graph Embeddings. Journal of Machine Learning Research 22, 82 (2021), 1--6.
[3]
Kuru Halil Brahim, Oznur Tastan, and Ercument Cicek. 2021. MatchMaker: A Deep Learning Framework for Drug Synergy Prediction. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2021).
[4]
Xusheng Cao, Rui Fan, and Wanwen Zeng. 2020. DeepDrug: A General Graph- Based Deep Learning Framework for Drug Relation Prediction. bioRxiv (2020).
[5]
Yukuo Cen, Zhenyu Hou, Yan Wang, Qibin Chen, et al. 2021. CogDL: An Extensive Toolkit for Deep Learning on Graphs. (2021).
[6]
Tianqi Chen, Mu Li, Yutian Li, Min Lin, et al. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv preprint 1512.01274 (2015).
[7]
Xin Chen, Xien Liu, and Ji Wu. 2020. GCN-BMP: Investigating Graph Representation Learning for DDI Prediction Task. Methods 179 (2020), 47--54. Interpretable machine learning in bioinformatics.
[8]
CSIRO's Data61. 2018. StellarGraph Machine Learning Library. https://github.com/stellargraph/stellargraph.
[9]
Andreea Deac, Yu-Hsiang Huang, Petar Velickovic, Pietro Liò, and Jian Tang. 2019. Drug-Drug Adverse Effect Prediction with Graph Co-Attention. ICML Workshop on Computational Biology (2019).
[10]
Joseph L. Durant, Burton A. Leland, Douglas R. Henry, and James G. Nourse. 2002. Reoptimization of MDL keys for use in drug discovery. Journal of Chemical Information and Computer Sciences 42, 6 (2002), 1273--1280.
[11]
David K Duvenaud, Dougal Maclaurin, et al. 2015. Convolutional Networks on Graphs for Learning Molecular Fingerprints. Advances in Neural Information Processing Systems 28 (2015), 2224--2232.
[12]
Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.
[13]
Thomas Gaudelet, Ben Day, Arian R Jamasb, Jyothish Soman, et al. 2021. Utilizing Graph Machine Learning within Drug Discovery and Development. Briefings in Bioinformatics 22, 6 (2021).
[14]
Justin Gilmer, Samuel Schoenholz, Patrick Riley, Oriol Vinyals, and George Dahl. 2017. Neural Message Passing for Quantum Chemistry. In International Conference on Machine Learning. 1263--1272.
[15]
Jonathan Godwin, Thomas Keck, Peter Battaglia, Victor Bapst, Thomas Kipf, et al. 2020. Jraph: A Library for Graph Neural Networks in Jax.
[16]
Rafael Gómez-Bombarelli, Jennifer Wei, David Duvenaud, José Miguel Hernández-Lobato, et al. 2018. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Central Science 4, 2 (2018), 268--276.
[17]
Daniele Grattarola and Cesare Alippi. 2021. Graph Neural Networks in Tensor-Flow and Keras with Spektral. IEEE Computational Intelligence Magazine 16, 1 (2021), 99--106.
[18]
Stephen R. Heller, Alan McNaught, Igor Pletnev, Stephen Stein, and Dmitrii Tchekhovskoi. 2015. InChI, the IUPAC International Chemical Identifier. Vol. 7. Journal of Cheminformatics. 1--34 pages.
[19]
Jun Hu, Shengsheng Qian, Quan Fang, et al. 2021. Efficient Graph Deep Learning in TensorFlow with TF Geometric. arXiv preprint 2101.11552 (2021).
[20]
Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2019. Strategies for Pre-training Graph Neural Networks. In International Conference on Learning Representations.
[21]
Kexin Huang, Tianfan Fu, Wenhao Gao, Yue Zhao, et al. 2021. Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development. In 35th Conference on Neural Information Processing Systems.
[22]
Kexin Huang, Cao Xiao, Trong Nghia Hoang, Lucas M Glass, and Jimeng Sun. 2020. CASTER: Predicting Drug Interactions with Chemical Substructure Representation. AAAI (2020).
[23]
James Bradbury and Roy Frostig and Peter Hawkins and Matthew James Johnson and others. 2018. JAX: Composable Transformations of Python+NumPy Programs.
[24]
Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Internation Conference on Learning Representations.
[25]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR).
[26]
Mario Krenn, Florian Häse, AkshatKumar Nigam, Pascal Friederich, and Alan Aspuru-Guzik. 2020. Self-referencing Embedded Strings (SELFIES): A 100% Robust Molecular String Representation. Machine Learning: Science and Technology 1, 4 (dec 2020), 045024.
[27]
Mufei Li, Jinjing Zhou, Jiajing Hu, Wenxuan Fan, Yangkang Zhang, Yaxin Gu, and George Karypis. 2021. DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life Science. ACS Omega 6, 41 (2021), 27233--27238.
[28]
Hui Liu, Wenhao Zhang, Bo Zou, Jinxian Wang, and Yuanyuan Deng. 2020. DrugCombDB: A Comprehensive Database of Drug Combinations Toward the Discovery of Combinatorial Therapy. Nucleic acids research 48 (2020), 871--881.
[29]
Meng Liu, Youzhi Luo, Limei Wang, et al. 2021. DIG: A Turnkey Library for Diving into Graph Deep Learning Research. Journal of Machine Learning Research 22, 240 (2021), 1--9.
[30]
Rocío Mercado, Tobias Rastemo, Edvard Lindelöf, Günter Klambauer, Ola Engkvist, Hongming Chen, and Esben Jannik Bjerrum. 2021. Graph Networks for Molecular Design. Machine Learning: Science and Technology 2, 2 (jun 2021).
[31]
H. L. Morgan. 1965. The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. Journal of Chemical Documentation 5, 2 (may 1965), 107--113.
[32]
Abe Motoki, Mihai Mororiu, Tomoya Otabi, Kenshin Abe, and Others. 2017. Chainer Chemistry: A Library for Deep Learning in Biology and Chemistry. https://github.com/chainer/chainer-chemistry
[33]
Vinod Nair and Geoffrey E Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning. 807--814.
[34]
Arnold K Nyamabo, Hui Yu, and Jian-Yu Shi. 2021. SSI--DDI: Substructure-- Substructure Interactions for Drug--Drug Interaction Prediction. Briefings in Bioinformatics (2021).
[35]
Noel O'Boyle and Andrew Dalke. 2018. DeepSMILES: An Adaptation of SMILES for Use in Machine-Learning of Chemical Structures. ChemRxiv (2018).
[36]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, et al. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems 32 (2019), 8026--8037.
[37]
Kristina Preuer, Richard PI Lewis, Sepp Hochreiter, Andreas Bender, Krishna C Bulusu, and Günter Klambauer. 2018. DeepSynergy: Predicting Anti-Cancer Drug Synergy with Deep Learning. Bioinformatics 34, 9 (2018), 1538--1546.
[38]
Bharath Ramsundar, Peter Eastman, Patrick Walters, Vijay Pande, et al. 2019. Deep Learning for the Life Sciences. O'Reilly Media.
[39]
Benedek Rozemberczki, Stephen Bonner, Andriy Nikolov, Michael Ughetto, Sebastian Nilsson, and Eliseo Papa. 2021. A Unified View of Relational Deep Learning for Drug Pair Scoring. arXiv:2111.02916 [cs.LG]
[40]
Benedek Rozemberczki, Paul Scherer, Yixuan He, George Panagopoulos, et al. 2021. PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 4564--4573.
[41]
Jae Yong Ryu, Hyun Uk Kim, and Sang Yup Lee. 2018. Deep Learning Improves Prediction of Drug--Drug and Drug--Food Interactions. Proceedings of the National Academy of Sciences 115, 18 (2018), E4304--E4311.
[42]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929--1958.
[43]
Mengying Sun, Fei Wang, Olivier Elemento, and Jiayu Zhou. 2020. Structure-Based Drug-Drug Interaction Detection via Expressive Graph Convolutional Networks and Deep Sets. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 13927--13928.
[44]
Nicholas P Tatonetti, P Ye Patrick, Roxana Daneshjou, and Russ B Altman. 2012. Data-Driven Prediction of Drug Effects and Interactions. Science translational medicine 4, 125 (2012).
[45]
Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. 2015. Chainer: A Next-Generation Open Source Framework for Deep Learning. In Proceedings of Workshop on Machine Learning Systems in the 29th Conference on Neural Information Processing Systems (NIPS), Vol. 5. 1--6.
[46]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. [n.d.]. Graph Attention Networks. In 6th International Conference on Learning Representations, 2018.
[47]
Jinxian Wang, Wenhao Zhang, Siyuan Shen, Lei Deng, and Hui Liu. 2021. Deep-DDS: Deep Graph Neural Network with Attention Mechanism to Predict Synergistic Drug Combinations. bioRxiv (2021).
[48]
Minjie Wang, Lingfan Yu, Da Zheng, et al. 2019. Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs. (2019).
[49]
David Weininger. 1988. SMILES, a Chemical Language and Information System. 1. Introduction to Methodology and Encoding Rules. J. Chem. Inf. Comput. Sci. 28, 1 (feb 1988), 31--36.
[50]
Nuo Xu, Pinghui Wang, Long Chen, Jing Tao, and Junzhou Zhao. 2019. MR-GNN: Multi-Resolution and Dual Graph Neural Network for Predicting Structured Entity Interactions. Proceedings of IJCAI (2019).
[51]
Bulat Zagidullin, Jehad Aldahdooh, Shuyu Zheng, Wenyu Wang, Yinyin Wang, et al. 2019. DrugComb: An Integrative Cancer Drug Combination Data Portal. Nucleic acids research 47, W1 (2019), W43--W51.
[52]
Shuyu Zheng, Jehad Aldahdooh, Tolou Shadbahr, Yinyin Wang, Dalal Aldahdooh, et al. 2021. DrugComb Update: A More Comprehensive Drug Sensitivity Data Repository and Analysis Portal. Nucleic Acids Research (2021).
[53]
Zhaocheng Zhu, Shengchao Liu, Chence Shi, et al. 2021. TorchDrug: A Powerful and Flexible Machine Learning Platform for Drug Discovery.

Cited By

View all
  • (2024)FacGNN: Multi-faceted Fairness Enhancement for GNN through Adversarial and Contrastive Learning2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10649939(1-8)Online publication date: 30-Jun-2024
  • (2023)Digital Research Environment(DRE)-enabled Artificial Intelligence (AI) to facilitate early stage drug developmentFrontiers in Pharmacology10.3389/fphar.2023.111535614Online publication date: 24-Mar-2023
  • (2023)SAE-SV: A Stacked-AutoEncoder and Soft Voting Joint Approach Based on Small Dataset with High Dimensions for Inhibitory Potency PredictionProceedings of the 2023 4th International Symposium on Artificial Intelligence for Medicine Science10.1145/3644116.3644315(1170-1175)Online publication date: 20-Oct-2023
  • Show More Cited By

Index Terms

  1. ChemicalX: A Deep Learning Library for Drug Pair Scoring

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2022
    5033 pages
    ISBN:9781450393850
    DOI:10.1145/3534678
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. chemistry
    2. deep learning
    3. neural networks

    Qualifiers

    • Research-article

    Funding Sources

    • DARPA

    Conference

    KDD '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)335
    • Downloads (Last 6 weeks)50
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)FacGNN: Multi-faceted Fairness Enhancement for GNN through Adversarial and Contrastive Learning2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10649939(1-8)Online publication date: 30-Jun-2024
    • (2023)Digital Research Environment(DRE)-enabled Artificial Intelligence (AI) to facilitate early stage drug developmentFrontiers in Pharmacology10.3389/fphar.2023.111535614Online publication date: 24-Mar-2023
    • (2023)SAE-SV: A Stacked-AutoEncoder and Soft Voting Joint Approach Based on Small Dataset with High Dimensions for Inhibitory Potency PredictionProceedings of the 2023 4th International Symposium on Artificial Intelligence for Medicine Science10.1145/3644116.3644315(1170-1175)Online publication date: 20-Oct-2023
    • (2023)Analyzing and Improving Prediction of Spatiotemporal Signal Data Using Grid Search on Graph Convolutional Networks2023 9th International Conference on Web Research (ICWR)10.1109/ICWR57742.2023.10139174(294-299)Online publication date: 3-May-2023
    • (2023)Artificial intelligence and cheminformatics tools: a contribution to the drug development and chemical scienceJournal of Biomolecular Structure and Dynamics10.1080/07391102.2023.223403942:12(6523-6541)Online publication date: 11-Jul-2023
    • (2022)Improving Fairness in Graph Neural Networks via Mitigating Sensitive Attribute LeakageProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539404(1938-1948)Online publication date: 14-Aug-2022
    • (2022)Imbalanced Graph Classification via Graph-of-Graph Neural NetworksProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557356(2067-2076)Online publication date: 17-Oct-2022
    • (2022)Degree-Related Bias in Link Prediction2022 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW58026.2022.00103(757-758)Online publication date: Nov-2022

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media