research-article

Public Access

ChemicalX: A Deep Learning Library for Drug Pair Scoring

Authors:

Benedek Rozemberczki,

Charles Tapley Hoyt,

Piotr Grabowski,

Andriy Nikolov,

Sebastian Nilsson,

Michael Ughetto,

Benjamin M. GyoriAuthors Info & Claims

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 3819 - 3828

https://doi.org/10.1145/3534678.3539023

Published: 14 August 2022 Publication History

Abstract

In this paper, we introduce ChemicalX, a PyTorch-based deep learning library designed for providing a range of state of the art models to solve the drug pair scoring task. The primary objective of the library is to make deep drug pair scoring models accessible to machine learning researchers and practitioners in a streamlined framework. The design of ChemicalX reuses existing high level model training utilities, geometric deep learning, and deep chemistry layers from the PyTorch ecosystem. Our system provides neural network layers, custom pair scoring architectures, data loaders, and batch iterators for end users. We showcase these features with example code snippets and case studies to highlight the characteristics of ChemicalX. A range of experiments on real world drug-drug interaction, polypharmacy side effect, and combination synergy prediction tasks demonstrate that the models available in ChemicalX are effective at solving the pair scoring task. Finally, we show that ChemicalX could be used to train and score machine learning models on large drug pair datasets with hundreds of thousands of compounds on commodity hardware.

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, et al. 2016. Tensorflow: A System for Large-Scale Machine Learning. In 12th (USENIX) symposium on operating systems design and implementation (OSDI 16). 265--283.

[2]

Mehdi Ali, Max Berrendorf, Charles Tapley Hoyt, Laurent Vermue, Sahand Sharifzadeh, Volker Tresp, and Jens Lehmann. 2021. PyKEEN 1.0: a Python Library for Training and Evaluating Knowledge Graph Embeddings. Journal of Machine Learning Research 22, 82 (2021), 1--6.

[3]

Kuru Halil Brahim, Oznur Tastan, and Ercument Cicek. 2021. MatchMaker: A Deep Learning Framework for Drug Synergy Prediction. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2021).

[4]

Xusheng Cao, Rui Fan, and Wanwen Zeng. 2020. DeepDrug: A General Graph- Based Deep Learning Framework for Drug Relation Prediction. bioRxiv (2020).

[5]

Yukuo Cen, Zhenyu Hou, Yan Wang, Qibin Chen, et al. 2021. CogDL: An Extensive Toolkit for Deep Learning on Graphs. (2021).

[6]

Tianqi Chen, Mu Li, Yutian Li, Min Lin, et al. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv preprint 1512.01274 (2015).

[7]

Xin Chen, Xien Liu, and Ji Wu. 2020. GCN-BMP: Investigating Graph Representation Learning for DDI Prediction Task. Methods 179 (2020), 47--54. Interpretable machine learning in bioinformatics.

[8]

CSIRO's Data61. 2018. StellarGraph Machine Learning Library. https://github.com/stellargraph/stellargraph.

[9]

Andreea Deac, Yu-Hsiang Huang, Petar Velickovic, Pietro Liò, and Jian Tang. 2019. Drug-Drug Adverse Effect Prediction with Graph Co-Attention. ICML Workshop on Computational Biology (2019).

[10]

Joseph L. Durant, Burton A. Leland, Douglas R. Henry, and James G. Nourse. 2002. Reoptimization of MDL keys for use in drug discovery. Journal of Chemical Information and Computer Sciences 42, 6 (2002), 1273--1280.

[11]

David K Duvenaud, Dougal Maclaurin, et al. 2015. Convolutional Networks on Graphs for Learning Molecular Fingerprints. Advances in Neural Information Processing Systems 28 (2015), 2224--2232.

[12]

Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.

[13]

Thomas Gaudelet, Ben Day, Arian R Jamasb, Jyothish Soman, et al. 2021. Utilizing Graph Machine Learning within Drug Discovery and Development. Briefings in Bioinformatics 22, 6 (2021).

[14]

Justin Gilmer, Samuel Schoenholz, Patrick Riley, Oriol Vinyals, and George Dahl. 2017. Neural Message Passing for Quantum Chemistry. In International Conference on Machine Learning. 1263--1272.

Digital Library

[15]

Jonathan Godwin, Thomas Keck, Peter Battaglia, Victor Bapst, Thomas Kipf, et al. 2020. Jraph: A Library for Graph Neural Networks in Jax.

[16]

Rafael Gómez-Bombarelli, Jennifer Wei, David Duvenaud, José Miguel Hernández-Lobato, et al. 2018. Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules. ACS Central Science 4, 2 (2018), 268--276.

[17]

Daniele Grattarola and Cesare Alippi. 2021. Graph Neural Networks in Tensor-Flow and Keras with Spektral. IEEE Computational Intelligence Magazine 16, 1 (2021), 99--106.

Digital Library

[18]

Stephen R. Heller, Alan McNaught, Igor Pletnev, Stephen Stein, and Dmitrii Tchekhovskoi. 2015. InChI, the IUPAC International Chemical Identifier. Vol. 7. Journal of Cheminformatics. 1--34 pages.

[19]

Jun Hu, Shengsheng Qian, Quan Fang, et al. 2021. Efficient Graph Deep Learning in TensorFlow with TF Geometric. arXiv preprint 2101.11552 (2021).

[20]

Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2019. Strategies for Pre-training Graph Neural Networks. In International Conference on Learning Representations.

[21]

Kexin Huang, Tianfan Fu, Wenhao Gao, Yue Zhao, et al. 2021. Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development. In 35th Conference on Neural Information Processing Systems.

[22]

Kexin Huang, Cao Xiao, Trong Nghia Hoang, Lucas M Glass, and Jimeng Sun. 2020. CASTER: Predicting Drug Interactions with Chemical Substructure Representation. AAAI (2020).

[23]

James Bradbury and Roy Frostig and Peter Hawkins and Matthew James Johnson and others. 2018. JAX: Composable Transformations of Python+NumPy Programs.

[24]

Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Internation Conference on Learning Representations.

[25]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR).

[26]

Mario Krenn, Florian Häse, AkshatKumar Nigam, Pascal Friederich, and Alan Aspuru-Guzik. 2020. Self-referencing Embedded Strings (SELFIES): A 100% Robust Molecular String Representation. Machine Learning: Science and Technology 1, 4 (dec 2020), 045024.

[27]

Mufei Li, Jinjing Zhou, Jiajing Hu, Wenxuan Fan, Yangkang Zhang, Yaxin Gu, and George Karypis. 2021. DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life Science. ACS Omega 6, 41 (2021), 27233--27238.

[28]

Hui Liu, Wenhao Zhang, Bo Zou, Jinxian Wang, and Yuanyuan Deng. 2020. DrugCombDB: A Comprehensive Database of Drug Combinations Toward the Discovery of Combinatorial Therapy. Nucleic acids research 48 (2020), 871--881.

[29]

Meng Liu, Youzhi Luo, Limei Wang, et al. 2021. DIG: A Turnkey Library for Diving into Graph Deep Learning Research. Journal of Machine Learning Research 22, 240 (2021), 1--9.

[30]

Rocío Mercado, Tobias Rastemo, Edvard Lindelöf, Günter Klambauer, Ola Engkvist, Hongming Chen, and Esben Jannik Bjerrum. 2021. Graph Networks for Molecular Design. Machine Learning: Science and Technology 2, 2 (jun 2021).

[31]

H. L. Morgan. 1965. The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. Journal of Chemical Documentation 5, 2 (may 1965), 107--113.

[32]

Abe Motoki, Mihai Mororiu, Tomoya Otabi, Kenshin Abe, and Others. 2017. Chainer Chemistry: A Library for Deep Learning in Biology and Chemistry. https://github.com/chainer/chainer-chemistry

[33]

Vinod Nair and Geoffrey E Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning. 807--814.

Digital Library

[34]

Arnold K Nyamabo, Hui Yu, and Jian-Yu Shi. 2021. SSI--DDI: Substructure-- Substructure Interactions for Drug--Drug Interaction Prediction. Briefings in Bioinformatics (2021).

[35]

Noel O'Boyle and Andrew Dalke. 2018. DeepSMILES: An Adaptation of SMILES for Use in Machine-Learning of Chemical Structures. ChemRxiv (2018).

[36]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, et al. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems 32 (2019), 8026--8037.

[37]

Kristina Preuer, Richard PI Lewis, Sepp Hochreiter, Andreas Bender, Krishna C Bulusu, and Günter Klambauer. 2018. DeepSynergy: Predicting Anti-Cancer Drug Synergy with Deep Learning. Bioinformatics 34, 9 (2018), 1538--1546.

[38]

Bharath Ramsundar, Peter Eastman, Patrick Walters, Vijay Pande, et al. 2019. Deep Learning for the Life Sciences. O'Reilly Media.

[39]

Benedek Rozemberczki, Stephen Bonner, Andriy Nikolov, Michael Ughetto, Sebastian Nilsson, and Eliseo Papa. 2021. A Unified View of Relational Deep Learning for Drug Pair Scoring. arXiv:2111.02916 [cs.LG]

[40]

Benedek Rozemberczki, Paul Scherer, Yixuan He, George Panagopoulos, et al. 2021. PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. 4564--4573.

Digital Library

[41]

Jae Yong Ryu, Hyun Uk Kim, and Sang Yup Lee. 2018. Deep Learning Improves Prediction of Drug--Drug and Drug--Food Interactions. Proceedings of the National Academy of Sciences 115, 18 (2018), E4304--E4311.

[42]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929--1958.

Digital Library

[43]

Mengying Sun, Fei Wang, Olivier Elemento, and Jiayu Zhou. 2020. Structure-Based Drug-Drug Interaction Detection via Expressive Graph Convolutional Networks and Deep Sets. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 13927--13928.

[44]

Nicholas P Tatonetti, P Ye Patrick, Roxana Daneshjou, and Russ B Altman. 2012. Data-Driven Prediction of Drug Effects and Interactions. Science translational medicine 4, 125 (2012).

[45]

Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. 2015. Chainer: A Next-Generation Open Source Framework for Deep Learning. In Proceedings of Workshop on Machine Learning Systems in the 29th Conference on Neural Information Processing Systems (NIPS), Vol. 5. 1--6.

[46]

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. [n.d.]. Graph Attention Networks. In 6th International Conference on Learning Representations, 2018.

[47]

Jinxian Wang, Wenhao Zhang, Siyuan Shen, Lei Deng, and Hui Liu. 2021. Deep-DDS: Deep Graph Neural Network with Attention Mechanism to Predict Synergistic Drug Combinations. bioRxiv (2021).

[48]

Minjie Wang, Lingfan Yu, Da Zheng, et al. 2019. Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs. (2019).

[49]

David Weininger. 1988. SMILES, a Chemical Language and Information System. 1. Introduction to Methodology and Encoding Rules. J. Chem. Inf. Comput. Sci. 28, 1 (feb 1988), 31--36.

Digital Library

[50]

Nuo Xu, Pinghui Wang, Long Chen, Jing Tao, and Junzhou Zhao. 2019. MR-GNN: Multi-Resolution and Dual Graph Neural Network for Predicting Structured Entity Interactions. Proceedings of IJCAI (2019).

[51]

Bulat Zagidullin, Jehad Aldahdooh, Shuyu Zheng, Wenyu Wang, Yinyin Wang, et al. 2019. DrugComb: An Integrative Cancer Drug Combination Data Portal. Nucleic acids research 47, W1 (2019), W43--W51.

[52]

Shuyu Zheng, Jehad Aldahdooh, Tolou Shadbahr, Yinyin Wang, Dalal Aldahdooh, et al. 2021. DrugComb Update: A More Comprehensive Drug Sensitivity Data Repository and Analysis Portal. Nucleic Acids Research (2021).

[53]

Zhaocheng Zhu, Shengchao Liu, Chence Shi, et al. 2021. TorchDrug: A Powerful and Flexible Machine Learning Platform for Drug Discovery.

Cited By

Xu YZhou JYing HChen JChen WChen DWu J(2024)A Protein-Context Enhanced Master Slave Framework for Zero-Shot Drug Target Interaction PredictionIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2024.346843421:6(2359-2370)Online publication date: Nov-2024
https://doi.org/10.1109/TCBB.2024.3468434
Liu HYang YWu QHe BLiao YZhou P(2024)FacGNN: Multi-faceted Fairness Enhancement for GNN through Adversarial and Contrastive Learning2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10649939(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10649939
Ju WGu YMao ZQiao ZQin YLuo XXiong HZhang M(2024)GPS: graph contrastive learning via multi-scale augmented views from adversarial poolingScience China Information Sciences10.1007/s11432-022-3952-368:1Online publication date: 24-Dec-2024
https://doi.org/10.1007/s11432-022-3952-3
Show More Cited By

Index Terms

ChemicalX: A Deep Learning Library for Drug Pair Scoring
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Intervals and the deduction of drug binding site models
HICSS '95: Proceedings of the 28th Hawaii International Conference on System Sciences

In the search for new drugs, it often occurs that the binding affinities of several compounds to a common receptor macromolecule are known experimentally. But the structure of the receptor is not known. We describe an extraordinarily objective computer ...
Deep Learning for High-Order Drug-Drug Interaction Prediction
BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

Drug-drug interactions (DDIs) and their associated adverse drug reactions (ADRs) represent a significant detriment to the public health. Existing research on DDIs is primarily focused on pairwise DDI detection and prediction. It is highly needed to ...
Deep Learning Based-Virtual Screening Using 2D Pharmacophore Fingerprint in Drug Discovery
Abstract
Predicting biological activity and molecular properties is one of the most important goals in the pharmaceutical and bioinformatics field in order to discover potential new drugs. Although machine learning methods have been used in drug discovery ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2022

5033 pages

ISBN:9781450393850

DOI:10.1145/3534678

General Chairs:
Aidong Zhang
University of Virginia
,
Huzefa Rangwala
Amazon/George Mason University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

DARPA

Conference

KDD '22

Sponsor:

KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2022

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
693
Total Downloads

Downloads (Last 12 months)344
Downloads (Last 6 weeks)53

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xu YZhou JYing HChen JChen WChen DWu J(2024)A Protein-Context Enhanced Master Slave Framework for Zero-Shot Drug Target Interaction PredictionIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2024.346843421:6(2359-2370)Online publication date: Nov-2024
https://doi.org/10.1109/TCBB.2024.3468434
Liu HYang YWu QHe BLiao YZhou P(2024)FacGNN: Multi-faceted Fairness Enhancement for GNN through Adversarial and Contrastive Learning2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10649939(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10649939
Ju WGu YMao ZQiao ZQin YLuo XXiong HZhang M(2024)GPS: graph contrastive learning via multi-scale augmented views from adversarial poolingScience China Information Sciences10.1007/s11432-022-3952-368:1Online publication date: 24-Dec-2024
https://doi.org/10.1007/s11432-022-3952-3
Barrett JOskoui SRussell SBorens A(2023)Digital Research Environment(DRE)-enabled Artificial Intelligence (AI) to facilitate early stage drug developmentFrontiers in Pharmacology10.3389/fphar.2023.111535614Online publication date: 24-Mar-2023
https://doi.org/10.3389/fphar.2023.1115356
Zhang HMa XLin Z(2023)SAE-SV: A Stacked-AutoEncoder and Soft Voting Joint Approach Based on Small Dataset with High Dimensions for Inhibitory Potency PredictionProceedings of the 2023 4th International Symposium on Artificial Intelligence for Medicine Science10.1145/3644116.3644315(1170-1175)Online publication date: 20-Oct-2023
https://dl.acm.org/doi/10.1145/3644116.3644315
Mohammadiyeh SNeysiani B(2023)Analyzing and Improving Prediction of Spatiotemporal Signal Data Using Grid Search on Graph Convolutional Networks2023 9th International Conference on Web Research (ICWR)10.1109/ICWR57742.2023.10139174(294-299)Online publication date: 3-May-2023
https://doi.org/10.1109/ICWR57742.2023.10139174
Saifi IBhat BHamdani SBhat ULobato-Tapia CMir MDar TGanie S(2023)Artificial intelligence and cheminformatics tools: a contribution to the drug development and chemical scienceJournal of Biomolecular Structure and Dynamics10.1080/07391102.2023.223403942:12(6523-6541)Online publication date: 11-Jul-2023
https://doi.org/10.1080/07391102.2023.2234039
Wang YZhao YDong YChen HLi JDerr TZhang ARangwala H(2022)Improving Fairness in Graph Neural Networks via Mitigating Sensitive Attribute LeakageProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539404(1938-1948)Online publication date: 14-Aug-2022
https://dl.acm.org/doi/10.1145/3534678.3539404
Wang YZhao YShah NDerr TAl Hasan MXiong L(2022)Imbalanced Graph Classification via Graph-of-Graph Neural NetworksProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557356(2067-2076)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557356
Wang YDerr T(2022)Degree-Related Bias in Link Prediction2022 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW58026.2022.00103(757-758)Online publication date: Nov-2022
https://doi.org/10.1109/ICDMW58026.2022.00103

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten