Convolutional neural network scoring and minimization in the D3R 2017 community challenge

Jocelyn Sunseri¹,
Jonathan E. King¹,
Paul G. Francoeur¹ &
…
David Ryan Koes ORCID: orcid.org/0000-0002-9043-6267¹

1817 Accesses
39 Citations
8 Altmetric
1 Mention
Explore all metrics

Abstract

We assess the ability of our convolutional neural network (CNN)-based scoring functions to perform several common tasks in the domain of drug discovery. These include correctly identifying ligand poses near and far from the true binding mode when given a set of reference receptors and classifying ligands as active or inactive using structural information. We use the CNN to re-score or refine poses generated using a conventional scoring function, Autodock Vina, and compare the performance of each of these methods to using the conventional scoring function alone. Furthermore, we assess several ways of choosing appropriate reference receptors in the context of the D3R 2017 community benchmarking challenge. We find that our CNN scoring function outperforms Vina on most tasks without requiring manual inspection by a knowledgeable operator, but that the pose prediction target chosen for the challenge, Cathepsin S, was particularly challenging for de novo docking. However, the CNN provided best-in-class performance on several virtual screening tasks, underscoring the relevance of deep learning to the field of drug discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sfcnn: a novel scoring function based on 3D convolutional neural network for accurate and stable protein–ligand affinity prediction

Article Open access 08 June 2022

GNINA 1.0: molecular docking with deep learning

Article Open access 09 June 2021

An open-source molecular builder and free energy preparation workflow

Article Open access 27 October 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Wang J-C, Lin J-H (2013) Scoring functions for prediction of protein-ligand interactions. Curr Pharm Des 19(12):2174–2182
Article CAS PubMed Google Scholar
Colwell LJ (2018) Statistical and machine learning approaches to predicting protein-ligand interactions. Curr Opin Struct Biol 49:123–128
Article CAS PubMed Google Scholar
Braga RC, Alves VM, Silva AC, Nascimento MN, Silva FC, Liao LM, Andrade CH (2014) Virtual screening strategies in medicinal chemistry: the state of the art and current challenges. Curr Top Med Chem 14(16):1899–1912
Article CAS PubMed Google Scholar
Pérez-Sianes J, Pérez-Sánchez H, Díaz F (2016) Virtual screening: a challenge for deep learning. In: Mohamad MS, Rocha M, Fdez-Riverola F, De Paz JF, De Paz JF (eds) 10th International Conference on practical applications of computational biology and bioinformatics. Springer, Basel, pp 13–22
Google Scholar
Sliwoski G, Kothiwale S, Meiler J, Lowe EW (2014) Computational methods in drug discovery. Pharmacol Rev 66(1):334–395
Article CAS PubMed PubMed Central Google Scholar
Jansen JM, Amaro RE, Cornell W, Tseng YJ, Patrick Walters W (2012) Computational chemistry and drug discovery: a call to action. Future Med Chem 4(15):1893–1896
Article CAS PubMed Google Scholar
Boutros PC, Margolin AA, Stuart JM, Califano A, Stolovitzky G (2014) Toward better benchmarking: challenge-based methods assessment in cancer genomics. Genome Biol 15(9):462
Article PubMed PubMed Central Google Scholar
Gathiaka S, Liu S, Chiu M, Yang H, Stuckey JA, Kang YN, Delproposto J, Kubish G, Dunbar JB, Carlson HA et al (2016) D3r grand challenge 2015: evaluation of protein-ligand pose and affinity predictions. J Comput-Aided Mol Des 30(9):651–668
Article CAS PubMed PubMed Central Google Scholar
Gaieb Z, Liu S, Gathiaka S, Chiu M, Yang H, Shao C, Feher VA, Walters WP, Kuhn B, Rudolph MG et al (2018) D3r grand challenge 2: blind prediction of protein-ligand poses, affinity rankings, and relative binding free energies. J Comput-aided Mol Des 32(1):1–20
Article CAS PubMed Google Scholar
Jiménez Luna J, Skalic M, Martinez-Rosell G (2018) K deep: Protein-ligand absolute binding affinity prediction via 3d-convolutional neural networks. J Chem Inf Model 58(2):287–296
Article CAS Google Scholar
Mobley DL, Graves AP, Chodera JD, McReynolds AC, Shoichet BK, Dill KA (2007) Predicting absolute ligand binding free energies to a simple model site. J Mol Biol 371(4):1118–1134
Article CAS PubMed PubMed Central Google Scholar
Aldeghi M, Heifetz A, Bodkin MJ, Knapp S, Biggin PC (2016) Accurate calculation of the absolute free energy of binding for drug molecules. Chem Sci 7(1):207–218
Article CAS PubMed Google Scholar
Stjernschantz E, Oostenbrink C (2010) Improved ligand-protein binding affinity predictions using multiple binding modes. Biophys J 98(11):2682–2691
Article CAS PubMed PubMed Central Google Scholar
Kim R, Skolnick J (2008) Assessment of programs for ligand binding affinity prediction. J Comput Chem 29(8):1316–1331
Article CAS PubMed PubMed Central Google Scholar
Ashtawy HM, Mahapatra NR (2012) A comparative assessment of ranking accuracies of conventional and machine-learning-based scoring functions for protein-ligand binding affinity prediction. IEEE/ACM Trans Comput Biol Bioinform 9(5):1301–1313
Article PubMed Google Scholar
Carlson HA (2016) Lessons learned over four benchmark exercises from the community structure—activity resource. J Chem Inf Model 56:951–954
Article CAS PubMed PubMed Central Google Scholar
Smith RD, Damm-Ganamet KL, Dunbar JB Jr, Ahmed A, Chinnaswamy K, Delproposto JE, Kubish GM, Tinberg CE, Khare SD, Dou J et al (2015) Csar benchmark exercise 2013: evaluation of results from a combined computational protein design, docking, and scoring/ranking challenge. J Chem Inf Model 56(6):1022–1031
Article CAS PubMed PubMed Central Google Scholar
Carlson HA, Smith RD, Damm-Ganamet KL, Stuckey JA, Ahmed A, Convery MA, Somers DO, Kranz M, Elkins PA, Cui G et al (2016) Csar 2014: a benchmark exercise using unpublished data from pharma. J Chem Inf Model 56(6):1063–1077
Article CAS PubMed PubMed Central Google Scholar
Harder E, Damm W, Maple J, Wu C, Reboul M, Xiang JY, Wang L, Lupyan D, Dahlgren MK, Knight JL, Kaus JW, Cerutti DS, Krilov G, Jorgensen WL, Abel R, Friesner RA (2016) OPLS3: a force field providing broad coverage of drug-like small molecules and proteins. J Chem Theory Comput 12(1):281–296. https://doi.org/10.1021/acs.jctc.5b00864
Article CAS PubMed Google Scholar
Yin S, Biedermannova L, Vondrasek J, Dokholyan NV (2008) MedusaScore: an accurate force field-based scoring function for virtual drug screening. J Chem Inf Model 48(8):1656–1662. https://doi.org/10.1021/ci8001167
Article CAS PubMed PubMed Central Google Scholar
Case DA, Cheatham TE, Darden T, Gohlke H, Luo R, Merz KM, Onufriev A, Simmerling C, Wang B, Woods RJ (2005) The Amber biomolecular simulation programs. J. Comput. Chem. 26(16):1668–1688. https://doi.org/10.1002/jcc.20290 ISSN 1096-987X.
Cheng T, Li X, Li Y, Liu Z, Wang R (2009) Comparative assessment of scoring functions on a diverse test set. J Chem Inf Model 49(4):1079–1093. https://doi.org/10.1021/ci9000053
Article CAS PubMed Google Scholar
Ewing TJ, Makino S, Skillman AG, Kuntz ID (2001) DOCK 4.0: search strategies for automated molecular docking of flexible molecule databases. J Comput-Aided Mol Des 15(5):411–28
Article CAS PubMed Google Scholar
Brooks BR, Bruccoleri RE, Olafson BD (1983) CHARMM: a program for macromolecular energy, minimization, and dynamics calculations. J Comput Chem 4(2):187–217 ISSN 1096-987X
Lindahl E, Hess B, Van Der Spoel D (2001) GROMACS 3.0: a package for molecular simulation and trajectory analysis. J Mol Model 7(8):306–317 ISSN 1610-2940
Jorgensen WL, Maxwell DS, Tirado-Rives J (1996) Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids. J Am Chem Soc 118(45):11225–11236 ISSN 0002-7863
Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267(3):727–748. https://doi.org/10.1006/jmbi.1996.0897
Article CAS PubMed Google Scholar
Koes DR, Baumgartner MP, Camacho CJ (2013) Learned lessons in empirical scoring with smina from the CSAR 2011 benchmarking exercise. J Chem Inf Model. https://doi.org/10.1021/ci300604z
Article PubMed PubMed Central Google Scholar
Eldridge MD, Murray CW, Auton TR, Paolini GV, Mee RP (1997) Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes. J Comput-Aided Mol Des 11(5):425–445
Article CAS PubMed Google Scholar
Böhm HJ (1994) The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure. J Comput-Aided Mol Des 8(3):243–256 ISSN 0920-654X
Wang R, Lai L, Wang S (2002) Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J Comput-Aided Mol Des 16(1):11–26 ISSN 0920-654X
Korb O, Stützle T, Exner TE (2009) Empirical scoring functions for advanced protein-ligand docking with PLANTS. J Chem Inf Model 49(1):84–96. https://doi.org/10.1021/ci800298z ISSN 1549-9596
Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK, Shaw DE, Francis P, Shenkin PS (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem 47(7):1739–1749. https://doi.org/10.1021/jm0306430
Article CAS PubMed Google Scholar
Trott O, Olson AJ (2009) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem. https://doi.org/10.1002/jcc.21334. ISSN 1096-987X
Huang SY, Zou X (2010) Mean-force scoring functions for protein-ligand binding. Annu Rep Comp Chem 6:280–296 ISSN 1574-1400
Muegge I, Martin YC (1999) A general and fast scoring function for protein-ligand interactions: a simplified potential approach. J Med Chem 42(5):791–804. https://doi.org/10.1021/jm980536j
Article CAS PubMed Google Scholar
Gohlke H, Hendlich M, Klebe G (2000) Knowledge-based scoring function to predict protein-ligand interactions. J Mol Biol 295(2):337–356
Article CAS PubMed Google Scholar
Zhou H, Skolnick J (2011) GOAP: a generalized orientation-dependent, all-atom statistical potential for protein structure prediction. Biophys J 101(8):2043–2052. https://doi.org/10.1016/j.bpj.2011.09.012
Article CAS PubMed PubMed Central Google Scholar
Mooij WT, Verdonk ML (2005) General and targeted statistical potentials for protein-ligand interactions. Proteins 61(2):272–287. https://doi.org/10.1002/prot.20588
Article CAS PubMed Google Scholar
Ballester PJ, Mitchell JBO (2010) A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics 26(9):1169. https://doi.org/10.1093/bioinformatics/btq112 ISSN 1367-4803
Huang SY, Zou X (2006) An iterative knowledge-based scoring function to predict protein-ligand interactions: II. Validation of the scoring function. J Comput Chem 27(15):1876–1882. https://doi.org/10.1002/jcc.20505 ISSN 1096-987X
Rojas R (2013) Neural networks: a systematic introduction. Springer Science and Business Media, Berlin
Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article CAS PubMed Google Scholar
Durrant JD, McCammon JA (2010) Nnscore: a neural-network-based scoring function for the characterization of protein-ligand complexes. J Chem Inf Model 50(10):1865–1871. https://doi.org/10.1021/ci100244v
Article CAS PubMed PubMed Central Google Scholar
Durrant JD, McCammon JA (2011) Nnscore 2.0: a neural-network receptor-ligand scoring function. J Chem Inf Model 51(11):2897–2903. https://doi.org/10.1021/ci2003889
Article CAS PubMed PubMed Central Google Scholar
Chupakhin V, Marcou G, Baskin I, Varnek A, Rognan D (2013) Predicting ligand binding modes from neural networks trained on protein-ligand interaction fingerprints. J Chem Inf Model 53(4):763–772. https://doi.org/10.1021/ci300200r
Article CAS PubMed Google Scholar
Ashtawy HM, Mahapatra NR (2015) Machine-learning scoring functions for identifying native poses of ligands docked to known and novel proteins. BMC Bioinform 16(6):1–17. https://doi.org/10.1186/1471-2105-16-S6-S3 ISSN 1471-2105
Jorissen RN, Gilson MK (2005) Virtual screening of molecular databases using a support vector machine. J Chem Inf Model 45(3):549–561. https://doi.org/10.1021/ci049641u
Article CAS PubMed Google Scholar
Zilian David, Sotriffer Christoph A (2013) Sfcscore rf: a random forest-based scoring function for improved affinity prediction of protein-ligand complexes. Journal of chemical information and modeling 53(8):1923–1933. https://doi.org/10.1021/ci400120b
Article CAS PubMed Google Scholar
Gomes J, Ramsundar B, Feinberg EN, Pande VS (2017) Atomic convolutional networks for predicting protein-ligand binding affinity. arXiv preprint arXiv:1703.10603
Wallach I, Dzamba M, Heifets A (2015) Atomnet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery. arXiv preprint arXiv:1510.02855
Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. In: Solla SA, Leen TK, Müller KR (eds) Advances in neural information processing systems. MIT Press, London, pp 2224–2232
Google Scholar
Schütt KT, Kindermans PJ, Sauceda HE, Chmiela S, Tkatchenko A, Müller K-R (2017) Moleculenet: a continuous-filter convolutional neural network for modeling quantum interactions. arXiv preprint arXiv:1706.08566
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Solla SA, Leen TK, Müller KR (eds) Advances in neural information processing systems. MIT Press, London, pp 1097–1105
Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. CoRR, abs/1512.03385.arXiv:1512.03385
Ragoza M, Hochuli J, Idrobo E, Sunseri J, Koes DR (2017a) Protein-ligand scoring with convolutional neural networks. J Chem Inf Model 57(4):942–957
Article CAS PubMed PubMed Central Google Scholar
Ragoza M, Turner L, Koes DR (2017) Ligand pose optimization with atomic grid-based convolutional neural networks. arXiv preprint arXiv:1710.07400
Hochuli J, Helbling A, Skaist T, Ragoza M, Koes DR (2018) Visualizing convolutional neural network protein-ligand scoring. arXiv preprint arXiv:1803.02398
Trott O, Olson AJ (2010) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 31(2):455–461. https://doi.org/10.1002/jcc.21334
Article CAS PubMed PubMed Central Google Scholar
Liu Z, Minyi S, Han L, Liu J, Yang Q, Li Y, Wang R (2017) Forging the basis for developing proteinligand interaction scoring functions. Acc Chem Res 50(2):302–309. https://doi.org/10.1021/acs.accounts.6b00491
Article CAS PubMed Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093
rdkit. RDKit: Open-Source Cheminformatics. http://www.rdkit.org. Accessed 6 Nov 2017
Kufareva I, Ilatovskiy AV, Abagyan R (2011) Pocketome: an encyclopedia of small-molecule binding sites in 4d. Nucleic Acids Res 40(D1):D535–D540
Article CAS PubMed PubMed Central Google Scholar
DeLano WL, Schrödinger, LLC. The PyMOL molecular graphics system, version 1.8. (2015)
O’Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) Open babel: an open chemical toolbox. J Cheminf 3(1):33
Article CAS Google Scholar
Shewchuk LM, Hassell AM, Ellis B, Holmes WD, Davis R, Horne EL, Kadwell SH, McKee DD, Moore JT (2000) Structure of the tie2 rtk domain: self-inhibition by the nucleotide binding loop, activation loop, and c-terminal tail. Structure 8(11):1105–1113
Article CAS PubMed Google Scholar

Download references

Acknowledgements

J.S. is supported by a fellowship from The Molecular Sciences Software Institute under NSF Grant ACI-1547580. This work is supported by R01GM108340 from the National Institute of General Medical Sciences and by a GPU donation from the NVIDIA corporation.

Author information

Authors and Affiliations

Department of Computational & Systems Biology, School of Medicine, University of Pittsburgh, 3501 Fifth Avenue, Suite 3064, Biomedical Science Tower 3 (BST3), Pittsburgh, PA, 15260, USA
Jocelyn Sunseri, Jonathan E. King, Paul G. Francoeur & David Ryan Koes

Authors

Jocelyn Sunseri
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan E. King
View author publications
You can also search for this author in PubMed Google Scholar
Paul G. Francoeur
View author publications
You can also search for this author in PubMed Google Scholar
David Ryan Koes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Ryan Koes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sunseri, J., King, J.E., Francoeur, P.G. et al. Convolutional neural network scoring and minimization in the D3R 2017 community challenge. J Comput Aided Mol Des 33, 19–34 (2019). https://doi.org/10.1007/s10822-018-0133-y

Download citation

Received: 26 May 2018
Accepted: 06 July 2018
Published: 10 July 2018
Issue Date: 15 January 2019
DOI: https://doi.org/10.1007/s10822-018-0133-y

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Sfcnn: a novel scoring function based on 3D convolutional neural network for accurate and stable protein–ligand affinity prediction

GNINA 1.0: molecular docking with deep learning

An open-source molecular builder and free energy preparation workflow

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Convolutional neural network scoring and minimization in the D3R 2017 community challenge

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Sfcnn: a novel scoring function based on 3D convolutional neural network for accurate and stable protein–ligand affinity prediction

GNINA 1.0: molecular docking with deep learning

An open-source molecular builder and free energy preparation workflow

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation