Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

SB-Net: : Synergizing CNN and LSTM networks for uncovering retrosynthetic pathways in organic synthesis

Published: 21 November 2024 Publication History

Abstract

Retrosynthesis is vital in synthesizing target products, guiding reaction pathway design crucial for drug and material discovery. Current models often neglect multi-scale feature extraction, limiting efficacy in leveraging molecular descriptors. Our proposed SB-Net model, a deep-learning architecture tailored for retrosynthesis prediction, addresses this gap. SB-Net combines CNN and Bi-LSTM architectures, excelling in capturing multi-scale molecular features. It integrates parallel branches for processing one-hot encoded descriptors and ECFP, merging through dense layers. Experimental results demonstrate SB-Net’s superiority, achieving 73.6 % top-1 and 94.6 % top-10 accuracy on USPTO-50k data. Versatility is validated on MetaNetX, with rates of 52.8 % top-1, 74.3 % top-3, 79.8 % top-5, and 83.5 % top-10. SB-Net’s success in bioretrosynthesis prediction tasks indicates its efficacy. This research advances computational chemistry, offering a robust deep-learning model for retrosynthesis prediction. With implications for drug discovery and synthesis planning, SB-Net promises innovative and efficient pathways.

Graphical Abstract

Display Omitted

Highlights

Retrosynthesis aids in synthesizing target products, crucial for drug and material discovery.
Current models lack multi-scale feature extraction, limiting effectiveness.
SB-Net, a proposed deep-learning model, excels in retrosynthesis prediction.
SB-Net combines CNN and Bi-LSTM, achieving high accuracy on various datasets.

References

[1]
Z. Abbas, M.U. Rehman, H. Tayara, K.T. Chong, Ori-explorer: a unified cell-specific tool for origin of replication sites prediction by feature fusion, Bioinformatics 39 (11) (2023).
[2]
A.F. de Almeida, R. Moreira, T. Rodrigues, Synthetic organic chemistry driven by artificial intelligence, Nat. Rev. Chem. 3 (10) (2019) 589–604.
[3]
C.W. Coley, W.H. Green, K.F. Jensen, Rdchiral: an RDKit wrapper for handling stereochemistry in retrosynthetic template extraction and application, J. Chem. Inf. Model. 59 (6) (2019) 2529–2537.
[4]
C.W. Coley, L. Rogers, W.H. Green, K.F. Jensen, Computer-assisted retrosynthesis based on molecular similarity, ACS Cent. Sci. 3 (12) (2017) 1237–1245.
[5]
C.W. Coley, L. Rogers, W.H. Green, K.F. Jensen, Computer-assisted retrosynthesis based on molecular similarity, ACS Cent. Sci. 3 (12) (2017) 1237–1245.
[6]
H. Dai, C. Li, C. Coley, B. Dai, L. Song, Retrosynthesis prediction with conditional graph logic network, Adv. Neural Inf. Process. Syst. 32 (2019).
[7]
H. Dai, C. Li, C. Coley, B. Dai, L. Song, Retrosynthesis prediction with conditional graph logic network, Adv. Neural Inf. Process. Syst. 32 (2019).
[8]
S. Gaffar, M.T. Hassan, H. Tayara, K.T. Chong, If- aip: a machine learning method for the identification of anti-inflammatory peptides using multi-feature fusion strategy, Comput. Biol. Med. 168 (2024).
[9]
M.T. Hassan, H. Tayara, K.T. Chong, Meta-il4: an ensemble learning approach for il-4-inducing peptide prediction, Methods 217 (2023) 49–56.
[10]
M.T. Hassan, H. Tayara, K.T. Chong, An integrative machine learning model for the identification of tumor t-cell antigens, BioSystems 237 (2024).
[11]
K. Jaganathan, M.U. Rehman, H. Tayara, K.T. Chong, Xml-cimt: Explainable machine learning (xml) model for predicting chemical induced mitochondrial toxicity, Int. J. Mol. Sci. 23 (24) (2022) 15655.
[12]
G. Landrum, et al., Rdkit: Open-Source Chemin (2006).
[13]
S. Lim, S. Lee, Y. Piao, M. Choi, D. Bang, J. Gu, S. Kim, On modeling and utilizing chemical compound information with deep learning technologies: a task-oriented approach, Comput. Struct. Biotechnol. J. (2022).
[14]
B. Liu, B. Ramsundar, P. Kawthekar, J. Shi, J. Gomes, Q. Luu Nguyen, S. Ho, J. Sloane, P. Wender, V. Pande, Retrosynthetic reaction prediction using neural sequence-to-sequence models, ACS Cent. Sci. 3 (10) (2017) 1103–1113.
[15]
B. Liu, B. Ramsundar, P. Kawthekar, J. Shi, J. Gomes, Q. Luu Nguyen, S. Ho, J. Sloane, P. Wender, V. Pande, Retrosynthetic reaction prediction using neural sequence-to-sequence models, ACS Cent. Sci. 3 (10) (2017) 1103–1113.
[16]
J. Liu, C. Yan, Y. Yu, C. Lu, J. Huang, L. Ou-Yang, P. Zhao, MARS:a motif-based autoregressive model for retrosynthesis prediction, Bioinformatics 40 (3) (2024).
[17]
B. Mikulak-Klucznik, P. Gołębiowska, A.A. Bayly, O. Popik, T. Klucznik, S. Szymkuć, E.P. Gajewska, P. Dittwald, O. StaszewskaKrajewska, W. Beker, et al., Computational planning of the synthesis of complex natural products, Nature 588 (7836) (2020) 83–88.
[18]
B.A. Mir, M.U. Rehman, H. Tayara, K.T. Chong, Improving enhancer identification with a multi-classifier stacked ensemble model, J. Mol. Biol. 435 (23) (2023).
[19]
R.V. Nirogi, R. Badange, V. Reballi, M. Khagga, Design, synthesis and biological evaluation of novel benzopyran sulfonamide derivatives as 5-ht 6 receptor ligands, Asian J. Chem. 27 (6) (2015).
[20]
S. Park, M.U. Rehman, F. Ullah, H. Tayara, K.T. Chong, icpg-pos: an accurate computational approach for identification of cpg sites using positional features on single-cell whole genome sequence data, Bioinformatics 39 (8) (2023).
[21]
M.U. Rehman, H. Tayara, K.T. Chong, Dl-m6a: identification of n6- methyladenosine sites in mammals using deep learning based on different encoding schemes, IEEE/ACM Trans. Comput. Biol. Bioinforma. 20 (2) (2022) 904–911.
[22]
M.U. Rehman, H. Tayara, Q. Zou, K.T. Chong, i6ma-caps: a capsulenet-based framework for identifying dna n6-methyladenine sites, Bioinformatics 38 (16) (2022) 3885–3891.
[23]
D. Rogers, M. Hahn, Extended-connectivity fingerprints, J. Chem. Inf. Model. 50 (5) (2010) 742–754.
[24]
M. Sacha, M. Błaz, P. Byrski, P. Dabrowski-Tumanski, M. Chrominski, R. Loska, P. Włodarczyk-Pruszynski, S. Jastrzebski, Molecule edit graph attention network: modeling chemical reactions as sequences of graph edits, J. Chem. Inf. Model. 61 (7) (2021) 3273–3284.
[25]
P. Schwaller, T. Gaudin, D. Lanyi, C. Bekas, T. Laino, “found in translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models, Chem. Sci. 9 (28) (2018) 6091–6098.
[26]
Segler, M.H., Preuss, M., Waller, M.P.Learning to plan chemical syntheses, arXiv preprint arXiv:1708.04202 (2017).
[27]
M.H. Segler, M.P. Waller, Neural-symbolic machine learning for retrosynthesis and reaction prediction, Chem. A Eur. J. 23 (25) (2017) 5966–5971.
[28]
Shi, C., Xu, M., Guo, H., Zhang, M., Tang, J.A graph to graphs framework for retrosynthesis prediction, in: International conference on machine learning, PMLR, 2020, pp. 8818–8827.
[29]
V.R. Somnath, C. Bunne, C. Coley, A. Krause, R. Barzilay, Learning graph models for retrosynthesis prediction, Adv. Neural Inf. Process. Syst. 34 (2021) 9405–9415.
[30]
R.K. Srivastava, K. Greff, J. Schmidhuber, arXiv preprint, arXiv:1505.00387 ( Highw. Netw. (2015).
[31]
F. Strieth-Kalthoff, F. Sandfort, M.H. Segler, F. Glorius, Machine learning the ropes: principles, applications and directions in synthetic chemistry, Chem. Soc. Rev. 49 (17) (2020) 6154–6168.
[32]
Sun, R., Dai, H., Li, L., Kearnes, S., Dai, B.Energy-based view of retrosynthesis, arXiv preprint arXiv:2007.13437 (2020).
[33]
C. Yan, Q. Ding, P. Zhao, S. Zheng, J. Yang, Y. Yu, J. Huang, Retroxpert: decompose retrosynthesis prediction like a chemist, Adv. Neural Inf. Process. Syst. 33 (2020) 11248–11258.
[34]
C. Yan, P. Zhao, C. Lu, Y. Yu, J. Huang, Retrocomposer: composing templates for template-based retrosynthesis prediction, Biomolecules 12 (9) (2022) 1325.
[35]
F. Yang, J. Liu, Q. Zhang, Z. Yang, X. Zhang, Cnn-based two-branch multi-scale feature extraction network for retrosynthesis prediction, BMC Bioinforma. 23 (1) (2022) 1–16.

Index Terms

  1. SB-Net: Synergizing CNN and LSTM networks for uncovering retrosynthetic pathways in organic synthesis
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Computational Biology and Chemistry
    Computational Biology and Chemistry  Volume 112, Issue C
    Oct 2024
    820 pages

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 21 November 2024

    Author Tags

    1. Retrosynthesis prediction
    2. Drug discovery
    3. Convolutional neural network
    4. Bidirectional LSTM
    5. Multi-scale features

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media