Nothing Special   »   [go: up one dir, main page]

Machine Learning For Protein Folding and Dynamics: Sciencedirect

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Available online at www.sciencedirect.

com

ScienceDirect

Machine learning for protein folding and dynamics


Frank Noé1, Gianni De Fabritiis2,3 and Cecilia Clementi4

Many aspects of the study of protein folding and dynamics In the last few years these tools and ideas have also been
have been affected by the recent advances in machine applied to, and in some cases revolutionized problems in
learning. Methods for the prediction of protein structures from fundamental sciences, where the discovery of patterns
their sequences are now heavily based on machine learning and hidden relationships can lead to the formulation of
tools. The way simulations are performed to explore the energy new general principles. In the case of protein folding and
landscape of protein systems is also changing as force-fields dynamics, machine learning has been used for multiple
are started to be designed by means of machine learning purposes [1,2,3, 4,5,6].
methods. These methods are also used to extract the essential
information from large simulation datasets and to enhance the As protein sequences contain all the necessary information
sampling of rare events such as folding/unfolding transitions. to reach the folded structure, it is natural to ask if the ideas
While significant challenges still need to be tackled, we expect and algorithms that have proved very useful to associate
these methods to play an important role on the study of protein labels to images can also help to associate a folded structure
folding and dynamics in the near future. We discuss here the to a protein sequence. Indeed, protein structure prediction
recent advances on all these fronts and the questions that need has greatly benefitted from the influx of idea from machine
to be addressed for machine learning approaches to become learning, as it has been demonstrated in the CASP compe-
mainstream in protein simulation. titions in the last few years, where several groups have used
machine learning approaches of different kinds [1,2,7,3],
Addresses
1
and the AlphaFold team from DeepMind won the
Department of Mathematics and Computer Science, Freie Universität 2018 competition by a margin [8,9].
Berlin, Arnimallee 6, 14195 Berlin, Germany
2
Computational Science Laboratory, Universitat Pompeu Fabra,
Barcelona Biomedical Research Park (PRBB), Doctor Aiguader 88, In addition to protein structure prediction, machine learn-
08003 Barcelona, Spain ing methods can help address other questions regarding
3
Institucio Catalana de Recerca i Estudis Avanats (ICREA), Passeig Lluis protein dynamics. Physics-based approaches to protein
Companys 23, 08010 Barcelona, Spain
4 folding usually involve the design of an energy function
Center for Theoretical Biological Physics, and Department of
Chemistry, Rice University, 6100 Main Street, Houston, TX 77005, that guides the dynamics of the protein on its conforma-
United States tional landscape from the unfolded to the folded state.
Different ideas have been used in the past several decades
Corresponding author: Clementi, Cecilia (cecilia@rice.edu) to design such energy functions, from first-principle atom-
istic force field [10,11] to simplified coarse-grained effec-
Current Opinion in Structural Biology 2020, 60:77–84 tive potential energies [12] encoding physical principles
This review comes from a themed issue on Folding and binding
such as for instance the energy landscape theory of protein
folding [13,14]. In this context, neural networks can help
Edited by Shachi Gosavi and Ben Schuler
design these energy functions to take into account of multi-
body terms that are not easily modeled analytically [5].

Another aspect where machine learning has made a signifi-


https://doi.org/10.1016/j.sbi.2019.12.005 cant impact is on the analysis of protein simulations. Even if
0959-440X/ã 2019 Elsevier Ltd. All rights reserved. we had an accurate protein force-field and we could simu-
late the dynamics of a protein long enough to sample its
equilibrium distribution, there is still the problem of
extracting the essential information from the simulation,
and to relate it to experimental measurements. In this case,
unsupervised learning methods can help to extract meta-
Introduction stable states from high dimensional simulation data and to
During the last couple of decades advances in artificial connect them to measurable observables [15].
intelligence and machine learning have revolutionized
many application areas such as image recognition and In the following we review the recent contributions of
language translation. The key of this success has been machine learning in the advancement of these different
the design of algorithms that can extract complex pat- aspects of the study of protein folding and dynamics. As
terns and highly non-trivial relationships from large the field is rapidly evolving, most probably these
amount of data and abstract this information in the contributions will become even more significant in the
evaluation of new data. near future.

www.sciencedirect.com Current Opinion in Structural Biology 2020, 60:77–84


78 Folding and binding

Machine learning for protein structure Historically the difference between top predictors in
prediction CASP has been minimal — indicating that there was
Structure prediction consists in the inference of the not a clearly better method, but rather an incremental
folded structure of a protein from the sequence infor- improvement of the workflows. This situation created a
mation. The most recent successes of machine learning barrier of entry to a certain extent for new ideas and
for protein structure prediction arise with the models. However, in the latest edition of CASP
application of deep learning to evolutionary informa- (CASP13), the group of AlphaFold [9] ranked first with
tion [16,17]. It has long been known that the mutation a very simplified workflow [8], heavily based on machine
of one amino acid in a protein usually requires the learning methods. The approach extended the contact
mutation of a contacting amino acid in order to pre- and distance matrix predictions to predict histograms of
serve the functional structure [18–21] and that the co- distances between amino acids using a very deep residual
evolution of mutations contains information on amino network on co-evolutionary data. This approach allowed
acid distances in the three dimensional structure of the to take into account implicitly the possible errors and
protein. Initial methods [16,22] to extract this infor- inaccuracy in the prediction itself. In addition, it used an
mation from co-evolution data were based on standard autoencoder architecture derived from previous work on
machine learning approaches but later methods based drawing [25] to replace threading all-together and gener-
on deep residual networks have shown to perform ate the structure directly from the sequence and distance
better in inferring possible contact maps [1,2]. More histograms. The use of an autoencoder guarantees an
recently, it has been shown that it is possible to predict implicit, but much more elegant threading of the avail-
distance matrices [4] from co-evolutionary information able structural information in the PDB to the predicted
instead of just contact maps. This result was accom- structure. In a second approach from the same group, a
plished by using a probabilistic neural network to knowledge-based potential derived from the distance
predict inter-residue distance distributions. From a histograms was also used. The potential was simply
complete distance matrix, it is relatively straightfor- minimized to converged structures. This last protein-
ward to obtain a protein structure, but of course the specific potential minimization might look surprising at
prediction of the distance matrix from co-evolution first, but it is actually very similar to well-known struc-
data is not perfect, nor complete. Yet, in [7] it was ture-based models for protein folding [26,13].
shown that, if at least 32–64 sequences are available for
a protein family, then this data are sufficient to obtain An alternative and interesting machine learning approach for
the fold class for 614 protein families with currently structure predictions, which also offers wider applicability, is
unknown structures, when the co-evolutionary infor- to use end-to-end differentiable models [27,3,28]. While
mation is integrated in the Rosetta structure prediction the performance of these methods does not yet reach
approach. Admittedly, the authors concede that this is the performance of co-evolution based methods for cases
not yet equivalent to obtain the crystal structure to the where co-evolutionary information is high, they can be
accuracy that would be useful, for instance, for drug applied to protein design, and in cases where co-evolution
discovery. However, it still represents a major achieve- data is missing. In [27], a single end-to-end network is
ment in structure prediction. proposed that is composed by multiple transformations from
the sequence to the protein backbone angles and finally to
Every two years, the performance of the different meth- three-dimensional coordinates on which a loss function is
ods for structure prediction is assessed in the CASP computed in terms of root mean square deviations against
(Critical Assessment of Techniques for Protein Structure known structures. In [3] a sequence-conditioned energy
Prediction) competition, where a set of sequences with function is parameterized by a deep neural network and
structures yet to be released are given to participants to Langevin dynamics is used to generate samples from the
predict the structure blindly. The extent of the impact of distribution. In [28] a generative adversarial model is used to
machine learning in structure prediction has been quite produce realistic C a distance matrices on blocks up to
visible in the latest CASP competitions. The typical 128-residues, then standard methods are used to recreate
methodology in previous CASP editions for the top the backbone and side chain structure from there. Inci-
ranked predictions has been to use very complex work- dentally, a variational autoencoder was also tested as a
flows based on protein threading and some method for baseline with comparable results. This model is not
structure optimization like Rosetta [23]. Protein thread- conditioned on sequence, so it is useful for generating
ing consists in selecting parts of the sequence for which new structures and for in-painting missing parts in a
there are good templates in the PDB and stitch them crystal structure.
together [24]. A force-field can then be used to relax this
object into a protein structure. The introduction of Folding proteins with machine learned force
co-evolution information in the form of contact maps fields
prediction provided a boost in the performance, at the State-of-the-art force fields can reproduce with reason-
expense of even more complex workflows. able accuracy the thermodynamical and structural

Current Opinion in Structural Biology 2020, 60:77–84 www.sciencedirect.com


Machine learning for protein folding and dynamics Noé, De Fabritiis and Clementi 79

properties of globular proteins [10] or intrinsically One fundamental challenge resides on the modeling of
disordered proteins (IDPs) [11]. Generally, force fields long-range interactions. If only quantum calculations on
are designed by first assigning a functional form for all the small molecules are used in the training of force-fields,
different types of interactions (e.g. electrostatic, Van der interactions on scales larger than these molecules could
Waals, etc.) between the atoms of different types, then easily be missed in the training. The locality of the
optimizing the parameters in these interactions to machine-learned force-fields could be insufficient to cap-
reproduce as best as possible some reference data. ture electrostatic interactions, or long-range van der Waals
interactions [37]. This problem could be addressed by
In the last few years, a new approach on the design of force- separating the long-range effects in the force-field. For
fields has emerged, that takes advantage of machine learn- instance, atomic partial charges could be learned [38]
ing tools [29,30]. The idea is to use either a deep neural simultaneously to local energy terms and used in electro-
network or some other machine learning model to represent static interactions that could be added to the machine-
the classical energy function of a system as a function of the learned energy part to obtain a total energy that is used in
atomic coordinates, instead of specifying a functional form a the training.
priori [31]. The model can then be trained on the available
data to ‘learn’ to reproduce some desired properties, such as Another main challenge resides in the software used for
energies and forces as obtained from quantum mechanical the simulations. Calculating energies and forces for a
calculations. As a neural network is a universal function protein configuration by means of a trained neural net-
approximator, this approach has the significant advantage work is several orders of magnitude faster than obtaining
that can approximate a large number of possible functional these quantities ab initio with quantum mechanical cal-
forms for the energy, instead of being constrained by a culations, but it is still slower than with a standard
predefined one, and can in principle include multi-body classical force-field. In order to simulate protein folding,
correlations that are generally ignored in classical force- molecular dynamics trajectories of at least microseconds
fields. The downside of this increased flexibility however are needed and this timescale is not currently accessible
resides in the fact that a very large amount of data is needed with machine-learned force-fields. Research in this area
to train the machine learning model as the model may has so far mostly focused on obtaining an accurate repre-
extrapolate poorly in regions of the conformational space sentation for the energy and forces for molecules and tests
where data are not available. So far, large amount of have been performed on small systems, mostly as a proof
quantum chemical calculations have been used to train of concept. As this field mature, we believe that signifi-
such force-fields, but in principle experimental data could cant efforts will also be made to optimize the software for
also be included [32]. practical applications and molecular dynamics simulation
with machine learned force-fields will become a viable
The machine learning approach to force field design has alternative to current approaches. Additionally, the
evolved rapidly in the last decade, but it has so far mostly whole arsenal of methods that have been developed to
been tested on small organic molecules. Some of the enhance the sampling of protein configurational land-
proposed methods are tailored to reproduce the thermody- scapes with classical force-fields (e.g. [39,40]) can also
namics of specific molecules (e.g. [33]), while others be used with machine-learned force-fields to reach longer
attempt to design transferable force-fields that are trained timescales and larger system sizes.
on a large number of small molecules and could in principle
be used to simulate a much larger molecule such as a protein Machine learning of coarse-grained protein
(e.g. [34,35]). Indeed, quantum mechanical calculations folding models
on water, amino acids, and small peptides have been In parallel to efforts for the design of atomistic force-fields,
included in the latest generation of machine-learned clas- machine learning has also been used to obtain coarser models
sical force-fields (e.g. the development version of the ANI [42,43,5], that could be applied to study larger systems and
potential [36]). We are aware of one instance where a longer timescales with reduced computational resources.
machine-learned force-field has been used to simulate a Coarse-grained models map groups of atoms in some effec-
50 ns molecular dynamics trajectory of a cellulose-binding tive interactive ‘beads’ and assign an effective energy func-
domain protein (1EXG) in its folded state. Recently, a tion between the beads to try to reproduce some properties
transferable machine-learned force-field has been tested on of a protein system. Different properties could be targeted,
polypeptides. However, machine-learned force-fields have and different strategies have been used to design coarse-
not (yet) been used for protein folding simulations, nor have grained models, either starting from atomistic simulations
they been used to predict thermodynamic or kinetic prop- (bottom-up) (e.g. [44,45]), experimental data (top-down)
erties. While we believe that this will be possible and (e.g. [46]) or enforcing general ‘rules’ such as the minimal
machine-learned force-fields will be widely used in protein frustration principle for protein folding [13,14]. In
simulations in the near future, at the moment there are still principle, the same ideas used in the design of atomistic
some significant challenges that need to be overcome force-fields from quantum mechanical data can be used to
towards this goal [6]. make the next step in resolution and design coarse-grained

www.sciencedirect.com Current Opinion in Structural Biology 2020, 60:77–84


80 Folding and binding

molecular models from all-atom molecular simulations [12]. coarse-grained models have not allowed to rigorously inves-
One main problem in the design of models at a resolution tigate the trade-off between transferability and accuracy for
coarser than atomistic is the fact that by renormalizing local such models. The use of machine learning tools to design
degrees of freedom multi-body terms emerge in the effec- effective potential energy functions may soon allow to
tive energy function even if only pairwise interactions were explore this question systematically.
used in a reference atomistic force-field. Such multi-body
terms should then be taken into account in the energy
function of the coarse-grained model to correctly reproduce Machine learning for analysis and enhanced
the thermodynamics and dynamics of the model at finer simulation of protein dynamics
resolution. Attempts have been made to include these terms Machine learning has been quite impactful in the analysis
in coarse-grained models, but it is challenging to define of simulations of protein dynamics. In this context, two
suitable and general functional forms to capture these effects closely related aims are: Firstly, the extraction of collec-
in an effective energy function. For this reason, neural tive variables (CVs) associated with the slowest dynamical
networks appear as a natural choice for the design of processes and the metastable states (that can be defined
coarse-grained potentials, as they can automatically capture from the knowledge of the slow CVs) from given protein
non-linearities and multi-body terms while agnostic on their molecular dynamics (MD) simulation data [15]; and
specific functional form. Indeed, in the last few years, several finally, enhancing the simulations so as to increase the
groups have attempted to use machine learning methods to number of rare event transitions between them.
design coarse-grained potentials for different systems
[42,43,5]. Most recently, CGnet (see Figure 1), a neural A cornerstone for the extraction of slow CVs, metastable
network for coarse-grained molecular force-fields, has been states and their statistics are shallow machine learning
proposed and has been used to model the folding/unfolding methods such as Markov state models (MSMs) [48] and
dynamics of a small protein [5]. The CGnet applications Master-equation models [49], which model the transitions
presented so far have been system-specific. However similar between metastable states via a Markovian transition or
ideas to what has been used in the design of transferable rate matrix. A key advantage of MSMs is that they can be
atomistic force-fields from quantum mechanical data could estimated from short MD simulations started from
also been used to try to obtain more transferable coarse- an arbitrary (non-equilibrium) distribution, and yet
grained models. In general, transferability remains an out- make predictions of the equilibrium distribution and
standing issue in the design of coarse models [47] and its long-timescale kinetics. While more complex models,
requirement may decrease the ability of reproducing faith- for example, including memory, are conceivable, MSMs
fully properties of specific systems. So far, the challenges in are simpler to estimate, easier to interpret and are moti-
the definition of general and multi-body functional forms for vated by the observation that if they are built in the slow

Figure 1

(a) 15 (b)
c Prior energy
10
featurization

gradx

5 b U(x) f(x)
x Free
Second TIC

0 Energy
a Net
–5

–10

–15

(c)
5
Free Energy (kT)

4
3
2
1
0
–25 –20 –15 –10 –5 0 a b c
First TIC

Current Opinion in Structural Biology

(a) Folding free energy landscape of the protein Chignolin as obtained with a coarse-grained model that uses a neural network to represent the
effective energy (CGnet). Top panel: Free energy as obtained from CGnet, as a function of the first two collective coordinates obtained with the
Time-Lagged Independent Component Analysis (TICA) method [41]. Bottom panel: Projection of the free energy on the first TICA coordinate. (b)
The CGnet neural network architecture. (c) Representative Chignolin configurations in the three minima from (a). Figure adapted from [5].

Current Opinion in Structural Biology 2020, 60:77–84 www.sciencedirect.com


Machine learning for protein folding and dynamics Noé, De Fabritiis and Clementi 81

CVs of the molecule, the error made by the Markovian human-built MSM pipeline by a single end-to-end learn-
approximation is close to zero for practical purposes [48]. ing framework. VAMPnets have been demonstrated on
several benchmark problems including protein folding
For this reason, much method development has been (Figure 2b) and have been shown to learn high-quality
made in the past 10–15 years in order to optimize the MSMs without significant human intervention
pipeline for the construction of MSMs, that is: finding (Figure 2c). When used with an output layer that does
suitable molecular features to work with [50], reducing perform a classification, VAMPnets can be trained to
the dimensionality of feature space [51,52], clustering the approximate directly the spectral components of the
resulting space [53,49], estimating the MSM transition Markov propagator [59,60].
matrix [54] and coarse-graining it [55,56]. While all steps
of this pipeline have significantly improved over time, The aim of enhancing MD sampling is closely connected
constructing MSMs this way is still very error prone and to identifying the metastable states or slow CVs of a given
depends on significant expert knowledge. A critical step molecular system. As the most severe sampling problems
forward was the advent of the variational approach of are due to the rare-event transitions between the most
conformation dynamics (VAC) [57] and later the more long-lived states, such as folding/unfolding transitions,
general variational approach of Markov processes identifying such states or the corresponding slow CVs on
(VAMP) [58]. These principles define loss functions that the fly can help to speed up the sampling. So-called
the best approximation to the slow CVs should minimize, adaptive sampling methods perform MD simulation in
and can thus be used to search over the space of features, multiple rounds, and select the starting states for the new
discretization and transition matrices variationally [50]. round based on a model of the slow CVs or metastable
Recently, VAMPnets have been proposed that use neural states found so far. Adaptive sampling for protein simula-
networks to find the optimal slow CVs and few-state tions has been performed using MSMs [61,62] and with
MSM transition matrices by optimizing the VAMP neural network approximations of slow CVs [63,64]. Since
score [59] (Figure 2a), and hence replace the entire adaptive sampling uses unbiased (but short) MD

Figure 2

1
All States
(a) Xt Xt+τ (b) 5
10
Residue no.

15
20
25
30

Encoder E Encoder E 35
1 5 10 15 20 25 30 35
Residue no.
1 1
5 Folded 5 Unfolded
10 10
Residue no.
Residue no.

Yt Yt+τ 15 15

VAMP 20
25
20
25
30 30
score Markov 35 35

P(τ) 1 5 10 15 20 25 30 35 1 5 10 15 20 25 30 35

model Residue no. Residue no.

1 1 1 1 1
1->1 1->2 1->3 1->4 1->5 5 5 5 5 5
1
(c) 10 10 10 10 10
Residue no.

Residue no.

Residue no.
Residue no.

Residue no.

15 15 15 15 15
0 20 20 20 20 20
2->1 2->2 2->3 2->5
25 25 25 25 25
30 30 30 30 30
35 35 35 35 35
0
3->1 3->2 3->3 3->4 3->5 1 5 10 15 20 25 30 35 1 5 10 15 20 25 30 35 1 5 10 15 20 25 30 35 1 5 10 15 20 25 30 35 1 5 10 15 20 25 30 35
1
Residue no. Residue no. Residue no. Residue no. Residue no.

0
4->1 4->2 4->3 4->4 4->5
1

0
5->1 5->2 5->3 5->4 5->5
1

0
0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 Folded 1 Folded 2 Intermediate Unfolded Misfolded
[µs]

Current Opinion in Structural Biology

VAMPnet and application to NTL9 protein folding. (a) A VAMPnet [59] includes an encoder E which transforms each molecular configuration xt to
a latent space of ‘slow reaction coordinates’ yt , and is trained on pairs ðyt ; ytþt Þ sampled from the MD simulation using the VAMP score [58]. (b)
Hierarchical decomposition of the NTL9 protein state space by a network with two and five output nodes. Mean contact maps are shown for all
MD samples grouped by the network, along with the fraction of samples in that group. 3D structures are shown for the five-state decomposition,
residues involved in a-helices or b-sheets in the folded state are colored identically across the different states. If the encoder performs a
classification, the dynamical propagator PðtÞ is a Markov state model. (c) Chapman–Kolmogorov test comparing long-time predictions of the
Koopman model estimated at t ¼ 320 ns and estimates at longer lag times. Figure modified from [59].

www.sciencedirect.com Current Opinion in Structural Biology 2020, 60:77–84


82 Folding and binding

trajectories it is possible to reconstruct the equilibrium methods. Furthermore, a trained scientist is still essential
kinetics using MSMs, VAMPnets or similar methods. to provide meaning to the patterns and use them to
Recently, adaptive sampling has been used to sample formulate general principles.
protein-protein association and dissociation reversibly in
all-atom resolution, involving equilibrium timescales of Conflict of interest
hours [65]. Nothing declared.

An alternative to adaptive sampling is to use enhanced Acknowledgements


sampling methods that speed up rare event sampling by We gratefully acknowledge funding from European Research Council (ERC
CoG 772230 ‘ScaleCell’ to FN), the Deutsche Forschungsgemeinschaft
introducing bias potentials, higher temperatures, etc., (CRC1114/A04 and GRK2433 DAEDALUS to FN), the MATH+ Berlin
such as umbrella sampling, replica-exchange or metady- Mathematics research center (AA1-6 and EF1-2 to FN), the Einstein
namics. Since these methods typically work in a space of Foundation in Berlin (visiting fellowship to CC), the National Science
Foundation (grants CHE-1265929, CHE-1740990, CHE-1900374, and
few collective variables, they are also sensitive to making PHY-1427654 to CC), the Welch Foundation (grant C-1570 to CC),
poor choices of collective variables, which can lead to MINECO (Unidad de Excelencia Marı́a de Maeztu MDM-2014-0370 and
sampling that is either not enhanced, or even slower than BIO2017-82628-P to GDF), FEDER (to GDF), and the European Union’s
Horizon 2020 research and innovation program under grant agreement no.
the original dynamics. Machine learning has an important 675451 (CompBioMed project to GDF).
role here as it can help these methods by learning optimal
choices of collective variables iteratively during sampling. References and recommended reading
For example, shallow machine learning methods have Papers of particular interest, published within the period of review,
have been highlighted as:
been used to adapt the CV space during Metadynamics
[66,67], adversarial and deep learning have used to adapt  of special interest
 of outstanding interest
the CV space during variationally enhanced sampling
(VES, [68]) [69,70]. A completely different approach to 1. Ma J, Wang S, Wang Z, Xu J: Protein contact prediction by
predict equilibrium properties of a protein system is the integrating joint evolutionary coupling analysis and
Boltzmann Generator [71] that trains a deep generative supervised learning. Bioinformatics 2015, 31:3506-3513.
neural network to directly sample the equilibrium distri- 2. Wang S, Sun S, Li Z, Zhang R, Xu J: Accurate de novo prediction
of protein contact map by ultra-deep learning model. PLoS
bution of a many-body system defined by an energy Comput Biol 2017, 13:1-34.
function, without using MD simulation.
3. Ingraham J, Riesselman A, Sander C, Marks D: Learning protein
 structure with a differentiable simulator. International
Since enhanced sampling changes the thermodynamic Conference on Learning Representations 2019
End-to-end-differentiable model for protein structure prediction solely
state of the simulation, it is suitable for the reconstruction from amino acid sequence information.
of the equilibrium distribution at a target thermodynamic
4. Xu J: Distance-based protein folding powered by deep
state by means of reweighting Boltzmann probabilities, learning. Proc Natl Acad Sci U S A 2019, 116:16856-16865.
but generally loses information about the equilibrium 5. Wang J, Olsson S, Wehmeyer C, Pérez A, Charron NE, de
kinetics. Ways to recover the kinetics include: Firstly,  Fabritiis G, Noé F, Clementi C: Machine learning of coarse-
extrapolating to the equilibrium kinetics of rare event grained molecular dynamics force fields. ACS Cent Sci 2019,
5:755-767
transitions by exploiting the Arrhenius relation [72]; Coarse-grained models are extracted from atomistic simulations by using
secondly, learning a model of the full kinetics and ther- neural networks to capture multi-body terms in the effective energy
function.
modynamics by combining probability reweighting and
MSM estimators in a multi-ensemble Markov model [73]; 6. Noé F, Tkatchenko A, Müller K-R, Clementi C: Machine learning
for molecular simulation. Ann Rev Phys Chem 2020, 71:
or finally, reweighting transition pathways [74]. Machine arXiv:1911.02792 http://dx.doi.org/10.1146/annurev-physchem-
learning and particularly deep learning has not been used 042018-052331 (in press).
much in these methods, but certainly has potential to 7. Ovchinnikov S, Park H, Varghese N, Huang P-S, Pavlopoulos GA,
improve them.  Kim DE, Kamisetty H, Kyrpides NC, Baker D: Protein structure
determination using metagenome sequence data. Science
2017, 355:294-298
Conclusions Rosetta structure prediction guided by residue-residue contacts inferred
from evolutionary information and metagenome sequence data is used to
Machine learning can provide a new set of tools to generate structural models for 614 protein families with unknown
advance the field of molecular sciences, including protein structures.
folding and structure prediction. Nonetheless, physical 8. Evans R, Jumper J, Kirkpatricks J, Sifre L, Green TFG, Qin C,
and chemical knowledge and intuition will remain invalu- Zidek A, Nelson A, Bridgland A, Penedones H, Petersen S,
Simonyan K, Crossan S, Jones DT, Silver D, Kavukcuoglu K,
able in the foreseeable future to design the methods and Hassabis D, Senior AW: De novo structure prediction with
interpret the results obtained. In particular, machine deep-learning based scoring. Thirteenth Critical Assessment of
Techniques for Protein Structure Prediction 2018.
learning can help us to extract new patterns from the
data that are not immediately evident, but in virtually all 9. Alphafold: Using AI for Scientific Discovery. https://deepmind.
com/blog/alphafold/.
areas reviewed above, machine learning methods that
10. Lindorff-Larsen K, Maragakis P, Piana S, Eastwood MP, Dror RO,
incorporate the relevant physical symmetries, invariances Shaw DE: Systematic validation of protein force fields against
and conservation laws perform better than black-box experimental data. PLoS ONE 2012, 7:e32131.

Current Opinion in Structural Biology 2020, 60:77–84 www.sciencedirect.com


Machine learning for protein folding and dynamics Noé, De Fabritiis and Clementi 83

11. Robustelli P, Piana S, Shaw DE: Developing a molecular 31. Schütt KT, Sauceda HE, Kindermans P-J, Tkatchenko A, Müller K-
dynamics force field for both folded and disordered protein  R: SchNet — a deep learning architecture for molecules and
states. Proc Natl Acad Sci U S A 2018, 115:E4758-E4766. materials. J Chem Phys 2018, 148:241722
New neural network architecture based on continuous convolutions to
12. Clementi C: Coarse-grained models of protein folding: toy- learn transferable force-fields from quantum chemical calculations.
models or predictive tools? Curr Opin Struct Biol 2008, 18:10-15.
32. Chen J, Chen J, Pinamonti G, Clementi C: Learning effective
13. Clementi C, Nymeyer H, Onuchic JN: Topological and energetic molecular models from experimental observables. J Chem
factors: what determines the structural details of the Theory Comput 2018, 14:3849-3858.
transition state ensemble and “en-route” intermediates for
protein folding? Investigation for small globular proteins. J Mol 33. Chmiela S, Tkatchenko A, Sauceda HE, Poltavsky I, Schütt KT,
Biol 2000, 298:937-953. Müller K-R: Machine learning of accurate energy-conserving
molecular force fields. Sci Adv 2017, 3:e1603015.
14. Davtyan A, Schafer NP, Zheng W, Clementi C, Wolynes PG,
Papoian GA: AWSEM-MD: Protein structure prediction using 34. Smith JS, Isayev O, Roitberg AE: ANI-1: an extensible neural
coarse-grained physical potentials and bioinformatically based  network potential with DFT accuracy at force field
local structure biasing. J Phys Chem B 2012, 116:8494-8503. computational cost. Chem Sci 2017, 8:3192-3203
Neural networks are used to design transferable force-fields from large
15. Noé F, Clementi C: Collective variables for the study of long- amount of quantum chemical calculations.
time kinetics from molecular trajectories: theory and
methods. Curr Opin Struct Biol 2017, 43:141-147. 35. Smith JS, Nebgen B, Lubbers N, Isayev O, Roitberg AE: Less is
more: sampling chemical space with active learning. J Chem
16. Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Phys 2018, 148:241733.
Zecchina R, Sander C: Protein 3d structure computed from
evolutionary sequence variation. PLoS ONE 2011, 6:e28766. 36. Isayev O: https://github.com/isayev/ASE_ANI.
17. Ovchinnikov S, Kamisetty H, Baker D: Robust and accurate 37. Hermann J, DiStasio RA, Tkatchenko A: First-principles models for
prediction of residue-residue interactions across protein van der Waals interactions in molecules and materials: concepts,
interfaces using evolutionary information. Elife 2014, 3:e02030. theory, and applications. Chem Rev 2017, 117:4714-4758.
18. Altschuh D, Lesk A, Bloomer A, Klug A: Correlation of co- 38. Nebgen B, Lubbers N, Smith JS, Sifain AE, Lokhov A, Isayev O,
ordinated amino acid substitutions with function in viruses Roitberg AE, Barros K, Tretiak S: Transferable dynamic
related to tobacco mosaic virus. J Mol Biol 1987, 193:693-707. molecular charge assignment using deep neural networks. J
Chem Theory Comput 2018, 14:4687-4698.
19. Göbel U, Sander C, Schneider R, Valencia A: Correlated
mutations and residue contacts in proteins. Proteins 1994, 39. Laio A, Parrinello M: Escaping free energy minima. Proc Natl
18:309-317. Acad Sci U S A 2002, 99:12562-12566.
20. Weigt M, White RA, Szurmant H, Hoch JA, Hwa T: Identification of 40. Preto J, Clementi C: Fast recovery of free energy landscapes via
direct residue contacts in protein-protein interaction by diffusion-map-directed molecular dynamics. Phys Chem Chem
message passing. Proc Natl Acad Sci U S A 2008, 106:67-72. Phys 2014, 16:19181-19191.
21. Szurmant H, Weigt M: Inter-residue, inter-protein and inter- 41. Pérez-Hernández G, Paul F, Giorgino T, De Fabritiis G, Noé F:
family coevolution: bridging the scales. Curr Opin Struct Biol Identification of slow molecular order parameters for Markov
2018, 50:26-32. model construction. J Chem Phys 2013, 139.
22. Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, 42. John ST, Csányi G: Many-body coarse-grained interactions
Zecchina R, Onuchic JN, Hwa T, Weigt M: Direct-coupling using Gaussian approximation potentials. J Phys Chem B 2017,
analysis of residue coevolution captures native contacts 121:10934-10949.
across many protein families. Proc Natl Acad Sci U S A 2011,
108:E1293-E1301. 43. Zhang L, Han J, Wang H, Car R, Dee WE: PCG: Constructing
Coarse-Grained Models Via Deep Neural Networks.
23. Raman S, Vernon R, Thompson J, Tyka M, Sadreyev R, Pei J, 2018arXiv:1802.08549.
Kim D, Kellogg E, DiMaio F, Lange O, Kinch L, Sheffler W, Kim B-H,
Das R, Grishin NV, Baker D: Structure prediction for CASP8 with 44. Noid WG, Chu J-W, Ayton GS, Krishna V, Izvekov S, Voth GA,
all-atom refinement using Rosetta. Proteins 2009, 77:89-99. Das A, Andersen HC: The multiscale coarse-graining method. I.
A rigorous bridge between atomistic and coarse-grained
24. Jones DT, Taylort WR, Thornton JM: A new approach to protein models. J Chem Phys 2008, 128:244114.
fold recognition. Nature 1992, 358:86-89.
45. Shell MS: The relative entropy is fundamental to multiscale and
25. Gregor K, Danihelka I, Graves A, Rezende DJ, Wierstra D: Draw: A inverse thermodynamic problems. J Phys Chem 2008,
Recurrent Neural Network for Image Generation. 2015arXiv:1502.04623. 129:144108.
26. Taketomi H, Ueda Y, Go N: Studies on protein folding, unfolding 46. Monticelli L, Kandasamy SK, Periole X, Larson RG, Tieleman DP,
and fluctuations by computer simulation. I. The effect of Marrink S-J: The MARTINI coarse-grained force field:
specific amino acid sequence represented by specific inter- extension to proteins. J Chem Theory Comput 2008, 4:819-834.
unit interactions. Int J Pept Protein Res 1975, 7:445-459.
47. Noid WG: Perspective: coarse-grained models for
27. AlQuraishi M: End-to-end differentiable learning of protein biomolecular systems. J Chem Phys 2013, 139:090901.
 structure. Cell Syst 2019, 8 292–301.e3
Neural network model for structure prediction without the use of co- 48. Prinz J-H, Wu H, Sarich M, Keller BG, Senne M, Held M,
evolutionary information. Chodera JD, Schütte C, Noé F: Markov models of molecular
kinetics: generation and validation. J Chem Phys 2011,
28. Anand N, Huang P-S: Generative modeling for protein 134:174105.
structures.. In Proceedings of the 32Nd International Conference
on Neural Information Processing Systems, NIPS’18. USA: Curran 49. Buchete NV, Hummer G: coarse master equations for peptide
Associates Inc.; 2018, 7505-7516. folding dynamics. J Phys Chem B 2008, 112:6057-6069.

29. Behler J, Parrinello M: Generalized neural-network 50. Scherer MK, Husic BE, Hoffmann M, Paul F, Wu H, Noé F:
representation of high-dimensional potential-energy Variational selection of features for molecular kinetics. J Chem
surfaces. Phys Rev Lett 2007, 98:146401. Phys 2019, 150:194108.

30. Rupp M, Tkatchenko A, Müller K-R, Lilienfeld OAV: Fast and 51. Perez-Hernandez G, Paul F, Giorgino T, De Fabritiis G, Noé F:
accurate modeling of molecular atomization energies with Identification of slow molecular order parameters for Markov
machine learning. Phys Rev Lett 2012, 108:058301. model construction. J Chem Phys 2013, 139:015102.

www.sciencedirect.com Current Opinion in Structural Biology 2020, 60:77–84


84 Folding and binding

52. Schwantes CR, Pande VS: Improvements in Markov state 63. Chen W, Ferguson AL: Molecular enhanced sampling with
model construction reveal many non-native interactions in the autoencoders: on-the-fly collective variable discovery and
folding of ntl9. J Chem Theory Comput 2013, 9:2000-2009. accelerated free energy landscape exploration. J Comput
Chem 2018, 39:2079-2102.
53. Husic BE, Pande VS: Ward clustering improves cross-validated
Markov state models of protein folding. J Chem Theory Comput 64. Ribeiro JML, Bravo P, Wang Y, Tiwary P: Reweighted
2017, 13:963-967. autoencoded variational Bayes for enhanced sampling (rave).
J Chem Phys 2018, 149:072301.
54. Trendelkamp-Schroer B, Wu H, Paul F, Noé F: Estimation and
uncertainty of reversible Markov models. J Chem Phys 2015, 65. Plattner N, Doerr S, Fabritiis GD, Noé F: Protein–protein
143:174101. association and binding mechanism resolved in atomic detail.
Nat Chem 2017, 9:1005-1011.
55. Deuflhard P, Weber M: Robust perron cluster analysis in
conformation dynamics. In Linear Algebra Appl, vol 398C. 66. McCarty J, Parrinello M: A variational conformational dynamics
Edited by Dellnitz M, Kirkland S, Neumann M, Schütte C. New approach to the selection of collective variables in
York: Elsevier; 2005:161-184. metadynamics. J Chem Phys 2017, 147:204109.

56. Noé F, Wu H, Prinz J-H, Plattner N: Projected and hidden Markov 67. Sultan MM, Pande VS: tICA-metadynamics: accelerating
models for calculating kinetics and metastable states of metadynamics by using kinetically selected collective
complex molecules. J Chem Phys 2013, 139:184114. variables. J Chem Theory Comput 2017, 13:2440-2447.

57. Nüske F, Keller BG, Pérez-Hernández G, Mey ASJS, Noé F: 68. Valsson O, Parrinello M: Variational approach to enhanced sampling
Variational approach to molecular kinetics. J Chem Theory and free energy calculations. Phys Rev Lett 2014, 113:090601.
Comput 2014, 10:1739-1752. 69. Zhang J, Yang YI, Noé F: Targeted adversarial learning
58. Wu H, Noé F: Variational Approach for Learning Markov Processes optimized sampling. ChemRxiv 2019 http://dx.doi.org/10.26434/
from Time Series Data. 2017arXiv:1707.04659. chemrxiv.7932371.
70. Bonati L, Zhang Y-Y, Parrinello M: Neural Networks Based
59. Mardt A, Pasquali L, Wu H, Noé F: Vampnets: deep learning of
Variationally Enhanced Sampling. 2019arXiv:1904.01305.
 molecular kinetics. Nat Commun 2018, 9:5
A deep learning framework for molecular kinetics using neural networks, 71. Noé F, Olsson S, Köhler J, Wu H: Boltzmann generators —
that extract Markov State Models directly from time series of molecular  sampling equilibrium states of many-body systems with deep
coordinates. learning. Science 2019, 365:eaaw1147
Deep learning methods for the generation of equilibrium structures from
60. Chen W, Sidky H, Ferguson AL: Nonlinear Discovery of Slow the Boltzmann distribution without the need of molecular dynamics.
Molecular Modes Using State-Free Reversible Vampnets.
2019arXiv:1902.03336. 72. Tiwary P, Parrinello M: From metadynamics to dynamics. Phys
Rev Lett 2013, 111:230602.
61. Doerr S, Fabritiis GD: On-the-fly learning and sampling of ligand
binding by high-throughput molecular simulations. J Chem 73. Wu H, Paul F, Wehmeyer C, Noé F: Multiensemble Markov
Theory Comput 2014, 10:2064-2069. models of molecular thermodynamics and kinetics. Proc Natl
Acad Sci U S A 2016, 113:E3221-E3230.
62. Hruska E, Abella JR, Nüske F, Kavraki LE, Clementi C:
Quantitative comparison of adaptive sampling methods for 74. Donati L, Keller BG: Girsanov reweighting for metadynamics
protein dynamics. J Chem Phys 2018, 149:244119. simulations. J Chem Phys 2018, 149:072335.

Current Opinion in Structural Biology 2020, 60:77–84 www.sciencedirect.com

You might also like