Nothing Special   »   [go: up one dir, main page]

QSAR

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 46

QSAR

 Qualitative Structure-Activity Relationships


 Can one predict activity (or properties in
QSPR) simply on the basis of knowledge of
the structure of the molecule?
 In other, words, if one systematically changes
a component, will it have a systematic effect
on the activity?
Choice of Model
 Can approach in two directions:
 Simple to complex model
 Complex to simple model
Simplest Model
 Linear relationship between x and y
 Y = mx + b
 Minimize error by least squares:
 (Yi – Y’i)2 = [Yi – (mXi + b)]2

Y’i is predicted value


Least Squares
Correlation coefficient

-1 < r < 1
Another test
Is the line better than the mean?
60
y = 2.9562x - 0.2597
y = 0.0676x - 0.3882 R 2 = 0.8686
R2 = 0.0045
30

0
-15 -10 -5 0 5 10 15 -10 -5 0 5 10 15

-30

-15 -60

A circle 2 lines
100 1000
y = 2.8515x - 31.647 y = 0.0008x + 275.11
2
R = 0.9179 R 2 = 0.978

75 750

50 500

25 250

0 0
10 20 30 40 50 0 200000 400000 600000 800000

One bad point Wrong model


Multiple Regression
 Y = f (X1, X2…Xn)
 Problems:
 Choice of model – linear, polynomial, etc.
 Visualization
 Interpretation
 Computationally demanding
Variable reduction
 Principal Component Analysis
Principal Component
 PC1 = a1,1x1 + a1,2x2 + … + a1,nxn
 PC2 = a2,1x1 + a2,2x2 + … + a2,nxn

 Keep only those components that


possess largest variation
 PC are orthogonal to each other
Exploring QSAR
 Pickup the NONLIN program
 http://www.trinity.edu/sbachrac/drugdesign2007/
 Unzip and install it on your computer
 Read the Read.Me and Nonlin.doc
documentation
 Look at the HeatForm.NLR file with any
word processor
Running NONLIN
 Start an MSDOS window
 Change to directory where the code is
 Cd /d d:\nonlin
 Execute the program with data file
 Nonlin heatForm > output
assignment
 Propose a QSAR scheme to predict the
Hf of the alkanes
Early Examples
 Hammett (1930s-1940s)
COOH COO + H K0

X COOH X COO + H Kp

COOH COO + H Km
X X

Kp
para = log10
K0

meta = log10 Km
K0
Hammett (cont.)
 Now suppose have a related series
CH2COOH CH2COO +H K'x
X X

log10 K'x = 
K'0

 reflect sensitivity to substituent


 reflect sensitivity to different system
Hammett (cont.)
 Linear Free Energy Relationship
G = -2.303RTlog10K
So
G – G0 = -2.303RT
and
G’ – G’0 = -2.303RT
Therefore
G’ – G’0 = (G – G0)
Free-Wilson Analysis
 Log 1/C =  ai + 
where C=predicted activity,
ai= contribution per group, and
=activity of reference
Free-Wilson example
Br
X N
activity of analogs
Y HCl

Log 1/C = -0.30 [m-F] + 0.21 [m-Cl] + 0.43 [m-Br]


+ 0.58 [m-I] + 0.45 [m-Me] + 0.34 [p-F] + 0.77 [p-Cl]
+ 1.02 [p-Br] + 1.43 [p-I] + 1.26 [p-Me] + 7.82

Problems include at least two substituent position


necessary and only predict new combinations of the
substituents used in the analysis.
Hansch Analysis

Log 1/C = a  + b  + c

where
x) = log PRX – log PRH

and log P is the water/octanol partition

This is also a linear free energy relation


Molecular Descriptors
 Simple rules for describing some aspect of a molecule
 Structure
 Property
 2D descriptors only use the atoms and connection
information of the molecule
 Internal 3D descriptors use 3D coordinate information
about each molecule; however, they are invariant to
rotations and translations of the conformation
 External 3D descriptors also use 3D coordinate
information but also require an absolute frame of
reference (e.g., molecules docked into the same
receptor).
Descriptor examples
 Physical Properties
 MW
 log P (ocanol/water partition)
 bp, mp
 Dipole moment
 solubility
Descriptor examples
 Structural descriptors
 2D
 Atom/Bond counts
 Number non-H atoms
 Number of rotatable bonds
 Number of each functional group
 2C chains, 3C chains, 4C chains, 5C chains, etc.
 Rings and their size
 3D
 Number of accessible conformations
 Surface area
Topological Descriptors
 Weiner Path Index
Distance Matrix
6
0123423
4 1012312
2 2101221
3 5 3210132
1
7 1234043
2123403
3212330

w =  dij w = 46
i j>i
Topological Descriptors
Randic Index
1
 valence 2
3
at vertex
1 3 1

bond values 3
as product 3 9 2
of above 6
3

edge term .577


as reciprocal of .333
.577 .707
square rooot of
.408
above bond values
.577

Sum of
edge terms 3.179
Predict bp of alkanes
100
y = 1.5225x + 7.2917
R2 = 0.9547
90

80
bp

70

60

50
30 35 40 45 50 55 60 65
Weiner Index
3D Molecular Descriptors
 Potential energy
 Solvation energy
 Water accessible surface area
 Water accessible surface area of all
atoms with positive (negative) partial
charge
Pharmacophore
 Specification of the spatial arrangement
of a small number of atoms or
functional groups
 With the model in hand, search
databases for molecules that fit this
spatial environment
Creating a Pharmacophore

O O

O O

OH
OH
3D Pharmacophore searching
 With the pharmacophore in hand,
search databases containing 3-D
structure of molecules for molecules
that fit
 Can rank these “hits” using scoring
system described later
Pharmacophore Descriptors
 Number of acidic atoms
 Number of basic atoms
 Number of hydrogen bond donor atoms
 Number of hydrophobic atoms
 Sum of VDW surface areas of hydrophobic atoms
Lipinski’s Rule of 5
 potential drug candidates should
 Have 5 or fewer H-bond donors (expressed as the
sum of OHs and NHs)
 Have a MW <500
 LogP less than 5
 Have 10 or less H-bond acceptors (expressed as
the sum of Ns and Os)

Adv. Drug Delivery Rev., 1997, 23, 3


Docking
 Interact a ligand with a receptor
 Need to do the following
 A) select appropriate ligands
 B) select appropriate conformation of receptor
 C) select appropriate conformations of ligands
 D) combine the ligand and receptor (docking)
 E) evaluate these combinations and rank order
them
Selection of Ligands
 Want drug-like molecules
 250< MW < 500
 Lipinski’s rules
 Search through databases
 Available Chemicals Directory (ACD)
 World Drug Index
 NCI Drug database
 In-house databases
Receptor Conformation
 Usually Receptor is assumed to be static
 Get structure from X-ray or NMR
experiment
 Protein Data Bank (http://www.rcsb.org/pdb/)
41385 Structures
Ligand Conformation
 Rigid or flexible
 If rigid, optimize the structure then use it
throughout the docking procedure
 If flexible, can
 A) create a set of low energy conformations and
then use this set as a collection of rigid structures
in docking
 B) optimize structure within active site of receptor,
i.e. dock and optimize together
Docking
 Place ligand in appropriate location for
interacting with the receptor
 Methodological problem:
 1) No best method for defining shape
 2) No general solution for packing irregular
objects (the knapsack problem)
Docking Algorithmic
Components
 Receptor and Ligand Description (keep in mind
relative errors of structures, etc.)
 Bind the Ligand to Receptor
(configuration/conformation search)
 Geometric search (match ligand and receptor site
descriptions)
 Search for minimum energy - molecular dynamics
(MD) or monte carlo (MC)
 Evaluation of the dock (Gbind) also called
scoring
Descriptor Matching Method
DOCK program
 1) Generate molecular surface for receptor

 2) Generate spheres to fill the active site


(usually 30-50 spheres)
 3) Match sphere centers to the ligand atoms
(originally just lowest E conformer, now use multiple
conformers, but still rigid) – generates 10K orientations per
ligand – Shape-driven!
 4) Score the interaction
Fragment-Joining Method
FlexX, LUDI
 Place base fragments into microstates

of the active site (Fragments can be small


molecules like benzene, formaldehyde,
formamide, naphthol, etc.)
 Optimize position of the Base fragment
 Join fragments with small connecting
chains made of CH2, CO, CONH, etc.
Scoring (evaluation of the dock)
 Want to quickly evaluate the strength of
the interaction between ligand and
receptor
 Full free energy computation
 Expensive
 Requires excellent force fields
 Empirical method
 Fast and cheap
 Requires fitting to a broad set of ligand/receptor
complexes
Empirical Scoring
 Method of Bohm (LUDI, FlexX, etc.)
Gbind = G0 + h-bonds Ghb f(R,) + ion Gion f(R,)
+ Glipo Alipo + Grot NROT

G0 reduction in binding energy due to loss of


rotation and translation of ligand
Ghb contribution from ideal hydrogen bond
Gion contribution from ionic interactions
Glipo contribution from lipophilic interactions
Grot contribution from freezing rotations within ligand
Bohm Method (cont.)
 f(R,) are penalty functions for non-ideal
interactions – distances too short/long, angles
not linear
f (R,) = f1(R)f2()

f1(R) = 1, R<0.2 Å f2() = 1, <30°


1-(R-0.2)/0.4, R<0.6 Å 1-(-30)/50, <80°
0, R>0.6 Å 0, >80°

R is deviation from ideal H...O/N distance of 1.9 Å


 is deviation from ideal N/O-H…O/N angle of 180°
Bohm Method (cont.)
 Alipo is the lipophilic contact surface,
evaluated by a coarse grid of boxes
 NROT is the number of rotatable bonds
– acyclic sp3-sp3, sp3-sp2 and sp2-sp2. No
terminal groups or flexibility of rings
incorporated.

H.-J. Bohm, J. Comput.-Aided Mol. Des., 1994, 8, 243-256


Scoring alternatives
 Many variations on Bohm scheme
 Buried Polar term, desolvation term, different
forms for the lipophilic term, include metal
bonding, etc.
 Combine scoring functions, i.e. QSAR with
scoring functions as variables
 Use empirical score to select set of hits, then
refine with free energy minimization

You might also like