Nothing Special   »   [go: up one dir, main page]

Lecture 04 Alignment

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Sequence Alignment

Lecture No. 04
Sequence Alignment
Procedure for comparing two or more
sequences by searching for a series of
individual characters or character patterns
that are in the same order in the sequences.
Sequence Alignment
Two sequences are aligned by writing them across
page in two rows.
Identical or similar characters are placed in same
column and non identical characters can ether be
placed in same column as mismatch or opposite a
gap in other sequence.
In optimal alignment non identical characters and
gaps are placed to bring as many identical or
similar characters as possible into vertical register.
Example sequence alignment
Task: align “abcdef” with “abdgf”
Write second sequence below the first
abcdef
abdgf
Move sequences to give maximum match between
them
Show characters that match using vertical bar
Example sequence alignment
abcdef
||
abdgf
Insert gap between b and d on lower
sequence to allow d and f to align
Example sequence alignment
abcdef
|| | |
ab-dgf
Note e and g don’t match
Sequence Alignment
Sequence alignment is easy with sufficiently
closely related sequences.
Below certain level of identity sequence
alignment may become meaning less.
Twilight zone for a sequence is ~ 30%.

Global vs. Local Alignment
We distinguish
Global alignment algorithms which optimize
overall alignment between two sequences
Local alignment algorithms which seek only
relatively conserved pieces of sequence
Alignment stops at the ends of regions of strong
similarity
Favors finding conserved nucleotide patterns, DNA
Sequences or amino acid patterns in protein
sequences.
Global Alignment
In global alignment an attempt is made to align the
entire sequence, using as many characters as
possible, up to both ends of each sequence.
Sequences that are quite similar and approximately
the same length are suitable candidates for
alignment.
vertical bars between sequence indicate the
presence of identical amino acid.
Local Alignment
Stretches of sequences with the highest
density of matches are aligned.
It is more suitable for aligning sequences
that are similar along some of their lengths
but dissimilar in others, sequences that
differ in length or sequences that share
conserved region or domain.
Global vs. Local Alignment
Global
LGPSSKQTGKGS-SRIWDN
| | ||| | |
LN-ITKSAGKGAIMRLGDA
Local
--------GKG--------
|||
--------GKG--------
Why Do You Choose
Local vs Global
Choose local Alignment when
• DNA sequences encodes genes with introns
• Amino acid sequences encoding proteins

Choose a global alignment when


• Sequences can be seen to be very similar
• Similar regions are in the same order and orientation.
Significance of Sequence
Alignmnet
It is used to find
• whether two or more genes or proteins are evolutionary
related to each other
• Structurally or functionally similar regions within proteins
• To highlight conserves regions/ sites
• To highlight variable regions/ sites
• To uncover changes in gene structure
• To summarize sequence information
Sequence Alignment Methods

Pair-wise alignment: Compare two


sequences
Multiple sequence alignment: Compare
more than two sequences
Methods for Pair-wise Alignment
Dot matrix analysis
Dynamic Programming
Word or k-tuple methods (FASTA and
BLAST)
Sequence comparison with
dot matrices
Goal: Graphically display regions of
similarity between two sequences (e.g.,
domains in common between two proteins
of suspected similar function)
Sequence comparison
with dot matrices
Basic Method: For two sequences of
lengths M and N, lay out an M by N grid
(matrix) with one sequence across the top
and one sequence down the left side. For
each position in the grid, compare the
sequence elements at the top (column) and
to the left (row). If and only if they are the
same, place a dot at that position.
Examples for protein sequences
(Sequence)
abcdaefghbijklcmnopd
abcdaefghbijklcmnopd
Examples for protein sequences
(Sequence)
abcdeedcbafghijklmno
abcdeedcbafghijklmno
Examples for protein sequences
(Sequence)
abcdaefghbijklcmnopd
abcdefghijklmnopqrst
Uses for dot matrices
Can use dot matrices to align two proteins
or two nucleic acid sequences
Can use to find amino acid repeats within a
protein by comparing a protein sequence to
itself

You might also like