Edit Dist
Edit Dist
Edit Dist
Edit
Distance
An alignment:
-AGGCTATCACCTGACCTCCAGGCCGA--TGCCC---
TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC
D(i-1,j) + 1
D(i,j)= min D(i,j-1) + 1
D(i-1,j-1) + 2; if X(i) Y(j)
0; if X(i) = Y(j)
Termination:
D(N,M) is distance
The Edit Distance Table
N 9
O 8
I 7
T 6
N 5
E 4
T 3
N 2
I 1
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
The Edit Distance Table
N 9
O 8
I 7
T 6
N 5
E 4
T 3
N 2
I 1
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
Edit Distance
N 9
O 8
I 7
T 6
N 5
E 4
T 3
N 2
I 1
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
The Edit Distance Table
N 9 8 9 10 11 12 11 10 9 8
O 8 7 8 9 10 11 10 9 8 9
I 7 6 7 8 9 10 9 8 9 10
T 6 5 6 7 8 9 8 9 10 11
N 5 4 5 6 7 8 9 10 11 10
E 4 3 4 5 6 7 8 9 10 9
T 3 4 5 6 7 8 7 8 9 8
N 2 3 4 5 6 7 8 7 8 7
I 1 2 3 4 5 6 7 6 7 8
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
Minimum
Edit Computing Minimum Edit
Distance
Distance
Minimum
Edit Backtrace for Computing
Alignments
Distance
Computing alignments
T 6
N 5
E 4
T 3
N 2
I 1
# 0 1 2 3 4 5 6 7 8 9
# E X E C U T I O N
MinEdit with Backtrace
Adding Backtrace to Minimum Edit Distance
D(i-1,j) + 1 deletion
xN
Every non-decreasing path
corresponds to
an alignment
of the two sequences
x0
An optimal alignment is composed
y0 yM of optimal subalignments
SLIDE ADAPTED FROM SERAFIM BATZOGLOU WITH PERMISSION
Result of Backtrace
Two strings and their alignment:
Performance
Time:
O(nm)
Space:
O(nm)
Backtrace
O(n+m)
Minimum
Edit Backtrace for Computing
Alignments
Distance
Minimum
Edit Weighted Minimum Edit
Distance
Distance
Weighted Edit Distance
Why would we add weights to the computation?
Spell Correction: some letters are more likely to be
mistyped than others
Biology: certain kinds of deletions or insertions are more
likely than others
Confusion matrix for spelling errors
Weighted Min Edit Distance
Initialization:
D(0,0) = 0
D(i,0) = D(i-1,0) + del[x(i)]; 1 < i
D(0,j) = D(0,j-1) + ins[y(j)]; 1 < j M
Recurrence Relation:
D(i-1,j) + del[x(i)]
D(i,j)= min D(i,j-1) + ins[y(j)]
D(i-1,j-1) + sub[x(i),y(j)]
Termination:
D(N,M) is distance
Minimum
Edit Weighted Minimum Edit
Distance
Distance
Minimum
Edit Minimum Edit Distance in
Computational Biology
Distance