2.1 Definition of The Problem
2.1 Definition of The Problem
2.1 Definition of The Problem
INTRODUCTION
2.1 Definition of the Problem
Modern estimation theory is applied in many areas such as
• Radar
• Sonar
• Speech
• Image analysis
• Biomedicine and biomedical engineering
• Communications
• Control
• Seismology … etc
The problem is to estimate values of a group of parameters. 0, we are interested in d To
determine the range, we transmit an electromagnetic pulse that is reflected by the aircraft,
To determine the range, we transmit an electromagnetic pulse that is reflected by the aircraft,
causing an echo to be received by the antenna 𝜏0 seconds later. The range is determined by
the equation 𝜏0 = 2𝑅/𝑐, where 𝑐 is the speed of electromagnetic propagation. The received
echo faces environmental, electromagnetic, and thermal noise. Moreover, the electronic
system and introduces a time delay also. The radar system puts the continuous waveform into
a digital computer by sampling the received data and processes the resulting time series to
estimate the value of the range.
Like the radar system, all the abovementioned systems are faced with the problem of
extracting values of parameters.
• this is the problem of parameter estimation, which is the subject of this course.
• We need to determine the estimator 𝑔
• We need to determine the length of the data 𝑁
• How close is 𝜃̂ to 𝜃
1 1
𝑝(𝑥[0]; 𝜃) = exp [− (𝑥[0] − 𝜃)2 ]
√2𝜋𝜎 2 2𝜎 2
• Based on the observations {𝑥[0], 𝑥[1], ⋯ , 𝑥[𝑁 − 1]} we would like to estimate 𝐴
• Intuitively, since A is the average level of 𝑥[𝑛], it would be reasonable to estimate A as
𝑁−1
1
𝐴̂ = ∑ 𝑥[𝑛]
𝑁
𝑛=0
-1
-2
0 50 100 150
𝐴̌ = 𝑥[0]
Intuitively, it will not perform well, since it doesn’t make use of all the data. There no averaging
to reduce noise effects. However, for this data set, 𝐴̌ = 0.95 turns out to be closer than 𝐴̂.
Can we conclude that 𝐴̌ is a better estimator? The answer is of course no.
• Since an estimator is a function of the data, which are random variables, it is too a
random variable.
𝐸[𝐴̌] = 𝐸(𝑥[0]) = 𝐴
var(𝐴̌) = var(𝑥[0]) = 𝜎 2
The shorthand notation 𝑥~𝒩(𝜇𝑥 , 𝜎𝑥2 ) is often used. For iid random processes 𝑥(𝑛) =
𝑥(0), 𝑥(1), ⋯ 𝑥(𝑁 − 1), each of which are distributed as 𝒩(𝜇𝑥 , 𝜎𝑥2 ). Then, 𝑥(𝑛) is distributed
according to
𝑁−1
𝑁 (𝑥)
1 1
𝑝(𝑥(𝑛)) = 𝑝 = 2 𝑁/2
exp [− 2 ∑(𝑥(𝑛) − 𝜇𝑥 )2 ]
(2𝜋𝜎𝑥 ) 2𝜎𝑥
𝑛=0
Where 𝑤(𝑛)~𝒩(0, 𝜎 2 ) are iid random process. Then the pdf of the above 𝑥(𝑛) can be written
by
𝑁−1
1 1 2
𝑝(𝑥(𝑛)) = exp [− ∑(𝑥(𝑛) − 𝑑(𝑛)) ]
(2𝜋𝜎 2 )𝑁/2 2𝜎 2
𝑛=0
Notice that 𝐂𝐱 is an 𝑛 × 𝑛 symmetric matrix with [𝐂𝐱 ]𝑖𝑗 = 𝐸{(𝑥𝑖 − 𝐸[𝑥𝑖 ])(𝑥𝑗 − 𝐸[𝑥𝑗 ])} =
cov(𝑥𝑖 𝑥𝑗 ). 𝐂𝐱 is assumed to be positive definite so that it is invertible. If 𝐂𝐱 is a diagonal matrix,
then the random variables are uncorrelated.
If the random variables are uncorrelated, 𝑝(𝐱) factors into product of N univariate Gaussian
pdfs and hence the random variables are also independent.
If 𝐱 is linearly transformed as
𝐲 = 𝐀𝐱 + 𝐛
where 𝐀 is 𝑚 × 𝑛 and b is 𝑚 × 1 with 𝑚 ≤ 𝑛 and 𝐀 full rank (so that 𝐂𝐱 is nonsingular), then
𝐲 is also distributed according to a multivariate Gaussian distribution with
𝐸[𝐲] = 𝛍𝐲 = 𝐀𝛍𝐲 + 𝐛
and
𝑇
𝐸 [(𝐲 − 𝛍𝐲 )(𝐲 − 𝛍𝒚 ) ] = 𝐂𝐲 = 𝐀𝐂𝐱 𝐀𝐓
2.5 HOMEWORK 2:
Research homework:
1. What is an independent and identically distributed random process.
a. Definition
b. Properties
c. What is a Gaussian iid process
2. What is a circularly symmetric iid Gaussian process? What is the difference between
an ordinary iid Gaussian process.
o