Nothing Special   »   [go: up one dir, main page]

Zero Padding

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 15

One of the fundamental principles of discrete signals is that "zero padding" in one domain results in an

increased sampling rate in the other domain. For example, the most common form of zero padding is
to append a string of zero-valued samples to the end of some time-domain sequence. Looking at a
simple example of this notion, Figure 1(a) shows a Hanning window sequence w(n) defined by 32 time
samples. If we take a 32-point DFT of w(n), and just look at the magnitudes, we get the W dB(m)
spectral magnitude plot shown in Figure 1(b).

Figure 1
If, on the other hand, we zero pad (append) 96 zero-valued samples to the end of w(n), well have the
128-sample w(n) sequence shown in Figure 1(c). The spectral magnitude plot of a 128-point DFT of
w(n) is shown in Figure 1(d), where we can see the more detailed structure of the Fourier transform
of a Hanning window. So, in this case, we can say "zero padding in the time domain results in an
increased sampling rate in the frequency domain". Here the zero padding increased our frequencydomain sampling (resolution) by a factor of four (128/32).
For each sample in Figure 1(b), we have four samples in Figure 1(d).
Many folk call this process "spectral interpolation".
OK, thats time-domain zero padding. No big deal, youve probably seen this before. Now lets see
what frequency-domain zero padding (FDZP) will do for us by way of an example. Think of an 8sample real discrete sequence comprising the sum of a 1 kHz and a 2 kHz sinusoids defined as:

x(n)=sin(2pi1000nts) + 0.5sin(2pi2000nts + 3pi/4),


where ts is the sample period (1/fs), and the fs sample rate is 8000 samples/second. Our eight x(n)
samples are shown as the black dots in Figure 2.

Figure 2
Well call x(n)s 8-point DFT the frequency-domain sequence X(m), the real and imaginary parts of
which are shown in Figure 3(a) and 3(b). OK, heres where the zero padding comes in. If we insert 8
zero-valued complex samples in the middle of X(m), well have a new complex spectrum whose real
and imaginary parts are shown in Figure 3(c) and 3(d). Realize, now, that a complex zero is merely
0 + j0.
That "middle of X(m)" phrase means just prior to half the sample rate, or 4 kHz in this example.
Because of the way the X(m)s sample index numbering is mapped to the frequency-domain axis, we
can think of this fs/2 = 4 kHz point as the highest positive frequency in the spectrum. Thus we insert
the zeros after the first N/2 spectral samples, where N is the length of X(m), in order to maintain
spectral symmetry. (Were assuming that the 4 kHz, X(N/2), spectral component is zero, or at least
negligibly small, in magnitude.)
OK, lets call this new 16-sample discrete spectrum X(m).

Figure 3
Now, heres the slick part. If we take a 16-point inverse DFT of X(m), well get the interpolated time
sequence shown in Figure 4. Check it out! In between each of the original x(n) samples (shaded dots),
weve calculated the intermediate time samples (the black dots). All 16 dots in Figure 4 represent the
new interpolated 16-sample x(n) sequence. In this case, we can say "zero padding in the frequency
domain results in an increased sampling rate in the time domain".

Figure 4
A few things to keep in mind about this FDZP technique:
1.) Notice the agreeable property that the FDZPs interpolated time samples match the original time
samples. (The shaded dots in Figure 4.)
2.) We must not append zeros to the end of the X(m) sequence, as occurs in time-domain zero
padding. The complex zero padding must take place exactly in the middle of the original X(m)
sequence, with the middle frequency sample being fs/2.
3.) The new time sequence x(n), the inverse DFT of X(m), is complex. However, if we stuffed the
zeros properly X(m) will symmetrical and x(n)s imaginary parts should all be zero (other than very
small computational errors). Thus Figure 4 is a plot of the real parts of x(n).

4.) Notice how the amplitudes of the new x(n) time sequence were reduced by a factor of two in our
example. (This amplitude reduction can, of course, be avoided by doubling either the X(m) or the
x(n) amplitudes.)
5.) In Figure 4 we interpolated by a factor of two. Had we stuffed, say, 24 zeros into the X(m)
sequence, we could perform interpolation by a factor of four using the inverse fast Fourier transform
(IFFT). The point here is that the number of stuffed zeros must result in an X(m) sequence whose
length is an integer power of two if you want to use the efficient radix-2 inverse FFT algorithm. If your
new X(m) sequences length is not an integer power of two, youll have to use the inverse discrete
Fourier (IDFT) transform to calculate your interpolated time-domain samples.
Although I haven't gone through a mathematical analysis of this scheme, the fact that it's called
"exact interpolation" in the DSP literature is reasonable for periodic time-domain sequences. Think of
the standard technique used to perform time-domain interpolation using a low-pass FIR filter. In that
method, zero-valued samples are inserted between each of the original time samples, and then the
new (lengthened) sequence is applied to an FIR filter to attenuate the spectral images caused by the
inserted zeros. The accuracy of that FIR filter interpolation technique depends on the quality of the
filter. The greater the low-pass FIR stopband attenuation, the greater the image rejection, and the
more accurate the time-domain interpolation. In this FDZP technique, for periodic time signals, image
rejection is ideal in the sense that the spectral images all have zero amplitudes.
If our original time-domain sequence is not periodic, then the FDZP scheme exhibits the Gibbs'
phenomenon in that there will be errors at the beginning and end of the interpolated time samples.
When we try to approximate discontinuities in the time-domain, with a finite number of values in the
frequency-domain, ripples occur just before and after the approximated (interpolated) discontinuity in
the time-domain. Of course, if your original time sequence is very large, perhaps you can discard
some of the initial and final erroneous interpolated time samples. From a practical standpoint, it's a
good idea to model this FDZP technique to see if it meets the requirements of your application.
One last thought here. For those readers with Law Degrees, don't try to cheat and use this FDZP
technique to compensate for failing to meet the Nyquist sampling criterion when your x(n) time
samples were originally obtained. This interpolation technique won't help because if you violated
Nyquist to get x(n), your X(m) DFT samples will be invalid due to aliasing errors. This FDZP scheme
only works if Nyquist was satisfied by the original x(n).

FFT algorithms are tailored to the specific length of the input signal. When the input
signals length is a large prime number or when only a subset of FFT algorithms is
available (when, for instance, all we have is the radix-2 algorithm, which processes
input vectors with lengths of a power of 2) it is customary to extend the length of the

signal to match the algorithmic requirements. This is usually achieved by zero


padding, i.e. the length-N data vector is extended to a chosen length M by appending
(M - N) zeros to it. Now, the maximum resolution of an N-point DFT, i.e. the
separation between frequency components, is 2N. By extending the signal to a
longer length M, we are indeed reducing the separation between frequency
components. One may think that this artificial increase in resolution allows the DFT to
show finer details of the input signals spectrum. It is not so.
The M-point DFT X(M) of an N-point data vector x, obtained via zero-padding, can
be obtained directly from the canonical N-point DFT of the vector X(N) via a simple
matrix multiplication:
(4.86)

where the M N matrix MM,N is given by

where WN is the standard DFT matrix and WM is the M N matrix obtained by keeping
just the first N columns of the standard DFT matrix WM. The fundamental meaning
of (4.86) is that, by zero padding, we are adding no information to the spectral
representation of a finite-length signal. Details of the spectrum which were not
apparent in an N-point DFT are still not apparent in a zero-padded version of the
same. It can be shown that (4.86) is a form of Lagrangian interpolation of the original
DFT samples; therefore the zero-padded DFT is more attractive in a cosmetic
fashion since the new points, when plotted, show a smooth curve between the original
DFT points (and this is how plots such as the one in Figure 4.18 are obtained).
Zero Padding Applications
Zero padding in the time domain is used extensively in practice to compute heavily interpolated spectra by taking the DFT of the zero-padded signal.
Such spectral interpolation is ideal when the original signal is time limited (nonzero only over some finite duration spanned by the orignal samples).
Note that the time-limited assumption directly contradicts our usual assumption of periodic extension. As mentioned in 6.7, the interpolation of a
periodic signal's spectrum from its harmonics is always zero; that is, there is no spectral energy, in principle, between the harmonics of a periodic
signal, and a periodic signal cannot be time-limited unless it is the zero signal. On the other hand, the interpolation of a time-limited signal's spectrum is
nonzero almost everywhere between the original spectral samples. Thus, zero-padding is often used when analyzing data from a non-periodic signal in
blocks, and each block, or frame, is treated as a finite-duration signal which can be zero-padded on either side with any number of zeros. In summary,
the use of zero-padding corresponds to the time-limited assumption for the data frame, and more zero-padding yields denser interpolation of the
frequency samples around the unit circle.
Sometimes people will say that zero-padding in the time domain yields higher spectral resolution in the frequency domain. However, signal processing
practitioners should not say that, because ``resolution'' in signal processing refers to the ability to ``resolve'' closely spaced features in a spectrum
analysis (see Book IV [70] for details). The usual way to increase spectral resolution is to take a longer DFT without zero padding--i.e., look at more
data. In the field of graphics, the term resolution refers to pixel density, so the common terminology confusion is reasonable. However, remember that
in signal processing, zero-padding in one domain corresponds to a higher interpolation-density in the other domain--not a higher resolution.

Ideal Spectral Interpolation


Using Fourier theorems, we will be able to show (7.4.12) that zero padding in the time domain gives exact bandlimited interpolation in the frequency
domain.7.9In other words, for truly time-limited signals
, taking the DFT of the entire nonzero portion of
extended by zeros yields exact
interpolation of the complex spectrum--not an approximation (ignoring computational round-off error in the DFT itself). Because the fast Fourier
transform (FFT) is so efficient, zero-padding followed by an FFT is a highly practical method for interpolating spectra of finite-duration signals, and is
used extensively in practice.

Before we can interpolate a spectrum, we must be clear on what a ``spectrum'' really is. As discussed in Chapter 6, the spectrum of a signal
frequency

That is,

is defined as a complex number

at

computed using the inner product

is the unnormalized coefficient of projection of

onto the sinusoid

at frequency

. When

for
, we obtain the special set of spectral samples known as the DFT. For other values of
, we obtain spectral points
in between the DFT samples. Interpolating DFT samples should give the same result. It is straightforward to show that this ideal form of interpolation is
what we call bandlimited interpolation, as discussed further in Appendix D and in Book IV [70] of this series.

Zero-padding in the time domain corresponds to interpolation in the Fourier domain. It is frequently used in audio, for
example for picking peaks in sinusoidal analysis.
While it doesn't increase the resolution, which really has to do with the window shape and length. As mentioned by
@svenkatr, taking the transform of a signal that's not periodic in the DFT size is like multiplying with a rectangular window,
equivalent in turn to convolving it's spectrum with the transform of the rectangle function (a sinc), which has high energy in
sidelobes (off-center frequencies), making the true sinusoidal peaks harder to find. This is known as spectral leakage.
But I disagree with @svenkatr that zero-padding is causing the rectangular windowing, they are separate issues. If you
multiply your non-periodic signal by a suitable window (like the Hann or Hamming) that has appropriate length to have the
frequency resolution that you need and then zero-pad for interpolation in frequency, things should work out just fine.
By the way, zero-padding is not the only interpolation method that can be used. For example, in estimating parameters of
sinusoidal peaks (amplitude, phase, frequency) in the DFT, local quadratic interpolation (take 3 points around a peak and fit
a parabola) can be used because it is more computationally efficient than padding to the exact frequency resolution that you
want (would mean a much larger DFT size).

FFT Zero Padding


Posted by Shannon Hilbert in Digital Signal Processing on 4-22-13
The Fast Fourier Transform (FFT) is one of the most used tools in electrical engineering
analysis, but certain aspects of the transform are not widely understoodeven by
engineers who think they understand the FFT. Some of the most commonly
misunderstood concepts are zero-padding, frequency resolution, and how to choose
the right Fourier transform size.
This article will explore zero-padding the Fourier transformhow to do it correctly and
what is actually happening. The exploration will cover of the following topics:
1. Zero Padding

2. FFT Frequency Resolution


1. Waveform Frequency Resolution
2. FFT Resolution
3. Frequency Domain Resolution Concept Exploration
4. Choosing the Right FFT Size.

Zero Padding
Zero padding is a simple concept; it simply refers to adding zeros to end of a timedomain signal to increase its length. The example 1 MHz and 1.05 MHz real-valued
sinusoid waveforms we will be using throughout this article is shown in the following
plot:

The time-domain length of this waveform is 1000 samples. At the sampling rate of 100
MHz, that is a time-length of 10 us. If we zero pad the waveform with an additional
1000 samples (or 10 us of data), the resulting waveform is produced:

There are a few reasons why you might want to zero pad time-domain data. The most
common reason is to make a waveform have a power-of-two number of samples.
When the time-domain length of a waveform is a power of two, radix-2 FFT algorithms,
which are extremely efficient, can be used to speed up processing time. FFT algorithms
made for FPGAs also typically only work on lengths of power two.
While its often necessary to stick to powers of two in your time-domain waveform
length, its important to keep in mind how doing that affects the resolution of your
frequency-domain output.

FFT Frequency Resolution


There are two aspects of FFT resolution. Ill call the first one waveform frequency
resolution and the second one FFT resolution. These are not technical names, but I
find them helpful for the sake of this discussion. The two can often be confused
because when the signal is not zero padded, the two resolutions are equivalent.
The waveform frequency resolution is the minimum spacing between two
frequencies that can be resolved. The FFT resolution is the number of points in the
spectrum, which is directly proportional to the number points used in the FFT.
It is possible to have extremely fine FFT resolution, yet not be able to resolve two
coarsely separated frequencies.

It is also possible to have fine waveform frequency resolution, but have the peak
energy of the sinusoid spread throughout the entire spectrum (this is called FFT
spectral leakage).
The waveform frequency resolution is defined by the following equation:

where T is the time length of the signal with data. Its important to note here that you
should not include any zero padding in this time! Only consider the actual data
samples.
Its important to make the connection here that the discrete time Fourier transform
(DTFT) or FFT operates on the data as if it were an infinite sequence with zeros on
either side of the waveform. This is why the FFT has the distinctive sinc function shape
at each frequency bin.
You should recognize the waveform resolution equation 1/T is the same as the space
between nulls of a sinc function.
The FFT resolution is defined by the following equation:

Frequency Domain Resolution Concept Exploration


Considering our example waveform with 1 V-peak sinusoids at 1 MHz and 1.05 MHz,
lets start exploring these concepts.
Lets start off by thinking about what we should expect to see in a power spectrum.
Since both sinusoids have 1 Vpeak amplitudes, we should expect to see spikes in the
frequency domain with 10 dBm amplitude at both 1 MHz and 1.05 MHz.
The original time-domain signal shown in the first plot with a length of 1000 samples
(10 us). A 1000-point FFT used on the time-domain signal is shown in the next figure:

Two distinct peaks are not shown, and the single wide peak has an amplitude of about
11.4 dBm. Clearly these results dont give an accurate picture of the spectrum. There is
not enough resolution in the frequency domain to see both peaks.
Lets try to resolve the two peaks in the frequency domain by using a larger FFT, thus
adding more points to the spectrum along the frequency axis. Lets use a 7000-point
FFT. This is done by zero padding the time-domain signal with 6000 zeros (60 us). The
zero-padded time-domain signal is shown here:

The resulting frequency-domain data, shown as a power spectrum, is shown here:

Although weve added many more frequency points, we still cannot resolve the two
sinuoids; we are also still not getting the expected power.
Taking a closer look at what this plot is telling us, we see that all we have done by
adding more FFT points is to more clearly define the underlying sinc function arising
from the waveform frequency resolution equation. You can see that the sinc nulls are
spaced at about 0.1 MHz.
Because our two sinusoids are spaced only 0.05 MHz apart, no matter how many FFT
points (zero padding) we use, we will never be able to resolve the two sinusoids.
Lets look at what the resolution equations are telling us. Although the FFT resolution is
about 14 kHz (more than enough resoution), the waveform frequency resolution is
only 100 kHz. The spacing between signals is 50 kHz, so we are being limited by the
waveform frequency resolution.
To resolve the spectrum properly, we need to increase the amount of time-domain
data we are using. Instead of zero padding the signal out to 70 us (7000 points), lets
capture 7000 points of the waveform. The time-domain and domain results are shown
here, respectively.

The resulting frequency-domain data, shown as a power spectrum, is shown here:

With the expanded time-domain data, the waveform frequency resolution is now
about 14 kHz as well. As seen in the power spectrum plot, the two sinusoids are not
seen. The 1 MHz signal is clearly represented and is at the correct power level of 10
dBm, but the 1.05 MHz signal is wider and not showing the expected power level of 10
dBm. What gives?

What is happening with the 1.05 MHz signal is that we dont have an FFT point at 1.05
MHz, so the energy is split between multiple FFT bins.
The spacing between FFT points follows the equation:

where nfft is the number of FFT points and fs is the sampling frequency.
In our example, were using a sampling frequency of 100 MHz and a 7000-point FFT.
This gives us a spacing between points of 14.28 kHz. The frequency of 1 MHz is a
multiple of the spacing, but 1.05 MHz is not. The closest frequencies to 1.05 MHz are
1.043 MHz 1.057 MHz, so the energy is split between the two FFT bins.
To solve this issue, we can choose the FFT size so that both frequencies are single
points along the frequency axis. Since we dont need finer waveform frequency
resolution, its okay to just zero pad the time-domain data to adjust the FFT point
spacing.
Adding an additional 1000 zeros (10 us) to the time-domain signal gives us a spacing of
12.5 kHz, and both 1 MHz and 1.05 MHz are integer multiples of the spacing. The
resulting spectrum is shown in the following figure.

Now both frequencies are resolved and at the expected power of 10 dBm.

For the sake of overkill, you can always add more points to your FFT through zero
padding (ensuring that you have the correct waveform resolution) to see the shape of
the FFT bins as well. This is shown in the following figure:

Choosing the Right FFT Size


Three considerations should factor into your choice of FFT size, zero padding, and
time-domain data length.
1. What waveform frequency resolution do you need?
2. What FFT resolution do you need?
3. Does your choice of FFT size allow you to inspect particular frequencies of
interest?
1) The waveform frequency resolution should be smaller than the minimum spacing
between
frequencies
of
interest.
2) The FFT resolution should at least support the same resolution as your waveform
frequency resolution. Additionally, some highly-efficient implementations of the FFT
require that the number of FFT points be a power of two.
3) You should ensure that there are enough points in the FFT, or the FFT has the
correct spacing set, so that your frequencies of interest are not split between multiple
FFT points.
One final thought on zero padding the FFT:

If you apply a windowing function to your waveform, the windowing function needs to
be applied before zero padding the data. This ensures that your real waveform data
starts and ends at zero, which is the point of most windowing functions.

You might also like