Nothing Special   »   [go: up one dir, main page]

Computer Vision

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

21AI601 - COMPUTER VISION

Unit I & LP2- FOURIER TRANSFORM, CONVOLUTION AND


FILTERING, IMAGE ENHANCEMENT, RESTORATION,
HISTOGRAM PROCESSING

1. FOURIER TRANSFORM
 The Fourier transform maps a signal into its component frequencies. It does not change the
original signal, only its representation.
 Fourier transform is an important image processing tool which is used to decompose an
image into the frequency domain.
 The input image of the Fourier transform is the spatial domain(x,y) equivalent.
 The output of the Fourier transform represents the image in the frequency domain.
 In the frequency domain image, each point represents a particular frequency contained in
the spatial domain image.
 If an image has more high-frequency components (edges, stripes, corners), there will be a
number of points in the frequency domain at high-frequency values.
 The Fourier transform is a mathematical tool used in signal processing and computer vision
to analyze and manipulate signals or images in the frequency domain.
 It decomposes a signal or an image into its constituent frequencies, revealing information
about the different frequency components present in the data.
 In computer vision, the Fourier transform can be applied to both one-dimensional signals
(such as audio signals) and two-dimensional signals (such as images).
 The Discrete Fourier Transform (DFT) is commonly used in practice, and efficient
algorithms like the Fast Fourier Transform (FFT) are often employed for its computation.
 The FFT significantly speeds up the calculation of the Fourier transform, making it
practical for real-time applications in computer vision.
1.1 Uses of Fourier transform in computer vision
1.1.1 Image Filtering and Enhancement:
The Fourier transform is commonly used for frequency-based image filtering. By transforming an
image into the frequency domain, one can apply filters that selectively enhance or suppress certain
frequency components. This is useful for tasks such as noise removal or enhancing specific
features in an image.
1.1.2 Edge Detection:
Edges in an image correspond to changes in intensity, and these changes often manifest as high-
frequency components in the Fourier domain. By applying a Fourier transform, it's possible to
identify and analyze the high-frequency components, aiding in edge detection algorithms.
1.1.3 Pattern Recognition:
Fourier descriptors, which are coefficients obtained from the Fourier transform, can be used for
representing the shape of objects in an image. These descriptors can then be used for pattern
recognition and shape matching.
1.1.4 Texture Analysis:
Fourier analysis can be used to extract information about the texture of an image by analyzing its
frequency components. This information is useful in tasks such as texture classification and
segmentation.
1.1.5 Image Compression:
Fourier transform-based methods are used in image compression techniques. Transforming the
image into the frequency domain allows for the removal of less significant frequency components,
resulting in compression without significant loss of perceived image quality.
1.1.6 Registration and Alignment:
In medical imaging and remote sensing, Fourier transforms can be used for image registration and
alignment. By comparing the frequency content of different images, one can determine the spatial
transformations required to align them.

2. CONVOLUTION AND FILTERING


2.1 Convolution
 Convolution is a mathematical operation that combines two functions to produce a third
function. In the context of image processing, convolution is typically performed between
an input image and a kernel or filter.
 The kernel is a small matrix that slides over the image, and at each position, it computes
the weighted sum of the pixel values in the neighborhood defined by the kernel.
 The result is a new image, often called the convolved or filtered image, where each pixel
is a linear combination of the neighboring pixels in the original image.
 In Computer Vision, convolution is generally used to extract or create a feature map (with
the help of kernels) out of the input image.

2.2 Basic Terminologies

In the above image, the blue matrix is the input and the green matrix is the output. Whereas we have
a kernel moving through the input matrix to get/extract the features.

2.2.1 Input Matrix


An image is made up of pixels, where each pixel is in the inclusive range of [0, 255]. So we can
represent an image, in terms of a matrix, where every position represents a pixel. The pixel value
represents how bright it is, i.e. pixel -> 0 is black and pixel -> 255 is white (highest brightness). A
grayscale image has a single matrix of pixels, i.e. it doesn't have any colour, whereas a coloured
image (RGB) has 3 channels, and each channel represents its colour density.
The above image is of shape: (24, 16) where height = 24 and width = 16. Similarly, we have a
coloured image (RGB) having 3 channels and it can be represented in a matrix of shape: (height,
width, channels)

Now, we know what is the first input to the convolution operator, but how to transform this input
and get the output feature matrix. Here comes the term ‘kernel’ which acts on the input image to
get the required output.

2.2.2 Kernel
In an image, we can say that a pixel surrounding another pixel has similar values. So to harness
this property of the image we have kernels. A kernel is a small matrix that uses the power of
localisation to extract the required features from the given image (input). Generally, a kernel is
much smaller than the input image. We have different kernels for different tasks like blurring,
sharpening or edge detection.

The convolution happens between the input image and the given kernel. It is the sliding dot product
between the kernel and the localized section of the input image.
The dimension of the region of interest (ROI) is always equal to the kernel’s dimension. We move
this ROI after every dot product and continue to do so until we have the complete output matrix.
2.2.3 Stride

It is a parameter which controls/modifies the amount of movement of the ROI in the image. The
stride of greater than 1 is used to decrease the output dimension. Intuitively, it skips the overlap of
a few pixels in every dot product, which leads to a decrease in the final shape of the output.

2.3 Filtering
 Filtering refers to the process of applying a filter, or kernel, to an image using convolution.
 Filters can be designed to perform various operations on an image, such as blurring,
sharpening, edge detection, or noise reduction.
 In a filter, all the kernels can be the same or different from each other. The specific kernel
can be used to extract specific features.
 Some common types of filters used in computer vision include:
 Gaussian Filter: Used for blurring and noise reduction by averaging the pixel values in the
neighborhood.
 Sobel Filter: Used for edge detection by computing the gradient magnitude of the image.
 Median Filter: Used for noise reduction by replacing each pixel value with the median
value in its neighborhood.
 Laplacian Filter: Used for edge detection and sharpening by highlighting regions of rapid
intensity change.
2.4 Applications

Feature Extraction: Convolutional neural networks (CNNs) use convolutional layers to extract
features from images, enabling tasks such as object detection and recognition.
Image Enhancement: Filters can be applied to enhance the visual quality of images by improving
contrast, reducing noise, or sharpening details.
Edge Detection: Filters such as Sobel and Prewitt are commonly used for detecting edges in
images, which is essential for tasks like image segmentation.
Texture Analysis: Convolution with texture kernels can be used to analyze and classify textures
in images.

3. IMAGE ENHANCEMENT
 Image enhancement techniques aim to improve the visual appearance of an image by
emphasizing certain features or reducing noise or other artifacts. Some common methods
include:
 Contrast Enhancement: Adjusting the contrast of an image to make it more visually
appealing or to highlight certain features.
 Brightness Adjustment: Changing the overall brightness level of an image.
 Sharpness Enhancement: Enhancing the edges and details in an image to make it appear
sharper.
 Color Correction: Adjusting the color balance of an image to remove color casts or improve
color fidelity.

4. IMAGE RESTORATION
 Image restoration techniques are used to recover the original image from degraded or
distorted versions.
 This is often done by modeling the degradation process and applying inverse operations to
reconstruct the original image. Some common restoration techniques include:
 Noise Reduction: Removing noise from an image caused by factors such as sensor noise
or compression artifacts.
 Blur Removal: Deblurring techniques are used to remove blurring caused by motion,
defocus, or other factors.
 Super-Resolution: Increasing the resolution of an image beyond its original size by
inferring additional detail from neighboring pixels or multiple low-resolution images.
5. HISTOGRAM PROCESSING
 Histogram processing involves analyzing and manipulating the histogram of an image,
which represents the distribution of pixel intensities.
 Histogram equalization is a common technique used to improve the contrast of an image
by redistributing pixel intensities to cover the entire dynamic range more evenly.
 Other histogram-based techniques include:
 Histogram Matching: Modifying the histogram of an image to match a desired histogram,
often used for color correction or style transfer.
 Histogram Stretching: Expanding the dynamic range of pixel intensities to improve
contrast.
 Histogram Specification: Specifying a desired histogram shape and adjusting the image
histogram to match it.

You might also like