CN102289668A

CN102289668A - Binaryzation processing method of self-adaption word image based on pixel neighborhood feature

Info

Publication number: CN102289668A
Application number: CN2011102641444A
Authority: CN
Inventors: 谭洪舟; 朱雄泳; 杨劲
Original assignee: Individual
Current assignee: Guangzhou Kansig Electronics Technology Inc
Priority date: 2011-09-07
Filing date: 2011-09-07
Publication date: 2011-12-21

Abstract

The invention discloses a method of carrying out binaryzation processing on a self-adaption word image based on a pixel neighborhood feature through a global threadhold value and a local threadhold value. At first, the brightness of a text image is adjusted globally, the proportion of a darker part in the image is increased, and the grey contrast of an object character and a background is improved in the text image; secondly, a bicubic interpolation algorithm is adopted to zoom the text image and gaps among character strokes are increased; thirdly, the stroke width d of the object character is numbered and consequently the size w of a neighborhood computation template is determined; fourthly, word information of the text image is divided into word blocks according to the determined size of the neighborhood computation template; and finally the global threadhold value and the local threadhold value are combined so that the binaryzation processing is carried out on each word block point to point. Through the binaryzation processing method, the character strokes can be separated from the background effectively; the phenomena of omission and artifact are avoided; the connectivity of the strokes is kept; the computation of the local threadhold value of pixel points in the text image is reduced, and the computation speed is greatly improved.

Description

Binarization processing method of self-adaptive character image based on pixel neighborhood characteristics

Technical Field

The invention relates to a binarization processing method of a character image, in particular to a binarization processing method of an adaptive character image based on pixel neighborhood characteristics by utilizing global and local thresholds.

Background

Binarization is a basic technique in digital image processing technology, and is also a preprocessing technique of many image processing techniques, and is widely used in image processing such as automatic object recognition (ATR), image analysis, text enhancement, and Optical Character Recognition (OCR). Most of the existing binarization methods belong to thresholding methods, and in different applications, the selection of a threshold value determines the retention of image characteristic information. Therefore, the automatic threshold value selecting method is very worthy of study, and a good automatic threshold value selecting method not only can keep useful information in an image, but also can reduce the time overhead.

The key of the image binarization technology lies in how to select a threshold value, and the threshold value is mainly divided into three categories according to the processing modes of the threshold value on pixels:

(1) global thresholding: the whole image is binarized by using a single threshold value T (global threshold value). A global threshold T is typically determined from the histogram or spatial distribution of the gray levels of the image, and the gray level of each pixel in the image is compared to T. If the value is larger than T, the foreground color is selected; otherwise, the color is taken as the background color. Typical global thresholding methods are the Ostu method, the maximum entropy method, and the like. The global threshold method has a prominent effect when the gray scale of the target and the background is obviously different, but the method often ignores details easily, and when more shadows exist in the image or the gray scale change of the image is complex, the ideal effect is often difficult to obtain.

(2) Local thresholding: the threshold of the pixel is determined by the gray value of the current pixel and the gray characteristic of the points around the pixel. And comparing the gray level of the investigation point with the neighborhood point by defining the neighborhood of the investigation point and using a neighborhood calculation template. Typical local thresholding methods are the Bernsen method, the nillblack method, and the like. The local threshold method can adapt to more complex conditions and is more widely applied than the global threshold method. However, the boundary characteristic information of the image is often ignored, so that different areas in the original image become large areas after binarization, and some important information of the binarization result image is lost. In some applications, such as medical image segmentation and particle analysis, the result image is often required to better retain the boundary feature information, which is very important for subsequent image analysis.

(3) Dynamic threshold method: when the illumination is not uniform or the background gray scale changes greatly, different thresholds must be automatically determined according to the coordinate position relationship of the pixels, and dynamic threshold determination is implemented. The threshold selection of this method depends not only on the gray values of the pixel and surrounding pixels, but also on the coordinate position of the pixel. The dynamic threshold binarization can process an image with poor quality and even a unimodal histogram, but because the dynamic thresholding method usually needs to calculate a threshold value for each pixel point in the image, that is, the calculation amount for calculating a threshold surface (usually a curved surface) for the whole image is very large, the calculation speed is generally slow, and the development of the method is hindered to a certain extent due to the defects of time consumption and certain distortion. Iterative methods are a more common dynamic threshold determination technique.

Disclosure of Invention

In view of the above disadvantages, the present invention provides a binarization processing method for an adaptive text image based on pixel neighborhood characteristics by using global and local thresholds, comprising:

a) carrying out global brightness adjustment on the text image, and improving the gray scale contrast of the target character and the background in the text image;

b) adaptively selecting the size of a neighborhood calculation template;

c) dividing the text information of the text image into text blocks according to the size of the selected domain calculation template;

d) and performing point-by-point binarization processing on each character block by adopting a method of combining global and local threshold values.

The step a) and the step b) are also provided with:

and ab) carrying out zooming processing on the text image by adopting a bicubic interpolation algorithm.

The step a) comprises the following steps:

a1) defining the pixel value of any point of the text image as the brightness value Y of the point, normalizing the brightness value Y, I represents the brightness value of the point after normalization,

I＝Y/255；

a2) calculating the average value of the gray scale of the whole image_al，

<math><mrow> <msub> <mi>I</mi> <mi>al</mi> </msub> <mo>=</mo> <mfrac> <mrow> <munder> <mi>Σ</mi> <mrow> <mi>ρ</mi> <mo>&Element;</mo> <mi>I</mi> </mrow> </munder> <mi>log</mi> <mrow> <mo>(</mo> <mi>I</mi> <mrow> <mo>(</mo> <mi>ρ</mi> <mo>)</mo> </mrow> <mo>)</mo> </mrow> </mrow> <mi>N</mi> </mfrac> </mrow></math>

Wherein rho epsilon I represents that rho points are in a definition domain of the text image, and N represents the number of pixel values in the text image;

a3) a global luminance dynamic range compression degree coefficient gamma is defined,

a4) global brightness adjustment is carried out on the text image by utilizing the global brightness dynamic range compression degree coefficient gamma,

I′＝I^γ；

a5) after brightness adjustment, mapping the gray value of the text image to a gray range (0-255) displayed by a display, namely:

wherein r is_max、r_minThe maximum gray value and the minimum gray value of the text image are respectively, and f is the gray value of the mapped text image.

The step b) comprises the following steps:

b1) defining the maximum and minimum gray values in the text image as r_maxAnd r_minThe initial value of the global threshold is

The initial value of the iteration times k is 0;

b2) according to the threshold value T_kDividing the text image into two parts of target character and background, respectively calculating the pixel number of target character and backgroundAnd

and its gray level average value

And

then

b3) Solving a new threshold value:

T_{k + 1} = ({avg}_{f}^{k} + {avg}_{b}^{k}) / 2;

b4) if T)_k+1＝T_kOr k is more than 100, ending; otherwise k is k +1, go to b 2);

b5) randomly selecting 100 points in the text image, and designating the number of times for calculating the stroke width v as an initial value of 0;

b6) setting a length threshold value length (hT) according to the size of a font in a text image, wherein the number n of the widths of effective character strokes is 0;

b7) if the pixel value f (x, y) < T of the point of interest (x, y)_k+1Then, the black pixel points are extended from the point along the horizontal and vertical directions until leaving the target area, and the lengths Hl and Vl of the black pixel points in the horizontal and vertical directions are obtained through statistics, wherein v is v + 1.

b8) If H1 < length hT or Vl < length hT, taking the smaller length value as the width (n) of the character stroke of the point, wherein n is n + 1; otherwise, abandoning the point;

b9) if v > -100, exit; otherwise take a point down, go to b 7);

b10) and sequencing the widths of all effective target character strokes, and then taking the value of the widths as the width d of the strokes, so as to select the size w of the neighborhood calculation template, wherein the w is 2d + 1.

The step d) comprises the following steps:

d1) comparing the gray value f (x, y) of the investigation point (x, y) with a global threshold value T_k+1If the gray value f (x, y) of the point under consideration (x, y) is less than or equal to T_k+1Go to d 2); otherwise, the gray value g (x, y) of the target pixel point is 255, the next point is continuously scanned, and the process goes to d 1;

d2) finding the average gray level avg (x, y) in the w x w template taking the investigation point as the center;

d3) and d, comparing the gray value f (x, y) of the inspected point with the average gray value avg (x, y) obtained in the step d2), if the gray value f (x, y) of the inspected point is larger than the average gray value avg (x, y), the gray value g (x, y) of the target pixel point is 255, otherwise, the gray value g (x, y) of the target pixel point is 0, continuing to scan the next point, and turning to d 1).

The invention has the beneficial effects that: according to the invention, through global brightness adjustment of the text image, the gray contrast of the target character and the background in the text image can be improved, a text image with better quality is obtained, and preparation is made for subsequent binarization processing; in addition, the text image is amplified through a bicubic interpolation algorithm, so that the gap between character strokes can be increased, and the size of a calculation template in the following field can be conveniently determined; the size of the neighborhood calculation template is selected in a self-adaptive manner, so that the application range of the method is widened; in addition, the method of combining the global threshold value and the local threshold value is adopted to carry out point-by-point binarization processing on each character block, so that character strokes can be effectively segmented from the background, the phenomena of pen breaking, artifact and the like are avoided, the connectivity of the strokes is kept, the condition of calculating the local threshold value for pixel points in a text image is reduced, and the operation speed is greatly improved.

Drawings

FIG. 1 is a block diagram of a method for binarization processing of an adaptive text image based on pixel neighborhood characteristics according to the present invention;

FIG. 2 is a flow chart of a method for adaptively selecting a neighborhood calculation template according to the present invention;

FIG. 3 is a flow chart of the point-by-point binarization processing of the invention combining global and local thresholds;

FIG. 4 is a schematic diagram of a text image to be binarized according to the present invention;

FIG. 5 is a schematic diagram of a text image after global brightness adjustment according to the present invention;

FIG. 6 is a schematic diagram of a text image to be magnified according to the present invention;

fig. 7 is a schematic diagram of a text image after the point-by-point binarization processing combining global and local thresholds is performed.

Detailed Description

The invention is further elucidated with reference to the drawing.

As shown in fig. 1, the binarization processing method of the adaptive text image based on the pixel neighborhood characteristics of the present invention includes: 1) carrying out global brightness adjustment on the text image, and improving the gray scale contrast of the target character and the background in the text image; 2) carrying out zooming processing on the text image by adopting a bicubic interpolation algorithm; 3) adaptively selecting the size of a neighborhood calculation template; 4) dividing the text information of the text image into text blocks according to the size of the determined domain calculation template; 5) and performing point-by-point binarization processing on each character block by adopting a method of combining global and local threshold values.

The following steps are described in detail:

1) and (3) performing global brightness adjustment on the text image: the method comprises the following steps of adjusting the overall brightness of an original text image according to the overall brightness level of the original text image, increasing the proportion of a darker part in the image, and improving the gray scale contrast of target characters and a background in the text image, and comprises the following specific steps:

11) defining the pixel value of any point of the text image as the brightness value Y of the point, firstly carrying out normalization processing on the brightness value Y, wherein I represents the brightness value of the point after normalization,

I＝Y/255；

12) calculating the average value I of the gray scale of the whole text image_al，

13) a global luminance dynamic range compression degree coefficient gamma is defined,

wherein,

is the average value of the overall gray scale of the image I_alThe increasing function of (1) represents the bending degree of the global dynamic range compression curve, the lower the average value is, the larger the bending degree of the curve is, the larger the stretching degree of the darker part is, but when the gray average value is larger, the gamma is taken as 1, and the image is not subjected to the whole dynamic range compression;

14) global brightness adjustment is carried out on the text image by utilizing the global brightness dynamic range compression degree coefficient gamma,

I′＝I^γ；

15) after brightness adjustment, mapping the gray value of the text image to a gray range (0-255) displayed by a display, namely:

2) And (4) carrying out scaling processing on the text image by adopting a bicubic interpolation algorithm, and increasing gaps among character strokes.

3) And self-adaptively selecting the size of the neighborhood calculation template, and counting the stroke width d of the target character so as to determine the size w of the neighborhood calculation template.

Firstly, a global iterative algorithm is adopted to calculate a global threshold value, and the method firstly selects an initial threshold value

The initial value of the iteration number k is set to 0, wherein r_max、r_minRespectively the maximum and minimum gray values in the image. According to the threshold value T_kSegmenting an image into a target and a background (less than T)_kAnd is not less than T_k) Two parts, respectively calculating the pixel numbers of the target part and the background part

And

and its gray level average value

And

the average expected value of the target and background is then used as a new threshold, and so on. When the threshold value is no longer changed, i.e. T_k+1＝T_kOr when k is more than 100, stopping iteration, and generally reaching a stable state after several iterations.

Then, the width of the template is determined, 100 points are randomly selected in the text image, and the global threshold T is used for determining the width of the template_k+1Judging each point, and if the point is in the background area, not considering the point again; if the point is in the target area (black pixel), the point respectively extends along the horizontal direction and the vertical direction until the point leaves the target area, and the length of the black pixel in the horizontal direction and the vertical direction is obtained through statistics. Discarding the point when the length in both the vertical and horizontal directions exceeds a length threshold (set according to font size in the text image), otherwise taking the smaller of the twoThe small length value is used as the width of the character stroke of the point; the widths of all valid character strokes are sorted and then the value is taken as the width d of the stroke, thereby determining the width w of the template (w-2 d + 1).

As shown in fig. 2, the specific process is as follows:

31) defining the maximum and minimum gray values in the text image as r_maxAnd r_minThe initial value of the global threshold is

The initial value of the iteration times k is 0;

32) according to the threshold value T_kDividing the text image into two parts of target character and background, respectively calculating the pixel number of target character and background

And

and its gray level average valueAnd

then

33) Solving a new threshold value:

T_{k + 1} = ({avg}_{f}^{k} + {avg}_{b}^{k}) / 2;

34) if T)_k+1＝T_kOr k is more than 100, ending; otherwise k is k +1, go to b 2);

35) randomly selecting 100 points in the text image, and designating the number of times for calculating the stroke width v as an initial value of 0;

36) setting a length threshold value length (hT) according to the size of a font in a text image, wherein the number n of the widths of effective character strokes is 0;

37) if the pixel value f (x, y) < T of the point of interest (x, y)_k+1Then, the black pixel points are extended from the point along the horizontal and vertical directions until leaving the target area, and the lengths Hl and Vl of the black pixel points in the horizontal and vertical directions are obtained through statistics, wherein v is v + 1.

38) If Hl < length hT or Vl < length hT, taking the smaller length value as the width (n) of the character stroke of the point, wherein n is n + 1; otherwise, abandoning the point;

39) if v > -100, exit; otherwise take a point down, go to b 7);

310) and sequencing the widths of all effective target character strokes, and then taking the value of the widths as the width d of the strokes, so as to select the size w of the neighborhood calculation template, wherein the w is 2d + 1.

4) And dividing the character information of the text image into character blocks according to the size of the determined domain calculation template.

5) And performing point-by-point binarization processing on each character block by adopting a method of combining global and local threshold values. If the gray value f (x, y) of the point under consideration (x, y) is greater than T_k+1If yes, the gray value g (x, y) of the target pixel point is 255; otherwise, the average gray level avg (x, y) in the w x w template with the investigation point as the center is calculated, if the gray level f (x, y) of the investigation point is larger than the average gray level avg (x, y), the value of the gray level g (x, y) of the target pixel point is 255, otherwise, the gray level g (x, y) of the target pixel point is 0.

As shown in fig. 3, the specific process is as follows:

51) comparing the gray value f (x, y) of the investigation point (x, y) with a global threshold value T_k+1If the gray value f (x, y) of the point under investigation (x, y) is less than or equal toAt T_k+1Go to 52); otherwise, the gray value g (x, y) of the target pixel point is 255, the next point is continuously scanned, and 51 is turned to;

52) finding the average gray level avg (x, y) in the w x w template taking the investigation point as the center;

53) comparing the gray value f (x, y) of the inspected point with the average gray avg (x, y) obtained in the step 52), if the gray value f (x, y) of the inspected point is larger than the average gray avg (x, y), the gray value g (x, y) of the target pixel point is 255, otherwise, the gray value g (x, y) of the target pixel point is 0, continuing to scan the next point, and going to 51).

By adopting the method, the binary character image with higher quality can be obtained by traversing each pixel point of the original image.

The specific embodiment is as follows:

for the address partial image of the collected second generation resident identification document with the poor quality and the size of 377 x 164 pixels, as shown in fig. 4, firstly, the global brightness is adjusted in a self-adaptive manner to obtain an image with improved gray scale contrast of the target and the background, as shown in fig. 5; then, amplifying the image after brightness adjustment by adopting a bicubic interpolation algorithm according to a certain scale factor, thereby increasing the gap between character strokes, as shown in fig. 6; then counting the stroke width d of the target character for the graph 6, thereby determining the size w of the neighborhood calculation template; dividing the character information into character blocks according to the size of the template acquired by self-adaption; finally, point-by-point binarization is carried out on the image 6 in a text block by adopting a method of combining global and local threshold values, so that the target character can be clearly extracted from the background, noise is filtered, and a high-quality binarized text image is obtained, as shown in fig. 7, table 1 shows that OCR results of the image are compared with those of the image binarized by other binarizing algorithms.

TABLE 1

The above description is only a preferred embodiment of the present invention, the present invention is not limited to the above embodiment, and there may be some slight structural changes in the implementation, and if there are various changes or modifications to the present invention without departing from the spirit and scope of the present invention, and within the claims and equivalent technical scope of the present invention, the present invention is also intended to include those changes and modifications.

Claims

1. A binarization processing method of a self-adaptive character image based on pixel neighborhood characteristics is characterized by comprising the following steps:

b) adaptively selecting the size of a neighborhood calculation template;

2. The binarization processing method for the adaptive character image based on the pixel neighborhood characteristics as claimed in claim 1, wherein between the step a) and the step b), further comprising:

3. The binarization processing method for the adaptive character image based on the pixel neighborhood characteristics as claimed in claim 1, wherein the step a) comprises:

I＝Y/255；

a2) calculating the average value of the gray scale of the whole image_al，

<math> <mrow> <msub> <mi>I</mi> <mi>al</mi> </msub> <mo>=</mo> <mfrac> <mrow> <munder> <mi>Σ</mi> <mrow> <mi>ρ</mi> <mo>&Element;</mo> <mi>I</mi> </mrow> </munder> <mi>log</mi> <mrow> <mo>(</mo> <mi>I</mi> <mrow> <mo>(</mo> <mi>ρ</mi> <mo>)</mo> </mrow> <mo>)</mo> </mrow> </mrow> <mi>N</mi> </mfrac> </mrow> </math>

I′＝I^γ；

4. The binarization processing method for the adaptive character image based on the pixel neighborhood characteristics as claimed in claim 1, wherein the step b) comprises:

The initial value of the iteration times k is 0;

b2) according to the threshold value T_kDividing the text image into two parts of target character and background, respectively calculating the pixel number of target character and background

Andand its gray level average value

Andthen

b3) Solving a new threshold value:

T_{k + 1} = ({avg}_{f}^{k} + {avg}_{b}^{k}) / 2;

b9) if v > -100, exit; otherwise take a point down, go to b 7);

5. The binarization processing method for the adaptive character image based on the pixel neighborhood characteristics as claimed in claim 4, wherein the step d) comprises: