CN107133929A - Low quality file and picture binary coding method based on background estimating and energy minimization - Google Patents
Low quality file and picture binary coding method based on background estimating and energy minimization Download PDFInfo
- Publication number
- CN107133929A CN107133929A CN201710289747.7A CN201710289747A CN107133929A CN 107133929 A CN107133929 A CN 107133929A CN 201710289747 A CN201710289747 A CN 201710289747A CN 107133929 A CN107133929 A CN 107133929A
- Authority
- CN
- China
- Prior art keywords
- mrow
- image
- msub
- background
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 19
- 230000002146 bilateral effect Effects 0.000 claims abstract description 14
- 238000001914 filtration Methods 0.000 claims abstract description 13
- 230000003190 augmentative effect Effects 0.000 claims abstract description 6
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 8
- 230000000877 morphologic effect Effects 0.000 claims description 7
- 238000003708 edge detection Methods 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 238000003384 imaging method Methods 0.000 claims description 5
- 230000003416 augmentation Effects 0.000 claims description 4
- 238000010586 diagram Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 229920006395 saturated elastomer Polymers 0.000 claims description 4
- 230000012010 growth Effects 0.000 claims description 3
- 238000004088 simulation Methods 0.000 claims description 3
- 238000003306 harvesting Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 claims description 2
- 210000001525 retina Anatomy 0.000 claims description 2
- 230000003313 weakening effect Effects 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 4
- 238000005286 illumination Methods 0.000 abstract description 4
- 230000008859 change Effects 0.000 abstract description 2
- 230000008595 infiltration Effects 0.000 abstract description 2
- 238000001764 infiltration Methods 0.000 abstract description 2
- 238000011946 reduction process Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 10
- 210000000695 crystalline len Anatomy 0.000 description 6
- 230000011218 segmentation Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 230000002902 bimodal effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
- 230000004304 visual acuity Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
- G06T5/94—Dynamic range modification of images or parts thereof based on local image properties, e.g. for local contrast enhancement
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
- G06T2207/20028—Bilateral filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30176—Document
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Geometry (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a kind of low quality file and picture binary coding method based on background estimating and energy minimization, first color document images are carried out with gray scale pretreatment, noise reduction process, image background estimation, background subtraction and image enhaucament is carried out to image using bilateral filtering, constructs energy function, tectonic network figure, finally use the figure based on augmenting path to cut the minimum that algorithm realizes energy function.Invention significantly improves the document image binaryzation effect under complex background, the document image binaryzation that can have spot or the low complex background of texture, uneven illumination, contrast suitable for multiple color writing, stroke gradual change, ink marks infiltration, the page is handled.
Description
Technical Field
The invention belongs to the technical field of digital image processing, pattern recognition and machine learning, and particularly relates to a low-quality document image binarization method based on background estimation and energy minimization.
Background
Document Analysis and Recognition (DAR) technology has been widely applied to the fields of ancient book digitization, layout analysis and character recognition, video subtitle extraction, text information retrieval and the like, and mainly comprises the processes of image acquisition, binaryzation, skew correction, character segmentation and recognition and the like. Image binarization is one of key preprocessing links, and is to convert a gray level image into a binary image so as to realize the separation of a character foreground from a document background. The performance of the whole DAR system is directly influenced by the effect of the binarization algorithm, so that in recent years, a plurality of scholars research the DAR system and put forward a plurality of algorithms; however, binarization of low quality document images remains a challenge due to factors such as poor image contrast, ink saturation, page smearing, or illumination non-uniformity.
The binarization algorithm can be roughly classified into a global threshold method and a local threshold method. The global thresholding method adopts a single threshold to divide a document image into two categories of characters (foreground) and background, for example, an Otsu algorithm selects an optimal threshold by using a gray histogram of the image, so that the inter-category variance of pixels of the foreground and the background after threshold segmentation is maximum. The global thresholding method has a good segmentation effect on images with large foreground and background differences, i.e. histograms with significant bimodal features, but some or even all foreground details are lost when processing low-quality document images.
The local thresholding method (also called adaptive thresholding method) is to set different thresholds on different parts of the image by convolution of the sliding window and the document image, for example, algorithms such as Niblack, Sauvola, Wolf and the like use the gray mean and variance in the neighborhood of pixels to construct a threshold segmentation curved surface, and the performance of the algorithm depends on the size of the sliding window, the thickness of character strokes and the like. Dynamically adjusting the window size aiming at the document images with different qualities to obtain the optimal threshold processing result; when the image contrast is low, a large number of noise points are generated or erroneous judgment is caused.
In addition, researchers at home and abroad also propose a plurality of more complex algorithms, such as a local contrast method, a background estimation and stroke edge detection method, a Laplace energy method, a convolutional neural network method and the like. However, none of the above methods can solve well for image binarization in complex document backgrounds such as low contrast, ink saturation, gradient illumination, smudges and textures.
Disclosure of Invention
In order to solve the technical problems, the invention provides a low-quality document image binarization method based on background estimation and energy minimization, which obviously improves the document image binarization effect under a complex background and can be suitable for document image binarization processing of complex backgrounds such as multi-color writing, stroke gradual change, ink mark infiltration, dirty or texture on pages, uneven illumination, low contrast and the like.
The technical scheme adopted by the invention is as follows: a low-quality document image binarization method based on background estimation and energy minimization is characterized by comprising the following steps:
step 1: carrying out gray level pretreatment on the color document image;
step 2: performing noise reduction processing on the image by adopting bilateral filtering;
and step 3: the image background estimation specifically comprises the following substeps:
step 3.1: performing stroke width transformation on the image processed in the step 2;
step 3.2: calculating a simulation distance and an imaging height;
step 3.3: weakening dark features in the document image through two morphological closing operations aiming at the image processed in the step 2;
step 3.4: combining the results of the step 3.2 and the step 3.3 to perform down sampling and up sampling on the image;
and 4, step 4: the background subtraction and image enhancement specifically comprise the following sub-steps:
step 4.1: background subtraction;
calculating an absolute difference value between the bilateral filtering image in the step 2 and the background estimation image in the step 3, wherein a pixel point with zero gray level in the difference image belongs to a high-confidence background pixel point, and the gray value of the pixel point is set to be 255;
step 4.2: histogram equalization;
negating non-zero pixel points in the background subtraction image to obtain a gray value corresponding to the point, and then performing histogram equalization on the whole image to increase the contrast ratio of the foreground and the background of the image;
and 5: constructing an energy function;
step 6: constructing a network diagram;
and 7: and (4) realizing the minimization of the energy function by adopting an image cutting algorithm based on the augmented path.
Compared with the existing algorithm, the method has the remarkable advantages that:
(1) the invention adopts the minimum mean value method to carry out gray scale pretreatment on the color document image, and the obtained gray scale image has color independence, thereby not only increasing the contrast between foreground pixels and background pixels, but also reducing the gray scale variance between the foreground pixels;
(2) the invention adopts a nonlinear bilateral filtering algorithm to realize the image noise reduction treatment, and simultaneously considers the spatial proximity and the gray similarity of the image, thereby achieving the purpose of edge protection and noise reduction;
(3) the stroke width in the document image is estimated by adopting a stroke width transformation method, and the method has the advantages that the stroke characteristics basically belong to unique characteristics of characters (certainly, the interference of certain degradation factors is not eliminated, and the subsequent operation is required to be eliminated), and the method has universality on texts of different languages;
(4) based on a visual sensitivity test model, the method adopts morphological closed operation to realize image background estimation, and performs histogram equalization on a background subtraction image, thereby effectively inhibiting the influence of degradation factors and enhancing the local contrast of the image;
(5) the invention realizes the document image binaryzation based on the maximum flow/minimum cut combined optimization algorithm, and the graph cut algorithm has strong universality, high feasibility and high running speed (close to real-time performance), and is suitable for various degraded low-quality document images.
Drawings
FIG. 1: is a flow chart of an embodiment of the invention;
FIG. 2: the angle resolution of the vision test model of the embodiment of the invention is shown schematically.
Detailed Description
In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail with reference to the accompanying drawings and examples, it is to be understood that the embodiments described herein are merely illustrative and explanatory of the present invention and are not restrictive thereof.
The main idea of the invention is as follows: when the target image is far away from an observer, the observed detail (stroke) information of the target image is less and less, but the perceived background gray scale and depth are not influenced by the distance, so that the approximate background of the image can be estimated by simulating the scene of the remotely observed image, an energy function is constructed for the image after the estimated background is removed, and image binarization is realized by adopting a graph cut algorithm.
Referring to fig. 1, the method for binarizing a low-quality document image based on background estimation and energy minimization provided by the present invention includes the following steps:
step 1: graying the minimum mean value;
the invention adopts a minimum mean value method to carry out gray level preprocessing on a color document image f (x, y), and the specific calculation formula is as follows:
wherein f isi(x, y) are R, G, B color component images, respectively, fgray(x, y) is the transformed grayscale image.
The obtained gray level image has color independence, namely, in the gray level image, the contrast between the foreground pixels and the background pixels is high, and meanwhile, the gray level difference between the foreground pixels is small.
Step 2: carrying out bilateral filtering and denoising;
the invention adopts a nonlinear bilateral filtering algorithm to carry out image noise reduction processing, and the output pixel value of the nonlinear bilateral filtering algorithmDepending on the weighted combination of the pixel values f (k, l) in the neighborhood S, the specific calculation formula is:
wherein the weighting factors w (i, j, k, l) depend on the domain kernelSum-value domain kernelProduct of, i.e. Andrespectively representing a gaussian distance variance and a gaussian gray variance.
Because the bilateral filter considers the spatial proximity and the gray similarity of the image at the same time, the purpose of edge-preserving and denoising can be achieved.
And step 3: estimating the background of the image;
step 3.1 Stroke Width Transformation (SWT): adopting Canny operator to carry out edge detection on the gray image after bilateral filtering, searching another edge pixel point q corresponding to each edge pixel point p according to the gradient direction of the edge pixel point p, wherein the Euclidean distance between the two points is p-q i, i.e. stroke width estimation of all pixel points on a [ p, q ] path, and unless the pixel point is assigned with a smaller width value, the stroke width SWE of the image is mathematical expectation of stroke width estimation of all non-zero pixel points, and the specific calculation formula is as follows:
wherein n is the total number of non-zero-value pixel points in the stroke width transformation output image s (x, y).
Step 3.2, calculating the simulation distance and the imaging height: based on the visual acuity test model, the smallest resolution angle (angle of 1') of the human eye can be perceived as the smallest image, as shown in fig. 2. Because the contrast of the low-quality document image is usually lower than that of the binary image on the visual chart, the minimum visual angle of the corresponding target is usually larger than that of the visual test, and the thicker the stroke of the image is, the farther the observation distance required by the stroke details cannot be sensed, the resolution angle corresponding to the stroke width of the document image is assumed to be 3', and the simulated observation distance d is determined according to the stroke width estimated in the step 3.10The specific calculation formula is as follows:
d0=SWE×cotθ,
where θ is the observation resolution angle, here the 3' viewing angle.
Because the crystalline lens of the human eye is similar to a convex lens, the distance d between the crystalline lens and the target image can be obtained according to the lens imaging rule and the focal length equation0Height h of image on retinaiThe specific calculation formula is as follows:
wherein f is the distance between the human lens and the retina, namely the focal length of the lens (about 17mm), h0Is the original height of the target image.
Step 3.3 morphological closing operation: dark features (character strokes) in the document image are weakened through two morphological closing operations, and circular structural elements are adopted in the two closing operations. The diameter of the first-time structural element is set as the stroke width of the image, and the diameter of the second-time structural element is 12 pixels larger than the stroke width of the image.
Step 3.4, image down-sampling and up-sampling: distance to the target image is d0The height of the observed image is hiTherefore, the image after the morphological closing operation is scaled to h by bilinear down-samplingiA height; and then restoring the zoomed image to the original size by adopting a bilinear interpolation method, wherein the obtained image is the estimated background image. When the image is zoomed, the image aspect ratio is kept unchanged.
And 4, step 4: background subtraction and image enhancement;
step 4.1 background subtraction: and calculating the absolute difference value between the bilateral filtering image and the background estimation image, wherein a pixel point with zero gray level in the difference image belongs to a high-confidence background pixel point, and the gray value of the pixel point is set to be 255 (white).
Step 4.2 histogram equalization: and (3) negating non-zero pixel points in the background subtraction image to obtain a gray value corresponding to the point, and then carrying out histogram equalization on the whole image to increase the contrast ratio of the foreground and the background of the image.
And 5: constructing an energy function;
the specific form of the laplace energy function is:
wherein the data item represents the cost of assigning a label to the pixel, e.g.Is referred to as a pixel pijA cost assigned to tag 0 (1); the boundary term represents the cost of discontinuity of adjacent pixels, namely the cost of endowing two adjacent pixels with different labels.
The Laplace transformation of the image can reflect the place where the gray level of the image changes suddenly, and when the Laplace value sign of a certain pixel point in the image is positive, the corresponding pixel point is generally positioned at the trough (dark) of the gray level image; on the contrary, when the laplacian value sign of a certain pixel point of the image is negative, the corresponding pixel point is located at the peak (bright) of the gray scale image. Therefore, the data items defining the laplace energy function of the present invention are specifically expressed as:
wherein,representing a pixel pijThe laplace value of (d);
the boundary items can be divided into boundary items in the horizontal directionAnd boundary item in vertical directionThe invention adopts a Canny edge detection operator to determine the boundary item, the probability of discontinuity of pixels near the edge is high, and the discontinuity cost between the pixels at two sides of the edge can be directly set to be zero, which is specifically expressed as:
wherein E isijRepresenting a pixel point pijEdge detection result of (1)ijRepresenting a pixel pijThe gray value of (c) is an arbitrary constant: (>0)。
Step 6: constructing a network diagram;
each pixel point p of the imageijAn intermediate node forming a network graph is added with two further terminal nodes s and t. The edge connecting the middle nodes is called nlink, and the weight of the nlink is determined by the boundary term of the energy function; the edge connecting the intermediate node and the terminal node is called tlink, and the weight value of the edge is determined by the data item of the energy function. Side (p)ijS) weight ofSide (p)ijT) a weight ofSide (p)ij,pi+1,j) Has a weight value ofSide (p)ij,pi,j+1) Has a weight value of
And 7: minimizing an energy function by adopting an image cutting algorithm based on an augmented path;
two search trees S and T are established based on a network graph, root nodes of the search trees are respectively positioned at a source point S and a sink point T, and the nodes of the search trees are divided into two types: the tree comprises active nodes and passive nodes, wherein the active nodes can expand free nodes into active nodes through unsaturated edges, and tree growth is achieved.
Step 7.1 growth phase: two trees grow continuously until active nodes of the two trees meet, and a path from a source point to a sink point is found;
step 7.2 augmentation stage: the path obtained in the step 7.1 is augmented, at least one saturated edge is formed by the augmentation, the child nodes connected with the edge become isolated nodes, and the trees S and T are split into a plurality of subtrees;
step 7.3, a harvesting stage: and finding a parent node for each isolated node, and if the parent node which meets the condition does not exist, changing the parent node into a free node until all the isolated nodes are processed.
And repeatedly executing the three steps until the two trees do not grow any more and are separated by the saturated edge, so that the minimum cut of the graph, namely the minimum value of the energy function, is obtained, and the final binarization of the image is realized.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A low-quality document image binarization method based on background estimation and energy minimization is characterized by comprising the following steps:
step 1: carrying out gray level pretreatment on the color document image;
step 2: performing noise reduction processing on the image by adopting bilateral filtering;
and step 3: the image background estimation specifically comprises the following substeps:
step 3.1: performing stroke width transformation on the image processed in the step 2;
step 3.2: calculating a simulation distance and an imaging height;
step 3.3: weakening dark features in the document image through two morphological closing operations aiming at the image processed in the step 2;
step 3.4: combining the results of the step 3.2 and the step 3.3 to perform down sampling and up sampling on the image;
and 4, step 4: the background subtraction and image enhancement specifically comprise the following sub-steps:
step 4.1: background subtraction;
calculating an absolute difference value between the bilateral filtering image in the step 2 and the background estimation image in the step 3, wherein a pixel point with zero gray level in the difference image belongs to a high-confidence background pixel point, and the gray value of the pixel point is set to be 255;
step 4.2: histogram equalization;
negating non-zero pixel points in the background subtraction image to obtain a gray value corresponding to the point, and then performing histogram equalization on the whole image to increase the contrast ratio of the foreground and the background of the image;
and 5: constructing an energy function;
step 6: constructing a network diagram;
and 7: and (4) realizing the minimization of the energy function by adopting an image cutting algorithm based on the augmented path.
2. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 1, performing gray scale preprocessing on the color document image f (x, y) by using a minimum mean value method, wherein a preprocessing formula is as follows:
<mrow> <msub> <mi>f</mi> <mrow> <mi>g</mi> <mi>r</mi> <mi>a</mi> <mi>y</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <mo>&lsqb;</mo> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mi>i</mi> </munder> <mrow> <mo>(</mo> <msub> <mi>f</mi> <mi>i</mi> </msub> <mo>(</mo> <mrow> <mi>x</mi> <mo>,</mo> <mi>y</mi> </mrow> <mo>)</mo> <mo>)</mo> </mrow> <mo>+</mo> <mfrac> <mn>1</mn> <mn>3</mn> </mfrac> <munder> <mo>&Sigma;</mo> <mi>i</mi> </munder> <msub> <mi>f</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&rsqb;</mo> <mo>,</mo> </mrow>
wherein f isi(x, y) are R, G, B color component images, respectively, fgray(x, y) is the transformed grayscale image.
3. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 2, a nonlinear bilateral filtering algorithm is adopted to perform image noise reduction processing, and pixel values are outputDepending on the weighted combination of the pixel values f (k, l) in the neighborhood S, the specific calculation formula is:
<mrow> <mover> <mi>f</mi> <mo>^</mo> </mover> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <munder> <mo>&Sigma;</mo> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>l</mi> <mo>)</mo> <mo>&Element;</mo> <mi>S</mi> </mrow> </munder> <mi>f</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>l</mi> <mo>)</mo> </mrow> <mi>w</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> <mo>,</mo> <mi>l</mi> <mo>)</mo> </mrow> </mrow> <mrow> <munder> <mo>&Sigma;</mo> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>l</mi> <mo>)</mo> <mo>&Element;</mo> <mi>S</mi> </mrow> </munder> <mi>w</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> <mo>,</mo> <mi>l</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> </mrow>
wherein the weighting factors w (i, j, k, l) depend on the domain kernelSum-value domain kernelProduct of, i.e. Andrespectively representing a gaussian distance variance and a gaussian gray variance.
4. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 3.1, Canny operator is adopted to perform edge detection on the gray image after bilateral filtering, and each edge pixel point p is searched for another edge pixel point q corresponding to the edge pixel point p according to the gradient direction of the edge pixel point p, the Euclidean distance between two points is | | | p-q | | | which is stroke width estimation of all pixel points on a [ p, q ] path, unless the pixel point is assigned with a smaller width value, the stroke width SWE of the image is mathematical expectation of stroke width estimation of all non-zero pixel points, and the specific calculation formula is as follows:
<mrow> <mi>S</mi> <mi>W</mi> <mi>E</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mi>n</mi> </mfrac> <munder> <mo>&Sigma;</mo> <mrow> <mi>s</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&NotEqual;</mo> <mn>0</mn> </mrow> </munder> <mi>s</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>,</mo> </mrow>
wherein n is the total number of non-zero-value pixel points in the stroke width transformation output image s (x, y).
5. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 3.2, the simulated observation distance d is determined according to the stroke width SWE estimated in step 3.10The specific calculation formula is as follows:
d0=SWE×cotθ,
wherein theta is an observation resolution angle;
obtaining d from the target image according to the lens imaging rule and the focal length equation0Height h of image on retinaiThe specific calculation formula is as follows:
<mrow> <msub> <mi>h</mi> <mi>i</mi> </msub> <mo>=</mo> <mfrac> <mi>f</mi> <mrow> <msub> <mi>d</mi> <mn>0</mn> </msub> <mo>-</mo> <mi>f</mi> </mrow> </mfrac> <msub> <mi>h</mi> <mn>0</mn> </msub> <mo>,</mo> </mrow>
wherein f is the distance between the human lens and the retina, namely the focal length of the lens, h0Is the original height of the target image.
6. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 3.3, circular structural elements are adopted in both closing operations; the diameter of the first time structural element is set to be the stroke width of the image, and the diameter of the second time structural element is 12 pixels larger than the stroke width of the image.
7. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 3.4, the distance to the target image is d0The height of the observed image is hiTherefore, the image after the morphological closing operation is scaled to h by bilinear down-samplingiA height; then, restoring the zoomed image to the original size by adopting a bilinear interpolation method, wherein the obtained image is the estimated background image; when the image is zoomed, the image aspect ratio is kept unchanged.
8. The low-quality document image binarization method based on background estimation and energy minimization according to claim 1, characterized in that: in step 5, the specific form of the laplace energy function is:
<mrow> <msubsup> <mi>L</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mn>0</mn> </msubsup> <mo>=</mo> <mo>-</mo> <msup> <mo>&dtri;</mo> <mn>2</mn> </msup> <msub> <mi>I</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <mi>&phi;</mi> <mo>;</mo> </mrow>
<mrow> <msubsup> <mi>L</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mn>1</mn> </msubsup> <mo>=</mo> <msup> <mo>&dtri;</mo> <mn>2</mn> </msup> <msub> <mi>I</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>+</mo> <mi>&phi;</mi> <mo>;</mo> </mrow>
wherein the data items represent the cost of assigning a label to a pixel,is referred to as a pixel pijThe cost assigned to label 0/1;representing a pixel pijThe laplace value of (d); the boundary term represents the discontinuous cost of adjacent pixels, namely the cost when two adjacent pixels are endowed with different labels; the boundary items being divided into horizontally oriented boundary itemsAnd boundary item in vertical directionEijRepresenting a pixel point pijEdge detection result of (1)ijRepresenting a pixel pijC is an arbitrary constant, c is>0。
9. The binarization method for low-quality document images based on background estimation and energy minimization as claimed in claim 8, wherein the specific implementation procedure of step 6 is as follows: each pixel point p of the imageijIntermediate nodes forming a network diagram, two additional terminalsNodes s and t; the edge connecting the middle nodes is called nlink, and the weight of the nlink is determined by the boundary term of the energy function; the edge connecting the intermediate node and the terminal node is called as tlink, and the weight value of the tlink is determined by the data item of the energy function; side (p)ijS) weight ofSide (p)ijT) a weight ofSide (p)ij,pi+1,j) Has a weight value ofSide (p)ij,pi,j+1) Has a weight value of
10. The binarization method for low-quality document images based on background estimation and energy minimization according to any one of claims 1-9, characterized in that the specific implementation process of step 7 is as follows: two search trees S and T are established based on a network graph, root nodes of the search trees are respectively positioned at a source point S and a sink point T, and the nodes of the search trees are divided into two types: the tree comprises active nodes and passive nodes, wherein the active nodes can expand free nodes into active nodes from unsaturated edges to realize tree growth;
step 7.1: a growth stage;
two trees grow continuously until active nodes of the two trees meet, and a path from a source point to a sink point is found;
step 7.2, an augmentation stage;
the path obtained in the step 7.1 is augmented, at least one saturated edge is formed by the augmentation, the child nodes connected with the edge become isolated nodes, and the trees S and T are split into a plurality of subtrees;
step 7.3: a harvesting stage;
finding a father node for each isolated node, if no father node meeting the condition exists, changing the father node into a free node until all the isolated nodes are processed;
step 7.4: and repeatedly executing the three steps until the two trees do not grow any more and are separated by the saturated edge, so that the minimum cut of the graph, namely the minimum value of the energy function, is obtained, and the final binarization of the image is realized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710289747.7A CN107133929B (en) | 2017-04-27 | 2017-04-27 | The low quality file and picture binary coding method minimized based on background estimating and energy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710289747.7A CN107133929B (en) | 2017-04-27 | 2017-04-27 | The low quality file and picture binary coding method minimized based on background estimating and energy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107133929A true CN107133929A (en) | 2017-09-05 |
CN107133929B CN107133929B (en) | 2019-06-11 |
Family
ID=59716294
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710289747.7A Expired - Fee Related CN107133929B (en) | 2017-04-27 | 2017-04-27 | The low quality file and picture binary coding method minimized based on background estimating and energy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107133929B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107705300A (en) * | 2017-09-28 | 2018-02-16 | 成都大熊智能科技有限责任公司 | A kind of method that blank page detection is realized based on morphological transformation |
CN108830186A (en) * | 2018-05-28 | 2018-11-16 | 腾讯科技(深圳)有限公司 | Method for extracting content, device, equipment and the storage medium of text image |
CN109918482A (en) * | 2019-03-14 | 2019-06-21 | 西安航空学院 | A kind of Students' Innovation plan of starting an undertaking the system of analysis and appraisal |
CN111583157A (en) * | 2020-05-13 | 2020-08-25 | 杭州睿琪软件有限公司 | Image processing method, system and computer readable storage medium |
CN111681175A (en) * | 2020-05-09 | 2020-09-18 | 浙江大学 | Preprocessing method for scanning gray document image |
CN111754416A (en) * | 2019-03-29 | 2020-10-09 | 通用电气精准医疗有限责任公司 | System and method for background noise reduction in magnetic resonance images |
CN112488107A (en) * | 2020-12-04 | 2021-03-12 | 北京华录新媒信息技术有限公司 | Video subtitle processing method and processing device |
CN112836541A (en) * | 2021-02-03 | 2021-05-25 | 华中师范大学 | Automatic acquisition and identification method and device for 32-bit bar code of cigarette |
CN112837329A (en) * | 2021-03-01 | 2021-05-25 | 西北民族大学 | Tibetan ancient book document image binarization method and system |
CN113129246A (en) * | 2021-04-19 | 2021-07-16 | 厦门喵宝科技有限公司 | Document picture processing method and device and electronic equipment |
CN114283156A (en) * | 2021-12-02 | 2022-04-05 | 珠海移科智能科技有限公司 | Method and device for removing document image color and handwriting |
CN117436058A (en) * | 2023-10-10 | 2024-01-23 | 国网湖北省电力有限公司 | Electric power information safety protection system |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295648A (en) * | 2016-07-29 | 2017-01-04 | 湖北工业大学 | A kind of low quality file and picture binary coding method based on multi-optical spectrum imaging technology |
-
2017
- 2017-04-27 CN CN201710289747.7A patent/CN107133929B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295648A (en) * | 2016-07-29 | 2017-01-04 | 湖北工业大学 | A kind of low quality file and picture binary coding method based on multi-optical spectrum imaging technology |
Non-Patent Citations (3)
Title |
---|
NICHOLAS R. HOWE等: "A Laplacian Energy for Document Binarization", 《2011 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION》 * |
RAFAEL G. MESQUITA等: "Parameter tuning for document image binarization using a racing", 《EXPERT SYSTEMS WITH APPLICATIONS》 * |
熊炜等: "低质量文档图像二值化算法研究", 《计算机应用与软件》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107705300A (en) * | 2017-09-28 | 2018-02-16 | 成都大熊智能科技有限责任公司 | A kind of method that blank page detection is realized based on morphological transformation |
CN108830186A (en) * | 2018-05-28 | 2018-11-16 | 腾讯科技(深圳)有限公司 | Method for extracting content, device, equipment and the storage medium of text image |
CN109918482A (en) * | 2019-03-14 | 2019-06-21 | 西安航空学院 | A kind of Students' Innovation plan of starting an undertaking the system of analysis and appraisal |
CN111754416A (en) * | 2019-03-29 | 2020-10-09 | 通用电气精准医疗有限责任公司 | System and method for background noise reduction in magnetic resonance images |
CN111681175A (en) * | 2020-05-09 | 2020-09-18 | 浙江大学 | Preprocessing method for scanning gray document image |
CN111583157A (en) * | 2020-05-13 | 2020-08-25 | 杭州睿琪软件有限公司 | Image processing method, system and computer readable storage medium |
CN112488107A (en) * | 2020-12-04 | 2021-03-12 | 北京华录新媒信息技术有限公司 | Video subtitle processing method and processing device |
CN112836541A (en) * | 2021-02-03 | 2021-05-25 | 华中师范大学 | Automatic acquisition and identification method and device for 32-bit bar code of cigarette |
CN112836541B (en) * | 2021-02-03 | 2022-06-03 | 华中师范大学 | Automatic acquisition and identification method and device for 32-bit bar code of cigarette |
CN112837329A (en) * | 2021-03-01 | 2021-05-25 | 西北民族大学 | Tibetan ancient book document image binarization method and system |
CN112837329B (en) * | 2021-03-01 | 2022-07-19 | 西北民族大学 | Tibetan ancient book document image binarization method and system |
CN113129246A (en) * | 2021-04-19 | 2021-07-16 | 厦门喵宝科技有限公司 | Document picture processing method and device and electronic equipment |
CN114283156A (en) * | 2021-12-02 | 2022-04-05 | 珠海移科智能科技有限公司 | Method and device for removing document image color and handwriting |
CN114283156B (en) * | 2021-12-02 | 2024-03-05 | 珠海移科智能科技有限公司 | Method and device for removing document image color and handwriting |
CN117436058A (en) * | 2023-10-10 | 2024-01-23 | 国网湖北省电力有限公司 | Electric power information safety protection system |
Also Published As
Publication number | Publication date |
---|---|
CN107133929B (en) | 2019-06-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107133929B (en) | The low quality file and picture binary coding method minimized based on background estimating and energy | |
CN112819772B (en) | High-precision rapid pattern detection and recognition method | |
CN109872285B (en) | Retinex low-illumination color image enhancement method based on variational constraint | |
Khan et al. | An efficient contour based fine-grained algorithm for multi category object detection | |
Roa'a et al. | Generation of high dynamic range for enhancing the panorama environment | |
CN111415316A (en) | Defect data synthesis algorithm based on generation of countermeasure network | |
CN110782477A (en) | Moving target rapid detection method based on sequence image and computer vision system | |
CN111275643A (en) | True noise blind denoising network model and method based on channel and space attention | |
CN108765325A (en) | Small unmanned aerial vehicle blurred image restoration method | |
CN114331886B (en) | Image deblurring method based on depth features | |
CN107610144A (en) | A kind of improved IR image segmentation method based on maximum variance between clusters | |
CN109961416B (en) | Business license information extraction method based on morphological gradient multi-scale fusion | |
CN114219732A (en) | Image defogging method and system based on sky region segmentation and transmissivity refinement | |
Shah et al. | An iterative approach for shadow removal in document images | |
Liu et al. | Texture filtering based physically plausible image dehazing | |
CN115272306B (en) | Solar cell panel grid line enhancement method utilizing gradient operation | |
Fazlali et al. | Single image rain/snow removal using distortion type information | |
Wang et al. | Straight lane line detection based on the Otsu-Canny algorithm | |
Zhang et al. | Dehazing with improved heterogeneous atmosphere light estimation and a nonlinear color attenuation prior model | |
Gasparyan et al. | Iterative Retinex-Based Decomposition Framework for Low Light Visibility Restoration | |
CN114358137A (en) | Automatic image correction method for file scanning piece based on deep learning | |
Chengtao et al. | Improved dark channel prior dehazing approach using adaptive factor | |
Soumya et al. | Enhancement and segmentation of historical records | |
Subramani et al. | A novel binarization method for degraded tamil palm leaf images | |
CN117315702B (en) | Text detection method, system and medium based on set prediction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190611 |
|
CF01 | Termination of patent right due to non-payment of annual fee |