Nothing Special   »   [go: up one dir, main page]

CN115035539B - Document anomaly detection network model construction method and device, electronic equipment and medium - Google Patents

Document anomaly detection network model construction method and device, electronic equipment and medium Download PDF

Info

Publication number
CN115035539B
CN115035539B CN202210964812.2A CN202210964812A CN115035539B CN 115035539 B CN115035539 B CN 115035539B CN 202210964812 A CN202210964812 A CN 202210964812A CN 115035539 B CN115035539 B CN 115035539B
Authority
CN
China
Prior art keywords
document
image
network model
abnormal
detection network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210964812.2A
Other languages
Chinese (zh)
Other versions
CN115035539A (en
Inventor
冯德亮
孙铁
陈奕均
毛奔
冯伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202210964812.2A priority Critical patent/CN115035539B/en
Publication of CN115035539A publication Critical patent/CN115035539A/en
Application granted granted Critical
Publication of CN115035539B publication Critical patent/CN115035539B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19107Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Processing Or Creating Images (AREA)
  • Facsimiles In General (AREA)

Abstract

The embodiment of the application provides a method and a device for constructing a document anomaly detection network model, electronic equipment and a medium, and belongs to the technical field of artificial intelligence. The method comprises the following steps: randomly selecting a text area based on a normal document image, and generating a document abnormal image sample set according to the text area, wherein the document abnormal image sample set comprises a plurality of document abnormal image samples; carrying out document abnormal marking on each document abnormal image to generate marking information files corresponding to each marked image sample; extracting marking information files with a first sample number from the plurality of marking information files, and generating a training image index list according to the marking information files with the first sample number; and training the initial document abnormality detection network model according to the real bounding box, the training image index list and the document abnormality training image to obtain the document abnormality detection network model. Therefore, the document abnormality detection can be carried out on the document image through the model, and the automation degree and accuracy of the document abnormality detection are improved.

Description

Document anomaly detection network model construction method and device, electronic equipment and medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for constructing a document anomaly detection network model, an electronic device, and a medium.
Background
At present, various document images comprise document images shot by a user and document images obtained by scanning, the document images uploaded by a client have the conditions of character overlapping and character shielding, and a scheme for separating characters and directly complementing shielding information to be correct is not provided in the industry for a while. Therefore, it is desirable to provide a solution for analyzing abnormal text situations, such as text overlap and text occlusion, of a document image.
Disclosure of Invention
In order to solve the technical problem, embodiments of the present application provide a method and an apparatus for constructing a document anomaly detection network model, an electronic device, and a medium.
In a first aspect, an embodiment of the present application provides a method for constructing a document anomaly detection network model, where the method includes:
randomly selecting a character area based on a normal document image, and generating a document abnormal image sample set according to the character area, wherein the document abnormal image sample set comprises a plurality of document abnormal image samples;
carrying out abnormal marking on each abnormal document image to obtain a plurality of marked image samples, and generating marking information files corresponding to the marked image samples;
determining a first sample number of a document abnormal image training set, extracting marking information files of the first sample number from a plurality of marking information files, and generating a training image index list according to the marking information files of the first sample number;
and constructing an initial document anomaly detection network model based on a YOLO framework, and training the initial document anomaly detection network model according to the size of a real boundary box, the training image index list and a document anomaly training image corresponding to the training image index list to obtain the document anomaly detection network model.
In a second aspect, an embodiment of the present application provides a document abnormality detection method for a document image, where the method includes:
inputting a document image to be detected into a document anomaly detection network model, wherein the document anomaly detection network model is obtained according to the document anomaly detection network model construction method provided by the first aspect;
detecting the document image to be detected through the document abnormality detection network model to obtain a document abnormality output result, wherein the document abnormality output result comprises document abnormality coordinate information, object confidence, category probability and a category to which the document abnormality output result belongs;
and generating a document abnormal detection result according to the document abnormal coordinate information, the object confidence coefficient, the class probability and the belonged class.
In a third aspect, an embodiment of the present application provides a document anomaly detection network model building apparatus, where the apparatus includes:
the document abnormal image sampling device comprises a selecting module, a processing module and a processing module, wherein the selecting module is used for randomly selecting a character area based on a normal document image and generating a document abnormal image sampling set according to the character area, and the document abnormal image sampling set comprises a plurality of document abnormal image samples;
the marking module is used for carrying out document abnormal marking on each document abnormal image to obtain a plurality of marked image samples and generating marking information files corresponding to the marked image samples;
the determining module is used for determining the first sample number of a document abnormal image training set, extracting marking information files of the first sample number from the marking information files, and generating a training image index list according to the marking information files of the first sample number;
and the training module is used for constructing an initial document abnormity detection network model based on a YOLO framework, and training the initial document abnormity detection network model according to the size of a real boundary box, the training image index list and a document abnormity training image corresponding to the training image index list to obtain a document abnormity detection network model.
In a fourth aspect, an embodiment of the present application provides an apparatus for detecting document anomalies of a document image, where the apparatus includes:
the document anomaly detection network model is obtained according to the document anomaly detection network model construction method provided by the first aspect;
the detection module is used for detecting the document image to be detected through the document abnormity detection network model to obtain a document abnormity output result, and the document abnormity output result comprises document abnormity coordinate information, object confidence coefficient, class probability and the class of the document abnormity output result;
and the generation module is used for generating a document abnormity detection result according to the document abnormity coordinate information, the object confidence coefficient, the class probability and the belonged class.
In a fifth aspect, an embodiment of the present application provides an electronic device, which includes a memory and a processor, where the memory is used to store a computer program, and when the processor runs the computer program, the computer program executes the method for constructing a document anomaly detection network model according to the first aspect, or executes the method for detecting document anomalies of a document image according to the second aspect.
In a sixth aspect, an embodiment of the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program runs on a processor, the computer program executes the method for constructing a document abnormality detection network model provided in the first aspect, or executes the method for detecting document abnormality of a document image provided in the second aspect.
According to the document anomaly detection network model construction method, the document anomaly detection network model construction device, the electronic equipment and the medium, document anomaly detection can be performed on the document image through the document anomaly detection network model, influences of factors such as difficulty in positioning under the condition that the characters are very small, interference of missing parts of the characters and the like on the document anomaly detection are avoided, and the automation degree and accuracy of the document anomaly detection are improved.
Drawings
In order to more clearly explain the technical solutions of the present application, the drawings needed to be used in the embodiments are briefly introduced below, and it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope of protection of the present application. Like components are numbered similarly in the various figures.
FIG. 1 is a flow chart illustrating a document anomaly detection network model building method according to an embodiment of the present application;
FIG. 2 is another schematic flow chart of a document anomaly detection network model construction method provided in the embodiment of the present application;
FIG. 3 is a flow chart illustrating a document abnormality detection method for a document image according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a document anomaly detection network model building device according to an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a document abnormality detection apparatus for a document image according to an embodiment of the present application;
fig. 6 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Icon: 400-document anomaly detection network model construction device, 401-selection module, 402-marking module, 403-determination module, 404-training module, 500-document image anomaly detection device, 501-input module, 502-detection module, 503-generation module, 600-electronic equipment, 601-transceiver, 602-processor and 603-memory.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, and not all embodiments.
The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
Hereinafter, the terms "including", "having", and their derivatives, which may be used in various embodiments of the present application, are intended to indicate only specific features, numbers, steps, operations, elements, components, or combinations of the foregoing, and should not be construed as first excluding the existence of, or adding to, one or more other features, numbers, steps, operations, elements, components, or combinations of the foregoing.
Furthermore, the terms "first," "second," "third," and the like are used solely to distinguish one from another, and are not to be construed as indicating or implying relative importance.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the various embodiments of this application belong. The terms (such as those defined in commonly used dictionaries) should be interpreted as having a meaning that is consistent with their contextual meaning in the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in various embodiments.
Example 1
The embodiment of the disclosure provides a method for constructing a document anomaly detection network model.
Referring to fig. 1, the method for constructing the document anomaly detection network model includes:
step S101, randomly selecting a character area based on a normal document image, and generating a document abnormal image sample set according to the character area.
In the embodiment, one or more position areas are automatically randomly selected for a normal document image, the position of a Character can be confirmed through Optical Character Recognition (OCR), and an internal text set composed of text contents is obtained. The document abnormal image sample set comprises a plurality of document abnormal image samples. It should be noted that the normal document image may also be referred to as a normal document image, and may be a normal document scanning file obtained by scanning a normal document, or a normal document photo obtained by taking a picture by a user, which is not limited herein.
In one embodiment, the generating a document abnormal image sample set according to the text area in step S101 includes:
determining the character position of the character area through an OCR (optical character recognition), and acquiring a text content set corresponding to the character position;
calculating the marking process of the shielded text through OPENCV image processing to obtain the background and font color of the text content set in the normal document image, calculating the font size through the width and height of the character position and the line number of the text, and constructing an edge frame according to the background, the font color and the font size;
and generating a document abnormal image sample according to the edge frame and the original text frame of the normal document image.
In this embodiment, the OPENCV image processing algorithm may provide image processing algorithms such as image binarization processing, erosion processing, filtering processing, and blurring processing, and various processing algorithms provided by the OPENCV image processing algorithm may collect text contents in a background and font colors of a normal document image, for example, the text contents are collected as "today's weather is sunny", and the OPENCV image processing algorithm may determine that "today's weather is sunny" in the original normal document image, the background is black, and the font colors are yellow. The font size indicates a text size style, such as font 4. The edge frame is used for sliding on an original text frame of the normal document image, and different document abnormal image samples are generated through the control of the intersection ratio of the edge frame and the original text frame.
In an embodiment, the generating the document abnormal image sample according to the edge frame and the original text frame of the normal document image includes:
constructing the text overlapping sample by calculating the intersection ratio of the edge frame and the original text frame; and/or the presence of a gas in the gas,
and shielding the normal document image through a preset text box according to the text content set to obtain the text shielding sample.
In the present embodiment, the text overlap sample is generated by moving the intersection ratio of the edge frame and the original text frame to control the area where the documents overlap. And randomly carrying out a masking layer shielding in a certain range left and right through a preset text box by utilizing the text content set to generate a text shielding sample. The text occlusion samples all include one or more abnormal regions, and the abnormal regions can be text overlapping regions and text occlusion regions.
And S102, carrying out document abnormal marking on each document abnormal image to obtain a plurality of marked image samples, and generating marking information files corresponding to each marked image sample.
In one embodiment, the marking information file includes a custom image object name, an image file path, an image size, and document abnormal coordinate information.
Exemplarily, marking is performed by adopting an open source labelImg tool, namely, a shielding position area is framed by a rectangle, an object name of the shielding area is set as hid, and then a marking information file is generated according to the shielding position area and the shielding area object name, wherein the marking information file can be a marking xml file. For example, the branded xml file contains the branded custom image object name photo1, the image file name, the image file path, the image size corresponding width, height, depth, the occlusion region position minimum x coordinate, minimum y coordinate, maximum x coordinate, maximum y coordinate, and the occlusion region object name hid. The branded xml file is placed in another folder.
It should be noted that the marking process and the saving process of the overlapped text have similar steps to those of the marking process and the saving process of the shielded text, marking contents marked on the same image are all saved in the same marking xml file, and the shielded and overlapped object name tags corresponding to the marking process can be represented as 0 and 1, or be named as hid, etc., and are not limited herein.
Step S103, determining a first sample number of a document abnormal image training set, extracting marking information files of the first sample number from the marking information files, and generating a training image index list according to the marking information files of the first sample number.
Exemplarily, the ratio of the training sample to the test sample is set to 9:1, according to 9:1, determining the number of first samples and the number of second samples, extracting marking information files of the number of the first samples from the plurality of marking information files, forming a training image index list by marking image samples corresponding to the marking information files of the number of the first samples, searching corresponding abnormal document images according to the training image index list, and generating an abnormal document image training set according to the searched abnormal document images.
The marking information file is randomly extracted, the name of the image file is determined according to the extracted marking information file, the image file name suffix (such as jpg) is removed and then is used as a training image index, the training image index is stored to txt, and the training image index list is obtained in a single-line mode. And marking label content and an original image corresponding to the marking information file can be simultaneously obtained according to the training image index list.
And step S104, constructing an initial document abnormality detection network model based on a YOLO framework, and training the initial document abnormality detection network model according to the size of a real boundary box, the training image index list and a document abnormality training image corresponding to the training image index list to obtain a document abnormality detection network model.
Exemplarily, a YOLO framework basis is adopted, a cross-stage partial Connection (CSP) is added to each large residual block of the Darknet53, corresponding to layer0 to layer104, a backbone network (backbone) is formed, a spatial pyramid pooling is added to increase a perception field of the network, 5 × 5, 9 × 9 and 13 × 13 maximal pooling is performed on layer107, layer108, layer110 and layer112 are obtained respectively, after pooling is completed, the layers are connected (concatene) to form layer114, dimension reduction is performed to 512 channels through 1 × 1, and after upsampling (sample) is performed on an FPN basis, a downsampling (downsampling) operation is added, so that feature fusion is realized, and an initial document anomaly detection network model is obtained.
And adopting M60 four-card training, properly adapting and adjusting the size of the image input into the initial document abnormality detection network model to 416 x 416 according to the size of the video memory, and inputting the image to be detected into the initial document abnormality detection network model. For training model parameters, mosaic data enhancement, label smoothing, CIOU, learning rate cosine annealing attenuation and Mish activation functions are adopted, in addition, in the training process, a general freezing training for extracting network characteristics by a backbone network can accelerate the training speed, and weight can also be prevented from being damaged in the initial training stage. Exemplarily, 200 epochs are trained (epochs are a training process), the initial learning rate of the first 100 epochs is set to be le-3, the initial learning rate of the batch \ "u size (batch _ size is the size of each batch of data) is 4, the initial learning rate of the last 100 epochs is set to be le-4 by trying to increase the training speed and reduce the video memory usage, and the initial learning rate of the batch \" u size is 2.
Referring to fig. 2, the training the initial document anomaly detection network model according to the size of the real bounding box, the training image index list, and the document anomaly training image corresponding to the training image index list in step S104 includes:
step S1041, loading marking information files corresponding to the training image index list through the initial document anomaly detection network model, acquiring document anomaly coordinate information of the loaded marking information files, and taking the size information of the real bounding box as input data of K-means clustering;
step S1042, training the initial document abnormality detection network model according to the document abnormality coordinate information, the size information of the real bounding box and the document abnormality training image.
Exemplarily, a marking xml file in a marking folder is loaded to obtain a minimum x coordinate, a minimum y coordinate, a maximum x coordinate, and a maximum y coordinate of each shielding and overlapping region position of marking, and then the minimum x coordinate, the maximum y coordinate, and the maximum y coordinate are used as input data of a K-means cluster, namely the width and the height of a real bounding box (ground bounding box), and considering scenes under different sizes, the size of each real bounding box is different, and it is very necessary to standardize the width and the height of the bounding box, and the width and the height of a standardized image.
It should be added that, the method for constructing a document anomaly detection network model provided in this embodiment further includes:
storing each document abnormal image into an image folder;
and storing each marking information file into a marking folder, wherein each marking information file under the marking folder corresponds to each abnormal document image under the image folder one by one.
In this embodiment, the text overlap sample and the text occlusion sample are saved in the same image folder. The automatically generated text overlapping samples and the automatically generated text shielding samples are all ten thousand-level, and the number of the automatically generated text overlapping samples and the automatically generated text shielding samples can be larger, and is not limited herein. Exemplarily, the marking xml file is placed in a marking folder.
It is further added that the method for constructing a document anomaly detection network model provided in this embodiment further includes:
determining a second sample number of the document abnormal image test set;
extracting marking information files with the second sample number from the plurality of marking information files, and generating a test image index list according to the marking information files with the second sample number;
and determining the false detection rate and the omission rate of the abnormal result of the document according to the abnormal coordinate information of the document corresponding to the test image index list and the abnormal test image of the document through the abnormal test network model of the document.
The method for constructing the document abnormality detection network model provided by this embodiment is to construct an initial document abnormality detection network model based on a YOLO framework, train the initial document abnormality detection network model according to the size of a real bounding box, the training image index list, and a document abnormality training image corresponding to the training image index list to obtain the document abnormality detection network model, and perform document abnormality detection on a document image through the document abnormality detection network model to improve the automation degree and accuracy of document abnormality detection.
Example 2
The embodiment of the disclosure provides a document abnormality detection method for a document image.
Referring to fig. 3, the document abnormality detection method of the document image includes:
step S301, inputting a document image to be detected to the document abnormity detection network model.
In this embodiment, the document abnormality detection network model is obtained according to the document abnormality detection network model construction method provided in embodiment 1.
In this embodiment, the document anomaly detection network model is obtained by using the document anomaly detection network model construction method provided in embodiment 1, and the detailed process is shown in embodiment 1, which does not avoid repetition and is not limited herein.
Step S302, detecting the document image to be detected through the document abnormity detection network model to obtain a document abnormity output result, wherein the document abnormity output result comprises document abnormity coordinate information, object confidence, class probability and the class.
Exemplarily, the object confidence (confidence) is the probability of a bounding box containing an object and the accuracy of the position (i.e. whether the occlusion area is just wrapped or not), the formula expresses Pr (hid) × IOU, IOU is the cross-over ratio between the predicted value and the actual value, pr (hid) × IOU of the label is 1 in the training process, and confidence is the predicted value in the prediction process; when whether the occlusion (occlusion region object name is hid) is predicted, the document abnormal category conditional probability value is provided, namely the category probability under the confidence, so that the final score (scores) is the confidence multiplied by the category probability. And performing primary screening by using the class probability larger than the preset parameter 0.5 to obtain all the prediction results under all the classes after the primary screening, sequencing all the residual prediction results under all the classes by using the confidence coefficient multiplied by the class probability, obtaining the maximum score (scores) of the result obtained by sequencing, and simultaneously satisfying the condition that the non-maximum inhibition removal coincidence degree is larger than the preset parameter 0.4 to obtain the final optimal prediction result under each class.
Step S303, generating a document abnormal detection result according to the document abnormal coordinate information, the object confidence coefficient, the category probability and the belonged category.
In one embodiment, step S303 includes:
determining an abnormal regression position according to the abnormal coordinate information of the document;
determining a document abnormal score according to the product of the object confidence and the category probability;
and determining the number of abnormal documents in each category according to the category to which the document belongs.
Exemplarily, the document abnormal output result is (x 1, y1, x2, y2, obj _ conf, class _ conf, class _ pred), which respectively represents a minimum x coordinate, a minimum y coordinate, a maximum x coordinate, a maximum y coordinate, an object confidence, a class probability, and a class to which the document abnormal output result belongs. For example, the number of the document occlusion anomalies, that is, the result of finding the occlusion region object hid belonging to the category, is determined, and if the number of the corresponding results is 0, there is no document occlusion anomaly. If N, the number of the corresponding document shielding exceptions is N; the position of the document occlusion can be directly obtained from the corresponding x1, y1, x2, y2 of the document abnormal output result, and the final score (scores) is obj _ conf × class _ conf in the document abnormal output result.
In the above example, the document occlusion regression position, the number of document occlusion anomalies, and the prediction score are obtained. The document overlapping regression position, the number of document overlapping anomalies, and the obtaining manner of the prediction scores are similar to the document shielding regression position, the number of document shielding anomalies, and the prediction score, which are not repeated herein.
Therefore, the detection of abnormal conditions such as document overlapping, document shielding and the like is realized, the false detection rate and the missing detection rate are close to below 0.1, the diagnosis capability is provided for the document image uploaded by a user, and the quality guarantee is provided for the uploaded document image layer.
According to the document abnormality detection method for the document image, document abnormality detection can be performed on the document image through the document abnormality detection network model, so that the influence of factors such as difficulty in positioning under the condition of very small characters, interference of missing parts of the characters and the like on the document abnormality detection is avoided, and the automation degree and accuracy of the document abnormality detection are improved.
Example 3
In addition, the embodiment of the disclosure provides a document anomaly detection network model construction device.
As shown in fig. 4, the document abnormality detection network model building apparatus 400 includes:
a selecting module 401, configured to randomly select a text region based on a normal document image, and generate a document abnormal image sample set according to the text region, where the document abnormal image sample set includes a plurality of document abnormal image samples;
a marking module 402, configured to perform document abnormal marking on each document abnormal image to obtain multiple marked image samples, and generate marking information files corresponding to each marked image sample;
a determining module 403, configured to determine a first number of samples of a document abnormal image training set, extract the marking information files of the first number of samples from the plurality of marking information files, and generate a training image index list according to the marking information files of the first number of samples;
the training module 404 is configured to construct an initial document anomaly detection network model based on a YOLO framework, and train the initial document anomaly detection network model according to the size of a real bounding box, the training image index list, and a document anomaly training image corresponding to the training image index list, so as to obtain a document anomaly detection network model.
In an embodiment, the training module 404 is further configured to load a marking information file corresponding to the training image index list through the initial document anomaly detection network model, obtain document anomaly coordinate information of the loaded marking information file, and use size information of the real bounding box as input data of K-means clustering;
and training the initial document abnormality detection network model according to the document abnormality coordinate information, the size information of the real boundary box and the document abnormality training image.
In an embodiment, the selecting module 401 is further configured to determine a text position of the text region through OCR, and obtain a text content set corresponding to the text position;
acquiring the background and font color of the text content set in the normal document image through an OPENCV image processing algorithm, calculating the font size through the width and height of the character position and the line number of the text, and constructing an edge frame according to the background, the font color and the font size;
and generating a document abnormal image sample according to the edge frame and the original text frame of the normal document image.
In an embodiment, the document abnormal image sample includes a text overlapping sample and/or a text occlusion sample, and the selecting module 401 is further configured to construct the text overlapping sample by calculating an intersection ratio of the edge frame and an original text frame; and/or the presence of a gas in the gas,
and shielding the normal document image through a preset text box according to the text content set to obtain the text shielding sample.
In one embodiment, the document anomaly detection network model building apparatus 400 further includes:
the storage module is used for storing the abnormal images of the documents into an image folder;
and storing each marking information file into a marking folder, wherein each marking information file under the marking folder corresponds to each abnormal document image under the image folder one by one.
In one embodiment, the marking information file includes a custom image object name, an image file path, an image size, and document abnormal coordinate information.
In one embodiment, the document anomaly detection network model building apparatus 400 further includes:
a determining module 403, configured to determine a second sample number of the document abnormal image test set;
extracting marking information files with the second sample number from the plurality of marking information files, and generating a test image index list according to the marking information files with the second sample number;
and determining the false detection rate and the omission rate of the abnormal result of the document according to the abnormal coordinate information of the document corresponding to the test image index list and the abnormal test image of the document by the abnormal test network model of the document.
The document abnormality detection network model construction device 400 provided in this embodiment can implement the document abnormality detection network model construction method provided in embodiment 1, and is not described herein again to avoid repetition.
The document anomaly detection network model construction device provided by the embodiment constructs an initial document anomaly detection network model based on a YOLO framework, trains the initial document anomaly detection network model according to the size of a real boundary box, the training image index list and a document anomaly training image corresponding to the training image index list to obtain a document anomaly detection network model, and can perform document anomaly detection on a document image through the document anomaly detection network model to improve the automation degree and accuracy of document anomaly detection.
Example 4
In addition, the embodiment of the disclosure provides a document abnormality detection device for a document image.
As shown in fig. 5, the document abnormality detection apparatus 500 of the document image includes:
an input module 501, configured to input a document image to be detected to a document anomaly detection network model, where the document anomaly detection network model is obtained according to the document anomaly detection network model construction method provided in embodiment 1;
the detection module 502 is configured to detect the document image to be detected through the document anomaly detection network model to obtain a document anomaly output result, where the document anomaly output result includes document anomaly coordinate information, object confidence, category probability, and a category to which the document anomaly output result belongs;
and a generating module 503, configured to generate a document anomaly detection result according to the document anomaly coordinate information, the object confidence, the category probability, and the category to which the document anomaly detection result belongs.
In one embodiment, the generating module 503 is further configured to determine an abnormal regression position according to the document abnormal coordinate information;
determining a document abnormal score according to the product of the object confidence and the category probability;
and determining the number of abnormal documents in each category according to the belonged category.
Example 5
Furthermore, an embodiment of the present disclosure provides an electronic device, including a memory and a processor, where the memory stores a computer program, and the computer program executes, when running on the processor, the method for constructing a document abnormality detection network model provided in embodiment 1 or the method for detecting document abnormality of a document image provided in embodiment 2.
Specifically, referring to fig. 6, the electronic device 600 includes: the transceiver 601, the bus interface and the processor 602, when running on the processor 602, the computer program performs the method for constructing the document anomaly detection network model provided in embodiment 1, and specifically, the processor 602 is configured to: randomly selecting a text area based on a normal document image, and generating a document abnormal image sample set according to the text area, wherein the document abnormal image sample set comprises a plurality of document abnormal image samples;
carrying out document abnormal marking on each document abnormal image to obtain a plurality of marked image samples, and generating marking information files corresponding to the marked image samples;
determining a first sample number of a document abnormal image training set, extracting marking information files of the first sample number from a plurality of marking information files, and generating a training image index list according to the marking information files of the first sample number;
and constructing an initial document anomaly detection network model based on a YOLO framework, and training the initial document anomaly detection network model according to the size of a real boundary box, the training image index list and a document anomaly training image corresponding to the training image index list to obtain a document anomaly detection network model.
In addition, the computer program executes the method for detecting document abnormality of a document image provided in embodiment 2 when running on the processor, and specifically, the processor 602 is further configured to: inputting a document image to be detected into a document anomaly detection network model, wherein the document anomaly detection network model is obtained according to the document anomaly detection network model construction method provided by the embodiment 1;
detecting the document image to be detected through the document abnormality detection network model to obtain a document abnormality output result, wherein the document abnormality output result comprises document abnormality coordinate information, object confidence, category probability and a category to which the document abnormality output result belongs;
and generating a document abnormal detection result according to the document abnormal coordinate information, the object confidence coefficient, the class probability and the belonged class.
In the embodiment of the present invention, the electronic device 600 further includes: a memory 603. In FIG. 6, the bus architecture may include any number of interconnected buses and bridges, with one or more processors represented by processor 602 and various circuits of memory represented by memory 603 linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. The transceiver 601 may be a number of elements including a transmitter and a receiver that provide a means for communicating with various other apparatus over a transmission medium. The processor 602 is responsible for managing the bus architecture and general processing, and the memory 603 may store data used by the processor 602 in performing operations.
The electronic device 600 provided in the embodiment of the present invention may execute the steps of the method for constructing a network model for detecting document abnormalities in the foregoing method embodiment 1, or the steps of the method for detecting document abnormalities in a document image in embodiment 2, which are not described again.
The electronic device provided in this embodiment constructs an initial document anomaly detection network model based on a YOLO framework, trains the initial document anomaly detection network model according to the size of a real bounding box, the training image index list, and a document anomaly training image corresponding to the training image index list to obtain a document anomaly detection network model, and can perform document anomaly detection on a document image through the document anomaly detection network model, thereby improving the automation degree and accuracy of document anomaly detection.
Example 6
The present application also provides a computer-readable storage medium on which a computer program is stored, where the computer program, when executed by a processor, implements the method for constructing a document abnormality detection network model provided in embodiment 1, or implements the method for detecting document abnormality of a document image provided in embodiment 2.
In this embodiment, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
The computer-readable storage medium provided in this embodiment may implement the method for constructing a network model for detecting document anomalies provided in embodiment 1, or implement the method for detecting document anomalies of a document image provided in embodiment 2, and is not described herein again to avoid repetition.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one of 8230, and" comprising 8230does not exclude the presence of additional like elements in a process, method, article, or terminal comprising the element.
Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present application or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (12)

1. A method for constructing a document anomaly detection network model is characterized by comprising the following steps:
randomly selecting a character area based on a normal document image, and generating a document abnormal image sample set according to the character area, wherein the document abnormal image sample set comprises a plurality of document abnormal image samples;
carrying out document abnormal marking on each document abnormal image to obtain a plurality of marked image samples, and generating marking information files corresponding to the marked image samples;
determining a first sample number of a document abnormal image training set, extracting marking information files of the first sample number from a plurality of marking information files, and generating a training image index list according to the marking information files of the first sample number;
constructing an initial document anomaly detection network model based on a YOLO framework, and training the initial document anomaly detection network model according to the size of a real bounding box, the training image index list and a document anomaly training image corresponding to the training image index list to obtain a document anomaly detection network model;
the training of the initial document abnormality detection network model according to the size of the real bounding box, the training image index list and the document abnormality image corresponding to the training image index list comprises the following steps:
loading marking information files corresponding to the training image index list through the initial document anomaly detection network model, acquiring document anomaly coordinate information of the loaded marking information files, and taking the size information of the real bounding box as input data of K-means clustering;
and training the initial document abnormality detection network model according to the document abnormality coordinate information, the size information of the real boundary box and the document abnormality training image.
2. The method of claim 1, wherein generating a document abnormal image sample according to the text region comprises:
determining the character position of the character area through an OCR (optical character recognition), and acquiring a text content set corresponding to the character position;
acquiring the background and font color of the text content set in the normal document image through an OPENCV image processing algorithm, calculating the font size through the width and height of the character position and the line number of the text, and constructing an edge frame according to the background, the font color and the font size;
and generating a document abnormal image sample according to the edge frame and the original text frame of the normal document image.
3. The method according to claim 2, wherein the document abnormal image samples comprise text overlapping samples and/or text shading samples, and the generating of the document abnormal image samples according to the edge frame and the original text frame of the normal document image comprises:
constructing the text overlapping sample by calculating the intersection ratio of the edge frame and the original text frame; and/or the presence of a gas in the gas,
and shielding the normal document image through a preset text box according to the text content set to obtain the text shielding sample.
4. The method of claim 1, further comprising:
storing each document abnormal image into an image folder;
and storing each marking information file into a marking folder, wherein each marking information file under the marking folder corresponds to each abnormal document image under the image folder one by one.
5. The method of claim 1, wherein the marking information file includes custom image object name, image file path, image size, and document anomaly coordinate information.
6. The method according to claim 1, characterized in that it comprises:
determining a second sample number of the document abnormal image test set;
extracting marking information files with the second sample number from the plurality of marking information files, and generating a test image index list according to the marking information files with the second sample number;
and determining the false detection rate and the omission rate of the abnormal result of the document according to the abnormal coordinate information of the document corresponding to the test image index list and the abnormal test image of the document by the abnormal test network model of the document.
7. A document abnormality detection method for a document image, characterized by comprising:
inputting a document image to be detected into a document anomaly detection network model, wherein the document anomaly detection network model is obtained according to the document anomaly detection network model construction method of any one of claims 1-6;
detecting the document image to be detected through the document anomaly detection network model to obtain a document anomaly output result, wherein the document anomaly output result comprises document anomaly coordinate information, object confidence, category probability and a category to which the document anomaly output result belongs;
and generating a document abnormal detection result according to the document abnormal coordinate information, the object confidence coefficient, the class probability and the belonged class.
8. The method of claim 7, wherein generating the document anomaly detection result according to the document anomaly coordinate information, the object confidence level, the class probability and the belonged class comprises:
determining an abnormal regression position according to the abnormal coordinate information of the document;
determining a document abnormal score according to the product of the object confidence and the category probability;
and determining the number of abnormal documents in each category according to the category to which the document belongs.
9. An apparatus for constructing a document anomaly detection network model, the apparatus comprising:
the document abnormal image sampling device comprises a selecting module, a processing module and a processing module, wherein the selecting module is used for randomly selecting a character area based on a normal document image and generating a document abnormal image sampling set according to the character area, and the document abnormal image sampling set comprises a plurality of document abnormal image samples;
the marking module is used for carrying out document abnormal marking on each document abnormal image to obtain a plurality of marked image samples and generating marking information files corresponding to the marked image samples;
the determining module is used for determining the first sample number of the document abnormal image training set, extracting the marking information files of the first sample number from the marking information files, and generating a training image index list according to the marking information files of the first sample number;
the training module is used for constructing an initial document abnormity detection network model based on a YOLO framework, and training the initial document abnormity detection network model according to the size of a real bounding box, the training image index list and a document abnormity training image corresponding to the training image index list to obtain a document abnormity detection network model;
the training module is also used for loading marking information files corresponding to the training image index list through the initial document abnormity detection network model, obtaining document abnormity coordinate information of the loaded marking information files, and taking the size information of the real bounding box as input data of K-means clustering;
and training the initial document abnormality detection network model according to the document abnormality coordinate information, the size information of the real boundary box and the document abnormality training image.
10. An apparatus for detecting document abnormality of a document image, the apparatus comprising:
an input module, configured to input a document image to be detected to a document anomaly detection network model, where the document anomaly detection network model is obtained according to the document anomaly detection network model construction method according to any one of claims 1 to 6;
the detection module is used for detecting the document image to be detected through the document abnormity detection network model to obtain a document abnormity output result, and the document abnormity output result comprises document abnormity coordinate information, object confidence, class probability and the class to which the document abnormity output result belongs;
and the generation module is used for generating a document abnormity detection result according to the document abnormity coordinate information, the object confidence coefficient, the class probability and the belonged class.
11. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the computer program executes the method for constructing a document abnormality detection network model according to any one of claims 1 to 6 or the method for detecting document abnormality of a document image according to claim 7 or 8 when the processor runs.
12. A computer-readable storage medium characterized by storing a computer program which, when run on a processor, executes the document abnormality detection network model construction method of any one of claims 1 to 6, or executes the document abnormality detection method of a document image of claim 7 or 8.
CN202210964812.2A 2022-08-12 2022-08-12 Document anomaly detection network model construction method and device, electronic equipment and medium Active CN115035539B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210964812.2A CN115035539B (en) 2022-08-12 2022-08-12 Document anomaly detection network model construction method and device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210964812.2A CN115035539B (en) 2022-08-12 2022-08-12 Document anomaly detection network model construction method and device, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN115035539A CN115035539A (en) 2022-09-09
CN115035539B true CN115035539B (en) 2022-10-28

Family

ID=83130080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210964812.2A Active CN115035539B (en) 2022-08-12 2022-08-12 Document anomaly detection network model construction method and device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN115035539B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116361193B (en) * 2023-05-16 2023-08-22 福昕鲲鹏(北京)信息科技有限公司 Method and device for testing layout document text selection
CN116305172B (en) * 2023-05-23 2023-08-04 北京安天网络安全技术有限公司 OneNote document detection method, oneNote document detection device, oneNote document detection medium and OneNote document detection equipment
CN118041763B (en) * 2024-04-12 2024-08-09 中国移动紫金(江苏)创新研究院有限公司 CDN log data processing method, device, equipment, medium and program product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114299528A (en) * 2021-12-27 2022-04-08 万达信息股份有限公司 Information extraction and structuring method for scanned document
CN114419641A (en) * 2022-03-15 2022-04-29 腾讯科技(深圳)有限公司 Training method and device of text separation model, electronic equipment and storage medium
CN114638957A (en) * 2022-03-14 2022-06-17 北京感易智能科技有限公司 Text separation method and device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7581171B2 (en) * 2004-01-06 2009-08-25 Microsoft Corporation Positionally encoded document image analysis and labeling
US10936974B2 (en) * 2018-12-24 2021-03-02 Icertis, Inc. Automated training and selection of models for document analysis
US11361528B2 (en) * 2020-08-11 2022-06-14 Nationstar Mortgage LLC Systems and methods for stamp detection and classification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114299528A (en) * 2021-12-27 2022-04-08 万达信息股份有限公司 Information extraction and structuring method for scanned document
CN114638957A (en) * 2022-03-14 2022-06-17 北京感易智能科技有限公司 Text separation method and device, electronic equipment and storage medium
CN114419641A (en) * 2022-03-15 2022-04-29 腾讯科技(深圳)有限公司 Training method and device of text separation model, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Paragraph-based representation of texts: A complex networks approach;HenriqueF. de Arruda et.al;《Information Processing & Management》;20190531;第56卷(第3期);479-494 *
基于卷积网络的自然场景文本检测研究;刘一帆;《中国优秀硕士学位论文全文数据库 基于卷积网络的自然场景文本检测研究》;20211215;I138-467 *

Also Published As

Publication number Publication date
CN115035539A (en) 2022-09-09

Similar Documents

Publication Publication Date Title
CN115035539B (en) Document anomaly detection network model construction method and device, electronic equipment and medium
US8693790B2 (en) Form template definition method and form template definition apparatus
KR101896357B1 (en) Method, device and program for detecting an object
CN108710893A (en) A kind of digital image cameras source model sorting technique of feature based fusion
CN111461070A (en) Text recognition method and device, electronic equipment and storage medium
US20210056429A1 (en) Apparatus and methods for converting lineless tables into lined tables using generative adversarial networks
CN110942456A (en) Tampered image detection method, device, equipment and storage medium
JP4859054B2 (en) Image processing apparatus, image processing method, program, and recording medium
KR101917525B1 (en) Method and apparatus for identifying string
CN112668444A (en) Bird detection and identification method based on YOLOv5
CN114841974A (en) Nondestructive testing method and system for internal structure of fruit, electronic equipment and medium
WO2024179388A1 (en) Plankton object detection and classification method based on multi-layer neural network architecture
CN115546824B (en) Taboo picture identification method, apparatus and storage medium
CN115223173A (en) Object identification method and device, electronic equipment and storage medium
CN112825141B (en) Method and device for recognizing text, recognition equipment and storage medium
CN115797939A (en) Two-stage italic character recognition method and device based on deep learning
CN116246161A (en) Method and device for identifying target fine type of remote sensing image under guidance of domain knowledge
CN117291859A (en) Page abnormality detection method and device, electronic equipment and storage medium
EP2573694A1 (en) Conversion method and system
CN116563869B (en) Page image word processing method and device, terminal equipment and readable storage medium
CN114422199B (en) CMS (content management system) identification method and device
CN113449545A (en) Data processing method, device, storage medium and processor
CN117274817B (en) Automatic crack identification method and device, terminal equipment and storage medium
CN116503721B (en) Method, device, equipment and storage medium for detecting tampering of identity card
CN112288685B (en) Method, device, terminal equipment and readable storage medium for detecting acid-fast bacillus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant