CN110599463B - Tongue image detection and positioning algorithm based on lightweight cascade neural network - Google Patents
Tongue image detection and positioning algorithm based on lightweight cascade neural network Download PDFInfo
- Publication number
- CN110599463B CN110599463B CN201910789517.6A CN201910789517A CN110599463B CN 110599463 B CN110599463 B CN 110599463B CN 201910789517 A CN201910789517 A CN 201910789517A CN 110599463 B CN110599463 B CN 110599463B
- Authority
- CN
- China
- Prior art keywords
- sample
- tongue
- candidate
- picture
- positive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 20
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 30
- 238000000034 method Methods 0.000 claims description 8
- 238000002372 labelling Methods 0.000 claims description 6
- 238000012216 screening Methods 0.000 claims description 5
- 238000003066 decision tree Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 4
- 230000010354 integration Effects 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 4
- 230000004807 localization Effects 0.000 claims description 3
- 238000003745 diagnosis Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 238000012937 correction Methods 0.000 description 6
- 230000011218 segmentation Effects 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Public Health (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Primary Health Care (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a tongue image detection and positioning algorithm based on a lightweight cascade neural network, which comprises the following steps: inputting an acquired tongue image picture; cutting the marked tongue picture to obtain positive and negative samples with classification labels, and putting the positive and negative samples into a first layer network; extracting features of sample pictures input into the first layer network, judging and classifying, storing coordinate information of candidate frames classified as positive, and cutting original pictures according to the output candidate frames to obtain input samples of the second layer network; training and classifying sample pictures in the second-layer network, outputting candidate frames classified as positive, and correspondingly cutting an original picture to obtain an input sample of the third-layer network; training the sample pictures in the third layer network, and obtaining the coordinate information of the classification labels and the candidate frames after training. The invention can screen out the candidate frames with low confidence and accurately position the tongue image.
Description
Technical Field
The invention relates to the technical field of image detection algorithms, in particular to a tongue image detection and positioning algorithm based on a lightweight cascade neural network.
Background
Tongue diagnosis is one of the important diagnostic methods of traditional Chinese medicine. In recent years, with the development of image processing and machine learning techniques, research on a computer tongue diagnosis system has been receiving more and more attention. The complete computer tongue diagnosis system is divided into three parts of tongue image acquisition, tongue body segmentation and tongue image recognition analysis, and the recognition and classification of tongue image information are completed by utilizing related technologies of image processing and machine learning, so that a diagnosis result is finally obtained.
The complete computer tongue diagnosis system is generally divided into three parts of tongue image acquisition, tongue image segmentation and tongue image characteristic recognition and analysis, and the recognition and analysis of tongue image information are completed by utilizing image processing and pattern recognition technology. Researchers have focused on quantification and standardization of tongue diagnosis and suggested capturing canonical tongue images with standard tongue image devices to reduce the effects of environmental factors such as differences in lighting conditions, tongue position, and image size. Although, the constant illumination and fixed position can reduce the difficulty of subsequent tongue segmentation and tongue image recognition. However, given that computer tongue systems should be able to function on more platforms, represented by the internet, algorithms proposed in the past for standard tongue images acquired in stationary tongue instruments may face significant challenges. The existing vgg, master_ rcnn and other detection algorithms have complex network structure and large model, and are not beneficial to the transplantation of actual algorithms.
Aiming at the situation that the traditional Chinese medicine diagnosis and treatment is more complex, the background of pictures to be analyzed is more diversified, the premise of tongue image analysis is that the position of a tongue image in an image is accurately positioned, the tongue image positioning with fixed coordinates at present cannot realize the requirement of accurately positioning a tongue region, whether the tongue image exists in the image is detected, and the accurate positioning of the tongue image is a realistic function which is provided by the algorithm.
Disclosure of Invention
In view of this, the problem to be solved by the present invention is to provide a tongue image detection and positioning algorithm based on a lightweight cascade neural network.
In order to solve the technical problems, the invention adopts the following technical scheme: a tongue image detection and positioning algorithm based on a lightweight cascade neural network comprises the following steps of 1) marking the coordinate position of a target area of a tongue image sample picture;
Step 2) randomly cutting tongue photo sample pictures to obtain positive and negative samples with classification labels, and putting the positive and negative samples into a first layer network;
step 3) carrying out feature extraction, classification and judgment on the sample picture input in the step 2), outputting coordinate information of candidate frames classified as positive, correspondingly cutting an original picture to obtain a carefully selected sample, and scaling the sample to a specified size and inputting the sample into a second-layer network;
Step 4) training the sample pictures in the step 3) to obtain classified candidate frames, outputting coordinate information of the candidate frames classified as positive, correspondingly cutting the original picture to obtain carefully selected samples, scaling the samples to a specified size (larger than the size of the step 3), and putting the samples into a third layer network;
and 5) training the sample pictures in the step 4), and obtaining the coordinate information of the classification labels and the candidate frames after training.
In the present invention, preferably, the step 3) is performed before being performed on the second layer network and the step 4) is performed on the third layer network, and frame regression is performed on the candidate frames, so as to reject the candidate frames with low confidence.
In the present invention, preferably, the step 2) performs differentiation by the iou value to implement positive and negative local sample classification.
In the present invention, preferably, the algorithm step of the training of the step 4) and the step 5) includes:
step S1), preparing a tongue picture sample picture scaled to a window size as a positive sample;
step S2) sampling a negative sample on a tongue picture sample picture without a positive sample (without a tongue picture);
Step S3) calculating the values of 10 integration channels of the positive and negative samples;
step S4), randomly generating a feature pool F, and calculating the error rate of a sampled sample;
Step S5), initializing a sample set D, and carrying out maximum iteration number k max;
Step S6) initializing the iterative weights W k (i) =1/n, i=1 … n, and performing an accumulated summation from 1 to k max;
Step S7), selecting a structure from the feature pool according to the weight W k (i) of the sample set D, and forming a two-layer decision tree C k;
Step S8), calculating training errors, wherein the errors of C k when the weight W k (i) of the sample set D is calculated;
step S9), calculating the weight of the weak classifier, and updating the weight of the sample;
step S10) returns the weak classifier C k and its corresponding weights, combining the weak classifiers into a strong classifier.
In the present invention, preferably, the method for screening candidate frames includes:
Step T1), extracting a candidate frame list W, and sorting the candidate frame list W according to the score;
Step T2) initializing a filtered candidate box list W l to be empty;
step T3) judging whether l (W) >0 is true, if so, executing an nms algorithm to screen candidate frames; if not, return to step T2).
In the present invention, preferably, the nms algorithm includes:
Step U1), selecting a first candidate frame W in a candidate frame list W, adding a score s into the candidate frame list W l, and removing the candidate frame from the candidate frame list W;
step U2), calculating the coincidence ratio of the first candidate frame w and the divided first candidate frame w i;
step U3) removes all candidate boxes in the candidate box list W that overlap with the first candidate box W by more than a threshold of 0.3.
In the present invention, preferably, after eliminating a candidate frame with low confidence coefficient, the method for correcting the coordinate information of the candidate frame includes: defining a three-dimensional coordinate correction vectorGiven a candidate frame (x, y, w, h), the upper left corner coordinates (x, y), the candidate frame length and width (w, h), the candidate frame after coordinate correction is:
Confidence vector { c 1,c2,...,cn}I(cn > t), 1 is taken when (c n > t), otherwise 0 is taken: and taking the average value of the correction modes larger than the threshold t as the coordinate information of the corrected candidate frame.
In the present invention, preferably, in the step 2), the same character is sent to the classification network for classification multiple times by adopting a random clipping mode.
In the present invention, preferably, the sample size of the second layer network is larger than the sample size of the first layer network, and the sample size of the third layer network is larger than the sample size of the second layer network.
In the invention, preferably, the size of the input tongue picture sample picture needs to be subjected to multi-scale scaling, and the multi-scale sample test can effectively improve the robustness and the accuracy of the classifier.
The invention has the advantages and positive effects that: the deep learning network in the algorithm is composed of three small convolutional neural networks connected in series, and a classifier with better classifying effect is obtained through cascade of weak classifiers to detect tongue images. Firstly, randomly cutting a marked tongue picture sample picture to obtain positive and negative samples with classification labels, putting the positive and negative samples into a first layer network, extracting, classifying and judging the characteristics of the input samples, storing the coordinate information of the sample which is identified as positive in an original picture as candidate frame information, rejecting the candidate frame with lower confidence level into a second layer network for further training through difficult case identification operation, wherein the size of the second layer sample is larger than that of the first layer sample, carrying more picture information for training, obtaining more accurately classified candidate frames, reducing the number of the candidate frames into a third layer through confidence level screening, and obtaining the coordinate information of the classification labels and frames after training is completed, thereby realizing the function of detecting tongue pictures and accurately positioning; the method can realize the judgment of whether the tongue image exists or not and the framing of the tongue image interested area for the sample with complex leading-in background and complex picture structure, the accuracy of color tongue image detection in actual detection is higher than 90%, and the accurate positioning is higher than 80%.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a schematic diagram of a sample image of a tongue image detection and localization algorithm based on a lightweight cascaded neural network of the present invention;
Fig. 2 is a schematic diagram of feature extraction classification judgment of tongue image detection and positioning algorithm based on lightweight cascade neural network.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It will be understood that when an element is referred to as being "fixed to" another element, it can be directly on the other element or intervening elements may also be present. When a component is considered to be "connected" to another component, it can be directly connected to the other component or intervening components may also be present. When an element is referred to as being "disposed on" another element, it can be directly on the other element or intervening elements may also be present. The terms "vertical," "horizontal," "left," "right," and the like are used herein for illustrative purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
As shown in fig. 1 and 2, the present invention provides a tongue image detection and positioning algorithm based on a lightweight cascade neural network, comprising:
Step 1) labeling the coordinate position of a target area of a tongue photo sample picture;
Step 2) randomly cutting tongue photo sample pictures to obtain positive and negative samples with classification labels, and putting the positive and negative samples into a first layer network;
step 3) carrying out feature extraction, classification and judgment on the sample picture input in the step 2), outputting coordinate information of candidate frames classified as positive, correspondingly cutting an original picture to obtain a carefully selected sample, and scaling the sample to a specified size and inputting the sample into a second-layer network;
Step 4) training the sample pictures in the step 3) to obtain classified candidate frames, outputting coordinate information of the candidate frames classified as positive, correspondingly cutting the original picture to obtain carefully selected samples, scaling the samples to a specified size (larger than the size of the step 3), and putting the samples into a third layer network;
and 5) training the sample pictures in the step 4), and obtaining the coordinate information of the classification labels and the candidate frames after training.
In this embodiment, further, the step 3) is performed to screen the candidate frames and perform frame regression before the step 4) is performed to the second layer network and the step 4) is performed to the third layer network, so as to reject the candidate frames with low confidence.
In this embodiment, further, the step 2) performs distinction by using the iou value to implement positive and negative local sample classification, where the iou value is defined as a ratio of an intersection area of the segmented region and the target region to an intersection area, and a larger iou value corresponds to a better object segmentation, and for a group of tongue photo sample pictures, a mean value of the iou value of each tongue photo sample picture is adopted as an iou segmentation result of the group. Given a group of tongue picture sample pictures, the experiment adopts the iou average value of all the tongue picture sample pictures to represent the iou value of the group of tongue picture sample pictures. Successful segmentation corresponds to a larger iou value, whereas the iou value is smaller.
In this embodiment, further, the algorithm steps of the training in the step 4) and the step 5) include:
step S1), preparing a tongue picture sample picture scaled to a window size as a positive sample;
step S2) sampling a negative sample on a tongue picture sample picture without a positive sample (without a tongue picture);
Step S3) calculating the values of 10 integration channels of the positive and negative samples;
step S4), randomly generating a feature pool F, and calculating the error rate of a sampled sample;
Step S5), initializing a sample set D, and carrying out maximum iteration number k max;
Step S6) initializing the iterative weights W k (i) =1/n, i=1 … n, and performing an accumulated summation from 1 to k max;
Step S7), selecting a structure from the feature pool according to the weight W k (i) of the sample set D, and forming a two-layer decision tree C k;
Step S8), calculating training errors, wherein the errors of C k when the weight W k (i) of the sample set D is calculated;
step S9), calculating the weight of the weak classifier, and updating the weight of the sample;
step S10) returns the weak classifier C k and its corresponding weights, combining the weak classifiers into a strong classifier.
In this embodiment, further, the method for screening the candidate frame includes:
Step T1), extracting a candidate frame list W, and sorting the candidate frame list W according to the score;
Step T2) initializing a filtered candidate box list W l to be empty;
step T3) judging whether l (W) >0 is true, if so, executing an nms algorithm to screen candidate frames; if not, return to step T2).
In this embodiment, further, the nms algorithm includes:
Step U1), selecting a first candidate frame W in a candidate frame list W, adding a score s into the candidate frame list W l, and removing the candidate frame from the candidate frame list W;
step U2), calculating the coincidence ratio of the first candidate frame w and the divided first candidate frame w i;
step U3) removes all candidate boxes in the candidate box list W that overlap with the first candidate box W by more than a threshold of 0.3.
In this embodiment, further, after eliminating the candidate frame with low confidence coefficient, the method for correcting the coordinate information of the candidate frame includes: defining a three-dimensional coordinate correction vectorGiven a candidate frame (x, y, w, h), the upper left corner coordinates (x, y), the candidate frame length and width (w, h), the candidate frame after coordinate correction is:
Confidence vector { c1, c2,.,. Cn } I (cn > t), takes 1 when (cn > t), otherwise takes 0: and taking the average value of the correction modes larger than the threshold t as the coordinate information of the corrected candidate frame.
In this embodiment, further, in the step 2), the same character is sent to the classification network for classification multiple times by adopting a random clipping manner. The random cutting mode can effectively improve the classification effect of the difficult tongue picture sample pictures, and the difficult tongue picture sample pictures, namely the tongue picture sample pictures which are locally similar, can be distinguished by paying attention to local information. Therefore, the random clipping mode enables the classifier to learn global information, pay attention to local information, and is beneficial to improving the effect of the classifier.
In this embodiment, further, the sample size of the second layer network is larger than the sample size of the first layer network, and the sample size of the third layer network is larger than the sample size of the second layer network.
In this embodiment, further, the size of the input tongue picture sample needs to be scaled in multiple scales, and the multi-scale sample test can effectively improve the robustness and accuracy of the classifier.
The working principle and working process of the invention are as follows: in the algorithm, a deep learning network is cascaded by weak classifiers obtained by three small convolutional neural networks to obtain a classifier with better classification effect. Firstly, randomly cutting marked tongue picture sample pictures to obtain positive and negative samples with classification labels, putting the positive and negative samples into a first layer network, extracting, classifying and judging the characteristics of the input samples, storing the coordinate information of the sample which is identified as positive in an original picture as candidate frame information, rejecting the candidate frames with lower confidence level into a second layer network for further training through difficult case identification operation, wherein the size of the second layer sample is larger than that of the first layer sample, carrying more picture information for training, obtaining more accurately classified candidate frames, reducing the number of the candidate frames into a third layer through confidence level screening, and obtaining the coordinate information of the classification labels and frames through the training.
The thinking of the algorithm for feature selection is that a strong classifier with better effect is obtained through combination of weak classifiers, iterative training is performed during training, each round of training can be self-adaptive, the weight of samples with the wrong classifier in the previous round of iteration is increased, a new weak classifier is trained by taking the weight as a reference, a set of weak classifiers is added, g (x) is calculated during testing to judge whether the g (x) is larger than a given threshold value theta, positive and negative samples are classified, and the training specifically comprises the following steps: preparing a tongue picture sample picture scaled to a window size as a positive sample; sampling a negative sample on a tongue picture sample picture without a positive sample (without a tongue image) to calculate values of 10 integration channels of the positive and negative samples; randomly generating a feature pool F, and calculating the error rate of a sampled sample; initializing a sample set D, and enabling the maximum iteration number k max; initializing the weight W k (i) =1/n, i=1 … n of the iteration, and performing accumulated summation from 1 to k max; selecting a structure from the feature pool according to the weight W k (i) of the sample set D, and forming a two-layer decision tree C k; calculating a training error, C k, an error in the weight W k (i) of the sample set D; calculating the weight of the weak classifier and updating the weight of the sample; returning the weak classifier C k and the weight corresponding to the weak classifier C k, and combining the weak classifier into a strong classifier. The algorithm is very sensitive to abnormality and noise, if the labeling data has false labeling, the influence on the training model is easy to be caused, and in order to alleviate the influence, the labeling of the whole data set is rechecked, and the labeling data is corrected; in addition, the algorithm that the weak classifier obtained through the three small convolutional neural networks is cascaded to obtain the classifier with better classifying effect is adopted, and the used convolutional neural network is smaller, and candidate frames are filtered layer by layer through 3 series networks with different sizes in a cascading mode, so that the speed is ensured, and meanwhile, the model precision is high.
The foregoing describes the embodiments of the present invention in detail, but the description is only a preferred embodiment of the present invention and should not be construed as limiting the scope of the invention. All equivalent changes and modifications within the scope of the present invention are intended to be covered by this patent.
Claims (7)
1. A tongue image detection and positioning algorithm based on a lightweight cascade neural network is characterized by comprising the following steps:
Step 1) labeling the coordinate position of a target area of a tongue photo sample picture;
Step 2) randomly cutting tongue photo sample pictures to obtain positive and negative samples with classification labels, and putting the positive and negative samples into a first layer network;
step 3) carrying out feature extraction, classification and judgment on the sample picture input in the step 2), outputting coordinate information of candidate frames classified as positive, correspondingly cutting an original picture to obtain a carefully selected sample, and scaling the sample to a specified size and inputting the sample into a second-layer network;
Step 4) training the sample pictures in the step 3) to obtain classified candidate frames, outputting coordinate information of the candidate frames classified as positive, correspondingly cutting the original picture to obtain carefully selected samples, and scaling the samples to a size larger than that of the step 3) to input the samples into a third-layer network;
Wherein the sample size of the second layer network is greater than the sample size of the first layer network, and the sample size of the third layer network is greater than the sample size of the second layer network;
Step 5) training the sample pictures in the step 4), and obtaining coordinate information of the classification labels and the candidate frames after training;
the algorithm steps of the training of the step 4) and the step 5) comprise:
step S1), preparing a tongue picture sample picture scaled to a window size as a positive sample;
step S2) sampling a negative sample on a tongue picture sample picture without a positive sample and a tongue picture;
Step S3) calculating the values of 10 integration channels of the positive and negative samples;
step S4), randomly generating a feature pool F, and calculating the error rate of a sampled sample;
Step S5), initializing a sample set D, and carrying out maximum iteration number k max;
Step S6) initializing the iterative weights W k (i) =1/n, i=1 … n, and performing an accumulated summation from 1 to k max;
Step S7), selecting a structure from the feature pool according to the weight W k (i) of the sample set D, and forming a two-layer decision tree C k;
Step S8), calculating training errors, wherein the errors of C k when the weight W k (i) of the sample set D is calculated;
step S9), calculating the weight of the weak classifier, and updating the weight of the sample;
step S10) returns the weak classifier C k and its corresponding weights, combining the weak classifiers into a strong classifier.
2. The tongue image detection and positioning algorithm based on the lightweight cascading neural network according to claim 1, wherein the step 3) is performed before the second layer network is input and the step 4) is performed on the third layer network, and frame regression is performed on the candidate frames, so that the candidate frames with low confidence coefficient are removed.
3. The tongue image detection and positioning algorithm based on the lightweight cascade neural network according to claim 1, wherein the step 2) is characterized in that positive and negative local sample classification is realized by distinguishing through the iou value.
4. The tongue detection and localization algorithm based on a lightweight cascaded neural network of claim 2, wherein the method for screening candidate frames comprises:
Step T1), extracting a candidate frame list W, and sorting the candidate frame list W according to the score;
Step T2) initializing a filtered candidate box list W l to be empty;
step T3) judging whether l (W) >0 is true, if so, executing an nms algorithm to screen candidate frames; if not, return to step T2).
5. The tongue detection and localization algorithm based on a lightweight cascaded neural network of claim 4, wherein the nms algorithm comprises:
Step U1), selecting a first candidate frame W in a candidate frame list W, adding a score s into the candidate frame list W l, and removing the candidate frame from the candidate frame list W;
step U2), calculating the coincidence ratio of the first candidate frame w and the divided first candidate frame w i;
step U3) removes all candidate boxes in the candidate box list W that overlap with the first candidate box W by more than a threshold of 0.3.
6. The tongue image detection and positioning algorithm based on the lightweight cascade neural network according to claim 1, wherein the step 2) adopts a random clipping mode to send the same character into the classification network for classification.
7. The tongue image detection and positioning algorithm based on the lightweight cascade neural network as claimed in claim 1, wherein the size of the input tongue image sample picture needs to be scaled in multiple scales, and the multi-scale sample test can effectively improve the robustness and accuracy of the classifier.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910789517.6A CN110599463B (en) | 2019-08-26 | 2019-08-26 | Tongue image detection and positioning algorithm based on lightweight cascade neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910789517.6A CN110599463B (en) | 2019-08-26 | 2019-08-26 | Tongue image detection and positioning algorithm based on lightweight cascade neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110599463A CN110599463A (en) | 2019-12-20 |
CN110599463B true CN110599463B (en) | 2024-09-03 |
Family
ID=68855693
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910789517.6A Active CN110599463B (en) | 2019-08-26 | 2019-08-26 | Tongue image detection and positioning algorithm based on lightweight cascade neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110599463B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111242038B (en) * | 2020-01-15 | 2024-06-07 | 北京工业大学 | Dynamic tongue fibrillation detection method based on frame prediction network |
CN111414995B (en) * | 2020-03-16 | 2023-05-19 | 北京君立康生物科技有限公司 | Detection processing method and device for micro-target colony, electronic equipment and medium |
CN111489332B (en) * | 2020-03-31 | 2023-03-17 | 成都数之联科技股份有限公司 | Multi-scale IOF random cutting data enhancement method for target detection |
CN111598833B (en) * | 2020-04-01 | 2023-05-26 | 江汉大学 | Method and device for detecting flaws of target sample and electronic equipment |
CN112860867B (en) * | 2021-02-25 | 2022-07-12 | 电子科技大学 | Attribute selecting method and storage medium for Chinese question-answering system based on convolution neural network |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109410168A (en) * | 2018-08-31 | 2019-03-01 | 清华大学 | For determining the modeling method of the convolutional neural networks model of the classification of the subgraph block in image |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104537379A (en) * | 2014-12-26 | 2015-04-22 | 上海大学 | High-precision automatic tongue partition method |
KR101809819B1 (en) * | 2016-02-23 | 2017-12-18 | 정종율 | Method and system for tongue diagnosis based on image of tongue |
CN105930798B (en) * | 2016-04-21 | 2019-05-03 | 厦门快商通科技股份有限公司 | The tongue picture towards mobile phone application based on study quickly detects dividing method |
US10997727B2 (en) * | 2017-11-07 | 2021-05-04 | Align Technology, Inc. | Deep learning for tooth detection and evaluation |
CN109637660B (en) * | 2018-12-19 | 2024-01-23 | 新绎健康科技有限公司 | Tongue diagnosis analysis method and system based on deep convolutional neural network |
-
2019
- 2019-08-26 CN CN201910789517.6A patent/CN110599463B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109410168A (en) * | 2018-08-31 | 2019-03-01 | 清华大学 | For determining the modeling method of the convolutional neural networks model of the classification of the subgraph block in image |
Non-Patent Citations (2)
Title |
---|
"AdaBoost算法在中医舌诊图像分区识别中的研究";张萌 等;《小型微型计算机系统》;第29卷(第6期);第1149-1153页第1.2-1.03节 * |
"多级联卷积神经网络人脸检测";余飞 等;《五邑大学学报(自然科学版)》;第32卷(第3期);第49-56, 71页第1.2节 * |
Also Published As
Publication number | Publication date |
---|---|
CN110599463A (en) | 2019-12-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110599463B (en) | Tongue image detection and positioning algorithm based on lightweight cascade neural network | |
CN110348319B (en) | Face anti-counterfeiting method based on face depth information and edge image fusion | |
Zhao et al. | Cloud shape classification system based on multi-channel cnn and improved fdm | |
CN110543837B (en) | Visible light airport airplane detection method based on potential target point | |
CN110334706B (en) | Image target identification method and device | |
CN107133616B (en) | Segmentation-free character positioning and identifying method based on deep learning | |
CN107316036B (en) | Insect pest identification method based on cascade classifier | |
JP6395481B2 (en) | Image recognition apparatus, method, and program | |
CN107273832B (en) | License plate recognition method and system based on integral channel characteristics and convolutional neural network | |
CN105844621A (en) | Method for detecting quality of printed matter | |
CN105574550A (en) | Vehicle identification method and device | |
CN106408030A (en) | SAR image classification method based on middle lamella semantic attribute and convolution neural network | |
CN110263712A (en) | A kind of coarse-fine pedestrian detection method based on region candidate | |
CN106203237A (en) | The recognition methods of container-trailer numbering and device | |
CN110334703B (en) | Ship detection and identification method in day and night image | |
CN108734200B (en) | Human target visual detection method and device based on BING (building information network) features | |
CN109902576B (en) | Training method and application of head and shoulder image classifier | |
CN112365497A (en) | High-speed target detection method and system based on Trident Net and Cascade-RCNN structures | |
CN110008899B (en) | Method for extracting and classifying candidate targets of visible light remote sensing image | |
CN114140665A (en) | Dense small target detection method based on improved YOLOv5 | |
US11741153B2 (en) | Training data acquisition apparatus, training apparatus, and training data acquiring method | |
CN104050460B (en) | The pedestrian detection method of multiple features fusion | |
CN113221956A (en) | Target identification method and device based on improved multi-scale depth model | |
CN111444816A (en) | Multi-scale dense pedestrian detection method based on fast RCNN | |
CN114898290A (en) | Real-time detection method and system for marine ship |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |