Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a schematic flow chart of a method for constructing a symmetrical full convolution neural network model according to an embodiment of the present invention. As shown in FIG. 1, the method includes S101-S103.
S101, performing a blocking process on the original fundus image.
The original fundus image may be a color image, a grayscale image, or the like. Color images, such as RGB three-channel color images, are generally used. The original fundus image may be an original fundus image acquired from a preset dataset, such as a fundus image in a DRIVE library training set. In an embodiment, the method may also be obtained by performing data augmentation according to fundus images in a preset dataset, for example, obtained by performing data augmentation on fundus images in a training dataset of a DRIVE library.
In one embodiment, as shown in FIG. 2, step S101 includes the following steps S1011-S1013.
S1011, determining the data size. If the data size is selected by test experiments, if it is necessary to divide all of the original images in the DRIVE library training set into a total of how many tiles. Such as into 500000 tiles, etc.
S1012, determining the number and the size of the blocks to be segmented of each original fundus image according to the data scale. That is, how many tiles each original fundus image needs to be divided into is determined according to the data size, the size and the number of the divided tiles.
S1013, the original fundus image and a preset standard image which has undergone fundus blood vessel segmentation are randomly segmented according to the determined tile data and size. The preset standard image which has undergone fundus blood vessel segmentation refers to a standard image which is manually segmented by an expert in a preset dataset, and a standard image which is manually segmented by an expert in a DRIVE library training dataset. Each original fundus image corresponds to a preset standard image subjected to fundus blood vessel segmentation, and the original fundus image and the preset standard image subjected to fundus blood vessel segmentation are segmented.
Assuming that the data size is n, the original fundus image size is w×h, and the tile size is a×b. Each original fundus image of original size needs to be randomly segmented into n/60 tiles, where there may be overlapping portions between the tiles.
The center point of the tile should satisfy: (a/2) < x_center < (w-a/2)
(b/2)<y_center<(h-b/2)
After the center point is randomly selected, the range of values of the image block should be:
Patch=(x_center-a/2:x_center+a/2,y_center-b/2:y_center+b/2)
wherein: the x_center and the y_center are respectively an X-axis coordinate value and a Y-axis coordinate value of a center point of the block; w and h are the length and width of the original fundus image respectively; a. b are the length and width of the tile, respectively.
S102, performing whitening treatment on the segmented original fundus image to obtain an original fundus image block.
The original fundus image after the blocking processing is the corresponding block. Fundus images are typically affected by factors such as illumination intensity in the acquisition environment, centerline reflection, and acquisition equipment. Noise is easily introduced and the contrast between blood vessels and the background is reduced. To reduce the influence of these factors, constant information in the picture is extracted, and whitening processing is required for the fundus image to convert the pixel values of the fundus image into zero mean and unit variance.
In one embodiment, as shown in FIG. 3, step S102 includes the following steps S1021-S1022.
S1021, calculating pixel average values and variances under different channels of the original fundus image after the blocking processing.
For example, calculating the pixel mean value mu and variance delta of each channel of the original fundus image after the block processing 2 . Wherein the pixel mean μ and variance δ for each channel are calculated 2 The calculation formula of (2) is as follows:
wherein R, G, B is the number of red, green and blue channel pixel points respectively; r, g and b are respectively the current pixel points of the red, green and blue channels; z is Z r 、Z g 、Z b The values of the current pixel points are respectively; mu (mu) r 、μ g 、μ b The average values of the red, green and blue channel pixels are respectively obtained.
S1022, subtracting the average value of the pixels under the channel from each pixel value in different channels of the original fundus image after the blocking processing, and dividing the average value by the standard deviation under the channel, so as to obtain an original fundus image block. Wherein the calculation formula is shown as formula (7).
Fig. 4 is a comparison diagram of an original fundus image after the blocking processing and a fundus image subjected to the whitening processing provided in the embodiment of the present application. As shown in fig. 4, the left part of fig. 4 is a block-processed original fundus image, and the right part of fig. 4 is a whitened image of a corresponding block-processed original fundus image. It can be seen that the contrast of blood vessels with the background in the processed image is significantly enhanced. In some images with dim illumination and overlarge noise, some tiny blood vessels in the original fundus image are hardly identified by naked eyes, but the blood vessels can still become clearly visible after whitening, which plays an important role in improving the segmentation accuracy of the blood vessels in the later period. After whitening treatment, the contrast of micro blood vessels and the background in some images with dim light or focus is enhanced, and the method plays an important role in improving the segmentation accuracy of blood vessels. The precision of the model on the focus image and the dim uneven illumination image is improved finally.
S103, inputting the original fundus image block into a preset symmetrical full-convolution neural network for training to obtain a preset symmetrical full-convolution neural network model, wherein each hidden layer in the preset symmetrical full-convolution neural network model is used for processing the characteristic images input by the layer and processing all the output characteristic images before the layer at the same time to realize the input of the original fundus image block and the output of the fundus blood vessel segmentation result of each pixel corresponding to the original fundus image block.
The preset symmetrical full convolution neural network is a structural symmetrical full convolution neural network which is connected densely.
Fig. 5 is a schematic structural diagram of a structurally symmetrical full convolutional neural network according to an embodiment of the present invention. As shown in fig. 5, the structural symmetric full convolutional neural network includes an input layer, a hidden layer and an output layer, wherein the input layer and the output layer are all hidden layers, and a plurality of layers in the hidden layer are symmetric structures. The hidden layer is divided into a downsampling part and an upsampling part, the downsampling part is formed by alternately combining a plurality of convolution layers and pooling layers, and the input image can be subjected to path contraction in the training of the network so as to capture global information. The up-sampling cost is formed by alternately combining a plurality of convolution layers and deconvolution layers, and the down-sampling characteristic diagram is subjected to path expansion in network training, so that each pixel point is accurately positioned. Unlike a general convolutional neural network, the preset symmetrical full convolutional neural network does not contain a full connection layer, only comprises an output layer, and the output layer performs a classification operation (calculates the probability of each pixel point being a background point and a blood vessel point) on each pixel point in a feature map with the same size as an original image after upsampling through a preset activation function, such as a softmax activation function. The structural symmetrical full convolution neural network is an end-to-end network, namely, an image is input, and an image with the same size is output corresponding to the input.
Fig. 6 is a schematic diagram of a dense connection network according to an embodiment of the present invention. As can be seen in fig. 6, each layer processes the feature map output from the previous layer, and processes all the feature maps output before the layer.
On the basis of a full convolution neural network model of a symmetrical structure, the introduction of a dense connection mechanism has the following advantages:
(1) In the hidden layers of the deep learning network model, the output of the upper layer is generally used as the input of the lower layer, so that N hidden layers in the network have N connection layers. However, when the number of network layers is increased to a certain extent, the connection between the front layer and the rear layer is weakened as it becomes longer, which may cause a problem of gradient extinction. Under the dense connection mechanism, each layer of input in the hidden layer is a set of outputs of all previous layers, and the feature map of the current layer is directly transferred to all network layers as input, so that if there are N hidden layers in the network, there are N (n+1)/2 connections. This solves the problem of gradient extinction well.
(2) The problem of over fitting can be solved to a certain extent. The amount of data in the fundus image database employed in the studies herein is relatively poor and the problem of overfitting is likely to occur during the training of the network. Each layer of the deep learning network is characterized by a nonlinear transformation of the input data, and as the number of layers of the network increases, the complexity of the transformation increases, and the complexity of the last layer of the network accumulates to a greater extent. The classifier of the general neural network directly depends on the output of the last layer of the network, so that a decision function with good generalization performance is difficult to obtain, and the problem of overfitting is caused. The comprehensive utilization of the network under the dense connection mechanism is close to the characteristic with low complexity in the input layer, so that the network can obtain a decision function with better generalization performance more easily.
Fig. 7 is a schematic structural diagram of a preset symmetric full convolution neural network according to an embodiment of the present invention. The preset symmetrical full convolution neural network has a symmetrical structure in the whole, and the hidden layer is composed of a downsampling part and an upsampling part, except that the input of each layer of the network is the superposition of all the output characteristic diagrams of the upper layer, so that the deeper network layer of each layer can repeatedly use the characteristics extracted from the front layer.
The downsampling part and the upsampling part in the hidden layer of the model are respectively composed of a plurality of circulation units, as shown in fig. 8, which is a hidden layer decomposition schematic diagram of the preset symmetric full convolution neural network provided by the embodiment of the invention. Wherein the downsampling part has three downsampling cycle units, and the upsampling part has three upsampling cycle units. Each downsampling cycle unit corresponds to one concealment layer, and each upsampling cycle unit also corresponds to one concealment layer.
In an embodiment, if the number of fundus images in the preset data set is insufficient, it is necessary to increase the number of fundus images in the data set. As shown in fig. 1, before step S101, the method further includes:
s101a, acquiring fundus images in a preset data set, and performing augmentation processing on the acquired fundus images to obtain original fundus images.
In one embodiment, as shown in FIG. 9, step S101a includes the following steps S1011a-S1013a.
S1011a, shorthand-rotating each fundus image in the preset data set by one angle.
S1012a, brightness of the picture is adjusted for each fundus image after rotation using Gamma correction.
S1013a, each fundus image after brightness adjustment and a fundus image in a preset data set are taken as an original fundus image.
The preset data set may be a DRIVE library training set. The Gamma correction formula is as follows: f (img) i )=img i γ Wherein img i Representing the pixel value of a certain point i. Fig. 10 is a schematic diagram of Gamma correction according to an embodiment of the present invention. The effect of Gamma correction can be derived in conjunction with fig. 10:
1) When γ < 1, as shown by a broken line in fig. 10, in a low gray scale region, a dynamic range becomes large, and the contrast of an image is enhanced; in the high gray scale region, the dynamic range becomes small, and the gray scale value of the entire image becomes large. Such as 0.5< gamma < 1.
2) When γ > 1, as shown by the solid line in fig. 10, the dynamic range of the low gray area becomes small, the dynamic range of the high gray area becomes large, the contrast of the low gray area image is reduced, the contrast of the high gray area image is improved, and the gray value of the whole image becomes small. Such as 1< gamma < 1.5.
Each picture in the training set is expanded into three pictures after the data is amplified: the original image in training set, the augmentation image obtained when gamma is less than 1, the augmentation image obtained when gamma is more than 1. Assuming that 20 images in the training set are acquired, 60 images can be obtained after the augmentation process to be used as training of the model.
In one embodiment, as shown in FIG. 11, step S103 includes the following steps S1031-S1038.
S1031, randomly extracting fundus image blocks of a preset proportion from the original fundus image blocks as training samples.
If the model training stage is assumed to comprise 570000 image blocks in total, 90% of the data in each training round is randomly selected for training, and the data can be input into a preset symmetrical full convolution neural network in batches during training to reduce training time. The remaining 10% of the data was used as validation. Assuming that the size of the image block is 48×48, since the information of the color space in the image is retained, that is, the images of different channels are acquired, the size of the image block is 48×48×3.
S1032, inputting the acquired training samples into a plurality of downsampling circulation units in a preset symmetrical full-convolution neural network for processing, wherein each downsampling circulation unit corresponds to a hidden layer in the preset symmetrical full-convolution neural network, each downsampling circulation unit carries out convolution processing on the characteristic images input by the layer, carries out convolution processing on all the output characteristic images before the layer, and carries out pooling processing on all the characteristic images after the convolution processing.
Each downsampling cyclic unit corresponds to one hidden layer in a preset symmetric full convolutional neural network. The step of inputting the feature map processed by the plurality of downsampling cycle units into an upsampling cycle unit symmetrical to the downsampling cycle unit for processing comprises the following steps: inputting the acquired training sample into a first downsampling cycle unit for processing; the feature map processed by the first downsampling cycle unit is input to the second downsampling cycle unit for processing, and the feature map processed by the second downsampling cycle unit is input to the third downsampling cycle unit for processing. Wherein the specific processing procedure of each downsampling cycle unit is the same. Each downsampling circulation unit carries out convolution processing on the characteristic images input by the layer, carries out convolution processing on all the characteristic images output before the layer, and carries out pooling processing on all the characteristic images after convolution processing.
Specifically, the specific processing procedure of each downsampling cycle unit will be described taking as an example the step of inputting the acquired training samples to the first downsampling cycle unit for processing. The specific processing procedure of each downsampling cycle unit can be seen in fig. 8 and 12. Each downsampling cycle unit in fig. 8 includes: conv2d (downsampling the first convolutional layer), add (superposition processing), batch_normalization (normalization processing), activation (activation function processing), conv2d (downsampling the second convolutional layer), max_pooling (downsampling pooling layer).
As shown in fig. 12, the step of inputting the acquired training samples to the first downsampling cycle unit for processing includes the following steps S1032a-S1032f.
S1032a, the acquired training samples are input to the downsampled first convolution layer in the downsampling cyclic unit to perform convolution processing.
The downsampled first convolution layer extracts features from the feature map output by the previous layer. For example, the convolution kernel may be 3*3.
S1032b, obtaining all the output feature graphs before the layer, and superposing the obtained feature graphs.
Superposition is understood to mean that the acquired feature map as a whole is processed together.
S1032c, the superimposed feature map is normalized.
The normalization is to forcedly pull the input distribution gradually mapped to the nonlinear function and drawn towards the limit saturation region of the value interval back to the normal distribution of the comparison standard with the mean value of 0 and the variance of 1 for each hidden layer neuron, so that the input value of the nonlinear transformation function falls into a region which is sensitive to input, thereby avoiding the problem of gradient disappearance and improving the convergence rate of the network.
S1032d, the normalized feature map is activated using an activation function.
Among them, the activation function is the most commonly used 'Relu' activation function in deep learning at present.
S1032e, the activated feature map is input to the downsampled second convolution layer in the downsampling cyclic unit to perform convolution processing.
The second convolution extracts features from the superposition of all inputs to the previous layer, the convolution kernel size may be 3*3, for example.
S1032f, inputting the feature map processed by the downsampling first convolution layer and the feature map processed by the downsampling second convolution layer into a downsampling pooling layer in the downsampling circulation unit to carry out pooling processing, thus completing the processing of the downsampling circulation unit.
Wherein, the downsampling pooling pool uses a maximum pooling method to pool.
And each downsampling cycle unit comprises two convolution operations, wherein the first convolution extracts features from the input of the upper layer, and the second convolution extracts features from the superposition of all the inputs of the front layer. The convolution kernel is 3*3, which is because the convolution kernel must be larger than 1 to improve the receptive field, and even convolution kernels cannot guarantee that the size of the input feature map and the size of the output feature map are unchanged even if padding is symmetrically added on both sides of the feature map, so that a convolution kernel with a size of 3*3 is generally selected.
The original image block size is reduced to 1/4 of the original size after each downsampling operation.
S1033, inputting the feature images processed by the plurality of downsampling circulation units into upsampling circulation units symmetrical to the downsampling circulation units for processing, wherein each upsampling circulation unit corresponds to a hidden layer in a preset symmetrical full convolution neural network, each upsampling circulation unit carries out upsampling processing on the feature images input by the layer, carries out convolution processing on the feature images after upsampling processing, and carries out convolution processing on all the feature images output before the layer.
Each upsampling loop unit corresponds to a hidden layer in the preset symmetric full convolutional neural network. The step of inputting the feature map processed by the plurality of downsampling cycle units into an upsampling cycle unit symmetrical to the downsampling cycle unit for processing comprises the following steps: inputting the feature images processed by the plurality of downsampling cycle units into a first upsampling cycle unit for processing; the feature map processed by the first upsampling circulation unit is input to the second upsampling circulation unit for processing, and the feature map processed by the second upsampling circulation unit is input to the third upsampling circulation unit for processing. Wherein the specific processing procedure of each up-sampling loop unit is the same. Each up-sampling circulation unit carries out up-sampling processing (also called deconvolution processing) on the feature map input by the layer, carries out convolution processing on the feature map after up-sampling processing, and carries out convolution processing on all the feature maps output before the layer.
Specifically, the specific processing procedure of each up-sampling loop unit will be described taking as an example the step of inputting the acquired training sample to the first up-sampling loop unit for processing. The specific processing procedure of each up-sampling loop unit can be seen in fig. 8 and 13. Each up-sampling loop unit in fig. 8 includes: up_sampling (upsampling process), conv2d (upsampling first convolutional layer), add (superimposition process), batch_normalization (normalization process), activation (activation function process), conv2d (upsampling second convolutional layer).
As shown in fig. 13, the step of inputting the acquired training samples to the first downsampling cycle unit for processing includes the following steps S1033a-S1033f.
And S1033a, performing up-sampling processing on the acquired characteristic diagram. The upsampling process may also be understood as a deconvolution process, which may interpolate the acquired features using interpolation algorithms. Such as using bilinear interpolation.
S1033b, inputting the acquired feature map to the up-sampling first convolution layer in the up-sampling loop unit to perform convolution processing.
The upsampling first convolutional layer extracts features from the feature map output from the previous layer. For example, the convolution kernel may be 3*3.
And S1033c, acquiring all the output feature maps before the layer, and superposing the acquired feature maps.
And S1033d, normalizing the superimposed characteristic diagram.
The purpose and effect of standardization are referred to above, and are not described herein.
S1033e, activating the normalized feature map using an activation function.
Among them, the activation function is the most commonly used 'Relu' activation function in deep learning at present.
S1033f, inputting the activated characteristic diagram into an up-sampling second convolution layer in the up-sampling circulation unit to carry out convolution processing, and thus completing the processing of the first up-sampling circulation unit.
The second convolution extracts features from the superposition of all inputs to the previous layer. The convolution kernel may be 3*3, for example.
The up-sampling circulation unit corresponds to the down-sampling circulation unit, and the internal structure is similar. The image block size is increased to 1/4 of the upper layer after each upsampling operation.
If the preset symmetrical full convolution neural network comprises three downsampling circulation units and three upsampling circulation units, the input image block size is 48×48. Then the size of the image block input by the first downsampling cycle unit is 48×48, the size of the feature map obtained after the processing of the first downsampling cycle unit is 24×24, that is, the size of the image block input by the second downsampling cycle unit is 24×24, the size of the feature map obtained after the processing of the second downsampling cycle unit is 12×12, that is, the size of the image block input by the third downsampling cycle unit is 12×12. In this way, when the feature map processed by the three downsampling cycle units is input to the first upsampling cycle unit for processing, the size of the input feature map is 12×12, the feature map obtained by the processing of the first upsampling cycle unit is 24×24, that is, the size of the image block input by the second upsampling cycle unit is 24×24, and the feature map obtained by the processing of the second upsampling cycle unit is 48×48, that is, the size of the image block input by the third upsampling cycle unit is 48×48. Wherein the convolution kernel size of each convolution layer in each downsampling unit and each upsampling unit is 3*3. Referring to table 1 for specific data, table 1 inputs parameters for each hidden layer. Wherein, layer_1, layer_2, layer_3 correspond to one cyclic unit respectively.
TABLE 1 hidden layer input parameters
Downsampling layer
|
Feature map size
|
Upsampling layer
|
Feature map size
|
Convolution kernel size
|
Layer_1
|
48*48
|
Layer_1
|
12*12
|
3*3
|
Layer_2
|
24*24
|
Layer_2
|
24*24
|
3*3
|
Layer_3
|
12*12
|
Layer_3
|
48*48
|
3*3 |
S1034, inputting the feature images processed by the up-sampling circulation units into an output layer in a preset symmetrical full convolution neural network for processing to obtain a predicted value corresponding to each pixel in the training sample.
S1035, calculating errors according to the predicted value corresponding to each pixel in the training sample and the real label of each pixel of the training sample.
Wherein the error is calculated using a cross entropy cost function.
The cross entropy cost function is:
a=σ(z) (9)
z=∑(W j *X j +b) (10)
wherein: n is the total number of training sets, x is the input, w is the weight of the input, b is the bias value, z is the weighted sum of the inputs, σ is the activation function, a is the actual output of the neuron, y is the desired output, and C is the cross entropy cost function.
S1036, judging whether the error has reached the minimum.
S1037, if the error does not reach the minimum, updating the network parameters in the preset symmetrical full convolution neural network through a gradient descent algorithm, and calling the symmetrical full convolution network with the updated network parameters as the preset symmetrical full convolution neural network. Then, the process returns to step S1031.
In training the neural network, w and b are updated by gradient descent algorithm, so the derivatives of the cost function on w and b need to be calculated:
then update w, b:
the weights of each layer of the network are then updated by back propagation:
using the error delta of the next layer l+1 To represent the error delta of the current layer l The method comprises the following steps:
δ l =((w l+1 ) T δ l+1 )⊙σ'(z l ) (15)
formulas (13) - (16) are the process of weight and bias updating, and the letters in (13) (14) are as above; where l represents the network layer delta represents the error, delta
l Indicating the error of the first layer,
representing the weights on the connections from k neurons of layer (l-1) to j neurons of layer l.
S1038, if the error is minimized, taking the trained symmetrical full convolution neural network model as a preset symmetrical full convolution neural network model.
To learn what the result of the preset symmetric full convolutional neural network model learned, the learned convolutional operator is visualized as shown in fig. 14. Each convolution operator is equivalent to extracting a feature in the image, and parameters (namely, calculated w) in the convolution operator are used for weakening uninteresting input and strengthening interesting input when convolution operation is carried out on the input value.
As can be seen from fig. 14, this convolution operator is roughly a network of blood vessels, with white portions representing large weights and black portions representing small or negative weights. The convolution operator is in fact intended to give greater weight to the pixels of the vessel shape portion, while penalizing the background points. Finally, predicting the probability that the pixel point is a vascular point through softmax.
According to the embodiment of the method, the characteristic map input by the layer is processed through each hidden layer in the preset symmetrical full convolution neural network, all the output characteristic maps before the layer are processed at the same time, so that the fundus blood vessel segmentation result which is input into an original fundus image block and output into each pixel corresponding to the original fundus image block is achieved. Each hidden layer constructed by the embodiment of the invention processes all the output characteristic diagrams before the layer besides the output of the previous layer. Therefore, the problem that the connection between the front layer and the rear layer is weakened along with the lengthening and the gradient possibly disappears after the network layer number is deepened to a certain extent due to taking the output of the upper layer as the input of the lower layer in other full convolution neural networks is avoided. Meanwhile, the preset symmetrical full convolution neural network model obtained through training can solve the problem that the classifier of other neural networks directly depends on the output of the last layer of the network, so that the overfitting problem generated by a decision function with good generalization performance is difficult to obtain, namely, the preset symmetrical full convolution neural network model in the embodiment of the invention can solve the overfitting problem, and the generalization capability of the model is improved.
Fig. 15 is a schematic flowchart of a fundus image vessel segmentation method provided in an embodiment of the present invention. As shown in fig. 15, the method includes S201-S204.
S201, a blocking process is performed on the target fundus image. The target fundus image may be a test image in a preset dataset, such as a test-in-DRIVE image. If 20 original images in the test set in the DRIVE are acquired, the images are divided into a plurality of blocks from left to right and from top to bottom in sequence, and the block size is the same as that in step S101. There is no overlap between tiles, then each full-size image is split into n small tiles, then:
wherein new_w and new_h are the width and the height of the original image respectively, and a and b are the width and the height of the image block respectively;
further, the values of new_w and new_h are determined according to the following rules:
if w% a=0, new_w=w; w is the width of the original image and% is the remainder operation.
else new_w= (w/a+1) a; division is floor division, i.e. the result of the division operation takes only the integer part of the quotient;
in the same way, the processing method comprises the steps of,
if h% b=0, new_h=h; h is the width of the original image;
else new_h=(h/b+1)*b。
i.e. the manner of determination is as follows: if w% a=0, then new_w=w is determined; if w% a=0 does not hold, new_w= (w/a+1) ×a, where w/a is floor division; if h% b=0, then new_h=h is determined; if h% b=0 does not hold, then new_h= (h/b+1) b is determined.
And putting the total 20 x n blocks obtained after the processing into an array in sequence.
S202, performing whitening treatment on the segmented target fundus image to obtain a target fundus image block.
The target fundus image after the blocking processing is the corresponding block. The specific method of the whitening treatment is referred to the above description, and will not be described herein.
S203, inputting the target fundus image block into the constructed preset symmetrical full convolution neural network model to obtain fundus blood vessel segmentation results of each pixel of the target fundus image block.
The preset symmetrical full convolution neural network model constructed by the method can be the preset symmetrical full convolution neural network model constructed by any embodiment.
S204, the fundus blood vessel segmentation results of each pixel of the target fundus image block are spliced again to obtain fundus blood vessel segmentation results of the target fundus image.
And re-stitching the 20 x n feature images obtained through the output of the preset symmetrical full convolution neural network model according to the size of the original image to obtain a fundus blood vessel segmentation result of the target fundus image.
According to the embodiment of the method, the fundus blood vessel segmentation result is obtained through the preset symmetrical full convolution neural network model, and the accuracy and the precision of the fundus blood vessel segmentation result are improved due to the fact that the accuracy and the generalization capability of the preset symmetrical full convolution neural network model are improved.
Fig. 16 is a schematic block diagram of a symmetrical full convolution neural network model construction device provided by an embodiment of the present invention. As shown in fig. 16, the apparatus includes a unit for executing the above-described symmetrical full convolution neural network model construction method. Specifically, as shown in fig. 16, the apparatus 30 includes a block processing unit 301, a whitening processing unit 302, and a training unit 303.
A blocking processing unit 301 for performing a blocking process on the original fundus image.
In an embodiment, the block processing unit 301 includes a scale determining unit, a tile size determining unit, and an image block processing unit. And the scale determining unit is used for determining the data scale. And the block size determining unit is used for determining the number and the size of blocks which need to be divided for each original fundus image according to the data scale. And the image blocking processing unit is used for randomly dividing the original fundus image and a preset standard image subjected to fundus blood vessel segmentation according to the determined block data and the determined size.
And a whitening processing unit 302 for performing whitening processing on the segmented original fundus image to obtain an original fundus image block.
In one embodiment, the whitening processing unit 302 includes a mean variance calculating unit and a mean variance processing unit. The mean variance calculating unit is used for calculating the mean value and variance of the pixels under different channels of the original fundus image after the blocking processing. And the mean variance processing unit is used for subtracting the pixel average value under the channel from each pixel value in different channels of the original fundus image after the blocking processing and dividing the pixel average value by the standard deviation under the channel so as to obtain an original fundus image block.
The training unit 303 is configured to input the original fundus image block into a preset symmetric full-convolution neural network for training to obtain a preset symmetric full-convolution neural network model, where each hidden layer in the preset symmetric full-convolution neural network model implements processing of a feature map input by the present layer, and simultaneously processes all output feature maps before the present layer to implement input as the original fundus image block, and output as a fundus blood vessel segmentation result of each pixel corresponding to the original fundus image block.
In an embodiment, as shown in fig. 16, the symmetrical full convolution neural network model building apparatus further includes an image augmentation unit 301a. The image augmentation unit 301a is configured to acquire a fundus image in a preset data set, and perform data augmentation processing on the acquired fundus image to obtain an original fundus image.
In an embodiment, the image augmentation unit 301a includes a rotation unit, a correction adjustment unit, and an original image determination unit. And the rotating unit is used for rotating each fundus image in the preset data set by one angle. And a correction adjustment unit for adjusting the brightness of the picture by Gamma correction for each fundus image after rotation. And an original image determining unit for taking each fundus image after brightness adjustment and a fundus image in a preset data set as original fundus images.
In one embodiment, as shown in fig. 17, the training unit 303 includes a sample acquisition unit 3031, a downsampling unit 3032, an upsampling unit 3033, an output unit 3034, a calculation unit 3035, a judgment unit 3036, an updating unit 3037, and a model determination unit 3038. The sample acquiring unit 3031 is configured to randomly extract fundus image blocks with a preset proportion from the original fundus image blocks as training samples. The downsampling unit 3032 is configured to input the obtained training samples to a plurality of downsampling circulation units in a preset symmetric full-convolution neural network for processing, where each downsampling circulation unit corresponds to a hidden layer in the preset symmetric full-convolution neural network, each downsampling circulation unit performs convolution processing on the feature map input by the layer, performs convolution processing on all the feature maps output before the layer, and performs pooling processing on all the feature maps after the convolution processing. And the upsampling unit 3033 is configured to input the feature images processed by the plurality of downsampling cycle units to upsampling cycle units symmetrical to the downsampling cycle units for processing, where each upsampling cycle unit corresponds to a hidden layer in the preset symmetrical full convolution neural network, and each upsampling cycle unit performs upsampling processing on the feature images input by the layer, performs convolution processing on the feature images after upsampling processing, and performs convolution processing on all the feature images output before the layer. And the output unit 3034 is used for inputting the feature images processed by the up-sampling circulation units to an output layer in a preset symmetrical full convolution neural network for processing so as to obtain a predicted value corresponding to each pixel in the training sample. The calculating unit 3035 is configured to calculate an error according to the predicted value corresponding to each pixel in the training sample and the real label of each pixel in the training sample. A determining unit 3036, configured to determine whether the error has reached a minimum. And the updating unit 3037 is configured to update the network parameters in the preset symmetric full convolution neural network through a gradient descent algorithm if the error does not reach the minimum, and call the symmetric full convolution network with the updated network parameters as the preset symmetric full convolution neural network. Then triggers the sample acquisition unit 3031. The model determining unit 3038 is configured to take the trained symmetric full convolutional neural network model as the preset symmetric full convolutional neural network model if the error has reached the minimum.
Wherein the downsampling unit 3032 comprises a plurality of downsampling cycle units. The system comprises a down sampling circulation unit, a first down sampling circulation unit and a second down sampling circulation unit, wherein the down sampling circulation unit is used for inputting an acquired training sample into a first down sampling circulation unit in a preset symmetrical full convolution neural network for processing. The downsampling cycle unit includes: the device comprises a downsampling first convolution unit, a first superposition unit, a first normalization unit, a first activation unit, a downsampling second convolution unit and a pooling unit. The down sampling first convolution unit is used for inputting the acquired training samples into the down sampling first convolution layer in the down sampling circulation unit to carry out convolution processing. And the first superposition unit is used for acquiring all the output characteristic diagrams before the layer and superposing the acquired characteristic diagrams. And the first normalization unit is used for normalizing the superimposed characteristic diagrams. And the first activating unit is used for activating the normalized characteristic diagram by using an activating function. And the downsampling second convolution unit is used for inputting the activated characteristic diagram into the downsampling second convolution layer in the downsampling circulation unit so as to carry out convolution processing. And the pooling unit is used for inputting the feature map processed by the downsampling first convolution layer and the feature map processed by the downsampling second convolution layer into the downsampling pooling layer in the downsampling circulation unit to carry out pooling processing, so that the processing of the downsampling circulation unit is completed.
The up-sampling unit 3033 includes a plurality of up-sampling loop units. And the up-sampling circulation unit is used for inputting the feature images processed by the plurality of down-sampling circulation units into the up-sampling circulation unit for processing. The up-sampling loop unit includes: the device comprises an upsampling processing unit, an upsampling first convolution unit, a second superposition unit, a second normalization unit, a second activation unit and an upsampling second convolution unit. And the up-sampling processing unit is used for up-sampling the acquired characteristic diagram. And the up-sampling first convolution unit is used for inputting the feature map subjected to the up-sampling processing into the up-sampling first convolution layer in the up-sampling circulation unit so as to carry out convolution processing. And the second superposition unit is used for acquiring all the output characteristic diagrams before the layer and superposing the acquired characteristic diagrams. And the second normalization unit is used for normalizing the superimposed characteristic diagrams. And the second activating unit is used for activating the normalized characteristic diagram by using an activating function. And the up-sampling second convolution unit is used for inputting the activated characteristic diagram into the up-sampling second convolution layer in the up-sampling circulation unit to carry out convolution processing, so that the processing of the first up-sampling circulation unit is completed.
Fig. 18 is a schematic block diagram of a fundus image vessel segmentation apparatus provided in an embodiment of the present invention. As shown in fig. 18, the apparatus includes a unit for performing the above-described fundus image vessel segmentation method. Specifically, as shown in fig. 18, the apparatus 40 includes a block processing unit 401, a whitening processing unit 402, a model using unit 403, and a splicing unit 404.
A blocking processing unit 401 for performing blocking processing on the target fundus image.
And a whitening processing unit 402 for performing whitening processing on the target fundus image after the blocking processing to obtain a target fundus image block.
The model using unit 403 is configured to input the target fundus image block into the constructed preset symmetric full convolution neural network model, so as to obtain a fundus blood vessel segmentation result of each pixel of the target fundus image block.
And a stitching unit 404, configured to stitch the fundus blood vessel segmentation result of each pixel of the target fundus image block again to obtain a fundus blood vessel segmentation result of the target fundus image.
The above-described apparatus may be implemented in the form of a computer program which is executable on a computer device as shown in fig. 19.
Fig. 19 is a schematic block diagram of a computer device according to an embodiment of the present invention. The device is a terminal and other devices, such as a mobile terminal, a PC terminal, an IPad and the like. The device 50 comprises a processor 502, a memory and a network interface 503 connected by a system bus 501, wherein the memory may comprise a non-volatile storage medium 504 and an internal memory 505.
The non-volatile storage medium 504 may store an operating system 5041 and a computer program 5042. The computer program 5042 stored in the nonvolatile storage medium, when executed by the processor 502, can implement the method for constructing the symmetrical full convolution neural network model described above. The processor 502 is used to provide computing and control capabilities to support the operation of the overall device. The internal memory 505 provides an environment for the execution of a computer program in a non-volatile storage medium that, when executed by the processor 502, causes the processor 502 to perform the symmetric full convolutional neural network model building method described above. The network interface 503 is used for network communication. It will be appreciated by persons skilled in the art that the structure shown in fig. 19 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and does not constitute a limitation of the apparatus to which the present inventive arrangements are applied, and that a particular apparatus may include more or less components than those shown in the drawings, or may combine certain components, or have a different arrangement of components.
Wherein the processor 502 is configured to execute a computer program stored in a memory to implement any embodiment of the method for constructing a symmetric full convolutional neural network model.
Another embodiment of the invention also provides a schematic block diagram of a computer device. In this embodiment, the device is a device such as a terminal, for example, a mobile terminal, a PC terminal, an IPad, and the like. Referring specifically to fig. 19, the computer device includes the same structure as the computer device shown in fig. 19. The computer apparatus is different from the computer apparatus shown in fig. 19 in that any of the embodiments of the fundus image vessel segmentation method described above can be implemented when a computer program stored in a nonvolatile storage medium in the computer apparatus is executed by the processor 502.
It should be appreciated that in an embodiment of the invention, the processor 502 may be a central processing unit (Central Processing Unit, CPU), which may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (application lication Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Those skilled in the art will appreciate that all or part of the flow in a method embodying the above described embodiments may be accomplished by computer programs instructing the relevant hardware. The computer program may be stored in a storage medium, which may be a computer readable storage medium. The computer program is executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a storage medium. The storage medium may be a computer readable storage medium. The storage medium stores a computer program which, when executed by a processor, implements any of the embodiments of the symmetric full convolution neural network model construction methods described previously.
In another embodiment of the present invention, there is also provided a storage medium, which may be a computer-readable storage medium storing a computer program which, when executed by a processor, implements any of the embodiments of the fundus image vessel segmentation method described above.
The storage medium may be a U-disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, or other various computer-readable storage media that can store program codes.
In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and the division of the units is merely a logical function division, and other divisions may be implemented in practice. It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus, device and unit described above may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein. While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.