WO2021159643A1 - 基于眼部oct图像的视杯和视盘定位点检测方法及装置 - Google Patents
基于眼部oct图像的视杯和视盘定位点检测方法及装置 Download PDFInfo
- Publication number
- WO2021159643A1 WO2021159643A1 PCT/CN2020/093585 CN2020093585W WO2021159643A1 WO 2021159643 A1 WO2021159643 A1 WO 2021159643A1 CN 2020093585 W CN2020093585 W CN 2020093585W WO 2021159643 A1 WO2021159643 A1 WO 2021159643A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- optic
- cup
- eye
- oct image
- optic disc
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10101—Optical tomography; Optical coherence tomography [OCT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30041—Eye; Retina; Ophthalmic
Definitions
- This application belongs to the field of image processing technology, and in particular relates to a method, a device, a terminal device, and a computer-readable storage medium for detecting a positioning point of an optic cup and an optic disc based on an OCT image of an eye.
- OCT Optical coherence tomography Coherence Tomography
- Optic disc morphology evaluation parameters are very important indicators in ophthalmological diagnosis.
- Optic disc morphology evaluation parameters include, but are not limited to, optic disc area, optic cup area, rim area, vertical rim area, horizontal rim volume, average cup to disc ratio (Cup to Disc Ratio, CDR), horizontal and vertical CDR, etc.
- the embodiments of the present application provide a method, device, terminal equipment and computer-readable storage medium for detecting the positioning point of the optic cup and the optic disc based on the OCT image of the eye, and provide a positioning point of the optic cup and the optic disc based on the OCT image of the eye
- the detection scheme solves the problem of low accuracy and efficiency of the positioning point of the optic cup and optic disc.
- an embodiment of the present application provides a method for detecting positioning points of the optic cup and optic disc based on an OCT image of the eye, including: acquiring an OCT image of the eye; and detecting the OCT image of the eye using a preset detection model , Obtain the two anchor point coordinates of the optic cup and the two anchor point coordinates of the optic disc in the OCT image of the eye; the detection model includes a first network branch and a second network branch, and the first network branch is used to extract A plurality of feature maps of different scales in the OCT image of the eye, and the second network branch is used to extract the coordinates of two positioning points of the optic cup in the OCT image of the eye according to the feature maps of a plurality of different scales, and the optic disc The coordinates of the two anchor points.
- an embodiment of the present application provides an optical cup and optical disc positioning point detection device based on an OCT image of the eye, including: an acquisition module for acquiring an OCT image of the eye; a detection module for using a preset detection
- the model detects the OCT image of the eye, and obtains the coordinates of two anchor points of the optic cup and the coordinates of the two anchor points of the optic disc in the OCT image of the eye;
- the detection model includes a first network branch and a second network branch
- the first network branch is used to extract a plurality of feature maps of the eye OCT image with different scales
- the second network branch is used to extract the visual focus of the eye OCT image according to the plurality of feature maps of different scales.
- an embodiment of the present application provides a terminal device, including: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program Time realization: Obtain an OCT image of the eye; use a preset detection model to detect the OCT image of the eye, and obtain the coordinates of two positioning points of the optic cup in the OCT image of the eye and the coordinates of two positioning points of the optic disc;
- the detection model includes a first network branch and a second network branch.
- the first network branch is used to extract multiple feature maps of the eye OCT image with different scales
- the second network branch is used to The feature map of the scale extracts the coordinates of two positioning points of the optic cup and the coordinates of two positioning points of the optic disc in the OCT image of the eye.
- an embodiment of the present application provides a computer-readable storage medium that stores a computer program that implements the method described in the first aspect when the computer program is executed by a processor.
- the embodiments of the present application provide a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute the method described in the first aspect.
- the cup and disc positioning points of the OCT image of the eye are detected through a preset detection model.
- the positioning point detection result can be obtained by directly detecting the OCT image of the eye through the detection model.
- the detection efficiency is greatly improved; on the other hand, because the detection model extracts multiple features of different scales in the OCT image of the eye, the positioning point detection of the optic cup and the optic disc is more accurately realized.
- FIG. 1 is a schematic flowchart of a method for detecting positioning points of an optic cup and an optic disc based on an OCT image of an eye according to an embodiment of the present application.
- FIG. 2 is a schematic structural diagram of a detection model used in a method for detecting positioning points of an optic cup and an optic disc based on an OCT image of an eye provided by an embodiment of the present application.
- FIG. 3 is a schematic flow chart of preprocessing the original ocular OCT image in the method for detecting positioning points of the optic cup and optic disc based on the ocular OCT image provided by an embodiment of the present application.
- FIG. 4 is a schematic diagram of marking an OCT image of the eye in the method for detecting positioning points of the optic cup and optic disc based on the OCT image of the eye provided by an embodiment of the present application.
- FIG. 5 is a schematic structural diagram of a first network branch adopted in a method for detecting positioning points of an optic cup and an optic disc based on an OCT image of an eye provided by an embodiment of the present application.
- FIG. 6 is a schematic structural diagram of module 1 of the first network branch adopted in the method for detecting positioning points of the optic cup and optic disc based on the OCT image of the eye provided by an embodiment of the present application.
- FIG. 7 is a schematic structural diagram of module 2 of the first network branch adopted in the method for detecting positioning points of the optic cup and optic disc based on the OCT image of the eye according to an embodiment of the present application.
- FIG. 8 is a schematic structural diagram of module 3 of the first network branch adopted in the method for detecting positioning points of the optic cup and optic disc based on the OCT image of the eye according to an embodiment of the present application.
- FIG. 9 is a schematic structural diagram of the module 4 of the first network branch adopted in the method for detecting positioning points of the optic cup and optic disc based on the OCT image of the eye provided by an embodiment of the present application.
- FIG. 10 is a schematic structural diagram of a second network branch used in a method for detecting positioning points of an optic cup and an optic disc based on an OCT image of an eye provided by an embodiment of the present application.
- FIG. 11 is a schematic diagram of the structure of the first sub-network of the second network branch used in the method for detecting the positioning point of the optic cup and optic disc based on the OCT image of the eye provided by an embodiment of the present application.
- FIG. 12 is a schematic structural diagram of a second sub-network of a second network branch used in a method for detecting positioning points of the optic cup and optic disc based on an OCT image of an eye provided by an embodiment of the present application.
- FIG. 13 is a schematic diagram of the structure of the attention module of the second sub-network of the second network branch adopted in the method for detecting the positioning point of the optic cup and optic disc based on the OCT image of the eye according to an embodiment of the present application.
- FIG. 14 is a schematic diagram of the cup ellipse and the optic disc ellipse obtained in the method for detecting the positioning point of the optic cup and optic disc based on the OCT image of the eye provided by an embodiment of the present application.
- FIG. 15 is a schematic structural diagram of an optical cup and optical disc positioning point detection device based on an OCT image of an eye provided by an embodiment of the present application.
- FIG. 16 is a schematic structural diagram of a terminal device to which a method for detecting positioning points of an optic cup and an optic disc based on an OCT image of an eye provided by an embodiment of the present application is applicable.
- the term “if” can be construed as “when” or “once” or “in response to determination” or “in response to detecting “.
- the phrase “if determined” or “if detected [described condition or event]” can be interpreted as meaning “once determined” or “in response to determination” or “once detected [described condition or event]” depending on the context ]” or “in response to detection of [condition or event described]”.
- Optic disc morphology evaluation parameters are very important indicators in ophthalmological diagnosis, and the positioning detection of the optic cup and optic disc in the OCT image of the eye is the basis for obtaining the optic disc morphological evaluation parameters. Therefore, the embodiments of the present application provide a method for detecting the positioning points of the optic cup and the optic disc based on the OCT image of the eye, so as to realize the accurate and efficient detection of the positioning points of the optic cup and the optic disc in the OCT image of the eye.
- the method for detecting positioning points of the optic cup and optic disc based on the OCT image of the eye provided by the embodiments of the present application can also be applied to the field of digital medical treatment for disease risk assessment and the establishment of electronic patient information files, which is helpful for realizing medical informatization.
- Ophthalmology diagnosis provides accurate and efficient diagnosis opinions with high practicability.
- FIG. 1 shows an implementation flow chart of a method for detecting positioning points of an optic cup and an optic disc based on an OCT image of an eye provided by an embodiment of the present application.
- the method is applied to terminal equipment.
- the optical cup and optical disc positioning point detection method based on ocular OCT images provided by the embodiments of this application can be applied to ophthalmic OCT devices, mobile phones, tablets, wearable devices, vehicle-mounted devices, augmented reality (AR)/virtual reality (Virtual reality, VR) devices, laptops, ultra-mobile personal computers (UMPC), netbooks, personal digital assistants (personal digital assistants)
- AR augmented reality
- VR virtual reality
- UMPC ultra-mobile personal computers
- PDA digital assistant
- the embodiments of this application do not impose any restrictions on the specific types of terminal devices.
- the method includes step S110 to step S130.
- the specific implementation principle of each step is as follows.
- S110 Acquire an OCT image of the eye.
- the OCT image of the eye is an object that needs to be detected for the positioning point of the optic cup and the optic disc, and the OCT image of the eye may be an original frame of the OCT image of the eye.
- the OCT image of the eye may be an OCT image of the eye obtained by the OCT device scanning the eye of the human body to be tested in real time.
- the eye OCT image can be the eye OCT image obtained by the terminal device in real time from the OCT device, or it can be the pre-stored eye OCT image obtained from the internal or external memory of the terminal device image.
- the OCT device collects the OCT image of the human eye to be tested in real time, sends the OCT image to the terminal device, and the terminal device obtains the OCT image.
- the OCT device collects the OCT image of the human body under test and sends it to the terminal device.
- the terminal device first stores the OCT image in the database, and then obtains the OCT image of the human body under test from the database. image.
- the terminal device obtains the OCT image of the eye, and after obtaining the OCT image of the eye, directly performs the subsequent step S120, that is, detects the positioning points of the optic cup and the optic disc in the OCT image of the eye.
- the terminal device acquires the OCT image of the eye, and after acquiring the OCT image of the eye, first pre-cut the eye OCT image to a preset size, such as 512*512, and then proceed to the subsequent step S120, namely Detect the positioning points of the optic cup and optic disc in the preprocessed OCT image of the eye.
- a preset size such as 512*512
- the terminal device when the user wants to detect the cup and optic disc positioning point of a selected frame of OCT image of the eye, by clicking on the specific physical button and/or virtual button of the terminal device The method of enabling the positioning point detection function of the terminal device, at this time, the terminal device will automatically process the selected frame of the OCT image of the eye according to the process from step S110 to step S120 to obtain the positioning point detection result.
- the terminal device when the user wants to perform cup and optic disc positioning point detection on a certain frame of eye OCT image, it can be activated by clicking a specific physical button and/or virtual button. If the terminal device has a positioning point detection function and selects a frame of eye OCT image, the terminal device will automatically process the eye OCT image according to the process from step S110 to step S120 to obtain the positioning point detection result.
- S120 Detect the ocular OCT image using a preset detection model, and obtain two positioning point coordinates of the optic cup and two positioning point coordinates of the optic disc in the ocular OCT image.
- Step S120 is a step of using a preset detection model to perform positioning point detection on an OCT image of the eye, and determining two positioning point coordinates of the optic cup and two positioning point coordinates of the optic disc in the eye OCT image.
- the detection model includes a first network branch and a second network branch.
- the first network branch is used to extract multiple feature maps of the eye OCT image with different scales
- the second network branch is used to extract the eye cup in the eye OCT image according to the multiple feature maps of different scales The coordinates of the two anchor points and the coordinates of the two anchor points of the optic disc.
- the detection model may be a deep learning network model
- the deep learning network model may be a deep learning network model based on machine learning technology in artificial intelligence.
- the deep learning network model When the OCT image of the eye is input to the deep learning network model, the deep learning network model outputs the coordinates of the two anchor points of the optic cup in the OCT image of the eye, and the coordinates of the two anchor points of the optic disc.
- the training process of the detection model includes: obtaining a sample data set, the sample data set includes a plurality of sample images, each sample image is an OCT sample image of the eye marked with optic cup and optic disc positioning points; using the sample The data set trains the key point detection model, and adjusts the weight of the key point detection model during the training process until the output result of the key point detection model after adjusting the weight meets the preset condition, or the iteration of the training process If the number of iterations reaches the preset number of iterations, the training is stopped.
- each sample image is an OCT sample image of the eye marked with the positioning points of the optic cup and optic disc.
- the sample image is preprocessed on the original OCT image of the eye, and then the cup and disc positioning points are labeled OCT image of the eye.
- preprocessing includes but is not limited to operations such as interpolation and truncation.
- the original image of the usually obtained OCT image is 1024 (1 pixel corresponds to the actual 6 mm) * 768 (1 pixel corresponds to the actual 3.01 mm).
- interpolate the OCT image and change it to It is 1200*462, which makes 1 pixel represent 5 ⁇ m (micrometers), which is very convenient for labeling; then, the two sides are truncated, and 200 pixels are truncated on both sides, and the preprocessed OCT image resolution is 800*462. .
- the labeling result includes four anchor points: two optic disc anchor point coordinates and two optic cup anchor point coordinates.
- the coordinates of the two optic disc anchor points are: optic disc anchor point 1, the coordinates are (x1, y1); optic disc anchor point 2, the coordinates are (x2, y2).
- the coordinates of the two viewing cup positioning points are: viewing cup positioning point 1, the coordinates are (x3, y3); viewing cup positioning point 2, the coordinates are (x4, y4).
- the labeling complies with clinical norms.
- the optic disc anchor point is the end of the retinal pigment epithelium (RPE), the optic cup line is parallel to the optic disc line, and the inner limiting membrane (ILM) intersects the optic cup anchor points (x3, y3) and (x4, y4), optic cup the distance between the connection and the connection disc d according to the clinical use of 110 ⁇ m, by 1 pixel represents 5 ⁇ m, the distance is calculated to obtain 22 pixels.
- the labeled sample images are stored in a preset database as a sample data set.
- a sample data set is obtained from a preset database, the sample image is used as input, and the annotation result in the sample image is used as the target anchor point, and a detection model of the optic cup and optic disc anchor point is established.
- the weight of the model is adjusted until the output result of the adjusted weight meets the preset accuracy threshold, or the number of iterations reaches the preset iteration threshold, the model training process is stopped.
- the sample image is divided into a training sample set, a verification sample set, and a test sample set, and according to the training sample set, the verification sample set, and the test sample set, a back propagation algorithm is used to train deep learning Network model.
- the above-mentioned marked sample image can also be stored in the block chain.
- the block chain storage Through the block chain storage, the sharing of data information between different platforms can also be realized, and the data can also be prevented from being tampered with.
- Blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
- the blockchain is essentially a decentralized database, which is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify the validity of the information. (Anti-counterfeiting) and generate the next block.
- the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
- the process of training the detection model can be implemented locally on the terminal device, or on other devices that communicate with the terminal device.
- the successfully trained detection model is deployed on the terminal device side, or other devices will train
- the detection of the position of the optic cup and optic disc on the obtained OCT image of the eye can be realized on the terminal device.
- the OCT image of the eye to be detected obtained in the process of detecting the positioning point of the optic cup and optic disc can also be used to increase the sample size in the sample data set, and to perform further optimization of the detection model on the terminal device or other equipment.
- the preset detection model includes a first network branch and a second network branch.
- the first network branch is used to extract a plurality of feature maps of different scales of the OCT image of the eye.
- the first network branch is an improved Xception network, which is used to extract image information of different scales in the target image.
- the structure of the improved Xception network is shown in Figure 5.
- the first network branch includes cascaded modules (module) 1a, module2a, module3a, and module4a, where the output of module4a is upsampled 4 times (upsample ⁇ 4) and then concatenated with the output of module2a and input into module2b, and the output of module3a is concatenated with the output of module2b and then input into module3b,
- the output of module3b and the output of module4a are concatenated (concat) and then input into module4b, and the output of module4b is upsampled 4 times (upsample ⁇ 4) and then concatenated with the output of module2b (concat) and input into module2c; module 1a, module2a, module2b, and module2c output four feature maps with different scales.
- the feature map output by module 1a has a scale of 256 ⁇ 256 and the number of channels is 8; the feature map output by module 2a has a scale of 128 ⁇ 128 and the number of channels is 48; The scale of the feature map output by module 2b is 64 ⁇ 64, and the number of channels is 48; the scale of the feature map output by module 2c is 32 ⁇ 32, and the number of channels is 48.
- module 1 including module 1a
- module 2 including module 2a, module2b and module2c
- module 3 including module3a and module3b
- module 4 including module4a and module4b
- FIG. 6 shows the structure diagram of module 1.
- module 1 includes a convolutional layer and a BN (Batch Normalization) layer.
- the activation function behind the BN layer is the ReLU function;
- the convolution kernel of the convolution layer is 3 ⁇ 3, the stride is 2 ⁇ 2, and the number of channels is 8.
- FIG. 7 shows a schematic diagram of the structure of module 2.
- module 2 includes five parts, the first part to the fifth part, and the second part to the fifth part, which have the same network structure.
- the first part includes the cascaded first convolutional layer, three convolutional layers of the second convolutional layer and the third convolutional layer, and a fourth convolutional layer; the first convolutional layer and the second convolutional layer After the convolutional layer, there is a BN layer and an activation function ReLU function.
- the output of the third convolutional layer and the output of the fourth convolutional layer are summed and then input into the second part.
- the second part includes the activation function ReLU function, three cascaded convolutional layers, input the second part of the data, and the data after the three convolutional layers are summed and then input the third part. And so on, until the fifth part is the output of module 2.
- Figure 8 shows a schematic diagram of the structure of module 3
- Figure 9 shows a schematic diagram of the structure of module 4.
- the overall structure of module2, module3, and module4 is similar. The difference lies in the size of the convolution kernel and the number of module repetitions. Please refer to Figure 8-9, which will not be repeated here.
- the first network branch uses the structure of the main module of the original Xception, and the changes to the original Xception include reducing the number of channels, increasing the number of module repetitions, and increasing feature cascading (or aggregation).
- the number of channels By reducing the number of channels, the amount of calculation is greatly reduced, the system resource usage is reduced, and the cost of computing power is reduced.
- the number of module repetitions is increased on the one hand, and on the other hand. Feature cascade.
- the number of channels of the original Xception 64, 128, 256, 728 is reduced to 8, 48, 96, and 192 to form a lightweight Xception network.
- the reduction of the number of channels will lead to insufficient feature extraction, so the feature cascade operation is increased.
- Feature cascading is specifically: copying the feature extraction network with a smaller number of channels in three copies. For ease of description, it is called a multi-level network. Each network has multiple convolutional layers, and each layer outputs features with different resolutions. It is called a multi-layer feature.
- the multi-level networks are connected in series, and the features extracted from each level of the network are transferred to the next level as input, and at the same time, the features of the corresponding level of the previous level are merged together to realize the multiplexing of the features.
- This cascading operation combines features of different resolutions multiple times, fully extracting effective information.
- module1a, module 2a, module 3a, module 4 a and module2b, module3b, and module4b belong to different levels, and there are multiple networks of different levels, which can fully extract image information of different scales; 2)
- the structure uses multiple methods, for example, module2b uses the output of module2a and module4a at the same time.
- the features of different resolutions are merged to realize feature multiplexing and effectively use network features at different levels;
- the second network branch is used to extract the eye OCT image based on the feature maps of multiple different scales The coordinates of the two anchor points of the middle optic cup and the coordinates of the two anchor points of the optic disc.
- the second network branch includes a first sub-network and a second sub-network.
- the first sub-network is used to roughly detect the optic cup and optic disc positioning points of the OCT image of the eye.
- the second sub-network is used to precisely detect the cup and disc positioning point of the OCT image of the eye.
- the first sub-network is a global network (GlobalNet) that increases feature cascade;
- the second sub-network is a segmentation network (RefineNet) that increases an attention mechanism.
- the first sub-network takes feature maps of different scales output by the branches of the first network as input, and adds feature cascades.
- Using the global network can locate simple key points by extracting image features.
- FIG 11 shows a schematic diagram of the structure of the first sub-network.
- the first sub-network includes 7 convolutional layers, among which, module The output of 2c passes through the first convolution layer (convolution kernel is 3 ⁇ 3, the number of channels is 256) and 2 times upsampling (upsample), and then concatenated with the output of module 2b as the second convolution layer (convolution)
- the kernel is the input of 3 ⁇ 3 and the number of channels is 128;
- the output of the second convolutional layer is upsampled twice (upsample) and concatenated with the output of module 2a as the third convolutional layer (convolution kernel is 3 ⁇ 3, the number of channels is 64) input;
- the output of the third convolutional layer is upsampled twice (upsample) and concatenated with the output of module 1a as the fourth
- the input of the number of channels is 64
- the output of the second convolutional layer, the third convolutional layer and the third convolutional layer respectively pass through a fifth convolutional layer (convolution kernel is 1 ⁇ 1, and the number of channels is 4 )
- the three outputs are global_out1, global_out2, and global_out3.
- the second sub-network takes the outputs of the first sub-network at different scales as input, and the features are already highly dense. By adding an attention mechanism, the features can be filtered according to their importance, which can effectively improve the reliability of the final result.
- Fig. 12 shows a schematic diagram of the structure of the second sub-network.
- the three convolutional layers of the second sub-network are connected to the three outputs of the first sub-network, followed by the first convolutional layer (the convolution kernel is 1 ⁇ 1, the number of channels is 128), and the second Convolutional layer (convolution kernel is 1 ⁇ 1, the number of channels is 128) and the third convolution layer (convolution kernel is 1 ⁇ 1, the number of channels is 256);
- the output of the second convolution layer is connected to the first attention Force module;
- the output of the third convolutional layer is connected to the second attention module;
- the output of the second attention module sequentially passes through the fourth convolutional layer (convolution kernel is 1 ⁇ 1, the number of channels is 128), and the third attention
- the output of the first attention module after 2 times upsampling and the output of the first convolutional layer are concatenated (concat), and then input into the fifth convolutional layer (convolution kernel).
- the attention module includes the first to third attention modules, and the schematic diagram of the structure is shown in FIG. 13.
- the attention module includes a global average pooling layer (Global average pooling) and 2 fully connected layers (Dense out), the activation function ReLU function is set in the middle of the two fully connected layers, and the second fully connected layer has the activation function Sigmoid function.
- the output of the Sigmoid function is structured and multiplied with the input data as the attention module. Output.
- the global average pooling layer averages the feature map globally and outputs a value, that is, a tensor of W*H*D becomes a tensor of 1*1*D.
- This layer performs feature compression along the spatial dimension, so that there is a global receptive field on the feature channel, and the output dimension matches the number of input feature channels.
- the three layers of fully connected layer, activation function and structural adjustment are used to generate weights for each feature channel through parameters, and the parameters are learned to explicitly model the correlation between feature channels.
- the final multiplication operation is a re-calibration operation.
- the weight of the output is regarded as the importance of each feature channel after feature selection, and then the previous feature is weighted channel by channel through multiplication to complete the channel dimension Recalibration of original features.
- the attention module in the embodiment of the present application only needs to learn one weight, which is multiplied by the original convolution.
- the attention module uses SE-Net (Squeeze-and-Excitation Networks).
- SE-Net explicitly models the interdependence between feature channels, instead of introducing a new spatial dimension for fusion between feature channels, but A new feature recalibration strategy is adopted. Specifically, the importance of each feature channel is automatically obtained through learning, and then according to this importance, useful features are promoted and features that are not useful for the current task are suppressed. Therefore, filtering features based on importance can effectively improve the final result.
- the positioning points of the optic cup and the optic disc of the OCT image of the eye are detected through a preset detection model.
- the positioning point detection results of the OCT image of the eye can be obtained directly through the detection model, which greatly improves Detection efficiency; on the other hand, because the detection model extracts multiple features of different scales in the OCT image of the eye, the positioning point detection of the optic cup and optic disc is more accurately realized.
- steps S130 to S160 are further included.
- S140 Determine the length of the optic cup and the length of the optic disc in each OCT image of the eye according to the coordinates of the two positioning points of the optic cup in each OCT image and the coordinates of the two positioning points of the optic disc.
- S150 Form at least three first line segments according to the lengths of the viewing cups at at least three different angles, and project the at least three first line segments to the same plane and fit the viewing cup ellipse;
- the length of the optic disc at three different angles forms at least three second line segments, and the at least three second line segments are co-centered and projected onto the same plane to fit the optic disc ellipse.
- the OCT image includes 0 degrees, 45 degrees, 90 degrees, 135 degrees, etc., so the calculated optic cup length and optic disc length are the lengths at various angles, for example, the vertical direction is the length at 90 degrees. , The horizontal direction is the length at 0 degrees.
- the optic cup length and optic disc length at 0 degrees, 45 degrees, 90 degrees, and 135 degrees can be obtained first.
- 8 line segments can be constructed. These 8 line segments are in the same center, and they are projected to the same plane, as shown in Figure 14.
- the ellipse parameters can be obtained, so the area of the viewing cup can be obtained, that is, the area of the smaller ellipse (the viewing cup ellipse).
- the optic disc is similar. Fitting the large outer ellipse shown in Figure 14 can find the optic disc area, that is, the area of the larger ellipse (optic disc ellipse).
- the cup-to-disk area ratio is the ratio of the cup-to-optic disk area, and the ratio of the cup-to-optic disk area is the smaller ellipse area than the larger ellipse area; the cup-to-disk level ratio is the ratio of the cup length to the optic disk length at 0 degrees ;
- the vertical ratio of the optic cup to the optic disc is 90 degrees, which is the ratio of the optic cup length to the 90-degree optic disc length.
- the embodiments of the present application can directly obtain the positioning point detection result of the OCT image of the eye through the detection model, which greatly improves the detection efficiency.
- the morphological parameters of the optic cup and optic disc are obtained by fitting.
- the method is simple, efficient and easy to implement.
- the detection model extracts multiple features of different scales of the OCT image of the eye, the optic cup and optic disc are more accurately realized.
- the positioning point detection also improves the accuracy of obtaining the optical cup and optic disc morphological parameters; on the other hand, based on multiple different angles of the optic cup and optic disc positioning points, more and richer ocular OCT images of the optic cup are obtained.
- the morphological parameters of the optic disc make the solution of the present application applicable to different scenarios and more adaptable.
- An embodiment of the present application also provides a method for grading glaucoma.
- the method for grading glaucoma uses the foregoing embodiments to obtain the four-dimensional morphological parameters of the optic cup area, the optic disc area, the cup-to-disk area ratio, and the vertical cup-to-disk ratio.
- the five-dimensional GCC characteristics of GCC parameters include: upper GCC thickness, lower GCC thickness, average GCC thickness, local loss volume (FLV) and global loss volume (GLV).
- the four-dimensional RNFL thickness characteristics of RNFL thickness include: upper RNFL thickness, lower RNFL thickness, nasal RNFL thickness, and temporal RNFL thickness.
- the 5-dimensional GCC feature and the 4-dimensional RNFL thickness feature can be directly read from the OCT image acquisition instrument.
- the glaucoma classification model may be a classification model based on machine learning. For example, a decision tree model based on Xgboost.
- the classification results of the glaucoma classification model include: no glaucoma, low-risk, intermediate-risk, and high-risk.
- This example is a four-category model.
- it can also be a two-category model, a three-category model, or a classification model of more categories.
- the embodiments of the present application integrate multiple parameters to improve the accuracy of classification.
- grading based on the glaucoma grading model can complete decision-making within a few seconds, reducing system resource occupation and greatly improving the grading efficiency.
- FIG. 15 shows the optical cup and optical disc positioning point detection device based on the ocular OCT image provided by an embodiment of the present application.
- the structural block diagram for ease of description, only shows the parts related to the embodiments of the present application.
- the device includes: an acquisition module 151, used to acquire an OCT image of the eye; a detection module 152, used to detect the positioning points of the optic cup and optic disc of the OCT image of the eye through a preset detection model.
- the positioning point detection results can be obtained directly on the OCT image of the eye through the detection model, which greatly improves the detection efficiency; on the other hand, because the detection model extracts multiple features of different scales of the eye OCT image, the vision is more accurately realized. Position detection of cup and optic disc.
- FIG. 16 is a schematic structural diagram of a terminal device provided by an embodiment of this application.
- the terminal device 16 of this embodiment includes: at least one processor 160 (only one processor is shown in FIG. 16), a memory 161, and stored in the memory 161 and can be processed in the at least one processor
- the computer program 162 running on the processor 160 implements the steps in the foregoing method embodiments when the processor 100 executes the computer program 162. For example, step S110 to step S120 shown in FIG. 1.
- the terminal device may include, but is not limited to, a processor 160 and a memory 161.
- FIG. 16 is only an example of the terminal device 16 and does not constitute a limitation on the terminal device 16. It may include more or less components than shown in the figure, or a combination of certain components, or different components.
- the electrocardiograph may also include input and output devices, network access devices, buses, and the like.
- the so-called processor 160 may be a central processing unit (Central Processing Unit, CPU), or may be other general-purpose processors or digital signal processors. (Digital Signal Processor, DSP), ASIC (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the memory 161 may be an internal storage unit of the terminal device 16, for example, a hard disk or a memory of the terminal device 16.
- the memory 161 may also be an external storage device of the terminal device 16, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) equipped on the terminal device 16. Card, Flash Card, etc.
- the memory 161 may also include both an internal storage unit of the terminal device 16 and an external storage device.
- the memory 161 is used to store the computer program and other programs and data required by the terminal device 16.
- the memory 161 can also be used to temporarily store data that has been output or will be output.
- the embodiments of the present application also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each of the foregoing method embodiments can be realized.
- the computer-readable storage medium may be non-volatile or volatile.
- the embodiments of the present application provide a computer program product.
- the steps in the foregoing method embodiments can be realized when the mobile terminal is executed.
- the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the computer program can be stored in a computer-readable storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented.
- the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
- the computer-readable medium may include at least: any entity or device capable of carrying computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (Read-Only Memory, ROM), and random access memory (Random Access Memory, RAM), electric carrier signal, telecommunications signal, and software distribution medium.
- any entity or device capable of carrying computer program code to the photographing device/terminal device recording medium, computer memory, read-only memory (Read-Only Memory, ROM), and random access memory (Random Access Memory, RAM), electric carrier signal, telecommunications signal, and software distribution medium.
- ROM read-only memory
- RAM random access memory
- electric carrier signal telecommunications signal
- software distribution medium for example, U disk, mobile hard disk, floppy disk or CD-ROM, etc.
- computer-readable media cannot be electrical carrier signals and telecommunication signals.
- the disclosed terminal device and method may be implemented in other ways.
- the terminal device embodiments described above are only illustrative.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
- Eye Examination Apparatus (AREA)
Abstract
本申请适用于图像处理技术领域,提供了一种基于眼部OCT图像的视杯和视盘定位点检测方法、装置及终端设备,包括:获取眼部OCT图像;使用预设的检测模型对所述眼部OCT图像进行检测,得到所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述检测模型包括第一网络分支和第二网络分支,所述第一网络分支用于提取所述眼部OCT图像多个不同尺度的特征图,所述第二网络分支用于根据多个不同尺度的所述特征图提取所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。本申请实现对眼部OCT图像的视杯和视盘的准确高效定位。
Description
本申请要求于2020年2月11日提交中国专利局,申请号为2020100872265、发明名称为“基于眼部OCT图像的视杯和视盘定位点检测方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请属于图像处理技术领域,尤其涉及一种基于眼部OCT图像的视杯和视盘定位点检测方法、装置、终端设备及计算机可读存储介质。
光学相干断层扫描(Optical
Coherence Tomography,OCT)技术是近年来发展较快的一种最具发展前途的新型层析成像技术,特别是在生物组织活体检测和成像方面具有诱人的应用前景。通过OCT技术获取的OCT图像具有无创性、无辐射、非侵入、高分辨率、高探测灵敏度以及图像获取安全高效等特点,在眼科诊断中越来越重要。
视盘形态评估参数是眼科诊断中非常重要的指标。视盘形态评估参数包括但不限于视盘面积、视杯面积、盘沿面积、垂直盘沿面积、水平盘沿容积、平均杯盘比(Cup to Disc Ratio,CDR)、水平和垂直CDR等。
但在目前,发明人意识到,基于OCT图像的视盘形态评估参数的测量大多仍依赖于人工测量以及机器半自动测量。因此,亟需一种基于眼部OCT图像的视盘形态评估参数检测方案。
本申请实施例提供了一种基于眼部OCT图像的视杯和视盘定位点检测方法、装置、终端设备及计算机可读存储介质,提供了一种基于眼部OCT图像的视杯和视盘定位点检测方案,解决对视杯和视盘定位点的准确率和效率低的问题。
第一方面,本申请实施例提供了一种基于眼部OCT图像的视杯和视盘定位点检测方法,包括:获取眼部OCT图像;使用预设的检测模型对所述眼部OCT图像进行检测,得到所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述检测模型包括第一网络分支和第二网络分支,所述第一网络分支用于提取所述眼部OCT图像多个不同尺度的特征图,所述第二网络分支用于根据多个不同尺度的所述特征图提取所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
第二方面,本申请实施例提供了一种基于眼部OCT图像的视杯和视盘定位点检测装置,包括:获取模块,用于获取眼部OCT图像;检测模块,用于使用预设的检测模型对所述眼部OCT图像进行检测,得到所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述检测模型包括第一网络分支和第二网络分支,所述第一网络分支用于提取所述眼部OCT图像多个不同尺度的特征图,所述第二网络分支用于根据多个不同尺度的所述特征图提取所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
第三方面,本申请实施例提供了一种终端设备,包括:存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现:获取眼部OCT图像;使用预设的检测模型对所述眼部OCT图像进行检测,得到所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述检测模型包括第一网络分支和第二网络分支,所述第一网络分支用于提取所述眼部OCT图像多个不同尺度的特征图,所述第二网络分支用于根据多个不同尺度的所述特征图提取所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
第四方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如第一方面所述的方法。
第五方面,本申请实施例提供了一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行如第一方面所述的方法。
在本申请实施例中,通过预设的检测模型对眼部OCT图像的视杯和视盘定位点进行检测,一方面,直接通过检测模型对眼部OCT图像进行检测就能获得定位点检测结果,大大提高了检测的效率;另一方面,由于检测模型提取了眼部OCT图像多个不同尺度的特征,更加准确地实现了视杯和视盘的定位点检测。
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法的流程示意图。
图2是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的检测模型的结构示意图。
图3是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中对原始眼部OCT图像进行预处理的流程示意图。
图4是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中对眼部OCT图像进行标记的示意图。
图5是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的第一网络分支的结构示意图。
图6是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的第一网络分支的module1的结构示意图。
图7是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的第一网络分支的module2的结构示意图。
图8是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的第一网络分支的module3的结构示意图。
图9是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的第一网络分支的module4的结构示意图。
图10是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的第二网络分支的结构示意图。
图11是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的第二网络分支的第一子网络的结构示意图。
图12是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的第二网络分支的第二子网络的结构示意图。
图13是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中采用的第二网络分支的第二子网络的注意力模块的结构示意图。
图14是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法中获得的视杯椭圆和视盘椭圆的示意图。
图15是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测装置的结构示意图。
图16是本申请一实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法所适用于的终端设备的结构示意图。
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚,完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,所获得的所有其他实施例,都应当属于本申请保护的范围。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。
应当理解,当在本申请说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
如在本申请说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。
另外,在本申请说明书和所附权利要求书的描述中,术语“第一”、“第二”、“第三”等仅用于区分描述,而不能理解为指示或暗示相对重要性。
在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
视盘形态评估参数是眼科诊断中非常重要的指标,而对眼部OCT图像中视杯和视盘进行定位检测是获得视盘形态评估参数的基础。因此,本申请实施例提供一种基于眼部OCT图像的视杯和视盘定位点检测方法,实现对眼部OCT图像中视杯和视盘定位点的准确高效检测。
本申请实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法可同样适用于数字医疗领域,用于疾病风险评估、建立患者电子信息档案,有助于实现医疗信息化,可为眼科诊断提供准确高效的诊断意见,实用性高。
图1示出了本申请实施例提供的一种基于眼部OCT图像的视杯和视盘定位点检测方法的实现流程图。所述方法应用于终端设备。本申请实施例提供的基于眼部OCT图像的视杯和视盘定位点检测方法可以应用于眼科OCT设备、手机、平板电脑、可穿戴设备、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal
digital assistant,PDA)、独立的服务器、分布式服务器、服务器集群或云服务器等终端设备上,本申请实施例对终端设备的具体类型不作任何限制。如图1所示,所述方法包括步骤S110至步骤S130。各个步骤的具体实现原理如下。
S110,获取眼部OCT图像。
其中,眼部OCT图像为需要进行视杯和视盘定位点检测的对象,眼部OCT图像可以为一帧原始的眼部OCT图像。
当终端设备为OCT设备时,眼部OCT图像可以为OCT设备实时扫描待测人体的眼部得到的眼部OCT图像。
当终端设备不为OCT设备时,眼部OCT图像可以为终端设备从OCT设备实时获取到的眼部OCT图像,还可以为从终端设备的内部或外部存储器中获取到的预先存储的眼部OCT图像。
在一个非限定性的示例中,OCT设备实时采集待测人体眼部的OCT图像,发送OCT图像给终端设备,终端设备获取OCT图像。
在另一个非限定性的示例中,OCT设备采集待测人体眼部的OCT图像发送给终端设备,终端设备先在数据库中存储该OCT图像,再从数据库中获取该待测人体的眼部OCT图像。
在本申请一些实施例中,终端设备获取眼部OCT图像,在获取眼部OCT图像后,直接进行后续的步骤S120,即对眼部OCT图像中的视杯和视盘定位点进行检测。
在本申请一些实施例中,终端设备获取眼部OCT图像,在获取眼部OCT图像后,先将眼部OCT图像预裁减成预设大小,例如512*512,再进行后续的步骤S120,即对预处理后的眼部OCT图像中的视杯和视盘定位点进行检测。
在本申请一种非限定性使用场景中,当用户想要对某选定的一帧眼部OCT图像进行视杯和视盘定位点检测时,通过点击终端设备特定的物理按键和/或虚拟按键的方式启用终端设备的定位点检测功能,此时,所述终端设备会对选定的该帧眼部OCT图像自动按照步骤S110至步骤S120的过程进行处理,得到定位点检测结果。
在本申请另一种非限定性使用场景中,当用户想要对某一帧眼部OCT图像进行视杯和视盘定位点检测时,可以通过点击特定的物理按键和/或虚拟按键的方式启用终端设备的定位点检测功能,并选定一帧眼部OCT图像,则所述终端设备会对眼部OCT图像自动按照步骤S110至步骤S120的过程进行处理,得到定位点检测结果。
此处可以理解的是,点击按键和选定一帧眼部OCT图像的顺序可以互换,本申请实施例适用但不限于这两种不同的使用场景。
S120,使用预设的检测模型对所述眼部OCT图像进行检测,得到所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
步骤S120为使用预设的检测模型对眼部OCT图像进行定位点检测的步骤,确定所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
其中,如图2所示,所述检测模型包括第一网络分支和第二网络分支。所述第一网络分支用于提取所述眼部OCT图像多个不同尺度的特征图,所述第二网络分支用于根据多个不同尺度的所述特征图提取所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
在本申请实施例中,检测模型可以为深度学习网络模型,深度学习网络模型可以为以人工智能中机器学习技术为基础的深度学习网络模型。
当眼部OCT图像输入深度学习网络模型,深度学习网络模型输出眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
其中,检测模型的训练过程包括:获取样本数据集,所述样本数据集中包括多个样本图像,每个样本图像为进行了视杯和视盘定位点标注的眼部OCT样本图像;使用所述样本数据集对关键点检测模型进行训练,在训练过程中对所述关键点检测模型的权重进行调整,直至调整权重后的所述关键点检测模型的输出结果满足预设条件,或者训练过程的迭代次数达到预设迭代次数,则停止训练。
作为本申请一非限制性示例,获取大量眼部OCT图像作为样本图像,形成样本数据集;每个样本图像为进行了视杯和视盘定位点标注的眼部OCT样本图像。
为了获得良好的标记精度,以便训练出表现更优的检测模型,在本申请一些实施例中,样本图像为对原始眼部OCT图像进行了预处理后,再进行了视杯和视盘定位点标注的眼部OCT图像。
可以理解地,预处理包括但不限于插值和截断等操作。示例性地,参见图3所示,通常获取到的OCT图像的原图为1024(1像素对应实际6毫米)*768(1像素对应实际3.01毫米),先对OCT图像进行插值,将其变为1200*462,使得1个像素代表5
μm(微米),这样处理非常便于标注;然后,两边进行截断,左右各截去200个像素,得到预处理后的OCT图像分辨率为800*462。
再对预处理后的眼部OCT图像进行标注,标注主要是医生根据经验进行。结合视杯和视盘在临床应用中的定义,由多位医生对不同OCT影像中的视杯和视盘的定位点进行精准标注,最终统一由一位专家医生审核,保证标注的准确性与规则一致性。标注结果示意图,请参见图4所示。如图4所示,标注结果中包括四个定位点:两个视盘定位点坐标和两个视杯定位点坐标。两个视盘定位点坐标分别为:视盘定位点1,坐标为(x1,y1);视盘定位点2,坐标为(x2,y2)。两个视杯定位点坐标分别为:视杯定位点1,坐标为(x3,y3);视杯定位点2,坐标为(x4,y4)。其中标注遵守了临床规范。视盘定位点为视网膜色素上皮层(RPE)末端,视杯连线与视盘连线平行,与内界膜(ILM)交于视杯定位点(x3,y3)和(x4,y4),视杯连线与视盘连线之间的距离d根据临床使用110 μm,按1像素代表5
μm,可计算得到距离为22像素。
将标注好的样本图像存储至预设的数据库作为样本数据集。
从预设的数据库中获取样本数据集,将样本图像作为输入,样本图像中的标注结果作为目标定位点,建立一个视杯和视盘定位点检测模型。在模型的训练过程中,对模型的权重进行调整,直至调整权重后的模型的输出结果满足预设准确度阈值,或者迭代次数达到预设迭代次数阈值,停止模型训练过程。
可选地,将所述样本图像分为训练样本集、验证样本集和测试样本集,根据所述训练样本集、所述验证样本集和所述测试样本集,利用反向传播算法训练深度学习网络模型。
可选地,上述标注好的样本图像也可存储至存块链,通过区块链存储,实现数据信息在不同平台之间的共享,也可防止数据被篡改。
区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层。
需要说明的是,训练检测模型的过程可以在终端设备本地实现,还可以在与终端设备进行通信连接的其他设备上实现,当在终端设备侧部署成功训练好的检测模型,或者其他设备将训练好的检测模型推送至终端设备并部署成功后,可在终端设备上实现对获取到的眼部OCT图像进行视杯视盘定位点的检测。需要说明的是,在进行视杯视盘定位点的检测过程中获得的待检测的眼部OCT图像还可以用以增加样本数据集中的样本量,在终端设备或其他设备端执行检测模型的进一步优化,将进一步优化的检测模型部署到终端设备中以替换之前的检测模型。通过这种方式优化了检测模型,进一步提高了检测模型的表现。
在本申请实施例中,预设的检测模型包括第一网络分支和第二网络分支。
所述第一网络分支用于提取所述眼部OCT图像多个不同尺度的特征图。第一网络分支为改进的Xception网络,用于提取目标图像中不同尺度的图像信息,改进的Xception网络的结构如图5所示。如图5所示,第一网络分支包括级联的块(module)
1a,module2a,module3a以及module4a,其中,module4a的输出经过4倍上采样(upsample×4)后与module2a的输出拼接(concat)输入module2b,module3a的输出与module2b的输出拼接(concat)后输入module3b,module3b的输出与module4a的输出拼接(concat)后输入module4b,module4b的输出经过4倍上采样(upsample×4)后与module2b的输出拼接(concat)输入module2c;module
1a,module2a,module2b以及module2c输出不同尺度的四个特征图,module 1a输出的特征图尺度为256×256,通道数为8;module 2a输出的特征图尺度为128×128,通道数为48;module 2b输出的特征图尺度为64×64,通道数为48;module 2c输出的特征图尺度为32×32,通道数为48。
其中,module
1(包括module 1a),module
2(包括module 2a,module2b和module2c),module
3(包括module3a和module3b),module 4(包括module4a和module4b)的结构分别如下图6至图9所示。
如图6所示为module 1的结构示意图,如图6所示,module 1包括1个卷积层和一个带有激活函数的BN(Batch
Normalization)层。其中,BN层后带的激活函数为ReLU函数;卷积层的卷积核为3×3,步幅(stride)为2×2,通道数为8。
如图7所示为module 2的结构示意图,如图7所示,module 2包括5个部分,第一部分至第五部分,第二部分至第五部分这四个部分具备相同的网络结构。其中,第一个部分包括级联的第一卷积层,第二卷积层和第三卷积层这三个卷积层,以及一个第四卷积层;第一卷积层和第二卷积层后都带有BN层和激活函数ReLU函数,第三卷积层的输出与第四卷积层的输出先进行和运算后输入第二部分。第二部分包括激活函数ReLU函数,级联的三个卷积层,输入第二部分的数据,与经过三个卷积层后的数据进行和运算后输入第三部分。依此类推,直至第五部分得到module 2的输出。
图8所示为module 3的结构示意图,图9所示为module 4的结构示意图。module2,module3,module4整体结构类似,区别在于卷积核的大小以及模块重复的次数。请参照图8-图9所示,此处不再赘述。
在本申请实施例中,第一网络分支利用了原始Xception的主要模块的结构,对原始Xception改动包括减少了通道数目,增加了模块重复的次数,同时增加了特征级联(或聚合)。通过减少通道数目,大幅度减少了计算量,减少了系统资源占用,降低了算力成本;同时,为了平衡因减少通道数而损失的精度,一方面增加了模块重复的次数,另一方面增加了特征级联。
具体地,将原始Xception的通道数目64,128,256,728,减少为8,48,96,192,形成轻量化的Xception网络。但由于通道数减少会导致特征提取不充分,因而增加了特征级联操作。特征级联具体为:将较小的通道数的特征提取网络复制三份,为了便于描述,称为多级网络,每个网络有多个卷积层,每层输出为不同分辨率的特征,称为多层特征。首先将多级网络进行了串联,每一级网络提取后的特征传入下一级作为输入,同时将上一级对应层次的特征一起融合,对特征实现了复用。该级联操作将不同分辨率的特征进行多次融合,充分提取了有效信息。
该级联方式主要优势为:1)module1a,module
2a,module 3a,module
4 a 和 module2b,module3b,module4 b 属于不同层级,有多个不同层级的网络,可以充分提取不同尺度的图像信息;2)该结构通过多种方式,例如module2b同时利用了module2a和module4a的输出上采样之后的特征,将不同分辨率的特征进行融合,实现了特征复用,有效利用不同层级网络特征;第二网络分支用于根据多个不同尺度的所述特征图提取所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
作为本申请一示例,如图10所示,第二网络分支包括第一子网络和第二子网络,所述第一子网络用于粗检测所述眼部OCT图像的视杯和视盘定位点;所述第二子网络用于精检测所述眼部OCT图像的视杯和视盘定位点。
其中,第一子网络为增加特征级联的全局网络(GlobalNet);第二子网络为增加注意力机制的分割网络(RefineNet)。
第一子网络将第一网络分支输出的不同尺度的特征图作为输入,并增加了特征级联。使用全局网络可以通过对图像特征的提取,定位简单的关键点。
图11所示为第一子网络的结构示意图。如图11所示,需要说明的是图11中module
1a,module 2a,module
2b,module 2c分别对应于第一网络分支中的不同尺度的输出。第一子网络包括7个卷积层,其中,module
2c的输出经过第一卷积层(卷积核为3×3,通道数为256)和2倍上采样(upsample)后与module 2b的输出拼接(concat)作为第二卷积层(卷积核为3×3,通道数为128)的输入;第二卷积层的输出经过2倍上采样(upsample)后与module 2a的输出拼接(concat)作为第三卷积层(卷积核为3×3,通道数为64)的输入;第三卷积层的输出经过2倍上采样(upsample)后与module 1a的输出拼接(concat)作为第四卷积层(卷积核为3×3,通道数为64)的输入,第二卷积层、第三卷积层和第三卷积层的输出分别经过一个第五卷积层(卷积核为1×1,通道数为4)后输出第一子网络生成的三个不同尺度的输出,三个输出分别为global_out1、global_out2和global_out3。
第二子网络将第一子网络不同尺度的输出作为输入,特征已经高度密集,通过增加注意力机制,因而可以根据重要性对特征进行筛选,可以有效提高最后结果的可靠性。
图12所示为第二子网络的结构示意图。如图12所示,第二子网络的三个卷积层连接第一子网络的三个输出,依次为第一卷积层(卷积核为1×1,通道数为128),第二卷积层(卷积核为1×1,通道数为128)和第三卷积层(卷积核为1×1,通道数为256);第二卷积层的输出连接了第一注意力模块;第三卷积层的输出连接了第二注意力模块;第二注意力模块的输出依次经过第四卷积层(卷积核为1×1,通道数为128),第三注意力模块和4倍上采样之后,与第一注意力模块经过2倍上采样后的输出以及第一卷积层的输出进行拼接(concat),拼接后再输入第五卷积层(卷积核为1×1,通道数为4)得到输出,输出的就是视杯和视盘的定位点检测结果。
其中,注意力模块,包括第一至第三注意力模块,其结构示意图如图13所示。注意力模块包括全局平均池化层(Global
average pooling)和2个全连接层(Dense
out),2个全连接层中间设置了激活函数ReLU函数,第2个全连接层后带有激活函数Sigmoid函数,经过Sigmoid函数的输出经过结构调整后与输入数据进行乘运算,作为注意力模块的输出。
全局平均池化层把特征图全局平均一下输出一个值,也就是把W*H*D的一个张量变成1*1*D的张量。该层顺着空间维度进行了特征压缩,使得在特征通道上具有全局的感受野,并且输出的维度和输入的特征通道数相匹配。
全连接层、激活函数和结构调整这三层,通过参数来为每个特征通道生成权重,其中参数被学习用来显式地建模特征通道间的相关性。
最后的乘运算是一个重标定的操作,将输出的权重看做是进过特征选择后的每个特征通道的重要性,然后通过乘法逐通道加权到先前的特征上,完成在通道维度上的对原始特征的重标定。本申请实施例中的注意力模块只需要学习一个权重,乘以原始卷积即可。
注意力模块采用SE-Net(Squeeze-and-Excitation Networks),SE-Net显式地建模特征通道之间的相互依赖关系,没有引入一个新的空间维度来进行特征通道间的融合,而是采用了一种全新的特征重标定策略。具体来说,就是通过学习的方式来自动获取到每个特征通道的重要程度,然后依照这个重要程度去提升有用的特征并抑制对当前任务用处不大的特征。因而根据重要性对特征进行筛选,可以有效提高最后的结果。
可以理解的是,此处描述的深度学习网络模型仅为示例性描述,不能解释为对发明的具体限制。
本申请实施例中,通过预设的检测模型对眼部OCT图像的视杯和视盘定位点进行检测,一方面,直接通过检测模型就能获得眼部OCT图像的定位点检测结果,大大提高了检测的效率;另一方面,由于检测模型提取了眼部OCT图像多个不同尺度的特征,更加准确地实现了视杯和视盘的定位点检测。
可选地,在上述任一实施例的基础上,也就是说,在获得一个眼部OCT图像中视杯和视盘定位点检测结果的基础上,本申请一些其他实施例中,在上述图1所示实施例的步骤S120之后,还包括步骤S130至S160。
S130,获取至少三个不同角度的眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述至少三个不同角度中包括0度和90度。
S140,根据每个所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标,确定每个所述眼部OCT图像中的视杯长度和视盘长度。
S150,根据至少三个不同角度下的所述视杯长度形成至少三个第一线段,将至少三个第一线段共中心并投影到同一个平面,拟合出视杯椭圆;根据至少三个不同角度下的所述视盘长度形成至少三个第二线段,将至少三个第二线段共中心并投影到同一个平面,拟合出视盘椭圆。
S160,根据至少三个不同角度下的所述视杯长度,至少三个角度下的所述视盘长度,所述视杯椭圆,和所述视盘椭圆,得到视杯面积、视盘面积、杯盘面积比、垂直杯盘比和水平杯盘比等形态参数。
在本申请实施例中,OCT图像包括0度,45度,90度,135度等,因而计算的视杯长度与视盘长度即为各个角度下的长度,例如,垂直方向为90度下的长度,水平方向为0度下的长度。
在一非限制性示例性中,在本示例中,以四个不同的角度为例进行说明,可以先分别得到0度,45度,90度,135度下的视杯长度与视盘长度。根据这些视杯长度和视盘长度可以构建8条线段,这8条线段共中心,将它们投影到同一个平面,如图14所示。现在以视杯长度为例:将共中心的线段投影到同一个平面,可得8个点,将这8个点拟合成一个椭圆,如图14中所示的内部的那个小的椭圆,可得椭圆参数,因而可求视杯面积,即较小椭圆(视杯椭圆)的面积。视盘类似,拟合出图14中所示的外部的那个大的椭圆,可以求出视盘面积,即较大椭圆(视盘椭圆)的面积。杯盘面积比为视杯与视盘面积比,视杯与视盘面积比为较小椭圆面积比上较大椭圆面积;视杯与视盘水平比为0度下视杯长度与0度视盘长度的比值;视杯与视盘垂直比为90度下是视杯长度与90度视盘长度的比值。
本申请实施例,一方面,通过检测模型就能直接获得眼部OCT图像的定位点检测结果,大大提高了检测效率,在此基础上,根据多个角度的视杯视盘定位点结果的进行线段投影,通过拟合获取了视杯视盘形态参数,方法简便高效,易于实施;另一方面,由于检测模型提取了眼部OCT图像多个不同尺度的特征,更加准确地实现了视杯和视盘的定位点检测,也就提高了视杯视盘形态参数获得的准确性;再一方面,基于多个不同角度的视杯和视盘定位点,获得了更多,更丰富的眼部OCT图像的视杯视盘形态参数,使得本申请的方案能适用于不同的场景,更具适应性。
本申请一实施例还提供了一种青光眼分级方法,该青光眼分级方法利用前述实施例获取到视杯面积、视盘面积、杯盘面积比,和垂直杯盘比共4维形态参数之后,还包括:获取5维节细胞复合体(ganglion
cell complex,GCC)特征和4维视网膜视神经纤维层(retinal
optic nerve fiber layer ,RNFL)厚度特征;将4维形态参数,5维GCC特征和4维RNFL特征组合成13维输入特征,输入使用机器学习方法训练出的青光眼分级模型,获得青光眼分级结果。
其中,关于GCC参数的5维GCC特征包括:上方GCC厚度、下方GCC厚度、平均GCC厚度、局部丢失体积(FLV)和整体丢失体积(GLV)。
关于RNFL厚度的4维RNFL厚度特征包括:上侧RNFL厚度、下侧RNFL厚度、鼻侧RNFL厚度和颞侧RNFL厚度。
5维GCC特征以及4维RNFL厚度特征可以从OCT图像采集仪器中直接读取。
将4维形态参数,5维GCC特征和4维RNFL特征,组合成13维输入特征,输入使用机器学习方法训练出的青光眼分级模型,获得青光眼分级结果。
青光眼分级模型可以为基于机器学习的分类模型。例如基于Xgboost的决策树模型。
示例性地,青光眼分级模型的分级结果包括:无青光眼,低危,中危,高危。该示例为四分类,此外,还可以为二分类模型、三分类模型、或者更多类别的分类模型。
可以理解的是,本领域技术人员可以在本申请实施例的教导下根据实际实施情况选用合适的分级模型,分级模型的分类结果也可以根据实际情况进行选择设置,本申请对此不做具体限定。
本申请实施例综合了多种参数,提高了分类的准确率。另外,基于青光眼分级模型进行分级,可在数秒内完成决策,减少了系统资源占用,极大提高了分级效率。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
对应于上文实施例所述的基于眼部OCT图像的视杯和视盘定位点检测方法,图15示出了本申请实施例提供的基于眼部OCT图像的视杯和视盘定位点检测装置的结构框图,为了便于说明,仅示出了与本申请实施例相关的部分。
参照图15,该装置包括:获取模块151,用于获取眼部OCT图像;检测模块152,用于通过预设的检测模型对眼部OCT图像的视杯和视盘定位点进行检测,一方面,直接通过检测模型对眼部OCT图像就能获得定位点检测结果,大大提高了检测的效率;另一方面,由于检测模型提取了眼部OCT图像多个不同尺度的特征,更加准确地实现了视杯和视盘的定位点检测。
需要说明的是,上述模块/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例部分,此处不再赘述。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
图16为本申请一实施例提供的终端设备的结构示意图。如图16所示,该实施例的终端设备16包括:至少一个处理器160(图16中仅示出一个处理器)、存储器161以及存储在所述存储器161中并可在所述至少一个处理器160上运行的计算机程序162,所述处理器100执行所述计算机程序162时实现上述各个方法实施例中的步骤。例如图1所示的步骤S110至步骤S120。
所述终端设备可包括但不仅限于处理器160、存储器161。本领域技术人员可以理解,图16仅仅是终端设备16的示例,并不构成对终端设备16的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述心电图机还可以包括输入输出设备、网络接入设备、总线等。
所称处理器160可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器
(Digital Signal Processor,DSP)、专用集成电路
(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列 (Field-Programmable Gate Array,FPGA) 或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
所述存储器161可以是所述终端设备16的内部存储单元,例如终端设备16的硬盘或内存。所述存储器161也可以是所述终端设备16的外部存储设备,例如所述终端设备16上配备的插接式硬盘,智能存储卡(Smart Media Card, SMC),安全数字(Secure Digital, SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器161还可以既包括所述终端设备16的内部存储单元也包括外部存储设备。所述存储器161用于存储所述计算机程序以及所述终端设备16所需的其他程序和数据。所述存储器161还可以用于暂时地存储已经输出或者将要输出的数据。
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现可实现上述各个方法实施例中的步骤。其中,该计算机可读存储介质可以是非易失性,也可以是易失性。
本申请实施例提供了一种计算机程序产品,当计算机程序产品在移动终端上运行时,使得移动终端执行时实现可实现上述各个方法实施例中的步骤。
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质至少可以包括:能够将计算机程序代码携带到拍照装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读介质不可以是电载波信号和电信信号。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的终端设备和方法,可以通过其它的方式实现。例如,以上所描述的终端设备实施例仅仅是示意性的。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。
Claims (20)
- 一种基于眼部OCT图像的视杯和视盘定位点检测方法,其中,包括:获取眼部OCT图像;使用预设的检测模型对所述眼部OCT图像进行检测,得到所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述检测模型包括第一网络分支和第二网络分支,所述第一网络分支用于提取所述眼部OCT图像多个不同尺度的特征图,所述第二网络分支用于根据多个不同尺度的所述特征图提取所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
- 如权利要求1所述的视杯和视盘定位点检测方法,其中,所述检测模型的训练过程,包括:获取样本数据集,所述样本数据集中包括多个样本图像,每个样本图像为进行了视杯和视盘定位点标注的眼部OCT样本图像;使用所述样本数据集对关键点检测模型进行训练,在训练过程中对所述关键点检测模型的权重进行调整,直至调整权重后的所述关键点检测模型的输出结果满足预设条件,或者训练过程的迭代次数达到预设迭代次数,则停止训练。
- 如权利要求1或2所述的视杯和视盘定位点检测方法,其中,所述第二网络分支包括第一子网络和第二子网络,所述第一子网络用于粗检测所述眼部OCT图像的视杯和视盘定位点;所述第二子网络用于精检测所述眼部OCT图像的视杯和视盘定位点。
- 如权利要求3所述的视杯和视盘定位点检测方法,其中,所述第一子网络为增加特征级联的GlobalNet;所述第二子网络为增加注意力机制的RefineNet。
- 如权利要求1所述的视杯和视盘定位点检测方法,其中,还包括:获取至少三个不同角度的眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述至少三个不同角度中包括0度和90度;根据至少三个所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标,计算视杯视盘形态参数。
- 如权利要求5所述的视杯和视盘定位点检测方法,其中,所述根据至少三个所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标,计算视杯视盘形态参数,包括:根据至少三个不同角度下的所述视杯长度形成至少三个第一线段,将至少三个第一线段共中心并投影到同一个平面,拟合出视杯椭圆;根据至少三个不同角度下的所述视盘长度形成至少三个第二线段,将至少三个第二线段共中心并投影到同一个平面,拟合出视盘椭圆;根据至少三个不同角度下的所述视杯长度,至少三个角度下的所述视盘长度,所述视杯椭圆,和所述视盘椭圆,得到视杯视盘形态参数。
- 如权利要求5或6所述的视杯和视盘定位点检测方法,其中,所述视杯视盘形态参数包括:视杯面积、视盘面积、杯盘面积比、垂直杯盘比和水平杯盘比中的至少一个。
- 如权利要求2所述的视杯和视盘定位点检测方法,其中,所述获取样本数据集,包括:获取原始眼部OCT图像,对所述原始眼部OCT图像进行预处理得到预处理后的眼部OCT图像,所述预处理包括插值和截断;对所述预处理后的眼部OCT图像进行视杯和视盘标注得到样本图像,将所述样本图像存储至预设的数据库作为样本数据集。
- 如权利要求1所述的视杯和视盘定位点检测方法,其中,所述检测模型的训练过程,包括:获取样本图像,根据所述样本图像确定出训练样本集、验证样本集和测试样本集;基于所述训练样本集、所述验证样本集和所述测试样本集,利用反向传播算法训练深度学习网络模型,以得到所述检测模型。
- 一种基于眼部OCT图像的视杯和视盘定位点检测装置,其中,包括:获取模块,用于获取眼部OCT图像;检测模块,用于使用预设的检测模型对所述眼部OCT图像进行检测,得到所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述检测模型包括第一网络分支和第二网络分支,所述第一网络分支用于提取所述眼部OCT图像多个不同尺度的特征图,所述第二网络分支用于根据多个不同尺度的所述特征图提取所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
- 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现:获取眼部OCT图像;使用预设的检测模型对所述眼部OCT图像进行检测,得到所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述检测模型包括第一网络分支和第二网络分支,所述第一网络分支用于提取所述眼部OCT图像多个不同尺度的特征图,所述第二网络分支用于根据多个不同尺度的所述特征图提取所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标。
- 如权利要求11所述的终端设备,其中,所述处理器执行所述计算机程序时实现:获取样本数据集,所述样本数据集中包括多个样本图像,每个样本图像为进行了视杯和视盘定位点标注的眼部OCT样本图像;使用所述样本数据集对关键点检测模型进行训练,在训练过程中对所述关键点检测模型的权重进行调整,直至调整权重后的所述关键点检测模型的输出结果满足预设条件,或者训练过程的迭代次数达到预设迭代次数,则停止训练。
- 如权利要求11或12所述的终端设备,其中,所述第二网络分支包括第一子网络和第二子网络,所述第一子网络用于粗检测所述眼部OCT图像的视杯和视盘定位点;所述第二子网络用于精检测所述眼部OCT图像的视杯和视盘定位点。
- 如权利要求13所述的终端设备,其中,所述第一子网络为增加特征级联的GlobalNet;所述第二子网络为增加注意力机制的RefineNet。
- 如权利要求11所述的终端设备,其中,所述处理器执行所述计算机程序时实现:获取至少三个不同角度的眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标;所述至少三个不同角度中包括0度和90度;根据至少三个所述眼部OCT图像中视杯的两个定位点坐标,以及视盘的两个定位点坐标,计算视杯视盘形态参数。
- 如权利要求15所述的终端设备,其中,所述处理器执行所述计算机程序时实现:根据至少三个不同角度下的所述视杯长度形成至少三个第一线段,将至少三个第一线段共中心并投影到同一个平面,拟合出视杯椭圆;根据至少三个不同角度下的所述视盘长度形成至少三个第二线段,将至少三个第二线段共中心并投影到同一个平面,拟合出视盘椭圆;根据至少三个不同角度下的所述视杯长度,至少三个角度下的所述视盘长度,所述视杯椭圆,和所述视盘椭圆,得到视杯视盘形态参数。
- 如权利要求15或16所述的终端设备,其中,所述视杯视盘形态参数包括:视杯面积、视盘面积、杯盘面积比、垂直杯盘比和水平杯盘比中的至少一个。
- 如权利要求12所述的终端设备,其中,所述处理器执行所述计算机程序时实现:获取原始眼部OCT图像,对所述原始眼部OCT图像进行预处理得到预处理后的眼部OCT图像,所述预处理包括插值和截断;对所述预处理后的眼部OCT图像进行视杯和视盘标注得到样本图像,将所述样本图像存储至预设的数据库作为样本数据集。
- 如权利要求11所述的终端设备,其中,所述处理器执行所述计算机程序时实现:获取样本图像,根据所述样本图像确定出训练样本集、验证样本集和测试样本集;基于所述训练样本集、所述验证样本集和所述测试样本集,利用反向传播算法训练深度学习网络模型,以得到所述检测模型。
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至9任一项所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010087226.5 | 2020-02-11 | ||
CN202010087226.5A CN111311565B (zh) | 2020-02-11 | 基于眼部oct图像的视杯和视盘定位点检测方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021159643A1 true WO2021159643A1 (zh) | 2021-08-19 |
Family
ID=71160064
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/093585 WO2021159643A1 (zh) | 2020-02-11 | 2020-05-30 | 基于眼部oct图像的视杯和视盘定位点检测方法及装置 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2021159643A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113837104A (zh) * | 2021-09-26 | 2021-12-24 | 大连智慧渔业科技有限公司 | 基于卷积神经网络的水下鱼类目标检测方法、装置及存储介质 |
CN113870270A (zh) * | 2021-08-30 | 2021-12-31 | 北京工业大学 | 一种统一框架下的眼底影像视杯、视盘分割方法 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160000315A1 (en) * | 2010-08-05 | 2016-01-07 | Carl Zeiss Meditec, Inc. | Automated analysis of the optic nerve head: measurements, methods and representations |
CN109829894A (zh) * | 2019-01-09 | 2019-05-31 | 平安科技(深圳)有限公司 | 分割模型训练方法、oct图像分割方法、装置、设备及介质 |
CN110120047A (zh) * | 2019-04-04 | 2019-08-13 | 平安科技(深圳)有限公司 | 图像分割模型训练方法、图像分割方法、装置、设备及介质 |
CN110298850A (zh) * | 2019-07-02 | 2019-10-01 | 北京百度网讯科技有限公司 | 眼底图像的分割方法和装置 |
CN110327013A (zh) * | 2019-05-21 | 2019-10-15 | 北京至真互联网技术有限公司 | 眼底图像检测方法、装置及设备和存储介质 |
CN110889826A (zh) * | 2019-10-30 | 2020-03-17 | 平安科技(深圳)有限公司 | 眼部oct图像病灶区域的分割方法、装置及终端设备 |
-
2020
- 2020-05-30 WO PCT/CN2020/093585 patent/WO2021159643A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160000315A1 (en) * | 2010-08-05 | 2016-01-07 | Carl Zeiss Meditec, Inc. | Automated analysis of the optic nerve head: measurements, methods and representations |
CN109829894A (zh) * | 2019-01-09 | 2019-05-31 | 平安科技(深圳)有限公司 | 分割模型训练方法、oct图像分割方法、装置、设备及介质 |
CN110120047A (zh) * | 2019-04-04 | 2019-08-13 | 平安科技(深圳)有限公司 | 图像分割模型训练方法、图像分割方法、装置、设备及介质 |
CN110327013A (zh) * | 2019-05-21 | 2019-10-15 | 北京至真互联网技术有限公司 | 眼底图像检测方法、装置及设备和存储介质 |
CN110298850A (zh) * | 2019-07-02 | 2019-10-01 | 北京百度网讯科技有限公司 | 眼底图像的分割方法和装置 |
CN110889826A (zh) * | 2019-10-30 | 2020-03-17 | 平安科技(深圳)有限公司 | 眼部oct图像病灶区域的分割方法、装置及终端设备 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113870270A (zh) * | 2021-08-30 | 2021-12-31 | 北京工业大学 | 一种统一框架下的眼底影像视杯、视盘分割方法 |
CN113870270B (zh) * | 2021-08-30 | 2024-05-28 | 北京工业大学 | 一种统一框架下的眼底影像视杯、视盘分割方法 |
CN113837104A (zh) * | 2021-09-26 | 2021-12-24 | 大连智慧渔业科技有限公司 | 基于卷积神经网络的水下鱼类目标检测方法、装置及存储介质 |
CN113837104B (zh) * | 2021-09-26 | 2024-03-15 | 大连智慧渔业科技有限公司 | 基于卷积神经网络的水下鱼类目标检测方法、装置及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN111311565A (zh) | 2020-06-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hacisoftaoglu et al. | Deep learning frameworks for diabetic retinopathy detection with smartphone-based retinal imaging systems | |
Keel et al. | Development and validation of a deep‐learning algorithm for the detection of neovascular age‐related macular degeneration from colour fundus photographs | |
Christopher et al. | Performance of deep learning architectures and transfer learning for detecting glaucomatous optic neuropathy in fundus photographs | |
JP7058373B2 (ja) | 医療画像に対する病変の検出及び位置決め方法、装置、デバイス、及び記憶媒体 | |
WO2021068523A1 (zh) | 眼底图像黄斑中心定位方法、装置、电子设备及存储介质 | |
WO2018201647A1 (zh) | 视网膜病变程度等级检测方法、装置及存储介质 | |
WO2021082691A1 (zh) | 眼部oct图像病灶区域的分割方法、装置及终端设备 | |
CN113784656A (zh) | 用于生物识别和健康状态确定的光学设备和相关装置 | |
US11967181B2 (en) | Method and device for retinal image recognition, electronic equipment, and storage medium | |
WO2022088665A1 (zh) | 病灶分割方法、装置及存储介质 | |
WO2022242392A1 (zh) | 血管图像分类处理方法、装置、设备及存储介质 | |
EP3972522A1 (fr) | Procede de generation d'un modele d'une arcade dentaire | |
US20220198831A1 (en) | System for determining one or more characteristics of a user based on an image of their eye using an ar/vr headset | |
WO2021190656A1 (zh) | 眼底图像黄斑中心定位方法及装置、服务器、存储介质 | |
CN113160226A (zh) | 基于双向引导网络的amd病变oct图像的分类分割方法及系统 | |
WO2021159643A1 (zh) | 基于眼部oct图像的视杯和视盘定位点检测方法及装置 | |
WO2022205779A1 (zh) | 基于多模态的眼部检测数据的处理方法、装置及终端设备 | |
CN114842270A (zh) | 一种目标图像的分类方法、装置、电子设备及介质 | |
Franco et al. | Glaucoma patient screening from online retinal fundus images via Artificial Intelligence | |
CN106446805A (zh) | 一种眼底照中视杯的分割方法及系统 | |
CN110610480B (zh) | 基于Attention机制的MCASPP神经网络眼底图像视杯视盘分割模型 | |
CN114782337B (zh) | 基于人工智能的oct图像推荐方法、装置、设备及介质 | |
Li et al. | Development of a deep learning-based image eligibility verification system for detecting and filtering out ineligible fundus images: a multicentre study | |
US20210407096A1 (en) | System for estimating primary open-angle glaucoma likelihood | |
CN113658097A (zh) | 一种眼底图像质量增强模型的训练方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20919080 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20919080 Country of ref document: EP Kind code of ref document: A1 |