WO2019082202A1

WO2019082202A1 - A fundus image quality assessment system

Info

Publication number: WO2019082202A1
Application number: PCT/IN2018/050681
Authority: WO
Inventors: Pradeep WALIA; Rajarajeshwari KODHANDAPANI; Raja Raja LAKSHMI; Mrinal HALOI
Original assignee: Artificial Learning Systems India Private Limited
Priority date: 2017-10-23
Filing date: 2018-10-22
Publication date: 2019-05-02

Abstract

A fundus image quality assessment system (1000) is disclosed. The system (1000) comprises a storage unit (107) adapted to store a training fundus image dataset; a generator (101) to generate the training fundus image dataset and a ground-truth file comprising: partitioning each of the training fundus images into a predefined number of elements based on a structure; analyzing the primary element with each of the secondary element sequentially for the partitioned training fundus image; and determining a quality value label of the analyzed training fundus image; a quality assessment means (105) to train a convolutional network based on the generated training fundus image dataset and the ground-truth file; compute the quality value of an input fundus image; and assess the input fundus image based on the computed quality value, a user defined threshold, and/or an image capture device characteristics.

Description

Title of the Invention :-

A Fundus Image Quality Assessment System

Technical field of the invention

[0001] The invention relates to the field of an image quality evaluation. More particularly, the invention relates to detecting a quality of a fundus image using machine learning applications which in turn aids ophthalmology.

Background of the invention

[0002] Vision is an important survival attribute for a human, thus making eyes as one of the most vital sensory body part. Though most of the eye diseases may not be fatal, failure of proper diagnosis and treatment of an eye disease may lead to vision loss. Some of the common eye diseases are, for example, conjunctivitis, optic nerve conditions such as glaucoma, cataract, retinal conditions, conditions due to diabetes, etc. Analysis of fundus images of a patient is a very convenient way of screening and monitoring eye diseases. Early diagnosis of eye diseases through regular screening may prevent visual loss and blindness amongst patients. Image quality is a representative of an image that relates to an assumed image degradation as compared to an ideal image reference. Good quality fundus images are essential for a better manual and/or computer-aided screening and detection of eye diseases. Poor quality fundus images, for example, too dark, too bright, blurred, haze, etc., may result in incorrect and false prediction of the eye diseases. This may cause critical consequences to the patient.

l [0003] In recent times, computer-aided screening systems assist doctors to improve the quality of examination of fundus images for screening of eye diseases. Machine learning (ML) algorithms on data are used to extract and evaluate information. Systems apply ML algorithms to ensure faster mode of efficient rejection of poor quality fundus images which enhances screening of eye diseases. But currently, the systems available for detection of quality of fundus images involving ML algorithms are complex and of high cost. This limits the reach of medical eye screening and diagnosis to common man. A simple and cost-effective solution involving effective use of ML algorithms enabling the systems to access concealed visions for effective quality segregation of fundus images is thus essential.

Summary of invention

[0004] The present invention discloses a fundus image quality assessment system. The fundus image quality assessment system comprises a storage unit adapted to store a training fundus image dataset; a generator adapted to generate the training fundus image dataset and a ground- truth file comprising: partitioning each of the training fundus images into a predefined number of elements based on a structure, wherein the partitioned elements comprises a primary element and one or more secondary elements; analyzing the primary element with each of the secondary element sequentially for the partitioned training fundus image; and determining a quality value label of the analyzed training fundus image; a quality assessment means adapted to train a convolutional network based on the generated training fundus image dataset and the ground-truth file; compute the quality value of an input fundus image; and assess the input fundus image based on the computed quality value, a user defined threshold and/or an image capture device characteristics.

[0005] In an embodiment, the quality value label associated with a training fundus image in the training fundus image dataset is either 'good' or 'bad'. The quality value label of the training fundus image is based on the gradable efficiency of the training fundus image. The ground-truth file comprises the quality value label and a training fundus image identifier for each of the training fundus image. The quality factors are, for example, darkness, light, contrast, color accuracy, tone reproduction, distortion, an exposure accuracy, sharpness, noise, lens flare, etc. The training fundus image identifier of a training fundus image is, for example, a name or an identity assigned to the training fundus image. The user defined threshold is a user defined parameter to vary the quality value label of the input fundus image. The image capture device characteristics of an image capture device is a resolution, an illumination factor, a field of view or the like. The image capture device is, for example, a fundus camera, a camera attached to a smartphone, etc.

Brief description of the drawings

[0006] The present invention is described with reference to the accompanying figures. The accompanying figures, which are incorporated herein, are given by way of illustration only and form part of the specification together with the description to explain the make and use the invention, in which,

[0007] Figure 1 illustrates a block diagram of a fundus image quality assessment system in accordance with the invention; [0008] Figure 2 exemplary illustrates a convolutional network of a quality assessment means to compute a quality value of an input fundus image; and [0009] Figure 3 illustrates a flowchart for determination of the quality value of the input fundus image by the fundus image quality assessment system in accordance with the invention.

Detailed description of the invention

[00010] Figure 1 illustrates a block diagram of a fundus image quality assessment system 1000 in accordance with the invention. The fundus image quality assessment system 1000 comprises a storage unit 107 adapted to store a training fundus image dataset. The fundus image quality assessment system 1000 comprises a generator 101 adapted to generate the training fundus image dataset. The training fundus image dataset comprises a plurality of training fundus images. The generator 101 performs the following steps for each of the training fundus images. The generator 101 partitions a training fundus image into a predefined number of elements based on a structure, for example, a square grid, a hexagonal grid, a circular grid, an irregular shaped grid, etc. The partitioned elements comprises a primary element and one or more secondary elements. The generator 101 analyzes the primary element with each of the secondary element sequentially. The generator 101 determines a quality value label of the training fundus image based on the analysis. The fundus image quality assessment system 1000 comprises a quality assessment means 105 adapted to train a convolutional network based on the generated training fundus image dataset. The quality assessment means 105 is adapted to identify the quality value of an input fundus image. The quality assessment means 105 is adapted to assess the input fundus image based on the identified quality value of the input fundus image, a user defined threshold and/or an image capture device characteristics.

[00011] The user defined threshold is a user defined parameter to vary the quality value label of the input fundus image. The quality value label defines the quality value of the input fundus image. The image capture device characteristics is a resolution, an illumination factor, a field of view or the like. The image capture device characteristics defines the characteristics of an image capture device. The image capture device is, for example, a fundus camera, a camera attached to a smartphone, etc. The image capture device is used to capture the input fundus image.

[00012] The input fundus image, herein, refers to a two-dimensional array of digital image data, however, this is merely illustrative and not limiting of the scope of the invention. A training fundus image is also a two-dimensional array of digital image data and is used for the purpose of training the fundus image quality assessment system 1000. In this invention, the term 'training' generally refers to a process of developing the fundus image quality assessment system 1000 for the classification of the fundus images based on the quality value. The input fundus image is used for subsequent assessment by the fundus image quality assessment system 1000.

[00013] The storage unit 107 is, for example, a database to store a structured collection of data. In an embodiment, the storage unit 107 may be an internal part of the fundus image quality assessment system 1000. In another embodiment, the storage unit 107 may be remotely located and accessed via a network. The storage unit 107 may be, for example, removable and/or non-removable data storage such as a tape, a magnetic disk, an optical disks, a flash memory card, etc. The storage unit 107 may comprise, for example, random access memory (RAM), read only memory (ROM), electrically erasable programmable read-only memory (EEPROM), a digital versatile disks (DVD), a compact disk (CD), a flash memory, a magnetic tape, a magnetic disk storage, or any combination thereof that can be used to store and access information and is a part of the fundus image quality assessment system 1000.

[00014] The storage unit 107 stores the training fundus image dataset. The training fundus image dataset comprises multiple training fundus images. Each training fundus image has a quality value label associated with the training fundus image. The quality value label and a training fundus image identifier for each of the training fundus image is stored in a ground-truth file in the storage unit 107. The quality value label represents a quality value of the training fundus image. The quality value is a measure of perceived image degradation as compared to an ideal image reference based on amounts of a plurality of quality factors. The quality factors are, for example, darkness, light, contrast, color accuracy, tone reproduction, distortion, an exposure accuracy, sharpness, noise, lens flare, etc. The training fundus image identifier of a training fundus image is, for example, a name or an identity assigned to the training fundus image.

[00015] In an embodiment, the quality value label associated with a training fundus image in the training fundus image dataset is either 'good' or 'bad'. For instance, a training fundus image with the quality value label as 'good' indicates the quality value of the training fundus image with all the quality factors above a quality threshold. Similarly, the training fundus image with the quality value label as 'bad' indicates the quality value of the training fundus image with a minimum number of the quality factors below the quality threshold. In another embodiment, the quality value label may also be termed as either 'low-quality' or 'high-quality' based on the quality value of the training fundus image with a number of quality factors above/below a quality threshold. In another embodiment, the quality value label may be of five levels - 'bad', 'poor', 'fair', 'good' and 'excellent'. In another embodiment, the quality value label may be a numeric value representing the degree of quality of the training fundus image based on the values of each of the associated quality factors. The quality value and the associated quality value label of the training fundus image is based on the gradable efficiency of the training fundus image. The quality value label defines the quality value and thus the terms 'quality value' and 'quality value label' may be used interchangeably herein. It will be appreciated that the quality value labels are merely exemplary, and, for example, other labels could be selected for different scenarios.

[00016] The training fundus images in the training fundus image dataset are used for training the fundus image quality assessment system 1000 to assess the input fundus image subsequently. In an embodiment, the training fundus images are collected from one or more input devices. The input device is, for example, a camera incorporated into a mobile device such as a smartphone, a server, a network of personal computers, or simply a personal computer, a mainframe, a tablet computer, etc. The training fundus images are located in the training fundus image dataset which is stored in the storage unit 107 of the fundus image quality assessment system 1000.

[00017] The generator 101 generates the training fundus image dataset. The input to the generator 101 is the training fundus images stored in the storage unit 107. The generator 101 performs the following steps on each of the training fundus image. The generator 101 partitions the training fundus image into a predefined number of elements based on a structure. The structure is one of a square grid, a hexagonal grid, a circular grid, an irregular shaped grid, or the like. The structure is partitioned into elements which may be symmetrical or asymmetrical in shape. The partitioned elements comprises a primary element and one or more secondary elements. The primary element and the one or more secondary elements together constitute the structure. The generator 101 choses any random element among the partitioned elements as the primary element. The remaining partitioned elements are the secondary elements. The generator 101 partitions the training fundus image such that the structure comprises at least two elements.

[00018] The generator 101 analyzes the primary element with each of the secondary element sequentially. That is, the generator 101 analyzes two of the partitioned elements at a time. One of the element being analyzed is the primary element and the other being one of the remaining secondary elements. The generator 101 considers, for example, a neighboring element of the primary element as a first secondary element to start the analysis. The generator 101 considers a next sequentially secondary element after each analysis.

[00019] In an example, consider the structure comprising six columns and five symmetric rectangles in each row. The left- first two rectangles of a first row are considered as the primary element and the secondary element respectively for the analysis by the generator 101. The generator 101 continues the analysis keeping the first rectangle as constant and considering a third rectangle in the first row for the next analysis and so on. The generator 101 navigates row-wise to consider the subsequent secondary element for analysis.

[00020] The generator 101 determines an element-quality value for each of the secondary elements and the primary element. The element-quality value for a partitioned element represents a measure of perceived element-image degradation as compared to an ideal image reference based on amounts of a plurality of quality factors. A label associated with the element-quality value represents the quality measure of the element. The label assigned to the element-quality value is chosen from one of the multiple quality value labels which can be assigned to the training fundus image. For example, the label for a partitioned element is one of 'good' or 'bad' with the quality value labels assignable for the training fundus image being 'good' and 'bad'. The element-quality value is based on the quality factors for the corresponding element. The quality factor is, for example, darkness, light, contrast, color accuracy, tone reproduction, distortion, an exposure accuracy, sharpness, noise, lens flare, etc. For example, when all the quality factors for the element are above a threshold, then the label associated with the element-quality factor is 'good'. Similarly, when a minimum number of quality factors for the element are below a threshold, then the label associated with the element-quality factor is 'bad' .

[00021] The generator 101 determines the quality value label of the training fundus image based on the analysis. The generator 101 stores the quality value label of the training fundus image in the ground-truth file along with the training fundus image identifier in the storage unit 107.

[00022] In an embodiment, when the generator 101 determines the element- quality value of the primary element or the secondary element under analysis as 'bad' and detects a presence of region of interest in the determined 'bad' element under analysis, then the generator 101 terminates the analysis. The region of interest refers to an optic disc and/or a macula of the fundus. The generator 101 detects the optic disc and/or the macula in the training fundus image using one or more of the known image processing algorithms. Upon termination of the analysis, the generator 101 determines the quality value and thus the associated quality value label of the training fundus image as 'bad' . [00023] In an embodiment, the generator 101 is adapted to determine the quality value label of the training fundus image as 'good' based on the following steps. When the generator 101 determines the element-quality value of the primary element or a secondary element under analysis as 'bad' and detects an absence of region of interest in the analyzed 'bad' element, the generator 101 considers the analyzed 'bad' element as a new primary element. The generator 101 now repeats the above steps one or more times, until each of the remaining secondary elements is analyzed with the new primary element. If the generator 101 encounters the secondary element under analysis as of 'bad' element-quality value, then the generator 101 terminates the analysis. If the generator 101 does not encounter the secondary element under analysis as of 'bad' element- quality value, then the generator 101 continues and completes the analysis with the remaining secondary elements and determines the quality value of the training fundus image as 'good'. That is, a minimum of two elements with element-quality value as 'bad' is required to determine the quality value of the training fundus image as 'bad' . Minimum of two elements with element-quality value as 'bad' makes the training fundus image non-gradable.

[00024] Even if the element-quality value of one of the partitioned elements is 'bad', the absence of the region of interest in that partitioned element makes the training fundus image gradable. The training fundus image can be easily graded to identify any presence of an eye disease. In this case, the generator 101 marks the quality value label of the training fundus image as 'good'.

[00025] The generator 101 determines the quality value label of the training fundus image as 'good' when the element-quality value of each of the partitioned elements is 'good'. In other words, the generator 101 starts the analysis with the primary element and one of the secondary elements and continues to analyze each of the secondary elements sequentially with the primary element and to find no 'bad' element-quality value elements in the structure. The generator 101 then determines the quality value label of the training fundus image as 'good'.

[00026] In an example, consider the structure comprising two parallel horizontal lines and two vertical lines cutting across the two parallel horizontal lines such that the training fundus image is divided into nine square elements. The generator 101 considers randomly a top left corner element as the primary element. The remaining eight elements are the secondary elements. The elements are numbered row-wise from 1-9 starting from the first row. Consider three quality factors - sharpness, brightness and noise. Consider the sharpness of the middle row to be low. That is, the 4^th, 5^th and 6^th elements are of low sharpness with the 5^th element comprising the optic disc and macula of the fundus. Consider that all the quality factors are to be above a threshold for an element-quality value to be 'good'. The sharpness, brightness and noise values of the top row and the bottom row are such that the quality value of these two rows is 'good'. The sharpness of the middle row is lesser than the threshold and the other two quality factors are above the threshold.

[00027] The generator 101 analyses two elements at a time. First, the generator 101 analyses the 1^st element, which is the primary element and the 2^nd element, which is a secondary element and horizontally neighboring element of the 1^st element. The generator 101 determines the element-quality value of the 1^st and the 2^nd elements.

[00028] The generator 101 determines the element-quality value of the 1^st and the 2^nd elements as 'good' since all the quality factors are above the threshold. The generator 101 next analyses the 1^st and the 3^rd elements. The generator 101 determines the element-quality value of the 1^st and the 3^rd elements as 'good' since all the quality factors above the threshold. [00029] The generator 101 further analyses the I^s and the 4 elements. The generator 101 determines the element-quality value as 'good' for the 1^st element and the element-quality value as 'bad' for the 4^th element. Since the 4^th element is of low sharpness, the quality factor for sharpness is below the threshold. Now, the generator 101 detects for the presence of the region of interest in the 4^th element since the element-quality value is 'bad' . The generator 101 does not detect the presence of the region of interest in the 4^th element based on standard image processing algorithms. The generator 101 now considers the 4^th element as the new primary element. The generator 101 continues the analysis with the 4^th element and the next sequential secondary element, that is, the 5^th element.

[00030] The generator 101 analyses the 4^th and the 5^th elements. The generator 101 determines the element-quality value as 'bad' for the 5^th element. Since the 5^th element is of low sharpness, the quality factor for sharpness is below the threshold. Now, the generator 101 detects for the presence of the region of interest in the 5^th element since the element-quality value is 'bad'. The generator 101 detects the presence of the optic disc and macula in the 5^th element based on standard image processing algorithms. Since the 5^th element's element-quality value is 'bad' and comprises the optic disc and the macula, the generator 101 terminates the analysis. The generator 101 determines the quality value and the associated quality value label of the training fundus image as 'bad' .

[00031] In an embodiment, an annotator generates the training fundus image dataset and the ground-truth file by classifying each of the training fundus images based on the quality value using an annotation platform. The annotation platform is a graphical user interface (GUI) used by the annotator to interact with the fundus image quality assessment system 1000. The annotator uses the annotation platform to access the training fundus images. The annotator is usually a trained specialist in accurately recognizing the quality of a training fundus image by partitioning the training fundus image into the predefined number of elements based on the structure.

[00032] The annotator manually analyses the partitioned elements based on the element-quality values of each of the elements and the location of the region of interest in the partitioned elements. For example, the annotator divides the training fundus image into nine equal elements and analyses each of the elements to determine the quality value of the training fundus image. The annotator considers multiple quality factors comprising the presence of one or more artifacts while analyzing the partitioned elements to finally determine the quality value label of the training fundus image. The annotator determines the element-quality value of the elements to determine the quality value of the training fundus image. If any one of the elements has a 'bad' element-quality value and comprises the region of interest, then the annotator determines the quality value of the training fundus image as 'bad'. The annotator considers a minimum of two elements with 'bad' element- quality value with an absence of the region of interest to determine the quality value of the training fundus image as 'bad' . If the annotator determines the element-quality value of all the elements as 'good', then the annotator classifies the quality value of the training fundus image as 'good' .

[00033] In an embodiment, the annotator annotates the same set of training fundus images which is used by the fundus image quality assessment system 1000 to generate the training fundus image dataset and the ground-truth file. In another embodiment, the annotator annotates a different set of training fundus images and adds the annotated training fundus images information to the training fundus image dataset and the ground-truth file. The quality value labels based on which the training fundus images are annotated remains the same. [00034] The generated training fundus image dataset and the ground-truth file are an input to a pre-processing means 102. In an embodiment, the annotator annotated training fundus image dataset and the corresponding ground-truth file is an input to the pre-processing means 102. In another embodiment, the generated training fundus image dataset and the ground-truth file from the generator 101 and annotator annotated training fundus images and the corresponding ground-truth file are combined and fed as an input to the pre-processing means 102.

[00035] The pre-processing means 102 processes each of the training fundus images in the training fundus image dataset. For each of the training fundus image, the pre-processing means 102 processes by performing the following steps. The preprocessing means 102 separates any text matter present at the border of the training fundus image. The pre-processing means 102 adds a border to the training fundus image with border pixel values as zero. The pre-processing means 102 increases the size of the training fundus image by a predefined number of pixels, for example, 20 pixels width and height. The additional pixels added are of a zero value. The preprocessing means 102 next converts the training fundus image from a RGB color image to a grayscale image. The pre-processing means 102 now binarize the training fundus image using histogram analysis. The pre-processing means 102 applies repetitive morphological dilation with a rectangular element of size [5, 5] to smoothen the binarized training fundus image. The pre-processing means 102 acquires all connected regions such as retina, text matter of the smoothen training fundus image to separate text matter present in the training fundus image from a foreground image. The pre-processing means 102 determines the largest region among the acquired connected regions as the retina. The retina is assumed to be the connected element with the largest region. The pre-processing means 102 calculates a corresponding bounding box for the retina. The pre-processing means 102 thus identifies retina from the training fundus image. [00036] Once the pre-processing means 102 identifies the retina in the training fundus image, the pre-processing means 102 further blurs the training fundus image using a Gaussian filter. The pre-processing means 102 compares an image width and an image height of the blurred training fundus image based on Equation 1.

Image width > 1.2(image height)— Equation 1

[00037] The pre-processing means 102 calculates a maximum pixel value of a left half, a maximum pixel value of a right half and a maximum background pixel value for the blurred training fundus image when the image width and the image height of the blurred identified retina satisfies the Equation 1. The maximum background pixel value (Max_background pixel value) is given by the below Equation 2. The term 'max_pixel_left' in Equation 2 is the maximum pixel value of the left half of the blurred identified retina. The term 'max_pixel_right' in Equation 2 is the maximum pixel value of the right half of the blurred training fundus image.

Max_background pixel value = max (max_pixel_left, max_pixel_right)— Equation

2

[00038] The pre-processing means 102 further extracts foreground pixel values from the blurred training fundus image by considering pixel values which satisfy the below Equation 3.

All pixel values > max_background_pixel_value + 10— Equation 3 [00039] The pre-processing means 102 calculates a bounding box using the extracted foreground pixel values from the blurred training fundus image. The preprocessing means 102 processes the bounding box to obtain a resized image using cubic interpolation of shape [256, 256, 3]. The pre-processing means 102 uses this final resized image for training the convolutional network. This is the pre-processed training fundus image which is the output obtained from the pre-processing means 102. The pre-processing means 102 stores the pre-processed training fundus images in a pre-processed training fundus image dataset. The ground-truth file associated with the training fundus image dataset holds good even from the pre-processed training fundus image dataset.

[00040] A segregation means 103 divides the pre-processed training fundus image dataset into two sets - a learning set and a validation set. Hereafter, the pre- processed training fundus images in the learning set is termed as learning fundus images and the pre-processed training fundus images in the validation set is termed as validation fundus images for simplicity. The learning set is used to learn to assess the learning fundus images based on the quality value label associated with each of the learning fundus image. The validation set is typically used to test the accuracy of the convolutional network.

[00041] The segregation means 103 transmits the learning set to an augmentation means 104 and the validation set to the quality assessment means 105. The input to the augmentation means 104 is the learning set from the segregation means 103. The augmentation means 104 randomly shuffles the learning fundus images to divide the learning set into a plurality of batches. Each batch is a collection of a predefined number of learning fundus images. The augmentation means 104 randomly samples each batch of learning fundus images. The augmentation means 104 processes each batch of the learning fundus images using affine transformations. The augmentation means 104 translates and rotates the learning fundus images in the batch randomly based on a coin flip analogy. The augmentation means 104 also adjusts the color and brightness of each of the learning fundus images in the batch randomly based on the results of the coin flip analogy. The output of the augmentation means 104 are batches of augmented learning fundus images which is an input to the quality assessment means 105.

[00042] The quality assessment means 105 receives batch wise augmented learning fundus images from the augmentation means 104. The quality assessment means 105 communicates with the storage unit 107 to access the corresponding quality value labels associated with the augmented learning fundus images. The quality assessment means 105 trains the convolutional network using the batches of augmented learning fundus images.

[00043] In general, the convolutional network is a class of deep artificial neural networks that can be applied to analyzing visual imagery. The convolutional network comprising 'n' convolutional stacks applies a convolution operation to the input and passes an intermediate result to a next layer. Each convolutional stack comprises a plurality of convolutional layers. A first convolution stack is configured to convolve pixels from an input with a filter to generate a first feature map. The first convolutional stack also comprises a first subsampling layer configured to reduce a size and variation of the first feature map. The first convolutional layer of the first convolutional stack is configured to convolve pixels from the input with a plurality of filters to generate the first feature stack with multiple feature maps. The first convolutional stack passes an intermediate result to the next layer. Similarly, each convolutional stack comprises a sub- sampling layer configured to reduce a size (width and height) of the features stack.

[00044] The segregation means 103 groups the validation fundus images of the validation set into a plurality of batches. Each batch comprises multiple validation fundus images. The segregation means 103 transmits batches of validation fundus images as an input to the quality assessment means 105. The quality assessment means 105 receives batch wise segregated validation fundus images from the segregation means 103. The validation fundus images are not augmented and are fed to the quality assessment means 105. The convolutional network validates each of the validation fundus images in each batch of the validation set. The result of the validation is compared against a corresponding quality value label of the validation fundus image by referring to the ground-truth file to evaluate a convolutional network performance for the batch of validation set.

[00045] The quality assessment means 105 optimizes the convolutional network parameters using an optimizer, for example, a Nadam optimizer which is an Adam optimizer with Nesterov Momentum. The optimizer iteratively optimizes the parameters of the convolutional network during multiple iterations using the learning set. Here, each iteration refers to a batch of the learning set. The quality assessment means 105 evaluates a convolutional network performance after a predefined number of iterations on the validation set. Here, each iteration refers to a batch of the validation set.

[00046] Thus, the quality assessment means 105 trains the convolutional network based on the augmented learning set and tests the convolutional network based on the segregated validation set. Upon completion of training and validation of the convolution network based on the convolutional network performance, the fundus image quality assessment system 1000 is ready to assess the input fundus image based on the quality of the input fundus image.

[00047] The pre-processing means 102 of the fundus image quality assessment system 1000 receives the input fundus image from one of the input devices, for example, a fundus camera. For example, the input fundus image is the fundus image of a patient which is captured using a camera of a smart phone. The pre-processing means 102 processes the input fundus image similar to that of the training fundus image. The preprocessed input fundus image is the output of the preprocessing means 102. The preprocessed input fundus image from the preprocessing means 102 is the input to a test time augmentation means 108. The test time augmentation means 108 converts the preprocessed input fundus image into a plurality of test time images, for example, twenty test time images, using deterministic augmentation. The test time images of the preprocessed input fundus image are, for example, duplicate versions of the preprocessed input fundus image. The test time augmentation means 108 follows the same process as the augmentation means 104 to augment the test time images of the preprocessed input fundus image, except that the augmentations are deterministic. The output of the test time augmentation means 108 are the deterministically augmented twenty test time images of the preprocessed input fundus image. The output of the test time augmentation means 108 is fed as an input to the quality assessment means 105. The quality assessment means 105 uses the convolutional network to obtain the quality value associated with the input fundus image. The convolutional network comprising 'n' convolutional stacks processes each of the deterministically augmented twenty test time images of the preprocessed input fundus image. The predicted probabilities of the twenty test time images are averaged to conclude a final prediction result. [00048] The output of the convolutional network provides the probability for each of the quality value labels associated with the input fundus image. The convolutional network provides a numeric value within the range [0, 1] for each of the quality value labels associated with the input fundus image. Each of the quality value labels in turn represents a corresponding quality value associated with the input fundus image.

[00049] The quality assessment means 105 also considers the user defined threshold and/or the image capture device characteristics to assess the quality value of the input fundus image. The output of the quality assessment means 105 is one of the quality value labels based on the probability values provided by the convolutional network and the user defined threshold and the image capture device characteristics. In an example, equal weightage is provided to the probability values provided by the convolutional network, the user defined threshold and the image capture device characteristics for assessing the quality value of the input fundus image.

[00050] The quality assessment means 105 assesses whether the preprocessed input fundus image is, for example, either 'good' or 'bad', based on the probability values provided by the convolutional network, the user defined threshold and/or the image capture device characteristics. The user defined threshold is a gradable quality value of the input fundus image as defined by the user. For example, the user defined threshold is a numeric value within the range of [0, 1]. Here, 0 defines a least value and 1 defines a best value defining the gradable quality value of the input fundus image as defined by the user.

[00051] The quality assessment means 105 of the system 1000 receives the user defined threshold from the user of the system 1000 via the input device. The user defined threshold is variable based on user requirements. For example, the user of the system 1000 defines the user defined threshold based on the user's gradable efficiency. In other words, the quality value of the input fundus image may be varied by varying the user defined threshold based on the user's grading experience and efficiency. This in turn increases the flexibility of the system 1000 to cater to the different needs of experienced and novice medical practitioners during evaluation of the input fundus image.

[00052] The fundus image quality assessment system 1000 considers the image capture device characteristics of an image capture device as one of the parameters to assess the quality of the input fundus image. The image capture device characteristics is a resolution, an illumination factor, a field of view or any combination thereof. The image capture device is, for example, a fundus camera, a camera attached to a smartphone, etc., used to capture the input fundus image. In an embodiment, the fundus image quality assessment system 1000 considers a manufacturer and version of the image capture device to determine a predefined score for the image capture device characteristics of the image capture device. This predefined score for the image capture device characteristics is used to assess the quality of the input fundus image. The predefined score for the image capture device characteristics denotes a superiority of the image capture device characteristics. The predefined score for the image capture device characteristics is a numeric value within the range of [0, 1]. Here, 0 defines a least value and 1 defines a highest value of the predefined score for the image capture device characteristics. For example, the predefined score for the image capture device characteristics for multiple manufacturers of image capture device is initially stored in the storage unit 107 by an operator of the fundus image quality assessment system 1000. By considering the image capture device characteristics of an image capture device to assess the quality of the input fundus image, the flexibility of the system 1000 is increased, thereby providing customized results for the input fundus image captured using the image capture device of multiple manufacturers.

[00053] Thus, the quality assessment means 105 assesses the input fundus image based on the factors - the probability values provided by the convolutional network, the user defined threshold and/or the image capture device characteristics. The quality assessment means 105 assigns the quality value label to the input fundus image based on the assessment. The quality value label defines the quality measure of the input fundus image for further grading of the input fundus image.

[00054] The user defined threshold is user defined to increase flexibility of the system 1000. The user defined threshold is the variable factor which may be used to vary the quality value label of the input fundus image to conveniently suit the requirements of the user, for example, medical practitioner. In an embodiment, the user defined threshold and/or the image capture device characteristics may be not considered and only the probability values provided by the convolutional network are used by the quality assessment means 105 to assess the quality value of the input fundus image.

[00055] Figure 2 exemplary illustrates the convolutional network of the quality assessment means 105 to compute the quality value of the input fundus image. The deterministically augmented twenty test time images of the preprocessed input fundus image are the input to a first convolutional stack (CS1) of the convolutional network which is a part of the quality assessment means 105. Each of the deterministically augmented twenty test time images of the preprocessed input fundus image is processed by the convolutional network. The deterministically augmented test time image is, for example, represented as a matrix of width 224 pixels and height 224 pixels with '3' channels. That is, the deterministically augmented test time image is a representative array of pixel values is 224 x 224 x 3. The first convolution stack (CS1) is configured to convolve pixels from the deterministically augmented test time image with a filter to generate a first feature map. The first convolutional stack (CS1) also comprises a first subsampling layer configured to reduce a size and variation of the first feature map. The output of the first convolutional stack (CS1) is a reduced input fundus image represented as a matrix of width 64 pixels and height 64 pixels with nl channels. That is, the output is a representative array of pixel values 64 x 64 x nl. This is the input to a second convolutional stack (CS2), which again convolves the representative array of pixel values 64 x 64 x nl to generate a second feature map. The second convolutional stack (CS2) comprises a second subsampling layer configured to reduce a size and variation of the second feature map to a representative array of pixel values of 16 x 16 x n2, n2 being the number of channels. The representative array of pixel values of 16 x 16 x n2 is an input to a third convolutional stack (CS3). The third convolutional stack (CS3) convolves the representative array of pixel values 16 x 16 x n2 to generate a third feature map. The third convolutional stack (CS3) comprises a third subsampling layer configured to reduce a size and variation of the third feature map to a representative array of pixel values of 8 x 8 x n3, n3 representing the number of channels. A fourth convolutional stack (CS4) convolves the representative array of pixel values 8 x 8 x n3 to generate a fourth feature map. The fourth convolutional stack (CS4) comprises a fourth subsampling layer configured to reduce a size and variation of the third feature map. A probability block (P) provides a probability of the quality value associated with the input fundus image. The predicted probabilities of the twenty test time images are averaged over to get a final prediction result. The final prediction result is the probability of the quality value of the input fundus image which are two values within a range [0,1] indicating the gradable quality measure - a 'goodness' and a 'badness' of the input fundus image. [00056] The quality assessment means 105 assesses the input fundus image based on the factors - the probability values provided by the convolutional network, the user defined threshold and/or the image capture device characteristics. The quality assessment means 105 assigns a quality value label to the input fundus image based on the assessment.

[00057] The output of the quality assessment means 105 is transmitted to a display 106. The quality value label associated with the input fundus image and the captured input fundus image are displayed to a user via the display 106. For example, when the quality value label of the input fundus image is 'bad', a pop-up box is displayed on a screen with a set of instructions to the user to capture an alternative fundus image of a patient. The fundus image quality assessment system 1000 may also generate a report based on the quality value label of the input fundus image which may be communicated to the patient via an electronic mail. The report could also be stored in the storage unit 107 of the fundus image quality assessment system 1000.

[00058] Figure 3 illustrates a flowchart for determination of the quality value of the input fundus image by the fundus image quality assessment system 1000 in accordance with the invention. At step 301, the generator 101 generates the training fundus image dataset and a ground-truth file. Step 301 comprises sub-steps 301a, 301b and 301c which the generator 101 performs on each of the training fundus images in the training fundus image dataset. At step 301a, the generator 101 partitions each of the training fundus images into a predefined number of elements based on a structure, wherein the partitioned elements comprises a primary element and one or more secondary elements. The structure is, for example, a square grid, a hexagonal grid, a circular grid, an irregular shaped grid, etc. At step 301b, the generator 101 analyzes the primary element with each of the secondary element sequentially for the partitioned training fundus image. [00059] At step 301c, the generator 101 determines the quality value label of the analyzed training fundus image. The quality value label represents a measure of perceived image degradation as compared to an ideal image reference based on amounts of multiple quality factors. The quality factors are, for example, darkness, light, contrast, color accuracy, tone reproduction, distortion, an exposure accuracy, sharpness, noise, lens flare, etc. The training fundus image identifier of a training fundus image is, for example, a name or identity assigned to the training fundus image.

[00060] At step 302, the quality assessment means 105 trains the convolutional network based on the generated training fundus image dataset and the ground-truth file. At step 303, the quality assessment means 105 computes the quality value of an input fundus image. At step 304, the quality assessment means 105 assesses the input fundus image based on the identified quality value of the fundus image, the user defined threshold and/or the image capture device characteristics. The fundus image quality assessment system 1000 assigns the quality value label based on the identified quality value of the fundus image, the user defined threshold and/or the image capture device characteristics.

[00061] For example, the fundus image quality assessment system 1000 assesses the input fundus image as either 'good' or 'bad' based on the identified quality value of the fundus image the user defined threshold and/or the image capture device characteristics. The input fundus image of 'good' quality value is gradable by doctors to determine a medical condition in the input fundus image. The input fundus image of 'bad' quality is not gradable by doctors. The user defined threshold may be varied to vary the quality value of the input fundus image based on the doctor's grading experience. The fundus image quality assessment system 1000 may further display a message to an operator of the fundus image quality assessment system 1000 to retake another fundus image of the patient in case of 'bad' quality value fundus image. [00062] The fundus image quality assessment system 1000 using the convolutional network, thus provides an accurate level of image quality of the fundus of a patient's eye. In an embodiment, the fundus image quality assessment system 1000 focuses on assessing the entire fundus image as a whole to detect the quality of the fundus image. This improves efficiency and reduces errors in identifying various medical conditions when compared to in manual diagnosis of retinal imaging. The fundus image quality assessment system 1000 acts as an important tool to decide on the quality of the fundus images which in turn may be used by medical practitioner in the detection and monitoring progression of several optic nerve diseases such as diabetic retinopathy, glaucoma, macular degeneration, etc.

[00063] The fundus image quality assessment system 1000 immediately provides a report to the operator of the system 1000 about the quality value of the captured fundus image of a patient. When the fundus image is of, for example, a 'bad' quality value label, the fundus image quality assessment system 1000 instantly prompts the operator to capture another fundus image of the patient. This eliminates any delays during the analysis stage of the fundus images by doctors for any medical conditions due to unavailability of 'good' quality fundus images associated with the patient.

[00064] The present invention described above, although described functionally or sensibly, may be configured to work in a network environment comprising a computer in communication with one or more devices. The present invention, may be implemented by computer programmable instructions stored on one or more computer readable media and executed by a processor of the computer. The computer comprises the processor, a memory unit, an input/output (I/O) controller, and a display communicating via a data bus. The computer may comprise multiple processors to increase a computing capability of the computer. The processor is an electronic circuit which executes computer programs. The processor executes the instructions to assess the input fundus image.

[00065] The memory unit, for example, comprises a ROM and a RAM. The memory unit stores the instructions for execution by the processor. In this invention, the storage unit 107 is the memory unit. The memory unit stores the training fundus image dataset and the ground-truth file. The memory unit may also store intermediate, static and temporary information required by the processor during the execution of the instructions. The computer comprises one or more input devices, for example, a keyboard such as an alphanumeric keyboard, a mouse, a joystick, etc. The I/O controller controls the input and output actions performed by a user. The data bus allows communication between modules of the computer. The computer directly or indirectly communicates with the devices via an interface, for example, a local area network (LAN), a wide area network (WAN) or the Ethernet, the Internet, a token ring, a Bluetooth connectivity, or the like. Further, each of the devices are adapted to communicate with the computer may comprise computers with, for example, Sun® processors, IBM® processors, Intel® processors, AMD® processors, etc.

[00066] The computer readable media comprises, for example, CDs, DVDs, floppy disks, optical disks, magnetic-optical disks, ROMs, RAMs, EEPROMs, magnetic cards, application specific integrated circuits (ASICs), or the like. Each of the computer readable media is coupled to the data bus. [00067] The foregoing examples have been provided merely for the purpose of explanation and does not limit the present invention disclosed herein. While the invention has been described with reference to various embodiments, it is understood that the words are used for illustration and are not limiting. Those skilled in the art, may effect numerous modifications thereto and changes may be made without departing from the scope and spirit of the invention in its aspects.

Claims

WE CLAIM:-

1. A fundus image quality assessment system 1000, said fundus image quality assessment system 1000 comprising: a storage unit 107 adapted to store a training fundus image dataset; a generator 101 adapted to generate the training fundus image dataset and a ground-truth file, the training fundus image dataset comprising a plurality of training fundus images; wherein said generator 101 adapted to

-partition each of the training fundus images into a predefined number of elements based on a structure, wherein the partitioned elements comprises a primary element and one or more secondary elements;

- analyze the primary element with each of the secondary element sequentially for the partitioned training fundus image;

-determine a quality value label of the analyzed training fundus image; a quality assessment means 105 adapted to train a convolutional network based on the generated training fundus image dataset and the ground- truth file; said quality assessment means 105 adapted to compute the quality value of an input fundus image; and said quality assessment means 105 adapted to assess the input fundus image based on the computed quality value, a user defined threshold and/or an image capture device characteristics.

2. The system 1000 as claimed in claim 1, wherein the generator 101 is adapted to terminate the analysis and determine the quality value label of the training fundus image as bad, comprising: determining an element-quality value of the primary element or a secondary element under analysis as bad; and detecting a presence of region of interest in the element analyzed as bad.

3. The system 1000 as claimed in claim 1, wherein the generator 101 is adapted to determine the quality value label of the training fundus image as good, comprising: determining an element-quality value of the primary element or a secondary element under analysis as bad; detecting an absence of region of interest in the element analyzed as bad; and repeating the above two steps one or more times, until each of the secondary elements is analyzed with the primary element.

4. The system 1000 as claimed in claim 1, wherein the generator 101 is adapted to determine the quality value label of the training fundus image as good when an element-quality value of each of the partitioned elements is good.

5. The system 1000 as claimed in claim 1, wherein the user defined threshold is a user defined parameter to vary the quality value label of the input fundus image.

6. A method for assessing an input fundus image by a fundus image quality assessment system 1000, said method comprising:

generating a training fundus image dataset and a ground-truth file, the training fundus image dataset comprising a plurality of training fundus images; wherein said generating step comprising:

-partitioning each of the training fundus images into a predefined number of elements based on a structure, wherein the partitioned elements comprises a primary element and one or more secondary elements;

- analyzing the primary element with each of the secondary element sequentially for the partitioned training fundus image;

-determining a quality value label of the analyzed training fundus image; training a convolutional network based on the generated training fundus image dataset and the ground-truth file; computing the quality value of an input fundus image; and assessing the input fundus image based on the identified quality value, a user defined threshold and/or an image capture device characteristics.

7. The method as claimed in claim 6, wherein terminating the analysis to determine the quality value label of the training fundus image as bad, comprising determining an element-quality value of the primary element or a secondary element under analysis as bad; and detecting a presence of region of interest in the element analyzed as bad.

8. The method as claimed in claim 6, wherein determining the quality value label of the training fundus image as good comprising

determining an element-quality value of the primary element or a secondary element under analysis as bad; detecting an absence of region of interest in the element analyzed as bad; and repeating the above two steps one or more times, until each of the secondary elements is analyzed with the primary element.

9. The method as claimed in claim 6, wherein determining the quality value of the training fundus image as good when an element-quality value of each of the partitioned elements is good.

10. The method as claimed in claim 6, wherein the user defined threshold is a user defined parameter to vary the quality value label of the input fundus image.