Abstract
Intestinal parasites pose a widespread challenge in underdeveloped and developing countries, afflicting millions of individuals. Traditional, manual light microscopes have been golden method for detecting these parasites, but they are not only expensive but also time-consuming and require specialized expertise. Recent advances in deep learning, however, have shown promise for overcoming these obstacles. The condition is that deep learning models require labeled medical imaging data, which is both scarce and costly to generate. This makes it difficult to establish universal deep learning models that required extensive amounts of data. To improve the performance of deep learning, we employed a generative adversarial network to fabricate a synthetic dataset. Our framework exploits the potential of Generative Adversarial Networks (CycleGANs) and Faster RCNN to generate new datasets and detect intestinal parasites, respectively, on images of varying quality, leading to improved model generalizability and diversity. In this experiment, we evaluated the effectiveness of Cycle Generative Adversarial Network (CycleGAN) + Faster RCNN. We employed widely-used evaluation metrics such as precision, recall, and F1-score. We demonstrated that the proposed framework effectively augmented the image dataset and improved the detection performance, with an F1-Score of 0.95 and mIoU of 0.97 are achieved, which is better than without data augmentation. We show that this state-of-the-art approach sets the stage for further advancements in the field of medical image analysis. Additionally, we have built a new dataset, which is now publicly accessible, offering a broader range of classes and variability for future research and development.
Article Highlights
-
In this paper, we proposed an improved Faster-RCNN, aimed at detecting intestinal parasites in complicated situations.
-
We focus on dataset enhancement, with the aim of achieving both high performance and system generalization.
-
The model was built utilizing publicly available datasets.
-
Our proposed strategy has outperformed existing state-of-the-art.
Avoid common mistakes on your manuscript.
1 Introduction
Intestinal parasitic diseases are the most widespread infectious disease, affecting millions of people globally, and they are particularly prevalent in underdeveloped regions where individuals live in unsanitary conditions. The World Health Organization reported that around 1.5 billion people were afflicted with soil-transmitted helminth infections in 2020 [1]. Human intestinal parasites, which are responsible for causing these diseases, such as diarrhea, malnutrition, and anemia, particularly impacting children and impeding their growth, can be categorized into three groups: helminths, protozoa, and ectoparasites [2]. It also affects physical and mental growth, job performance, and education, potentially influencing the quality of the future population and the country’s long-term growth [3]. The physical similarity of parasites and the presence of impurities in samples present difficulties in manually distinguishing between different types of parasite eggs using a microscope [4, 5]. As a result, significant training is required to develop skilled experts to perform diagnoses. This manual evaluation is both labor-intensive and time-consuming, taking an experienced technician an average of 30 min to analyze a single sample [6]. As a result, the development of an automated diagnostic faecal examination for parasitic diseases is essential to overcome the limitations of traditional diagnostic methods. Further, most infected people exhibit no or mild symptoms, it’s important to recognize that parasitic infection grow during pregnancy may result in severe nerve damage and, in some cases, infant mortality [7]. Leishmania is a neglected tropical disease spread out with female phlebotomine sandflies affecting over 700,000 people annually [8]. Moreover, trichomonad parasites found in the intestines, and oral cavity cause the human disease trichomoniasis [9].
Machine learning methods have been used in several studies to analyze microscopic images containing parasite eggs/cysts. Support Vector Machine (SVM) [10, 11] and Artificial Neural Networks (ANN) [12, 13] are examples of such systems. Prior attempts towards automating the detection and estimation of intestinal parasites [14, 15] involved complex processes that involved image processing and machine learning classification. These methods generally rely on extracting features from a set of measurements, especially intensity, dimension, and surface texture. As a result, considerable work is necessary during the feature extraction stage to fine-tune the features. Despite these efforts, none of these methods have achieved widespread acceptance due to generalizability issues as well as replication, comparison, and extension difficulties. Over the last decade, deep learning-based algorithms have been improved as a result of advances in computer performance and the availability of image datasets [16]. Deep learning has been shown to be extremely effective in solving a wide range of problems in a variety of disciplines, including text recognition, computer-aided diagnosis, face identification, and drug development [17, 18].
We applied the Faster-RCNN detector [19] as the foundation for our research, as it has exhibited favourable precision and speed when applied to images as to other deep models. Medical image analysis, on the other hand, presents distinct obstacles. Supervised deep learning requires large training datasets, which can be difficult to come by for medical images due to their high acquisition costs and the labour-intensive nature of manual annotation. To overcome these limitations, we propose expanding the baseline training dataset through data augmentation. While many data augmentation approaches use image transformations such as rotations and translations [20], we adopt a different CycleGAN approach [21], an unsupervised system capable of generating images based on annotated source images from a different modality. Our findings show that combining CycleGAN and Faster-RCNN provides an efficient and effective method for augmenting datasets and recognizing intestinal parasites in microscopy images.
Our work encapsulates several major contributions, which are summarized as below:
-
To provide fully automated proposal for dealing with low-quality intestinal parasite images captured using portable devices in clinical practice.
-
To provide an oversampling strategy that does not require a paired dataset, effectively capturing domain variability and improving dataset representativeness.
-
To provide robust methodology for detecting parasites in data-scarce contexts, significantly improving on existing state-of-the-art methods.
-
Extensive experimentation is used to validate our methodology, demonstrating its suitability, robustness, and uniqueness in augmenting intestinal parasite images with CycleGAN architectures and detecting with Faster-RCNN.
The rest of the present manuscript is organized into the following sections: Sect. 2, “Related Work," comprehensively outlines the important resources to reproduce our work. Section 3 “Methodology” provides details of the proposed strategy, experiment setup and specific parameters for each experiment to evaluate the performance that is discussed in Sect. 6 “result and Discussion.” Sect. 5, ‘‘Evaluation’’, shows the various metric tools to validate the recent work. Section 6, ‘‘Results and Discussion’’, shows the findings and detailed analysis achieved after the validation of the proposed method was performed. Finally, Sect. 7, “Conclusions,” succinctly provides contributions and noteworthy aspects, emphasizing the importance of our findings validated through extensive experimentation.
2 Related work
2.1 Object detection
Various architectural designs that perform better in object detection tasks have served as inspiration for the development of deep convolutional neural networks, which are used in modern methodologies in medical images for detection, classification, and segmentation tasks. Here, we presented an examination of some current methodologies used in the field of microorganisms to detect parasite eggs/cysts from microscopy images. Waithe et al. [22] evaluated how well state-of-the-art neural network designs detected luminous cells in microscope images. L Von et al. [23] introduced the ZeroCostDL4Mic, which allows researchers with no coding expertise to train and apply key deep learning networks to perform tasks including segmentation, object detection, and denoising. Kumar et al. [18] proposed an efficient and effective framework for intestinal parasite egg detection using YOLOv5, which achieved a mean average precision of approximately 97% for detection. Deep learning-based detection methods are widely classified into two approaches: two-stage and one-stage methods. In the former, models are trained separately for two unique tasks: detecting regions of interest and classifying and localizing objects. The Region-Based Convolutional Neural Network (R-CNN) algorithms are among the best in this area [13, 24]. These approaches make use of modules for feature extraction, classification, and regression, with region proposal handled by a distinct convolutional network in [4]. In the field of medical image analysis, regression forests have typically been the most effective statistical detection methods [2]. As observed in [25, 26], these methodologies have been deployed in a cascaded fashion, going from a global to a local environment. AI platform enables non-programmers to use AI for microscope image processing. ResNeXt-50–32 4d algorithm outperforms others with 96.83% accuracy and an F1-score of 96.82%. MobileNet-V2 strikes a balance between 95.72% accuracy and computational cost. Deep learning methods, on the other hand, are fast gaining popularity in this domain. Primarily, Faster RCNN has been used to recognize objects in parasite images [27], while Fast RCNN has been used to detect parasite eggs in medical images [28]. Our proposed framework applies a deep learning framework in two steps. The first step involves image enhancement before input into the object detection model. This enhancement is achieved through a Cycle Generative Adversarial Network (CycleGAN) model that is trained to convert low-resolution images into high-resolution ones. Finally, the object detection is then performed using a Faster-RCNN model, with ResNet50 as its backbone.
2.2 Data augmentation
Research organizations have explored the use of CycleGAN, an unsupervised technique to synthesize unpaired images particularly from one domain to another domain [29]. CycleGAN has been a frequently used method for creating synthetic datasets of image collection. Its primary potential to handle unpaired data, which is extremely useful in our situation. Image acquisition of multiple modalities for the same subject under identical conditions is usually not possible. CycleGAN has been used in prior studies, including [30], which used it to produce chest X-rays images for pneumonia detection, and [31], which used it to generate lung MRI images from CT images for lung tumour segmentation. CycleGAN is used to produce target modality images from labeled source images, and the source labels are then translated to the target domain. Additionally, several proposals were explored for synthetic image creation for over-sampling the original sample collection. These techniques, as demonstrated by Bouteldja N et al. [32, 33] and Motamed S et al. [34], make use of distinct GAN frameworks in similar contexts.
3 Methodology
The proposed approach has divided into two stages, which are meant for data augmentation and object detection, as shown in Fig. 1. The first stage focuses on synthetic image synthesis, with the CycleGAN algorithm. Section 3.2 has more information about the first stage. The second stage focuses on the detection of intestinal parasites from microscopy images with customized Faster-RCNN algorithm. Section 3.3 explores the workings of the second module.
3.1 Datasets
To evaluate the proposed framework, we obtained the intestinal parasite image dataset from Chulalongkorn University in Thailand. The total number of parasite images and dimensions are shown in Table 1. The dataset collection contains images obtained with different devices under different environmental conditions. The dataset includes 2,500 images categorized into 5 classes, consisting of 500 images of Ascaris lumbricoides (AL), 500 of Hookworm (HW), 509 of Fasciolopsis buski (FB), 500 of Taenia spp. (TS), and 500 of Hymenolepis nana (HN). These images, shown in Fig. 2, display distinct features, with some showing clear definitions and others showing blurriness or variations in lighting conditions. Furthermore, the resolution, color saturation, and contrast differ depending on the microscope used. The presence of debris in the background also varies significantly among the images. Thus, we proposed a framework that allows the transformation of input data so that the architecture would not experience any reduction in performance or model generalization. To train the CycleGAN model, images from [35] were utilized. We standardized the image size to 416 × 416 for compatibility with the Faster-RCNN algorithm. Furthermore, we divided the dataset into training, validation, and testing sets, with proportions of 70%, 20%, and 10%, respectively.
3.2 Model architectures and training details
3.2.1 Augmentation methods
In this experiment, we implemented deep neural network based CycleGAN algorithm() to generate synthetic intestinal parasitic images. The cyclic nature of this algorithm employs reverse transformation i.e. the architecture capable of converting generated images back into original images. CycleGAN architectures are widely employed in medical image analysis for image-to-image generation due to their robustness, flexibility and encouraging results on related problems. The CycleGAN model has two generators, each paired with a discriminator. The key concept in CycleGAN is the cycle consistency loss function, which is used to optimize the framework. Here's how it performs the operation: the output from the first generator can serve as the input image for the second generator, and the resulting image from the second generator should match the original image. Similarly, the output image from the second generator can be used as the input image for the first generator, and it should match the input image from the second generator, as shown in Fig. 3.
CycleGAN operates at the batch level: it is given a set of images in domain X and another set of images in domain Y. The goal is to learn the mapping G:X → Y in such a way that the distribution of images in domain X closely approaches the distribution of images in domain Y, such that the training images are indistinguishable from the original dataset. It uses adversarial losses for the mapping function, just like regular Generative Adversarial Networks. Equation (5) describes this function and its related discriminator, Dy.
In this context, we have the generator G, which aims to produce images similar to those in domain Y, and the discriminator D, whose task is to separate the generated image using G as effectively as possible from a genuine image y. When the parameters of the generator model G are changed, G attempts to minimize certain factors, whereas D aims to maximize certain aspects when the parameters of the discriminator model D are updated. However, if the network's capacity is high enough, it may translate the same collection of input images to any arbitrary arrangement of images in the destination domain. This, however, does not guarantee that each input x and output y are properly matched. Through successive mappings using G, this can result in a similar distribution in y, rendering the loss useless. To overcome this problem, CycleGAN combines the original and inverse mappings and employs a cyclic consistency loss to provide a meaningful link in both directions.
In this study’s CycleGAN model incorporates two mapping functions, G: → X and F: → Y, as well as the related adversarial discriminators Dy and Dx. CycleGAN introduces two cycle consistency losses to further regularize the mapping process: the forward cycle loss assures that when an image travels from one domain to another and back, it recovers to its initial state, as represented by the equation x → G(x) → F (G(x)) ≈x. Similarly, the backward cycle loss assures that an image closely approximates y when it moves from y to (y) and then back to G(F(y)). In the CycleGAN network, the overall loss is composed of various components, including the discriminator loss for X → Y, as indicated in Eq. 3.
The discriminator loss Y → X is indicated in Eq. 4
The cycle consistency loss generated by generators is indicated in Eq. 5
CycleGAN network final loss is given by Eq. 6
The goal is to solve:
Regarding the training configuration, following parameters are applied for CycleGAN setup. About 200 epochs are required for the training of CycleGAN, with a fixed learning rate of 0.0002 for the first 100 epochs and linear decay till zero for the remaining epochs. The training process utilizes the Adam algorithm (Kingma & Ba, 2014) with decay rates of β1 = 0.5 and β2 = 0.999. Loss weights are set as follows: λA = 10.0, λB = 10.0, and λidt = 0.5. The Table 2, outlines the hyper-parameter settings utilized in the training process of the CycleGAN model. The process of hyper-parameter tuning is a crucial and iterative one that requires several rounds of experimentation. It is essential to strike a balance between exploring new configurations and refining promising ones while undertaking this process. In this regard, we acknowledge that CycleGAN has a higher convergence rate, which mitigates the risk of mode collapse, a common concern in GANs. Additionally, we have observed that the Adam optimizer requires less hyper-parameter tuning compared to SGD.
3.3 Detection module
As we mentioned in the section above, Faster RCNN is our detector since it broadly shows a good balance between speed and accuracy. Using this method, an image is first divided into a grid of S × S cells. Three important elements must be predicted for each grid cell: the coordinates of bounding boxes, a confidence score indicating the presence of an item, and a class probability if an object is detected within the bounding box. We use Faster RCNN with ResNet50 as the backbone in our research, as shown in Fig. 4. We use scale-dependent box priors, which we learn from the training set, to improve prediction accuracy. Faster RCNN also incorporates cross-layer connections between each pair of prediction layers, except for the output layer. Specifically, the dataset is randomly partitioned into three subsets, 70% of the samples allocated for training, 20% for validation, and 10% for testing. The initialization of the trained model involves adopting weights from another model previously trained on the ImageNet dataset [36]. These weights were optimized over 200 epochs using the [37] with a mini-batch size of 4, a first-order momentum of 0.9, and a consistent learning rate (α) of 0.01. Table 3 includes an assortment of hyper-parameters for the Faster-RCNN model, which have been fine-tuned. These hyper-parameters can be utilized to reproduce the outcomes with simplicity. Furthermore, it provides a valuable reference point for training a network utilizing other datasets that have comparable sample sizes to ours.
4 Evaluation metrics
Different detection performance metrics were considered to offer comprehensive insights into the evaluation of the proposed methodology. Accuracy, F1-score, recall, precision, and mIoU were taken into consideration to provide an extensive evaluation. To be more specific, in this scenario, the metrics are derived using True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN) as reference values. The framework demonstrates improvements across most of the metrics listed in Eq. 7–10.
Metric | Formula | |
---|---|---|
Accuracy | \(\frac{TP+TN}{TP + TN + FP + FN}\) | (7) |
Recall | \(\frac{TP}{TP + FN}\) | (8) |
Precision | \(\frac{TP}{TP + FP}\) | (9) |
F1-Score | \(2*\frac{Precision*Recall}{Precision+Recall}\) | (10) |
5 Results and discussion
The primary evaluation criteria's precision, recall, F1-score, and mean Intersection over Union (mIoU) were used to assess how well the proposed framework performs for every class in the intestinal parasitic dataset. In this context, we also leverage the popular concepts of true positive, true negative, false positive, and false negative. The mIoU score was calculated by combining the IoU values of each type after evaluating and assigning an IoU score to all types. Moreover, we evaluated performance in terms of precision and recall to compute the F1-score for each egg type with IoU ≥ 0.5. The mAP@[0.5:0.95] represented the average mean Average Precision (mAP) across various IoU thresholds with a 0.05 step. The following scenarios were used to test the effectiveness of the proposed framework. First, using the original dataset, that is, without any previous image enhancement we trained the Faster-RCNN and conducted tests on the original image domain. The results obtained after using the proposed framework on the original dataset are shown in Table 4. Next, we used an enhanced dataset using standard augmentation methods to evaluate the proposed model's performance. We opted to conduct this experiment to investigate whether the pre-processing on this dataset could improve performance in parasite eggs/cysts detection. In addition, we used a number of processes and settings to modify the test input data in order to reproduce a wide range of variability in the results. We tested the model with this changed dataset.
This trained network before being transmitted to the detector network without being retrained.
Finally, we noticed that the detection performance in the original dataset was significantly improved with the synthetic dataset generated using the CycleGAN augmentation model. Generally, the transformations presented to the images using CycleGAN include brightness, image rotation, vibrant color, contrast, motion blurring, and saturation. The Faster-RCNN architecture is trained on these images. Table 5 shows the metrics averaged obtained in the validation dataset. It seems that the performance of the Faster-RCNN trained on CycleGAN-enhanced dataset images achieved the highest precision, recall and F1-Score. The accuracy and loss performance of proposed framework are shown in Figs. 5 and 6 respectively. We demonstrate images in Fig. 7 that were fed into the proposed framework under different scenarios. It depicts that the framework sometimes experienced difficulties in making accurate predictions on original the dataset. This issue, however, is effectively addressed with the enhanced image processed through the CycleGAN model.
6 Comparison against objects detection state-of-the-art methods
We conducted a thorough evaluation of our methods against leading object detection techniques: the Single Shot Detector, AlexNet, ResNet, YOLOv5 and Faster R-CNN, as shown in Table 6. SSD is known for its lightweight architecture, which can recognize multiple items in a single shot. On the other hand, Faster R-CNN requires two steps: first, identifying regions of interest (ROI), and then detecting objects within each ROI using convolutional neural networks (CNNs). Although this approach makes Faster R-CNN slower compared to other deep learning models, it is more accurate and robust. In our tests, SSD uses the VGG-16 backbone, while Faster R-CNN uses the ResNet50 architecture. It's worth noting that we found the You Only Look Once (YOLO) paradigm unsuitable for our application due to its inability to accurately detect small objects in images. Table 7 shows comparisons of the proposed framework with other methods in terms of speed and memory usage.
7 Conclusions and future work
Labeled medical imaging data is both scarce and expensive to generate, posing a major challenge to developing generalized deep learning models that require substantial amounts of data. To address this limitation, standard data augmentation techniques are specifically employed to enhance the generalizability of deep learning based models. However, seeking innovative approaches, generative adversarial networks (GANs) have emerged as a novel method for data augmentation. In this context, we proposed the CycleGAN model to generate synthetic datasets to overcome the data scarcity problem. This data augmentation principle is based on the idea of translating low-resolution parasitic images from a normal scenario to higher resolution in a fully automatic way, generating a new synthetic intestinal parasitic image dataset. This dataset is then merged into the original dataset to enhance the amount of data for the training process. Further, images collected using portable devices provide lower-quality and less detailed images than those captured by stationary cameras. In this regard, the proposed research demonstrates the feasibility of converting lower- dimensional dataset into enhanced-dimensional datasets to improve intestinal parasite eggs/cysts detection from images. The goal of this technique is to make it easier to apply automatic screening methods and models in a realistic clinical scenario.
To validate our framework, we evaluated the performance of the Faster-RCNN network on the newly generated dataset and then tested it with previously unseen data. Our results demonstrated that the proposed trained detector, based on high-quality data generated with the CycleGAN module, enhanced performance. Moreover, we could also perform additional experiments to evaluate the detection performance of other deep learning architectures. This experiment improved the intestinal parasitic detection performance to 0.97, 0.97, and 0.95 based on mIoU, precision, and F1-score, respectively. Moreover, the framework efficiently detects intestinal parasites from microscopic images affected by brightness variations, blurring, noise, and chrominance. As the present efforts focus solely on locating a few types of parasites from images, our plans involve extending this approach to encompass other parasite types as well.
Data availability
The datasets analyzed during the current study are available in the ICIP 2022 Challenge: Parasitic Egg Detection and Classification in Microscopic Images (https://icip2022challenge.piclab.ai/).
References
N. Q. Viet, D. T. T. Tuyen, and T. H. Hoang, ‘Parasite worm egg automatic detection in microscopy stool image based on Faster R-CNN’, in ACM International Conference Proceeding Series, Association for Computing Machinery, Jan. 2019, pp. 197–202. doi: https://doi.org/10.1145/3310986.3311014.
Kumar S, Arif T, Alotaibi AS, Malik MB, Manhas J. Advances towards automatic detection and classification of parasites microscopic images using deep convolutional neural network: methods, models and research directions. Arch Comput Methods Eng. 2022. https://doi.org/10.1007/s11831-022-09858-w.
Zhang C, et al. Deep learning for microscopic examination of protozoan parasites. Comput Struct Biotechnol J. 2022;20:1036–43. https://doi.org/10.1016/j.csbj.2022.02.005.
Pho K, Mohammed Amin MK, Yoshitaka A. Segmentation-driven hierarchical retinanet for detecting protozoa in micrograph. Int J Semant Comput. 2019;13(3):393–413. https://doi.org/10.1142/S1793351X19400178.
Zibaei M, Bahadory S, Saadati H, Pourrostami K, Firoozeh F, Foroutan M. Intestinal parasites and diabetes: a systematic review and meta-analysis. New Microbes New Infect. 2023. https://doi.org/10.1016/j.nmni.2022.101065.
Holmström O, et al. Point-of-care mobile digital microscopy and deep learning for the detection of soil-transmitted helminths and Schistosoma haematobium. Glob Health Action. 2017. https://doi.org/10.1080/16549716.2017.1337325.
Attias M, Teixeira DE, Benchimol M, Vommaro RC, Crepaldi PH, De Souza W. The life-cycle of Toxoplasma gondii reviewed using animations. Parasit Vectors. 2020. https://doi.org/10.1186/S13071-020-04445-Z.
Tomiotto-Pellissier F, et al. Macrophage polarization in leishmaniasis: broadening horizons. Front Immunol. 2018. https://doi.org/10.3389/FIMMU.2018.02529.
Chen X, et al. Recent advances and clinical applications of deep learning in medical image analysis. Med Image Anal. 2022;79:102444. https://doi.org/10.1016/J.MEDIA.2022.102444.
Suykens JAK, Vandewalle J. Least squares support vector machine classifiers. Neural Proc Lett. 1999;9(3):293–300. https://doi.org/10.1023/A:1018628609742.
Borba VH, Martin C, Machado-Silva JR, Xavier SCC, de Mello FL, Iñiguez AM. Machine learning approach to support taxonomic species discrimination based on helminth collections data. Parasit Vectors. 2021;14(1):1–15. https://doi.org/10.1186/s13071-021-04721-6.
K. E. Delas Penas, E. A. Villacorte, P. T. Rivera, and P. C. Naval, ‘Automated detection of helminth eggs in stool samples using convolutional neural networks’, IEEE Region 10 Annual International Conference, Proceedings/TENCON, vol. 2020-Novem, pp. 750–755, 2020, doi: https://doi.org/10.1109/TENCON50793.2020.9293746.
Rosado L, da Costa JMC, Elias D, Cardoso JS. Mobile-based analysis of malaria-infected thin blood smears: automated species and life cycle stage determination. Sensors. 2017;17(10):2167. https://doi.org/10.3390/S17102167.
J. Larsson and R. Hedberg. Development of machine learning models for object identification of parasite eggs using microscopy. 2000. http://www.teknat.uu.se/student
Alva A, et al. Mathematical algorithm for the automatic recognition of intestinal parasites. PLoS ONE. 2017;12(4):e0175646. https://doi.org/10.1371/JOURNAL.PONE.0175646.
A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. Adv Neural Inf Process Syst, vol. 25, 2012.
Farooq MU, Ullah Z, Khan A, Gwak J. DC-AAE: dual channel adversarial autoencoder with multitask learning for KL-grade classification in knee radiographs. Comput Biol Med. 2023;167:107570. https://doi.org/10.1016/J.COMPBIOMED.2023.107570.
Kumar S, Arif T, Ahamad G, Chaudhary AA, Khan S, Ali MAM. An efficient and effective framework for intestinal parasite egg detection using YOLOv5. Diagnostics. 2023;13(18):2978. https://doi.org/10.3390/DIAGNOSTICS13182978.
Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2015;39(6):1137–49. https://doi.org/10.1109/TPAMI.2016.2577031.
I. Correa, P. Drews, S. Botelho, M. S. De Souza, and V. M. Tavano, ‘Deep learning for microalgae classification’, Proceedings - 16th IEEE International Conference on Machine Learning and Applications, ICMLA 2017, vol. 2017-Decem, no. December, pp. 20–25, 2017, doi: https://doi.org/10.1109/ICMLA.2017.0-183.
J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, ‘Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks’, Proceedings of the IEEE International Conference on Computer Vision, vol. 2017-October, pp. 2242–2251, Mar. 2017, doi: https://doi.org/10.1109/ICCV.2017.244.
Waithe D, Brown JM, Reglinski K, Diez-Sevilla I, Roberts D, Eggeling C. Object detection networks and augmented reality for cellular detection in fluorescence microscopy. J Cell Biol. 2020. https://doi.org/10.1083/JCB.201903166/VIDEO-2.
von Chamier L, Laine RF, Henriques R. Artificial intelligence for microscopy: what you should know. Biochem Soc Trans. 2019;47(4):1029–40. https://doi.org/10.1042/BST20180391.
Seo Y, Park B, Hinton A, Yoon SC, Lawrence KC. Identification of Staphylococcus species with hyperspectral microscope imaging and classification algorithms. J Food Meas Charact. 2016;10(2):253–63. https://doi.org/10.1007/S11694-015-9301-0/TABLES/3.
Liu R, Dai W, Wu T, Wang M, Wan S, Liu J. AIMIC: deep learning for microscopic image classification. Comput Methods Programs Biomed. 2022;226:107162. https://doi.org/10.1016/J.CMPB.2022.107162.
Pullan RL, Smith JL, Jasrasaria R, Brooker SJ. Global numbers of infection and disease burden of soil transmitted helminth infections in 2010. Parasit Vectors. 2014. https://doi.org/10.1186/1756-3305-7-37.
Li S, Du Z, Meng X, Zhang Y. Multi-stage malaria parasite recognition by deep learning. Gigascience. 2021;10(6):1–11. https://doi.org/10.1093/gigascience/giab040.
Yang F, Yu H, Silamut K, Maude RJ, Jaeger S, Antani S. Parasite detection in thick blood smears based on customized faster-RCNN on smartphones. Proc Appl Imag Pattern Recognit Workshop. 2019. https://doi.org/10.1109/AIPR47015.2019.9174565.
Yi X, Walia E, Babyn P. Generative adversarial network in medical imaging: a review. Med Image Anal. 2019;58:101552. https://doi.org/10.1016/J.MEDIA.2019.101552.
Motamed S, Rogalla P, Khalvati F. Data augmentation using generative adversarial networks (GANs) for GAN-based detection of Pneumonia and COVID-19 in chest X-ray images. Inform Med Unlocked. 2021;27:100779. https://doi.org/10.1016/j.imu.2021.100779.
Y. Chen, Y. Zhu, and Y. Chang, ‘CycleGAN Based Data Augmentation for Melanoma images Classification’, ACM International Conference Proceeding Series, pp. 115–119, 2020, doi: https://doi.org/10.1145/3430199.3430217.
P. Mayo, N. Anantrasirichai, T. H. Chalidabhongse, D. Palasuwan, and A. Achim. Detection of parasitic eggs from microscopy images and the emergence of a new dataset.
Bouteldja N, Hölscher DL, Bülow RD, Roberts ISD, Coppo R, Boor P. Tackling stain variability using CycleGAN-based stain augmentation. J Pathol Inform. 2022. https://doi.org/10.1016/j.jpi.2022.100140.
Motamed S, Rogalla P, Khalvati F. Data augmentation using generative adversarial networks (GANs) for GAN-based detection of Pneumonia and COVID-19 in chest X-ray images. Inform Med Unlocked. 2021. https://doi.org/10.1016/j.imu.2021.100779.
Naing KM, et al. Automatic recognition of parasitic products in stool examination using object detection approach. PeerJ Comput Sci. 2022. https://doi.org/10.7717/PEERJ-CS.1065.
J. D. J. Deng, W. D. W. Dong, R. Socher, L.-J. L. L.-J. Li, K. L. K. Li, and L. F.-F. L. Fei-Fei, ‘jjkkjj’, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2–9, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206848
Cai Z. SA-GD: improved gradient descent learning strategy with simulated annealing. arXiv.org. 2021. https://doi.org/10.4855/arXiv.2107.07558.
Funding
The authors extend their appreciation to the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh in Saudi Arabia for funding and supporting this research partnership program no. PR-21–09-86.
Author information
Authors and Affiliations
Contributions
S Kumar: Supervision, Writing—review & editing; T arif: Data curation; G Ahmad: Methodology; A Ahmad: Investigation, Validation; Mohamad Ali: Software; Asimul Islam: Formal analysis.
Corresponding author
Ethics declarations
Ethics approval
No ethical permission was needed.
Consent to participate
All authors consent to participate in this publication.
Consent for publication
All authors consent to publish the manuscript.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kumar, S., Arif, T., Ahamad, G. et al. Improving faster R-CNN generalization for intestinal parasite detection using cycle-GAN based data augmentation. Discov Appl Sci 6, 261 (2024). https://doi.org/10.1007/s42452-024-05941-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42452-024-05941-y