Abstract
Free full text
Performance evaluation of machine-assisted interpretation of Gram stains from positive blood cultures
Manual microscopy of Gram stains from positive blood cultures (PBCs) is crucial for diagnosing bloodstream infections but remains labor intensive, time consuming, and subjective. This study aimed to evaluate a scan and analysis system that combines fully automated digital microscopy with deep convolutional neural networks (CNNs) to assist the interpretation of Gram stains from PBCs for routine laboratory use. The CNN was trained to classify images of Gram stains based on staining and morphology into seven different classes: background/false-positive, Gram-positive cocci in clusters (GPCCL), Gram-positive cocci in pairs (GPCP), Gram-positive cocci in chains (GPCC), rod-shaped bacilli (RSB), yeasts, and polymicrobial specimens. A total of 1,555 Gram-stained slides of PBCs were scanned, pre-classified, and reviewed by medical professionals. The results of assisted Gram stain interpretation were compared to those of manual microscopy and cultural species identification by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). The comparison of assisted Gram stain interpretation and manual microscopy yielded positive/negative percent agreement values of 95.8%/98.0% (GPCCL), 87.6%/99.3% (GPCP/GPCC), 97.4%/97.8% (RSB), 83.3%/99.3% (yeasts), and 87.0%/98.5% (negative/false positive). The assisted Gram stain interpretation, when compared to MALDI-TOF MS species identification, also yielded similar results. During the analytical performance study, assisted interpretation showed excellent reproducibility and repeatability. Any microorganism in PBCs should be detectable at the determined limit of detection of 105 CFU/mL. Although the CNN-based interpretation of Gram stains from PBCs is not yet ready for clinical implementation, it has potential for future integration and advancement.
Bloodstream infections (BSIs) are a major cause of severe morbidity and mortality. Early administration of adequate anti-infective treatment is directly related to the survival of patients who are critically ill (1,–5). Rapid identification of causative microorganisms and antimicrobial susceptibility testing (AST) are crucial for streamlining antimicrobial therapy. During BSIs, the quantity of microorganisms present in the blood can range from below 0.1 to 104 CFU/mL (6, 7). Blood cultures are highly sensitive for detection of BSIs and are typically collected before administration of empiric antimicrobial therapy. In most microbiology laboratories, blood cultures are continuously monitored by automated incubators to detect the growth of bacteria and fungi. Gram stain and microscopy are the first steps performed on any positive blood culture (PBC) for characterizing causative microorganisms, before species identification and AST become available (8). Timely reporting of microscopy results has demonstrated a great impact on the administration of adequate antimicrobial therapy (9, 10).
Over the past decade, the automation of diagnostic workflows has made significant inroads into microbiology laboratories (11). Recent advances in microbiological diagnostic applications, supported by artificial intelligence (AI), hold the potential to expedite analyses, improve efficiency, enhance sensitivity, and reduce errors (12). A key area for AI implementation in medical diagnostics is the interpretation of image data (13). Most applications employ deep convolutional neural networks (CNNs) for visual pattern recognition. Consequently, CNNs have been successfully applied across various medical disciplines, including pathology, cardiology, dermatology, and microbiology (13, 14).
Manual microscopy can be labor intensive, time consuming, and subjective (15, 16). Automated microscopy and CNN-based interpretation have shown benefits in microbiology diagnostics, such as enhanced sensitivity and reduced workload for detecting acid-fast bacilli in respiratory samples (17). Additionally, a proof-of-concept study on Gram stain microscopy from PBCs demonstrated that automated imaging and image analysis could accurately classify different Gram staining reactions and morphologies with high sensitivity and specificity (18, 19). To our knowledge, no systems for automated microscopy and image analysis of Gram stains from PBCs are currently used in routine diagnostics within microbiology laboratories. In this study, we evaluated a scan and analysis system that combines fully automated digital microscopy with a CNN for assisted image analysis of Gram stains from PBCs.
Slide collection and manual microscopy
Between May 2020 and January 2021, a total of 1,730 Gram-stained slides of PBCs from within the routine diagnostic were collected at the Department of Infectious Diseases, University Hospital Heidelberg, Germany, and the Centre for Laboratory Medicine, Division of Human Microbiology, St. Gall, Switzerland. These samples were prepared from blood culture vials that were reported as positive by the BD BACTEC FX blood culture system. Following preparation, the slides of blood culture smears were subjected to the standard fixation and Gram-staining procedures of the respective laboratories. In Heidelberg, slides were manually heat-fixed and stained, whereas in St. Gall, slides were fixed manually with methanol and stained by an automated system (PREVI Color Gram; bioMérieux). Each day, several slides were randomly collected without prior assessment of staining quality. Pre-selection of slides based on organism morphology was also not performed before collection. Gram stains from PBCs included samples from both aerobic and anaerobic blood culture vials (BD BACTEC Plus Aerobic/F, BD BACTEC Plus Anaerobic/F, and BD BACTEC Lytic/10 Anaerobic/F). In Heidelberg, a medical professional manually examined each slide using a 100 × oil immersion objective as part of the routine laboratory workflow. In St. Gall, the Gram-stained slides were digitized and manually interpreted using Metafer (MetaSystems, Altlussheim). Gram staining reaction, arrangement, and morphology of the microorganisms were documented. Culture-based species identification by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) was available for each microorganism from PBCs.
The laboratories in Heidelberg and St. Gall serve as training facilities for trainees, and the experience of the technical staff ranges from several months to 25 years. In Heidelberg, 20 different technical staff members prepared the Gram stains of PBCs. Four residents specializing in clinical microbiology, each with a minimum of 3 years of training, and three consultants with 10–25 years of experience in clinical microbiology manually interpreted the Gram stains. In St. Gall, a dedicated team of six technical staff members performed routine streaking and staining procedures. Manual reading of digitized Gram stains from PBCs was conducted by two experienced technical staff members under supervision of a third technical staff member. All medical professionals performing Gram stain interpretation possessed extensive expertise in reading Gram stains.
CNN development and training
The blood culture application (BCA), developed for this study, is a customized software program that enables Metafer to scan and evaluate Gram stains from PBCs. A total of 314 slides, distinct from those used in the clinical validation phase, were utilized to generate training data for the BCA. These Gram stains originated from PBCs processed within the routine laboratory diagnostics in Heidelberg. Both heat fixation and staining were conducted manually, using the same methods and equipment as those applied to the Heidelberg slides of the clinical validation phase. The training slides were scanned at Metasystems GmbH in Altlussheim, Germany, using the same scanner model as in both Heidelberg and St. Gall during the clinical validation phase. To ensure robust CNN training, these slides encompassed a broad spectrum of morphologies and staining qualities. From each captured field of view, smaller grid images measuring 144 × 144 pixels were generated. Suitable grid images were selected and manually classified to cover seven different classes: background/false positive, Gram-positive cocci in clusters (GPCCL), Gram-positive cocci in pairs (GPCP), Gram-positive cocci in chains (GPCC), rod-shaped bacilli (RSB), yeasts, and polymicrobial specimens (PMS). Images classified as PMS contained multiple morphologies, including any combination of Gram-positive cocci, rod-shaped bacilli, or yeasts. Owing to its complexity and the limited training data for PMS, this class was included solely to collect additional training data during the study. A total of 224,397 classified grid images were split at the slide level into 90% of training images and 10% of test images. Then Keras (version 2.2.4) with TensorFlow backend (version 1.12.0) was used to train a VGG16-based CNN with seven output classes (20,–22). Image augmentation techniques, such as flipping, rotation, random color and brightness shifting, as well as random zooming and spatial shifting, were used to enhance robustness and to prevent overfitting.
BCA classification system
During development of the BCA, we made decisions regarding the implementation of different classes for image classification. We chose to classify all rod-shaped bacilli into a single class rather than dividing them into Gram-negative and Gram-positive bacilli. This decision was based on the relative rarity of true Gram-positive rods in PBCs. Although the CNN technically has the capability to classify Gram-positive and Gram-negative bacilli separately based on their staining, it cannot distinguish between over-decolorized, under-decolorized, and regularly stained microorganisms with the same morphology. To address our concern that separate classification might lead to misinterpretations, we established a combined class of rod-shaped bacilli to closely resemble the visual experience of manual microscopy. For similar reasons, no separate class for Gram-negative cocci was established. This decision stemmed from the fact that true Gram-negative cocci, such as Neisseria meningitidis, are relatively rare in PBCs. In our experience, most Gram-negative cocci observed in Gram stains are actually over-decolorized Gram-positive cocci.
Automated microscopy
In Heidelberg, the Gram stains used were leftovers from routine diagnostics. Non-coverslipped slides were cleaned of immersion oil and then scanned and imaged using Metafer. In St. Gall, Gram stains were directly digitized with Metafer as part of routine laboratory procedures. Both digitization and subsequent analysis of the captured images by the BCA took place at their respective locations. Metafer systems of the same model and software were employed throughout the study in Heidelberg, St. Gall, and Altlussheim.
In the scanning process, Metafer initially creates a low magnification sample overview using a 10 × Apochromat dry objective (Zeiss) and a CoolCube four color camera (4,096 × 3,000 pixels, 3.45 mm pixel size; MetaSystems). The software then automatically pre-selects suitable sample areas based on morphology and density criteria. Subsequently, immersion oil is automatically applied, and areas of interest are captured with a 40 × Apochromat oil immersion objective (Zeiss). Following capture, each camera image is divided into 560 image tiles (144 × 144 pixels each). The BCA evaluates these tiles for the presence of Gram-stained objects, classifying them into seven different classes. As outlined earlier, a CNN trained on PBC samples was utilized. For each slide, the data of detected objects are presented as a column chart, as illustrated in Fig. 1. In addition to the column chart, a gallery displays all the image tiles with detected objects, which can be sorted by various criteria, such as CNN score. The CNN score describes the confidence level with which the CNN assigns an object to a class. Additionally, captured camera fields can be reviewed, providing a visual experience like that of manual microscopy, complete with slide navigation and zoom magnification. The data provided by the system are not interpreted and must be evaluated by medical professionals before reporting.
Spiked blood cultures
Blood cultures were spiked with microorganisms to simulate BSIs and evaluate analytical performance. In Heidelberg, 8–10 mL of blood from healthy human donors was aseptically collected via venipuncture and immediately inoculated into blood culture vials. Additionally, the donors’ blood was inoculated on blood agar (Columbia agar, 5% sheep blood; BD) to detect potential contamination during the blood draw. Written consent was obtained from all donors. In St. Gall, blood cultures from routine diagnostics, which had been incubated for 5 days and reported as negative, were utilized. Gram stains were conducted on these negative blood cultures to confirm the absence of microorganisms prior to spiking. Fresh overnight cultures of microorganisms from the American Type Culture Collection (Manassas, VA, USA) were diluted in 0.9% sodium chloride and inoculated into the prepared blood culture vials to achieve a final concentration of approximately 20 CFU/mL. Following incubation and reporting as positive, Gram stains were prepared according to the respective laboratory protocols, as described earlier. Table 1 presents a summary of the microorganisms and type strains used.
TABLE 1
Class | Organism name | Type strain |
---|---|---|
GPCCL | Staphylococcus aureus | ATCC 29213 |
GPCP | Streptococcus pneumoniae | ATCC 49619 |
GPCC | Streptococcus agalactiae | ATCC 27956 |
RSB | Escherichia coli | ATCC 25922 |
Yeast | Candida albicans | ATCC 90029 |
Candida glabrata | ATCC MYA-2950 |
Analysis of BCA analytical performance
When determining the accuracy of Gram stain interpretation, results are usually correlated to identification results of the microorganism from corresponding subcultures. However, results of manual microscopy are the information typically provided to clinicians at the time of blood culture positivity and Gram staining. Therefore, both manual microscopy and culture-based species identification by MALDI-TOF MS were used as reference standards for BCA-assisted interpretation to calculate positive percent agreement (PPA) and negative percent agreement (NPA) for the different classes. PPA denotes the proportion of true positives accurately identified, whereas NPA denotes the proportion of true negatives. We used the Wilson score interval method to calculate 95% confidence intervals (CIs) for PPA and NPA. In this study, true positives are instances where the BCA-assisted interpretation corresponds with the reference standard result, such as the correct classification of Staphylococcus aureus as GPCCL. Conversely, true negatives in the GPCCL class could include RSB that were correctly identified as not being GPCCL.
Gram stains from blood cultures spiked and reported as positive, as described earlier, were utilized to evaluate analytical performance unless otherwise stated. For the preparation of yeast samples, Candida glabrata was used in Heidelberg and Candida albicans was used in St. Gall.
To assess the precision and reliability of the BCA, both repeatability and reproducibility were evaluated. Results were deemed accurate when the BCA-assisted interpretation corresponded with the expected morphological class of the spiked microorganism, as indicated in Table 1.
Repeatability was defined as the BCA’s ability to consistently yield reliable results across multiple scans of the same sample under uniform conditions. These scans were conducted at MetaSystems, with evaluations conducted by five distinct operators. For each of the five morphological classes, three replicates were created. Over a span of 20 days, each replicate was scanned four times a day, with two consecutive scans at two different times. This led to a cumulative total of 240 scans for each class.
Reproducibility was defined as the ability to consistently obtain accurate BCA-assisted results when scanning different samples in Heidelberg, St. Gall, and Altlussheim. Different operators conducted the scanning and evaluation at their respective locations to ensure a blinded evaluation. For each class, 30 replicate Gram stains were prepared for each location. The replicates for both St. Gall and Altlussheim were prepared in St. Gall, whereas the replicates for Heidelberg were prepared there itself. Over 5 consecutive days, six replicates from each class were randomly scanned at their respective locations
The limit of detection (LOD) of the BCA for GPCCL, GPCP, GPCC, RSB, and yeasts was determined by testing serial dilutions of five different microorganisms prepared from positive aerobic and anaerobic blood culture vials (BD BACTEC Plus Aerobic/F and BD BACTEC Plus Anaerobic/F). To confirm the LOD, 20 replicates at the estimated LOD target level were tested. These target levels were determined by colony counts. The LOD was considered to be confirmed if ≥19/20 replicates yielded a positive result for the BCA under evaluation.
Metafer training for medical professional
In Heidelberg, one resident with 3 years of training conducted BCA-assisted interpretations. In St. Gall, two technical staff members, each with at least 5 years of experience, performed result interpretations under the supervision of a consultant with 6 years of training. The St. Gall staff was familiar with digitized Gram stain reading from Metafer, as it had already been integrated into their routine laboratory workflow. All medical professionals participating in the study received a test set of 100 Gram stains, consisting of 20 slides of each class, before the study commenced. Initially, they were introduced to the graphical user interface of the BCA and briefed on result distribution based on probability values. They were then instructed to use their expertise to interpret the Gram stains. A system-knowledgeable expert from MetaSystems supervised this training. Medical professionals were considered sufficiently skilled to work with the system after successfully interpreting these 100 test slides. No further training was provided during the study, and participants were not allowed to review either the original slide or digitized images from the routine laboratory workflow. For clinical validation, medical professionals were instructed to interpret the images based solely on their expertise and the information provided by the BCA, without any additional clinical information. All participants were blinded to the results of routine laboratory microscopy and cultural species identification by MALDI-TOF MS.
Inclusion and exclusion criteria
Inclusion criteria were defined as
Monomicrobial or false-positive samples.
Saved BCA-assisted result interpretation.
Result for routine Gram stain interpretation and cultural species identification were available.
Exclusion criteria were defined as
PMS: the CNN was not sufficiently trained to handle PMS.
Duplicate sample: the same sample was accidentally scanned twice, resulting in duplicated data entries.
Sample material out of scope: non-blood sample in blood culture vial.
Incomplete data sets: although the BCA analysis was conducted, the results were not saved, leading to incomplete data records.
Errors in reference standard procedure: results for routine Gram stain interpretation and/or culture-based species identification were unavailable.
Technical errors in scanning procedure: operational issues, such as failure to achieve initial focusing.
To prevent unintentional bias and due to the continuous influx of new samples, we did not rescan or reanalyze any specimens, despite the potential feasibility for some of the excluded samples.
Clinical validation
During the study period, 1,730 Gram stains, prepared from blood cultures reported as positive, were scanned and pre-classified by the BCA. These were then reviewed and interpreted by medical professionals. A total of 175 slides were excluded from the study, leaving 1,555 Gram stains, comprising monomicrobial and false-positive slides, for evaluation. Excluded samples and exclusion criteria are detailed in Table 2.
TABLE 2
Exclusion criteria | Number of samples |
---|---|
Polymicrobial sample | 73 |
Duplicate sample | 2 |
Sample material out of scope | 12 |
Incomplete data sets | 44 |
Errors in reference standard procedure | 2 |
Technical errors in Gram scanner procedure | 42 |
The results of assisted Gram stain interpretation were compared to those of manual microscopy and culture-based species identification by MALDI-TOF MS. To enhance diagnostic precision assessment, manual microscopy was also compared to MALDI-TOF MS. For comparison with culture-based identification, isolates identified by MALDI-TOF MS were classified into the class that best matched their expected morphology, arrangement, and Gram staining reaction, such as Staphylococcus epidermidis as GPCCL or Escherichia coli as RSB. Table 3 presents an overview of identified species and their respective classification. The two classes of GPCP and GPCC were combined for the analysis. Typically, it is considered that enterococci form short chains, while streptococci form pairs or long chains; however, in Gram stain both genera show high variability in their morphology (23, 24). Single cocci, cocci in pairs, short chains, and long chains may appear concurrently, particularly when considering the influence of sample quality and antibiotic treatment (25,–27). Manual microscopy showed 567 GPCCL, 218 GPCP/GPCC, 665 RSB, and 36 yeasts. Sixty-nine slides were identified as background/false positive. Morphologies based on species identification by MALDI-TOF MS were determined as 566 GPCCL, 218 GPCP/GPCC, 667 RSB, and 39 yeasts. Sixty-five slides originated from culture-negative blood cultures and were thus classified as false-positive. The outcomes of the comparative analysis of clinical validation results are presented in Table 4.
TABLE 3
Identified species and classification | Number of isolates |
---|---|
Gram-positive cocci in clusters | 566 |
Aerococcus urinae, Micrococcus luteus, and Staphylococcus spp. | |
Gram-positive cocci in pairs and chains | 218 |
Enterococcus spp., Gemella sanguinis, and Streptococcus spp. | |
Rod-shaped bacilli | |
Gram-negative bacilli | 644 |
Acinetobacter spp., Bacteroides fragilis, Brevundimonas spp., Citrobacter spp., Enterobacter spp., Escherichia coli, Fusobacterium nucleatum, Haemophilus influenzae, Klebsiella spp., Moraxella osloensis, Morganella morganii, Proteus spp., Pseudomonas spp., Raoultella ornithinolytica, Serratia marcescens, and Stenotrophomonas maltophilia | |
Gram-positive bacilli | 23 |
Actinobaculum schaalii, Actinomyces oris, Bacillus spp., Brevibacterium paucivorans, Clostridium tertium, Corynebacterium spp., Cutibacterium acnes, and Lactobacillus spp. | |
Yeasts | 39 |
Candida spp. and Cryptococcus neoformans |
TABLE 4
Comparison | Class | PPA (%) | 95% CI | NPA (%) | 95% CI |
---|---|---|---|---|---|
BCA-assisted interpretation to manual microscopy | GPCCL | 95.8 | 93.8–97.1 | 98.0 | 96.8–98.6 |
GPCP/GPCC | 87.6 | 82.5–91.3 | 99.3 | 89.6–99.5 | |
RSB | 97.4 | 95.9–98.4 | 97.8 | 96.6–98.5 | |
Yeasts | 83.3 | 68.1–92.1 | 99.3 | 98.8–99.6 | |
Negative/false positive | 87.0 | 77.0–93.0 | 98.5 | 97.7–99.0 | |
BCA-assisted interpretation to MALDI-TOF MS | GPCCL | 95.8 | 93.6–97.0 | 97.9 | 96.7–98.5 |
GPCP/GPCC | 87.6 | 82.6–91.4 | 99.3 | 98.6–99.6 | |
RSB | 97.3 | 95.8–98.3 | 97.9 | 96.7–98.6 | |
Yeasts | 79.5 | 64.5–89.2 | 99.4 | 98.9–99.7 | |
Negative/false positive | 87.7 | 75.6–93.6 | 98.3 | 97.5–98.8 | |
Manual microscopy to MALDI-TOF MS | GPCCL | 99.8 | 99.0–100.0 | 99.8 | 99.3–99.9 |
GPCP/GPCC | 98.6 | 96.0–99.5 | 99.8 | 99.3–99.9 | |
RSB | 99.6 | 98.7–99.9 | 99.9 | 99.4–100.0 | |
Yeasts | 92.3 | 79.9–97.4 | 100.0 | 99.8–100.0 | |
Negative/false positive | 100.0 | 94.4–100.0 | 99.7 | 99.3–99.9 |
Analysis of discrepant results comparing BCA-assisted interpretation with manual microscopy and cultural species identification by MALDI-TOF MS identified several factors likely contributing to errors in BCA-assisted interpretation. These factors included a low microbial load and the selection of areas of interest lacking microorganisms. Furthermore, staining variations, such as over-decolorization, and variability in microorganism morphology due to prior antibiotic treatment led to incorrect interpretations. A comprehensive analysis of all discrepant samples, encompassing the results of manual microscopy, BCA-assisted interpretation, and species identification by MALDI-TOF MS, is summarized in Table 5.
TABLE 5
Interpretation | |||||||
---|---|---|---|---|---|---|---|
MALDI-TOF MS species identification | Number of isolatesb | GPCCL | GPCP/GPCC | RSB | Yeasts | N/FP | Probable BCA-assisted interpretation explanation |
Candida albicans | 1 | B, C | A | BCA superior to manual microscopy | |||
Escherichia coli | 2 | A | B, C | ||||
Staphylococcus intermedius | 1 | B, C | A | ||||
Candida albicans | 2 | B | A, C | No microorganism in scanned fields of view and/or low microbial load | |||
Candida albicans | 1 | A, B | C | ||||
Candida dubliniensis | 1 | A, B | C | ||||
Candida glabrata | 3 | A, B | C | ||||
Candida tropicalis | 1 | A, B | C | ||||
Enterococcus faecium | 1 | A, B | C | ||||
Escherichia coli | 2 | A, B | C | ||||
Pseudomonas aeruginosa | 1 | A, B | C | ||||
Staphylococcus aureus | 6 | A, B | C | ||||
Staphylococcus epidermidis | 5 | A, B | C | ||||
Staphylococcus hominis | 2 | A, B | C | ||||
Streptococcus constellatus | 1 | B | A, C | ||||
Actinobaculum schaalii | 1 | A, C | B | Variation in microorganism morphology | |||
Cutibacterium acnes | 1 | C | A, B | ||||
Enterococcus faecalis | 3 | C | A, B | ||||
Enterococcus faecium | 2 | C | A, B | ||||
Staphylococcus aureus | 2 | A, B | C | ||||
Staphylococcus epidermidis | 5 | A, B | C | ||||
Staphylococcus hominis | 1 | A, B | C | ||||
Staphylococcus lugdunensis | 1 | A, B | C | ||||
Streptococcus agalactiae | 1 | C | A, B | ||||
Streptococcus anginosus | 1 | C | A, B | ||||
Streptococcus dysgalactiae | 1 | C | A, B | ||||
Streptococcus gallolyticus | 1 | C | A, B | ||||
Streptococcus oralis | 2 | C | A, B | ||||
Streptococcus salivarius | 1 | A, C | B | ||||
Streptococcus salivarius | 1 | C | A, B | ||||
Streptococcus sanguinis | 2 | C | A, B | ||||
Enterococcus faecalis | 4 | A, B | C | Variation in microorganism morphology and over-decolorization | |||
Staphylococcus aureus | 2 | A, B | C | ||||
Streptococcus constellatus | 1 | B | A, C | ||||
Streptococcus oralis | 1 | A, B | C | ||||
Streptococcus pneumoniae | 4 | A, B | C | ||||
Citrobacter freundii | 4 | A, B | C | Variation in microorganism morphology and under-decolorization | |||
Enterobacter spp. | 1 | A, B | C | ||||
Escherichia coli | 2 | C | A, B | ||||
Escherichia coli | 3 | A, B | C | ||||
Klebsiella pneumoniae | 2 | C | A, B | ||||
Klebsiella pneumoniae | 1 | A, B | C | ||||
No cultural growth | 7 | C | A, B | BCA might be superior, antibiotic treatment or fastidious microorganism resulting in no cultural growth. | |||
No cultural growth | 1 | C | A, B |
Analytical performance
Reproducibility results demonstrated high consistency, achieving 100% accuracy across all morphological classes, except for the class of GPCP. Detailed results can be found in Table 6.
TABLE 6
Class | Organism name | Reproducibility (%) |
---|---|---|
GPCCL | Staphylococcus aureus | 100 (90/90) |
GPCP | Streptococcus pneumoniae | 98.9 (89/90)b |
GPCC | Streptococcus agalactiae | 100 (89/89)c |
RSB | Escherichia coli | 100 (90/90) |
Yeast | Candida albicans | 100 (60/60) |
Candida glabrata | 100 (30/30) |
Repeatability results showed no differences between operators, time points, or days of testing, yielding a 100% repeatability rate for each class tested.
The LOD, defined as the lowest concentration at which Metafer could reliably detect microorganisms in Gram stains from aerobic and anaerobic blood culture vials, was 105 CFU/mL for each class, respectively. The only exception was C. albicans, showing no detectable growth in anaerobic blood culture vials with the inoculum used.
For BSIs, early administration of adequate anti-infective treatment is crucial for patient survival and prognosis improvement (1, 3, 5). Blood cultures have the highest sensitivity for detection of causative microorganisms and are a fundamental component of microbiological diagnostics. The interpretation of Gram stains from PBCs provides the first microbiological information to guide the choice of antimicrobial therapy before species identification or AST becomes available. However, manual microscopy remains labor intensive, time consuming, and subjective (15, 16). Furthermore, microbiology laboratories are challenged by staff shortages, increased sample volumes, and cost containment pressures. Concurrently, there is a need to reduce reporting times while maintaining high-quality results (16). Over the past decade, automation, digital microscopy, and, most recently, AI-assisted image analysis have been successfully implemented in various medical disciplines to support routine diagnostics (13, 14, 17).
The aim of this study was to assess the potential of an automated scanning and image analysis system in aiding the interpretation of Gram stains from PBCs within routine laboratory diagnostics. Manual microscopy of Gram stains from PBCs can be challenging due to effects of prior antimicrobial treatment, smear density, staining variability, artifacts, and sample distribution. These issues are also relevant in digital microscopy and CNN-based image analysis. For a CNN to provide reliable and consistent data, it must effectively manage slide-to-slide variability for subsequent analysis and interpretation. The CNN developed for this study was trained to classify microorganisms in Gram stains from PBCs based on their staining, arrangement, and morphology as Gram-positive cocci in clusters, pairs, or chains, as well as RSB and yeasts. Images containing multiple morphologies and/or different staining were classified as PMS, whereas those lacking detectable objects were classified as background/false positive. Given the complexity, the CNN was not sufficiently trained to classify PMS. Instead, this class was included to collect data to inform future advancements.
We hypothesize that trained microbiologists can identify the Gram staining reaction and morphology of microorganisms in Gram stains from PBCs with high accuracy. However, there is currently a lack of data on error rates for the manual interpretation of Gram stains from PBCs, and no established benchmarks exist. Two laboratories reported mean error rates of <1%, but yeasts were excluded from their studies (28, 29). Our study, which includes yeasts, yielded data that align with these published error rates. A comparison of manual microscopy results with corresponding subculture identification by MALDI-TOF MS in our data set revealed a similar error rate of<1% (10 of 1,555). Considering the results of other studies as well as our own data, we suggest that an error rate of 1%–2% might be a reasonable benchmark for the accuracy of manual Gram stain interpretations of PBCs (28,–31). Compared to MALDI-TOF MS, the BCA-assisted interpretation demonstrated an error rate of 5.5% (85 of 1,555) and a correct identification rate of 94.5% (1,470 of 1,555). This error rate was inferior to that of manual microscopy, and the proposed benchmark has not been met by the BCA-assisted interpretation yet.
While the comparison of manual microscopy to MALDI-TOF MS yielded high agreement across all classes, BCA-assisted interpretation was inferior to both methods. In evaluating BCA-assisted interpretation against manual microscopy and MALDI-TOF MS, the highest accuracy was observed for the identification of GPCCL and RSB, the microorganisms most frequently isolated from PBCs (32, 33). However, the PPA and NPA of the BCA-assisted interpretation showed non-overlapping 95% CIs for almost all classes, including GPCCL and RSB, when compared to manual microscopy and MALDI-TOF MS. The PPA for GPCCL was 95.8% in comparison to manual microscopy and 95.8% compared to MALDI-TOF MS, while the NPA was 98.0% and 97.9%, respectively. For RSB, the PPA was 97.4% compared to manual microscopy and 97.3% compared to MALDI-TOF MS, with the NPA at 97.8% and 97.9%, respectively. The BCA-assisted interpretation yielded lower PPA and NPA for all other classes. In summary, BCA-assisted interpretation correctly identified 94.7% (1,472 of 1,555) of all slides compared to manual microscopy, resulting in an error rate of 5.3% (83 of 1,555). We therefore conclude that the error rate must be reduced for successful clinical implementation.
The analytical performance study demonstrated excellent reproducibility and repeatability for BCA-assisted interpretation. With an LOD of 105 CFU/mL, the detection of any microorganism in PBCs should be feasible. The concentrations of bacteria and yeasts at the time of blood culture positivity range between 106 and 109 CFU/mL (34,–36).
Switching from manual microscopy to BCA-assisted interpretation of Gram stains marks a considerable change in established workflow routines. Initially, medical professionals accustomed to manual Gram stain microscopy may find the condensed image presentation within the separate classes unfamiliar. Consequently, initial training is essential. Training with 100 test slides, as described, seems to be adequate for working reliably with the system, since no substantial differences in error rates were observed between the start and end of the study.
The amount of microorganisms, which are classifiable for the CNN, varies across different species in Gram stains from PBCs. Slides with a high microbial load, often encountered with RSB, may contain countless classifiable objects within a few camera fields. Conversely, yeast samples typically show a very low microbial load. In our study, the most frequent error in interpreting yeast samples was reporting them as false negative. We hypothesize that by implementing thresholds for the detection of a minimum number of objects, the analysis process could be improved. For slides with a low microbial load, extending the scanning process allows Metafer to potentially find more objects. In contrast, for slides with a high microbial load, the analysis is expedited by concluding the analysis after a defined number of objects have been classified. These thresholds have already been developed for testing purposes. However, as this approach was not implemented during the study, comprehensive data are lacking.
The average duration of scanning and analysis by the BCA is 2.5 minutes per slide. We conservatively estimate that a medical professional takes approximately 15 seconds for subsequent result interpretation. Published data on the average speed of manual microscopy and result interpretation of PBCs are lacking. However, based on our experience, it typically takes only a few seconds, often less than a minute, for true PBCs. In contrast, manual microscopy of Gram stains from false PBCs is substantially more time consuming than that of true PBCs. In 1%–10% of blood cultures reported positive by automated blood culture systems, no microorganisms are present in the Gram stain, and the corresponding subcultures show no growth of microorganisms. The majority of these false PBCs have been attributed to CO2 production by metabolically active cells such as white blood cells, overfilling of blood culture vials, or system errors (37, 38). With further development, the BCA may reduce the time required for the final evaluation of Gram stains from false PBCs.
Although manual microscopy demonstrated extremely low error rates, frequent errors are reporting of polymicrobial infections as monomicrobial (29). As described earlier, the CNN has not yet been sufficiently trained on PMS. Consequently, these samples were excluded after matching both the results of manual microscopy and BCA-assisted interpretation with the cultural species identification. Because the operators were not trained on PMS interpretation using the BCA, the reported results varied incoherently among different operators.
Fully automated microscopy and pre-classification by the BCA offer advantages; however, it is also important to acknowledge its current limitations. Development and validation of CNNs require extensive training data sets. Acquiring and curating such data sets are resource intensive and time consuming. In the context of our study, this process is further complicated by variations in staining, changes in microorganism morphology, antimicrobial treatment, microbial load, artifacts, and sample distribution. CNN-based evaluations rely on pre-defined criteria and algorithms to ensure consistent result reporting. Therefore, specific classes for PMS, Gram-negative cocci, and separate classes for Gram-negative and Gram-positive bacilli have not been implemented. As described earlier, this decision was also supported by considerations regarding the use of the BCA in laboratory diagnostics. However, these factors are potential sources of error during Gram stain analysis. We emphasize that a medical professional must evaluate the results of the BCA before reporting. Although unintentional, a potential selection bias might have been introduced regarding the staining quality of the evaluated slides. It is possible that, during the random selection of Gram stains, slides that appeared visually to be better stained were preferentially selected. The question of whether manual microscopy or culture-based species identification should be the reference standard is a minor limitation of our study.
In conclusion, this study evaluated the accuracy and analytical performance of the CNN-based BCA in assisting the manual interpretation of Gram stains from PBCs. The application has demonstrated its capability to classify microorganisms based on their Gram staining reaction, arrangement, and morphology. Although the results are promising, the BCA is not ready yet for implementation in clinical laboratories. Nevertheless, if the error rates could be lowered by future advancements, the use of BCA-assisted interpretation may become feasible for routine diagnostics of PBCs.
We would like to thank Niclas Buob, Dragan Rendic and Saranya Suriyanarayanan for excellent technical assistance at St. Gall microbiology laboratory. We would also like to thank Editage for English language editing.
This work has been carried out in accordance with the Declaration of Helsinki. Informed consent was obtained from all blood donors and approved by the ethics committee of Heidelberg Medical School, Heidelberg, Germany (no. S-351/2020).
Articles from Journal of Clinical Microbiology are provided here courtesy of American Society for Microbiology (ASM)
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Automated Interpretation of Blood Culture Gram Stains by Use of a Deep Convolutional Neural Network.
J Clin Microbiol, 56(3):e01521-17, 22 Feb 2018
Cited by: 39 articles | PMID: 29187563 | PMCID: PMC5824030
The use of Gram stain and matrix-assisted laser desorption ionization time-of-flight mass spectrometry on positive blood culture: synergy between new and old technology.
APMIS, 119(10):681-688, 21 Jul 2011
Cited by: 9 articles | PMID: 21917005
Direct Identification of Urinary Tract Pathogens From Urine Samples Using the Vitek MS System Based on Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry.
Ann Lab Med, 35(4):416-422, 21 May 2015
Cited by: 27 articles | PMID: 26131413 | PMCID: PMC4446580
Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry for Use with Positive Blood Cultures: Methodology, Performance, and Optimization.
J Clin Microbiol, 55(12):3328-3338, 30 Aug 2017
Cited by: 48 articles | PMID: 28855303 | PMCID: PMC5703799
Review Free full text in Europe PMC