Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Integrated Clinical, Climate, and Environmental Prediction Modeling for Diagnosis of Spotted Fever Group Rickettsioses in northern Tanzania

2024, medRxiv (Cold Spring Harbor Laboratory)

medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Integrated Clinical, Climate, and Environmental Prediction Modeling for Diagnosis of Spotted Fever Group Rickettsioses in northern Tanzania. Robert J. Williams1, Ben J. Brintz1,2, William L. Nicholson3, John A. Crump4,5,6,7,8, Ganga Moorthy5,9, Venace P. Maro7,8, Grace D. Kinabo7,8, James Ngocho7,8, Wilbrod Saganda10,11, Daniel T. Leung1,12*+, Matthew P. Rubach4,5,7,13* 1. Division of Infectious Diseases, Department of Internal Medicine, University of Utah, Salt Lake City, Utah. 2. Division of Epidemiology, Department of Internal Medicine, University of Utah, Salt Lake City, Utah. 3. Rickettsial Zoonoses Branch, Division of Vector-Borne Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia. 4. Division of Infectious Diseases and International Health, Department of Medicine, Duke University, Durham, North Carolina. 5. Duke Global Health Institute, Duke University, Durham, North Carolina. 6. Center for International Health, University of Otago, Dunedin, New Zealand. 7. Kilimanjaro Christian Medical Centre, Moshi, Tanzania. 8. Kilimanjaro Christian Medical University College, Moshi, Tanzania 9. Division of Pediatric Infectious Diseases, Department of Pediatrics, Duke University, Durham, North Carolina. 10. Mawenzi Regional Referral Hospital, Moshi, Tanzania. 11. Ministry of Health, Community Development, Gender, Elderly, and Children, Dodoma, Tanzania. 12. Division of Microbiology and Immunology, Department of Pathology, University of Utah, Salt Lake City. 13. Programme in Emerging Infectious Diseases, Duke-National University of Singapore Medical School, Singapore, Singapore * contributed equally + corresponding author Email: daniel.leung@utah.edu (D.T.L.) NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice. 1 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Abstract Spotted fever group rickettsioses (SFGR) pose a global threat as emerging zoonotic infectious diseases; however, timely and cost-effective diagnostic tools are currently limited. While traditional clinical prediction models focus on individual patient-level parameters, we hypothesize that for infectious diseases, the inclusion of location-specific parameters such as climate data may improve predictive ability. To create a prediction model, we used data from 449 patients presenting to two hospitals in northern Tanzania between 2007 to 2008, of which 71 (15.8%) met criteria for acute SFGR based on ≥4-fold rise in antibody titers between acute and convalescent serum samples. We fit random forest classifiers by incorporating clinical and demographic data from hospitalized febrile participants as well as satellite-derived climate predictors from the Kilimanjaro Region. In crossvalidation, a prediction model combining clinical, climate, and environmental predictors (20 predictors total) achieved a statistically non-significant increase in the area under the receiver operating characteristic curve (AUC) compared to clinical predictors alone [AUC: 0.72 (95% CI:0.57-0.86) versus AUC: 0.64 (95% CI:0.48-0.80)]. In conclusion, we derived and internally-validated a diagnostic prediction model for acute SFGR, 2 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . demonstrating that the inclusion of climate variables alongside clinical variables improved model performance, though this difference was not statistically significant. Novel strategies are needed to improve the diagnosis of acute SFGR, including the identification of diagnostic biomarkers that could enhance clinical prediction models. 3 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Background Spotted fever group rickettsioses (SFGR) are a group of illnesses caused by bacteria from the genus Rickettsia that include endemic, new, and emerging zoonotic infectious diseases with a worldwide distribution. In several African countries, SFGR has been identified as the infectious etiology in 5-22% of febrile hospital admissions.1-3 Prompt recognition and treatment of SFGR are important as multiple studies have shown the delay in initiation of tetracycline antimicrobials is associated with increased morbidity and mortality.4-6 However, current diagnostic methods do not allow for timely and accurate diagnosis. The most sensitive reference standard diagnostic, a 4-fold rise in immunofluorescent antibody (IFA) titer between paired acute and convalescent serum samples, requires convalescent serum collection, and therefore by definition cannot establish the diagnosis at the time of presentation.7 In the case of R. africae and R. conorii, SFGR of importance in African countries, seroconversion may not occur until 4 weeks after illness onset.8 Rickettsia are intracellular species and do not circulate extensively in the bloodstream, limiting the sensitivity of polymerase chain reaction (PCR) on 4 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . blood specimens to around 60%.7 Laboratory values such as thrombocytopenia, hyponatremia, and elevated transaminases are supportive features, but cannot be relied on to guide early management as they are non-specific findings and often within normal limits or only slightly above the reference range early in the course of illness.9 Clinical diagnosis relying on the triad classically associated with SFGR—a history of a tick bite, rash, and fever—only occur in a minority of cases.9-12 Incorporating tetracycline therapy into the empiric syndromic management of febrile illness in high prevalence settings would not only exacerbate antimicrobial resistance, but subject children to the risks of tetracycline therapy, including bone growth suppression and permanent tooth discoloration.13,14 Thus, more accurate, timely, and cost-effective tools are needed for diagnosis of SFGR. Clinical Decision-Support Systems (CDSS) incorporating prediction models have the potential to improve management of infectious diseases. CDSS have proven effective at enhancing therapeutic management and reducing unnecessary diagnostic tests in both high-income countries (HICs) LMICs. 16-18 15 and Traditional predictive models generally incorporate clinical 5 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . information that is obtained solely from the presenting patient. However, as with other zoonoses, the range and incidence of SFGR has been associated with climate and climate-related environmental factors. 19,20 Thus, incorporating location-specific parameters, such as climate and environmental data, into a prediction model may increase diagnostic accuracy.18,21,22 In this study, we demonstrate a ‘proof of concept’ by integrating locationspecific parameters into clinical prediction. Our overarching goal is to create an accessible and cost-effective CDSS that assists clinicians in diagnosing SFGR. To achieve this, we used data from a clinical study of febrile illness in an SFGR-endemic region in northern Tanzania 23 to develop a clinical prediction model that incorporates climate and environmental data. Methods Study Design, Setting, and Data Source 6 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . For the derivation and validation of a prediction model, we used deidentified data from a study of participants presenting with febrile illness to two hospitals in Kilimanjaro Region of northern Tanzania, 2007-2008.24,25 Prior to data collection, the Kilimanjaro Region in northern Tanzania had a population of 1,380,000 in a mostly rural and semirural setting;26,27 and Moshi, the administrative center of the Kilimanjaro Region, had a population of approximately 144,000. 27 The climate is characterized by a long rainy period (March-May) and a short rainy period (NovemberDecember). Febrile participants presenting to Kilimanjaro Christian Medical Centre (KCMC) or Mawenzi Regional Hospital (MRH) in Moshi, Tanzania from September 2007 through August 2008 were eligible for enrollment. Complete study methods have been described elsewhere.24,25 KCMC is a tertiary care hospital serving several Regions in northern Tanzania; at the time of the study KCMC had 458 inpatient beds. MRH, the regional hospital for Kilimanjaro, had 300 beds at the time of the study and served. Together, KCMC and MRH served as major providers of hospitalbased care in the Moshi area. 7 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . For pediatric participants 2 months to 12 years, inclusion criteria were a history of fever in the past 48 hours or a measured axillary temperature ≥ 37.5°C or rectal temperature ≥ 38°C. For adolescent and adult participants (≥13 years of age), inclusion criteria were an oral temperature ≥38°C on admission to the hospital. All participants required paired sera for inclusion in this analysis. Blood specimens were collected for a complete blood count (CBC) and serologic infectious disease diagnostics. Participants were also tested for HIV and malaria with rapid diagnostic testing. After obtaining informed consent, a trained study team member collected standardized demographic data, clinical history, and physical examination findings. Participants were asked to return 4-6 weeks after enrollment for collection of a convalescent serum sample. Acute and convalescent serum samples collected for SFGR testing were sent to the Rickettsial Zoonoses Branch of the US Centers for Disease Control and Prevention (US CDC). Serum samples were tested for SFGR by IgG IFA to R. conorii (Moroccan strain). SFGR was defined as a ≥4fold increase in IFA titer to R. conorii between acute and convalescent 8 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . serum in a participant. Participants with less than a 4-fold rise in IFA titer to R. conorii were considered non-SFGR febrile illness. For each participant, a trained study team member collected standardized demographic data, clinical history, admission vital signs, and physical examination findings. As the presentation of infectious disease can differ between pediatric and adult participants, several of the clinical variables were only obtained for either pediatric or adult participants. For our clinical prediction modeling we only included variables that were collected for both groups, which included: age, heart rate, respiration rate, blood pressure, oxygen saturation, height, weight, body mass index (BMI), cough, diarrhea, emesis, hematochezia, dyspnea, seizures, crepitations, hepatomegaly, splenomegaly, pallor, lymphadenopathy, oral candidiasis, meningeal signs, HIV rapid diagnostic result, malaria rapid diagnostic result, and if the participant resided in a rural setting. We recorded clinical symptoms and physical exam findings as binary variables and age, vital signs, and complete blood count (CBC) results as continuous variables. Climate and Environmental data 9 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . We extracted climate and environmental data from the MODIS (Moderate Resolution Imaging Spectroradiometer) satellite for the Kilimanjaro Region, Tanzania. Environmental data included the normalized difference vegetation index (NDVI), the enhanced vegetation index (EVI), and the normalized difference water index (NDWI), evapotranspiration. Climate data included daytime and nighttime temperature. NDVI and EVI indicators are based on a 16-day time series composite image at 1km * 1km spatial resolution and were obtained from Moderate Resolution Imaging Spectroradiometer (MODIS) product MOD13A2. Surface temperatures, acquired from MODIS product MOD11A1, are daily measurements at 1km * 1km spatial resolution. Evapotranspiration is based on an 8-day time series composite image at 500m x 500m spatial resolution and obtained from MOD16A2GF. Finally, using MODI09A1, a 500m x 500m 8-day composite time series, we calculated NDWI from near-infrared (NIR – MODIS band 2) and shortwave infrared (SWIR- MODIS band 6) reflectance’s.28,29 We consolidated all climate and environmental data within a uniform 16-day time series window; for shorter times-series, we computed the mean for each 16-day window. For example, data from January 1 through January 16, 2007 constituted one window, followed by measurements from January 17 10 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . through February 2, 2007 in the subsequent window. To account for recent climate and environmental patterns that may influence SFGR incidence, we lagged each 16-day times series at one, two, and three months. Finally, we aligned the lagged measurements with the admission dates of study participants. For instance, a participant presenting on April 4, 2007 would have data from the window containing March 4 (1-month lag), February 4 (2-month lag), and January 4 (3-month lag). Statistical Analysis and Modeling To compare acute SFGR versus non-SFGR febrile illness groups on univariate analysis, we used the Wilcoxon rank sum test due to the nonnormal distribution of age data. We used Pearson’s Chi-squared test and Fisher’s exact test for categorical variables. We used the random forest algorithm to fit a model to predict risk of participants having acute SFGR versus non-SFGR febrile illness. Random forests are a machine learning algorithm which constructs a multitude of decision trees and averages over them to obtain a prediction robust to nonlinearities and interactions between covariates; random forests algorithms have been widely applied to biomedical sciences for both classification and regression.30,31 11 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . We excluded predictors with highly skewed binary predictors (predictors with 95% or more values concentrated as either 0 or 1) from analysis. For the remaining predictors, we imputed missing data using the ‘missRanger’ package in R. To determine which predictors to include in our analysis, we fit two distinct models – one utilizing solely satellitederived climate and environmental data and another incorporating clinical and demographic data. We used the ‘permimp’ package in R to assess the variable importance using permutation-based methods. This method involves systematically shuffling or permuting the values of individual predictors to evaluate their impact on performance. Next, we identified the top 10 predictors from each model based on their respective permuted importance scores. We included these predictors in our final analysis. To assess predictive performance for each random forest model, we used repeated cross-validation using 80% training/20% testing splits with 100 iterations. In each iteration, we trained models on 80% of the data, made predictions on the 20% test set, and obtained measures of performance. We determined overall model performance by averaging the area under 12 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . the receiver operating characteristic curve (AUC) and confidence intervals across the 100 iterations. To determine statistical significance in the AUC between models we used a bootstrap method over 100 iterations, which involves resampling the data with replacement multiple times, creating bootstrap samples. For each bootstrap sample, we generated receiver operating characteristic (ROC) and computed the difference in AUC between the curves. We completed all analyses using R version 4.2.0, and model development/validation was completed in accordance with the Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) checklist (Supplement Table 1). Research Ethics The primary study was approved by the Kilimanjaro Christian Medical University College Health Research Ethics Committee, the Tanzania National Institutes for Medical Research National Health Research Ethics Coordinating Committee, and Institutional Review Boards of Duke University Medical Center, and the US CDC. The secondary data analysis was reviewed by the Institutional Review Board of the University of Utah and determined to be exempt (IRB_00164810). All minors had written informed 13 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . consent given from a parent or guardian, and all adult participants provided their own written informed consent. Results Of the 870 participants enrolled in the study, 449 (51.6%) underwent follow-up for the collection of convalescent serum. Of these 449 participants, 71 (15.8%) met criteria for acute SFGR (Figure 1). We excluded the highly skewed predictors hematochezia, meningeal signs, and the malaria rapid diagnostic from our analysis. We found statistically significant differences in several clinical variables including vital signs, clinical symptoms, and laboratory results between acute SFGR and nonSGFR febrile illness (Table 1). Overall, acute SFGR participants were older (median age 24 versus 8 years, p-value=0.003) with significantly higher height and weight. Acute SFGR participants had a lower respiratory rate than non-SFGR febrile illness participants (median 28 versus 32 breaths per minute, p-value<0.001) and were more likely reside in a rural setting (59% versus 45%, p-value=0.025). There were no significant 14 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . differences in the CBC results between the two groups, although platelet count and hematocrit had a p-value of 0.07 and 0.08, respectively. We also found several significantly different climate and environmental predictors between acute SFGR and non-SFGR febrile illness participants (Supplemental table S1). Acute SFGR was associated with higher recent temperatures (significantly higher nighttime mean (Odds ratio (OR): 1.17 [1.03-1.34], p-value = 0.01) and nighttime maximum (OR: 1.16 [1.02-1.32], p-value = 0.02) temperature at one-month lag and daytime minimum (OR: 1.08 [1.00-1.17], p-value = 0.03) temperature at one-month lag) as well as lower minimum NDWI (a proxy for plant water stress, where lower values signify increased plant stress) at one- and two-month lags (OR:0.20 [0.050.71], p-value = 0.01; OR: 0.09 [0.02-0.38], p-value < 0.001). Additionally, acute SFGR participants had lower minimum evapotranspiration rates at a two-month lag (OR:0.78 [0.64-0.96] p-value=0.02) and higher maximum evapotranspiration rates at a three-month lag (OR:1.06 [1.01-1.12], pvalue=0.03). 15 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Performance of clinical, climate, and environmental predictors and parsimonious model selection Table 2 lists the best performing clinical, climate, and environmental predictors, as well as the best performing predictors when combined. We first assessed model performance with only the ten best performing clinical predictors: AUC of 0.64 (95% CI:0.48-0.80) and with only the ten best performing climate and environmental predictors: AUC: 0.61 (95% CI:0.470.77). Next, we fit a model using the ten best performing clinical and the ten best performing climate and environmental predictors and assessed how this model compared to a model with only the ten best performing clinical predictors. By combining clinical, climate and environmental predictors, the AUC improved to 0.72 (95% CI: 0.57-0.86), though this improvement was not statistically significant (median p-value=0.3, 12% of pvalues <0.05). A model with a sensitivity of 70%, 80%, and 90% had a specificity of 61%, 51%, and 33% respectively. Vice versa, a model with a specificity of 70%, 80%, and 90% had a sensitivity of 62%, 46%, and 29% respectively. 16 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . To create a parsimonious model, we fit models by successively incorporating fewer of the best performing predictors. Model performance was relatively similar with 10, 15, and 20 predictors and began to decrease with less than 10 predictors (Figure 2). A model with the best performing 10 predictors had an AUC: of 0.69 (95%CI 0.54-0.84) and a sensitivity of 66%, a specificity of 72%, PPV of 93%, and NPV of 33%. This model would include eight clinical predictors: respiration rate, platelet count, HIV rapid test result, red blood cell count, hemoglobin, admission temperature, oxygen saturation, basophil count and two climate and environmental predictors: minimum NDWI at a two-month lag and maximum evapotranspiration at a one-month lag. Discussion Using data from a two-center clinical study of febrile illness from northern Tanzania, we show the derivation and cross-validation of a diagnostic prediction model for SFGR, a febrile illness lacking an accurate laboratory diagnostic during acute illness. We also showed that the addition of satellite-derived climate and environmental predictors improved the predictive performance of clinical predictors alone. A parsimonious model with ten 17 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . predictors including three vital signs, four results from CBC, two satellitederived climate predictors, and a rapid HIV test achieved an AUC of 0.69 (95%CI 0.54-0.84) on cross-validation. While our predictive model offers an improvement over existing clinical prediction models published for SFGR,32 given the suboptimal performance of these models, there is a critical need for the exploration and validation of specific biomarkers that could enhance diagnostic precision of SFGR clinical prediction models and contribute to more effective management strategies in regions affected by this potentially fatal bacterial disease. We propose assessing candidate biomarkers, including proteins, peptides, and nucleic acids, including routine clinical analytes (e.g., fibrinogen) and vetted translational research assays (e.g., endothelial activation markers such as angiopoietein-2) that are relevant to SFGR’s known pathophysiology of endothelial infection and inflammation. The use of satellite Imagery has been shown to facilitate modeling of population dynamics of ticks,33,34 the vector for transmission of SFGR. In our prediction modeling, satellite-derived climate and environmental predictors improved the AUC of our internally validated model. The optimum threshold for the parsimonious model resulted in a sensitivity of 18 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . 66%, a specificity of 72%, PPV of 93%, and NPV of 33%. A PPV of 93% would allow clinicians to use this model to determine which patients should be started on empiric treatment for SFGR with tetracycline therapy. However, a sensitivity of 66% indicates that this model would miss nearly one-third of SFGR patients. Models with higher sensitivities had much lower specificities (i.e., high potential for false positive predictions). Given the dynamic nature of NDVI, EVI, and NDWI, which undergo variations influenced by land use changes and other anthropogenic impacts,35-37 external validation of the model is needed, as these fluctuations may impose limitations on their applicability within models spanning several years. For use in a clinical decision support tool, the most recent satellite-derived climate and environmental data could be gathered from online sources, based on smartphone-based detection of GPS location. Similar to what has been reported in the literature, we found that a higher body temperature on admission was one of the top clinical predictors associated with acute SFGR infection.38,39 CBC results including thrombocytopenia, leukopenia, and lymphopenia have been shown to be significantly different between acute SFGR and non-SFGR febrile illness. In 19 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . our model, platelet count, hemoglobin, and hematocrit were important CBC predictors, however, lymphopenia and leukopenia were not. Our model also found that respiration rate and oxygen saturation contribute to discrimination between acute SFGR and non-SFGR febrile illness. While our dataset did not have complete information on classic SFGR symptoms including headache, myalgias, and rash,11 these symptoms have been found to occur in similar proportions among acute SFGR and non-SFGR febrile illness.9,10,40 A major limitation of our study is lack of external validation. The lack of external validation, coupled with the fact that we constructed our model using data from a single endemic Region in northern Tanzania, potentially hinders the model’s generalizability to a broader population. Given the intricate interplay between vector and host, the climate and environmental indices that affect SFGR may vary between regions. Second, our model was constructed using a relatively small dataset, resulting in wide confidence intervals for the calculated cross-validated AUC. could help Finally, our model lacks other laboratory values that have shown to be correlated with SFGR infection (e.g., sodium,38,41 transaminases,38,41-43 lactic 20 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . dehydrogenase,41,43 fibrinogen44); inclusion of these laboratory parameters may have improved model performance. As a proof of concept using an existing dataset of acute SFGR and nonSFGR febrile illness in Tanzania, we demonstrated proof of concept that inclusion of climate and environmental variables along with clinical variables improved clinical prediction models for identifying SFGR. Further research should expand upon this analysis by incorporating data from additional febrile cohorts, exploring the inclusion of clinical biomarkers, and assessing the performance of this model in diverse settings endemic for SFGR to ensure its generalizability. Acknowledgements: This research was supported by the International Studies on AIDS Associated Co-infections, United States National Institutes of Health (U01 AI062563 to J.A.C. and V.P.M, K24 AI166087 to D.T.L., and R38 HL143605 to R.J.W. through Utah Stimulating Access to Research in Residency (StARR)). We acknowledge the Hubert-Yeargan Center for Global Health at Duke University for critical infrastructure support for the 21 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Kilimanjaro Christian Medical Centre-Duke University Collaboration. Disclaimers: The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the US Centers for Disease Control and Prevention. Use of trade names and commercial sources is for identification only and does not imply endorsement by the US Department of Health and Human Services or the US Centers for Disease Control and Prevention. 22 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . References 1. 2. 3. 4. 5. 6. 7. 8. 9. Crump JA, Morrissey AB, Nicholson WL, et al. Etiology of severe non-malaria febrile illness in Northern Tanzania: a prospective cohort study. PLoS Negl Trop Dis. 2013;7(7):e2324. Ndip LM, Fokam EB, Bouyer DH, et al. Detection of Rickettsia africae in patients and ticks along the coastal region of Cameroon. Am J Trop Med Hyg. Sep 2004;71(3):363-366. Maina AN, Farris CM, Odhiambo A, et al. Q Fever, Scrub Typhus, and Rickettsial Diseases in Children, Kenya, 2011-2012. Emerg Infect Dis. May 2016;22(5):883-886. Dalton MJ, Clarke MJ, Holman RC, et al. National surveillance for Rocky Mountain spotted fever, 1981-1992: epidemiologic summary and evaluation of risk factors for fatal outcome. Am J Trop Med Hyg. May 1995;52(5):405-413. Kirkland KB, Wilkinson WE, Sexton DJ. Therapeutic delay and mortality in cases of Rocky Mountain spotted fever. Clin Infect Dis. May 1995;20(5):1118-1121. Regan JJ, Traeger MS, Humpherys D, et al. Risk factors for fatal outcome from rocky mountain spotted Fever in a highly endemic area-Arizona, 2002-2011. Clin Infect Dis. Jun 1 2015;60(11):16591666. Biggs HM, Behravesh CB, Bradley KK, et al. Diagnosis and Management of Tickborne Rickettsial Diseases: Rocky Mountain Spotted Fever and Other Spotted Fever Group Rickettsioses, Ehrlichioses, and Anaplasmosis - United States. MMWR Recomm Rep. May 13 2016;65(2):1-44. Fournier PE, Jensenius M, Laferl H, Vene S, Raoult D. Kinetics of antibody responses in Rickettsia africae and Rickettsia conorii infections. Clin Diagn Lab Immunol. Mar 2002;9(2):324-328. Traeger MS, Regan JJ, Humpherys D, et al. Rocky mountain spotted fever characterization and comparison to similar illnesses in a highly endemic area-Arizona, 2002-2011. Clin Infect Dis. Jun 1 2015;60(11):1650-1658. 23 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. Prabhu M, Nicholson WL, Roche AJ, et al. Q fever, spotted fever group, and typhus group rickettsioses among hospitalized febrile patients in northern Tanzania. Clin Infect Dis. Aug 2011;53(4):e8-15. Cohen R, Finn T, Babushkin F, et al. Spotted Fever Group Rickettsioses in Israel, 2010-2019. Emerg Infect Dis. Aug 2021;27(8):2117-2126. Delisle J, Mendell NL, Stull-Lane A, Bloch KC, Bouyer DH, Moncayo AC. Human infections by multiple spotted fever group rickettsiae in Tennessee. The American journal of tropical medicine and hygiene. 2016;94(6):1212. Cross R, Ling C, Day NP, McGready R, Paris DH. Revisiting doxycycline in pregnancy and early childhood--time to rebuild its reputation? Expert Opin Drug Saf. 2016;15(3):367-382. Wormser GP, Wormser RP, Strle F, Myers R, Cunha BA. How safe is doxycycline for young children or for pregnant or breastfeeding women? Diagn Microbiol Infect Dis. Mar 2019;93(3):238-242. Bright TJ, Wong A, Dhurjati R, et al. Effect of Clinical DecisionSupport Systems. Annals of Internal Medicine. 2012/07/03 2012;157(1):29-43. Bilal S, Nelson E, Meisner L, et al. Evaluation of Standard and Mobile Health-Supported Clinical Diagnostic Tools for Assessing Dehydration in Patients with Diarrhea in Rural Bangladesh. The American journal of tropical medicine and hygiene. 2018;99(1):171179. Tuon FF, Gasparetto J, Wollmann LC, Moraes TP. Mobile health application to assist doctors in antibiotic prescription - an approach for antibiotic stewardship. Braz J Infect Dis. Nov-Dec 2017;21(6):660664. Garbern SC, Nelson EJ, Nasrin S, et al. External validation of a mobile clinical decision support system for diarrhea etiology prediction in children: A multicenter study in Bangladesh and Mali. Elife. Feb 9 2022;11. Kerins JL, Dorevitch S, Dworkin MS. Spotted Fever Group Rickettsioses (SFGR): weather and incidence in Illinois. Epidemiol Infect. Sep 2017;145(12):2466-2472. 24 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. Zhang WJ, Cai XY, Yang C, et al. Cervical necrotizing fasciitis due to methicillin-resistant Staphylococcus aureus: a case report. Int J Oral Maxillofac Surg. Aug 2010;39(8):830-834. Fine AM, Brownstein JS, Nigrovic LE, et al. Integrating Spatial Epidemiology Into a Decision Model for Evaluation of Facial Palsy in Children. Archives of Pediatrics & Adolescent Medicine. 2011;165(1):61-67. Nelson EJ, Khan AI, Keita AM, et al. Improving Antibiotic Stewardship for Diarrheal Disease With Probability-Based Electronic Clinical Decision Support: A Randomized Crossover Trial. JAMA Pediatr. Oct 1 2022;176(10):973-979. Pisharody S, Rubach MP, Carugati M, et al. Incidence Estimates of Acute Q Fever and Spotted Fever Group Rickettsioses, Kilimanjaro, Tanzania, from 2007 to 2008 and from 2012 to 2014. Am J Trop Med Hyg. Dec 20 2021;106(2):494-503. Crump JA, Ramadhani HO, Morrissey AB, et al. Invasive bacterial and fungal infections among hospitalized HIV-infected and HIVuninfected adults and adolescents in northern Tanzania. Clin Infect Dis. Feb 1 2011;52(3):341-348. Crump JA, Ramadhani HO, Morrissey AB, et al. Invasive bacterial and fungal infections among hospitalized HIV-infected and HIVuninfected children and infants in northern Tanzania. Trop Med Int Health. Jul 2011;16(7):830-837. Biggs HM, Hertz JT, Munishi OM, et al. Estimating leptospirosis incidence using hospital-based surveillance and a population-based health care utilization survey in Tanzania. PLoS Negl Trop Dis. 2013;7(12):e2589. Tanzania NBoSotURo. Tanzania Census 2002: Analytical Report. 2006. Chen D, Huang J, Jackson TJ. Vegetation water content estimation for corn and soybeans using spectral indices derived from MODIS near- and short-wave infrared bands. Remote Sensing of Environment. 2005/10/15/ 2005;98(2):225-236. Jackson TJ, Chen D, Cosh M, et al. Vegetation water content mapping using Landsat data derived normalized difference water index for corn and soybeans. Remote Sensing of Environment. 2004/09/30/ 2004;92(4):475-482. 25 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . 30. 31. 32. 33. 34. 35. 36. Peng SY, Chuang YC, Kang TW, Tseng KH. Random forest can predict 30-day mortality of spontaneous intracerebral hemorrhage with remarkable discrimination. Eur J Neurol. Jul 2010;17(7):945-950. Sarica A, Cerasa A, Quattrone A. Random Forest Algorithm for the Classification of Neuroimaging Data in Alzheimer's Disease: A Systematic Review. Front Aging Neurosci. 2017;9:329. Lopez DM, de Mello FL, Giordano Dias CM, et al. Evaluating the Surveillance System for Spotted Fever in Brazil Using MachineLearning Techniques. Front Public Health. 2017;5:323. Randolph SE. Ticks and tick-borne disease systems in space and from space. Adv Parasitol. 2000;47:217-243. Estrada-Peña A. Geostatistics and remote sensing using NOAAAVHRR satellite imagery as predictive tools in tick distribution and habitat suitability estimations for Boophilus microplus (Acari: Ixodidae) in South America. National Oceanographic and Atmosphere Administration-Advanced Very High Resolution Radiometer. Vet Parasitol. Feb 1 1999;81(1):73-82. Aburas MM, Abdullah SH, Ramli MF, Ash’aari ZH. Measuring land cover change in Seremban, Malaysia using NDVI index. Procedia Environmental Sciences. 2015;30:238-243. Lunetta RS, Knight JF, Ediriwickrema J, Lyon JG, Worthy LD. Landcover change detection using multi-temporal MODIS NDVI data. Geospatial Information Handbook for Water Resources and Watershed Management, Volume II: CRC Press; 2022:65-88. 37. 38. 39. Wang G, Peng W, Zhang L, Zhang J, Xiang J. Vegetation EVI Changes and Response to Natural Factors and Human Activities Based on Geographically and Temporally Weighted Regression. Global Ecology and Conservation. 2023:e02531. Buckingham SC, Marshall GS, Schutze GE, et al. Clinical and laboratory features, hospital course, and outcome of Rocky Mountain spotted fever in children. J Pediatr. Feb 2007;150(2):180-184, 184.e181. Silva-Ramos CR, Hidalgo M, Faccini-Martínez Á A. Clinical, epidemiological, and laboratory features of Rickettsia parkeri rickettsiosis: A systematic review. Ticks Tick Borne Dis. Jul 2021;12(4):101734. 26 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . 40. 41. 42. 43. 44. Faruque LI, Zaman RU, Gurley ES, et al. Prevalence and clinical presentation of Rickettsia, Coxiella, Leptospira, Bartonella and chikungunya virus infections among hospital-based febrile patients from December 2008 to November 2009 in Bangladesh. BMC infectious diseases. 2017;17(1):1-12. Antón E, Font B, Muñoz T, Sanfeliu I, Segura F. Clinical and laboratory characteristics of 144 patients with mediterranean spotted fever. Eur J Clin Microbiol Infect Dis. Feb 2003;22(2):126-128. Mahara F. Japanese spotted fever: report of 31 cases and review of the literature. Emerg Infect Dis. Apr-Jun 1997;3(2):105-111. Miyashima Y, Iwamuro M, Shibata M, et al. Prediction of Disseminated Intravascular Coagulation by Liver Function Tests in Patients with Japanese Spotted Fever. Intern Med. Jan 15 2018;57(2):197-202. Rao AK, Schapira M, Clements ML, et al. A prospective study of platelets and plasma proteolytic systems during the early stages of Rocky Mountain spotted fever. N Engl J Med. Apr 21 1988;318(16):1021-1028. 27 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . 28 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Table 1. Clinical characteristics and complete blood count results for febrile participants with and without spotted-fever group rickettsioses, northern Tanzania, 2007-2008. Overall, Non-SFGR febrile Acute SFGR, p-value2 1 1 N=449 illness, N=71 1 N=378 Age, years 9 (2, 32) 8 (1, 31) 24 (4, 39) 0.003 Rural 211 (47%) 168 (45%) 42 (59%) 0.03 Admission temperature, °C 38.40 (38.00, 38.40 (38.00, 39.00) 38.60 (38.00, 39.20) 0.4 39.10) Heart rate, beats per minute 112 (97, 131) 112 (98, 132) 108 (96, 124) 0.2 Respiration rate, breaths per 32 (24, 44) 32 (26, 45) 28 (24, 35) <0.001 minute Oxygen saturation, % 96.0 (94.0, 96.0 (94.0, 98.0) 97.0 (94.0, 98.0) 0.5 98.0) Systolic blood pressure, 108 (100, 108 (100, 120) 108 (100, 117) 0.4 mmHg 120) Diastolic blood pressure, 68 (60, 74) 68 (60, 74) 67 (60, 73) 0.7 mmHg BMI 17.2 (14.4, 17.0 (14.3, 21.0) 19.2 (15.6, 22.5) 0.06 21.6) Weight, kg 22 (10, 55) 20 (10, 55) 49 (14, 61) 0.01 Height, m 1.25 (0.82, 1.19 (0.81, 1.62) 1.54 (1.05, 1.63) 0.01 1.62) Cough Oral Candida Crepitations Hepatomegaly Diarrhea Emesis Dyspnea Seizure Pallor Lymphadenopathy 293 (65%) 43 (9.6%) 196 (44%) 34 (7.6%) 90 (20%) 132 (29%) 152 (34%) 41 (9.2%) 32 (7.2%) 39 (8.7%) 252 (67%) 39 (10%) 165 (44%) 27 (7.2%) 77 (20%) 109 (29%) 132 (35%) 37 (9.8%) 27 (7.2%) 34 (9.0%) 41 (58%) 4 (5.6%) 31 (44%) 7 (9.9%) 13 (18%) 23 (32%) 20 (28%) 4 (5.6%) 5 (7.0%) 5 (7.0%) 0.1 0.2 >0.9 0.4 0.7 0.5 0.3 0.3 >0.9 0.6 Rapid Malaria Diagnostic Rapid HIV Diagnostic Positive Indeterminant White blood cell count, K/µL Red blood cell count, M/ µL 19 (4.2%) 16 (4.2%) 3 (4.2%) >0.9 0.13 103 (23%) 8 (1.8%) 9 (6, 13) 4.28 (3.66, 4.79) 92 (24%) 8 (2.1%) 9 (6, 13) 4.24 (3.59, 4.79) 11 (15%) 0 (0%) 7 (5, 13) 4.43 (3.97, 4.82) 0.1 0.2 Hemoglobin, g/dL 10.80 (9.00, 12.40) 32 (28, 37) 267 (161, 391) 62 (42, 76) 10.70 (9.00, 12.28) 11.30 (9.95, 12.75) 0.1 32 (27, 36) 275 (164, 404) 33 (30, 37) 225 (130, 360) 0.08 0.07 62 (42, 76) 65 (45, 79) 0.3 25 (16, 43) 8.5 (5.9, 11.5) 0.20 (0.00, 0.90) 26 (16, 44) 8.5 (5.9, 11.6) 0.20 (0.00, 1.00) 22 (13, 42) 7.8 (5.3, 10.5) 0.20 (0.00, 0.70) 0.2 0.3 0.3 Hematocrit, % Platelet count, K/µL Neutrophil, % Lymphocyte, % Monocyte, % Eosinophil, % 29 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Basophil, % 1 2 0.80 (0.40, 1.20) 0.80 (0.40, 1.20) 0.80 (0.45, 1.10) >0.9 n (%); Median (IQR) Pearson’s Chi-squared test; Wilcoxon rank sum test; Fisher’s exact test 30 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Table 2. Best performing predictors by permutation of importance for predictors Best Performing Climate and Best Performing Clinical Environmental Predictors Predictors AUC: 0.61 (95% CI: 0.47-0.77) AUC: 0.64 (95% CI:0.48-0.80) Predictors Permutation Predictors Permutation of of Importance Importance Minimum NDWI 8.0e-05 Respiration rate 0.00380 (2) Daytime maximum temperature (2) Maximum NDVI (2) Daytime minimum temperature (1) Minimum evapotranspiration (3) Minimum EVI (2) Daytime mean temperature (2) Daytime maximum temperature (1) Maximum evapotranspiration (1) Maximum NDVI (1) 3.6e-05 2.9e-05 2.6e-05 2.5e-05 Admission temperature Oxygen saturation Red blood cell count HIV Hemoglobin Platelet count 1.8e-05 Cough Best Performing Combined Predictors AUC: 0.72 (95% CI: 0.57-0.86) Predictors Permutation of Importance Respiration Rate Platelet 0.00330 0.00240 0.00171 0.00165 0.00239 HIV Hemoglobin Red blood cell count Oxygen Saturation Admission temperature Minimum EVI (2) 0.00019 1.8e-05 Hematocrit 0.00001 1.8e-05 Basophil count -0.00112 0.00437 count 0.00359 0.00223 1.9e-05 1.9e-05 climate, environmental, and clinic 0.00119 0.00119 0.00060 0.00046 0.00031 0.00027 Maximum Evapotranspiration (1) Minimum NDWI (2) Maximum NDVI (2) Daytime maximum temperature(2) Daytime mean temperature (2) Maximum NDVI (1) Hematocrit Minimum evapotranspiration (3) Daytime maximum temperature (1) 0.00023 0.00017 0.00013 0.00008 0.00008 0.00001 0.00001 0.00000 -0.00003 31 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Daytime minimum temperature (1) Basophil count Cough -0.00010 -0.00020 -0.00031 32 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . 7,507 admissions to KCMC and MRH screened for eligibility from 17 September 2007 through 25 August 2008 6,177 (82.3%) did not meet eligibility criteria 1,330 (17.7%) met eligibility criteria 460 (34.6%) were not enrolled 870 (65.4%) enrolled in study 421 (48.4%) not tested by spotted fever rickettsioses IFA 449 (51.7%) tested by spotted fever group rickettsioses IFA 4- to 6-weeks posthospitalization Figure 1. Study flow diagram. Screening and enrollment of patients hospitalized at KCMC and MRH. KCMC: Kilimanjaro Christian Medical Centre; MRH: Mawenzi Regional Hospital; IFA: immunofluorescence assay 33 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . AUC by Number of Predictors 0.8 AUC 0.7 0.6 0.5 0.4 5 10 Number of Predictors 15 20 Figure 2. Average AUC (solid line) and 95% Confidence Intervals (dotted lines) from cross-validation (100 iterations) for each model by number of predictors included in the model. AUC: area under the receiver operating characteristic curve. 34 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Supplemental Table S1. TRIPOD checklist. Section/Topic Item Checklist Item Page Title and abstract Title 1 Abstract 2 Identify the study as developing and/or validating a multivariable prediction model, the target population, and the outcome to be predicted. Provide a summary of objectives, study design, setting, participants, sample size, predictors, outcome, statistical analysis, results, and conclusions. 1 2 Introduction Background and objectives 3a 3b Explain the medical context (including whether diagnostic or prognostic) and rationale for developing or validating the multivariable prediction model, including references to existing models. Specify the objectives, including whether the study describes the development or validation of the model or both. 2-3 3 Methods 4a Source of data 4b 5a Participants Outcome 5b 5c 6a 6b 7a Predictors 7b Sample size 8 Missing data 9 10a Statistical analysis methods Risk groups Results 10b 10d 11 15b Describe the flow of participants through the study, including the number of participants with and without the outcome and, if applicable, a summary of the follow-up time. A diagram may be helpful. Describe the characteristics of the participants (basic demographics, clinical features, available predictors), including the number of participants with missing data for predictors and outcome. Specify the number of participants and outcome events in each analysis. If done, report the unadjusted association between each candidate predictor and outcome. Present the full prediction model to allow predictions for individuals (i.e., all regression coefficients, and model intercept or baseline survival at a given time point). Explain how to the use the prediction model. 16 Report performance measures (with CIs) for the prediction model. 18 Discuss any limitations of the study (such as nonrepresentative sample, few events per predictor, missing data). 13a Participants 13b Model development Model specification Model performance Discussion Limitations Describe the study design or source of data (e.g., randomized trial, cohort, or registry data), separately for the development and validation data sets, if applicable. Specify the key study dates, including start of accrual; end of accrual; and, if applicable, end of follow-up. Specify key elements of the study setting (e.g., primary care, secondary care, general population) including number and location of centres. Describe eligibility criteria for participants. Give details of treatments received, if relevant. Clearly define the outcome that is predicted by the prediction model, including how and when assessed. Report any actions to blind assessment of the outcome to be predicted. Clearly define all predictors used in developing or validating the multivariable prediction model, including how and when they were measured. Report any actions to blind assessment of predictors for the outcome and other predictors. Explain how the study size was arrived at. Describe how missing data were handled (e.g., complete-case analysis, single imputation, multiple imputation) with details of any imputation method. Describe how predictors were handled in the analyses. Specify type of model, all model-building procedures (including any predictor selection), and method for internal validation. Specify all measures used to assess model performance and, if relevant, to compare multiple models. Provide details on how risk groups were created, if done. 14a 14b 15a 3-4 3-4 3-4 3-4 N/A 4 N/A 4 N/A 7 5 5 5 5 N/A 7 6-8 6-8 N/A 9-10 9-11 9-10 35 11-12 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Interpretation Implications Other information Supplementary information Funding 19b Give an overall interpretation of the results, considering objectives, limitations, and results from similar studies, and other relevant evidence. 11-12 20 Discuss the potential clinical use of the model and implications for future research. 11-12 21 22 Provide information about the availability of supplementary resources, such as study protocol, Web calculator, and data sets. Give the source of funding and the role of the funders for the present study. 36 12-14 12 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Supplemental Table S2. MODIS satellite-derived climate and environmental predictors for acute SFGR and non-SFGR febrile illness, Kilimanjaro region, Tanzania 2007-2008 Satellite predictor (months Overall, Non-SFGR Acute SFGR, p1 1 lagged) N=449 febrile illness, N=71 value2 N=3781 Mean EVI (1) 0.28 (0.22, 0.28 (0.23, 0.28 (0.22, 0.2 0.30) 0.30) 0.29) Mean NDVI (1) 0.47 (0.42, 0.48 (0.42, 0.47 (0.42, 0.5 0.52) 0.53) 0.50) Maximum EVI (1) 0.71 (0.65, 0.71 (0.65, 0.71 (0.69, >0.9 0.73) 0.73) 0.72) Maximum NDVI (1) 0.927 (0.923, 0.927 (0.923, 0.927 (0.923, 0.9 0.937) 0.941) 0.933) Minimum EVI (1) -0.16 (-0.19, - -0.16 (-0.19, -0.16 (-0.18, - 0.073 0.12) 0.12) 0.12) Minimum NDVI (1) -0.167 (-0.188, -0.167 (-0.188, -0.167 (-0.186, >0.9 -0.158) -0.149) -0.158) Mean NDWI (1) 0.22 (0.21, 0.23 (0.20, 0.22 (0.21, 0.4 0.30) 0.30) 0.24) Minimum NDWI (1) 1.11 (1.01, 1.12 (1.03, 1.10 (0.93, 0.008 1.29) 1.29) 1.14) Maximum NDWI (1) 0.25 (0.22, 0.25 (0.22, 0.24 (0.22, 0.087 0.29) 0.29) 0.26) 13.7 (11.1, 13.7 (11.1, 12.9 (12.0, 0.2 Mean evapotranspiration 17.5) 17.5) 14.6) (1) Minimum 3.95 (3.25, 5.00) 3.90 (3.25, 4.60) 0.6 evapotranspiration (1) 3.95 (3.25, 4.60) Maximum 61 (56, 63) 61 (58, 63) 61 (55, 64) 0.4 288.04 K (286.27, 289.02) 295.25 K (293.53, 295.86) 266.7 K (265.0, 269.2) 304.1 K (298.7, 306.3) 288.04 K (286.27, 288.93) 294.95 K (293.53, 295.86) 266.4 K (265.0, 269.2) 303.9 K (298.7, 306.4) 288.86 K (287.42, 290.59) 295.76 K (294.34, 297.77) 266.8 K (264.9, 269.2) 304.1 K (298.8, 306.3) 0.008 evapotranspiration (1) Nighttime Mean Temp. (1) Nighttime Maximum Temp. (1) Nighttime Minimum Temp. (1) Daytime Mean Temp. (1) 0.005 0.4 0.14 37 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Daytime Maximum Temp. (1) Daytime Minimum Temp. (1) Mean EVI (2) Mean NDVI (2) Maximum EVI (2) Maximum NDVI (2) Minimum EVI (2) Minimum NDVI (2) Mean NDWI (2) Minimum NDWI (2) Maximum NDWI (2) Mean evapotranspiration (2) Minimum evapotranspiration (2) Maximum evapotranspiration (2) Nighttime Mean Temp. (2) Nighttime Maximum Temp. (2) Nighttime Minimum Temp. (2) Daytime Mean Temp. (2) Daytime Maximum Temp. (2) 317.8 K (309.8, 321.1) 275.1 K (273.7, 278.6) 0.28 (0.22, 0.29) 0.47 (0.43, 0.50) 0.71 (0.69, 0.74) 0.927 (0.923, 0.936) -0.16 (-0.18, 0.12) -0.167 (-0.188, -0.158) 0.22 (0.18, 0.24) 1.12 (1.03, 1.22) 0.25 (0.22, 0.27) 13.22 (11.47, 15.43) 317.7 K (309.8, 321.1) 275.1 K (273.7, 278.3) 0.28 (0.22, 0.29) 0.47 (0.43, 0.50) 0.71 (0.69, 0.74) 0.927 (0.923, 0.936) -0.16 (-0.18, 0.12) -0.167 (-0.188, -0.158) 0.22 (0.15, 0.25) 1.12 (1.05, 1.29) 0.25 (0.22, 0.27) 13.71 (11.47, 15.43) 318.1 K (310.5, 321.0) 276.5 K (275.1, 279.5) 0.26 (0.23, 0.28) 0.45 (0.43, 0.50) 0.72 (0.69, 0.74) 0.928 (0.923, 0.933) -0.15 (-0.17, 0.10) -0.164 (-0.186, -0.149) 0.22 (0.21, 0.23) 1.12 (1.01, 1.14) 0.24 (0.22, 0.26) 13.22 (12.88, 14.55) 0.4 3.90 (3.25, 4.55) 3.90 (3.25, 4.55) 3.70 (3.25, 4.15) 0.038 61.1 (58.0, 63.1) 61.1 (58.2, 63.1) 60.0 (55.4, 63.1) 0.3 288.54 K (287.23, 288.93) 295.25 K (293.53, 296.41) 266.8 K (265.1, 270.0) 304.1 K (299.8, 306.7) 318.1 K (310.5, 321.1) 288.04 K (287.23, 288.93) 295.25 K (293.53, 295.86) 266.8 K (265.1, 270.0) 304.1 K (299.8, 306.7) 318.1 K (310.5, 321.1) 288.54 K (287.36, 289.05) 295.53 K (293.57, 297.30) 266.8 K (264.5, 270.0) 304.9 K (303.6, 306.7) 320.0 K (317.6, 321.1) 0.3 0.023 0.2 0.5 0.2 0.7 0.10 0.3 >0.9 0.011 0.7 0.5 0.2 0.6 0.2 0.15 38 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Daytime Minimum Temp. (2) Mean EVI (3) Mean NDVI (3) Maximum EVI (3) Maximum NDVI (3) Minimum EVI (3) Minimum NDVI (3) Mean NDWI (3) Minimum NDWI (3) Maximum NDWI (3) Mean evapotranspiration (3) Minimum evapotranspiration (3) Maximum evapotranspiration (3) Nighttime Mean Temp. (3) Nighttime Maximum Temp. (3) Nighttime Minimum Temp. (3) Daytime Mean Temp. (3) Daytime Maximum Temp. (3) 275.6 K (273.7, 278.6) 0.26 (0.23, 0.29) 0.45 (0.41, 0.49) 0.71 (0.69, 0.73) 0.927 (0.923, 0.936) -0.16 (-0.18, 0.12) -0.169 (-0.188, -0.158) 0.22 (0.15, 0.23) 1.12 (1.05, 1.26) 0.24 (0.22, 0.27) 12.93 (11.42, 14.55) 275.4 K (273.7, 278.6) 0.26 (0.22, 0.29) 0.45 (0.41, 0.49) 0.71 (0.65, 0.73) 0.927 (0.923, 0.936) -0.16 (-0.18, 0.12) -0.169 (-0.188, -0.158) 0.22 (0.15, 0.23) 1.11 (1.05, 1.26) 0.24 (0.22, 0.27) 12.88 (11.36, 14.55) 275.6 K (273.6, 278.3) 0.25 (0.24, 0.28) 0.45 (0.44, 0.52) 0.71 (0.69, 0.74) 0.927 (0.918, 0.936) -0.16 (-0.18, 0.12) -0.183 (-0.188, -0.158) 0.22 (0.15, 0.23) 1.13 (1.06, 1.29) 0.24 (0.22, 0.28) 13.71 (11.44, 15.43) 0.5 3.90 (3.25, 4.35) 3.90 (3.25, 4.35) 4.00 (3.25, 4.55) 0.8 61.1 (57.2, 63.7) 61.1 (56.4, 63.1) 62.4 (58.0, 67.4) 0.024 288.72 K (287.23, 289.16) 295.53 K (293.57, 296.41) 266.8 K (265.4, 269.2) 304.89 K (299.81, 306.66) 319.3 K (311.0, 321.4) 288.54 K (287.23, 289.16) 295.53 K (293.57, 296.41) 266.8 K (265.5, 269.2) 304.59 K (299.81, 306.66) 318.7 K (311.0, 321.4) 288.72 K (287.35, 288.97) 295.74 K (293.66, 296.41) 266.7 K (265.1, 269.2) 306.14 K (300.47, 307.02) 320.3 K (311.7, 321.4) >0.9 0.5 0.11 0.5 0.3 >0.9 0.5 >0.9 0.15 0.2 0.15 0.4 0.5 0.10 0.14 39 medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . Daytime Minimum Temp. (3) 1 2 276.8 K (274.0, 278.7) 276.8 K (273.7, 278.7) 276.6 K (274.0, 278.7) 0.8 Median (IQR) Wilcox rank sum test 40