medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Integrated Clinical, Climate, and Environmental Prediction Modeling for Diagnosis of Spotted
Fever Group Rickettsioses in northern Tanzania.
Robert J. Williams1, Ben J. Brintz1,2, William L. Nicholson3, John A. Crump4,5,6,7,8, Ganga
Moorthy5,9, Venace P. Maro7,8, Grace D. Kinabo7,8, James Ngocho7,8, Wilbrod Saganda10,11,
Daniel T. Leung1,12*+, Matthew P. Rubach4,5,7,13*
1. Division of Infectious Diseases, Department of Internal Medicine, University of Utah, Salt
Lake City, Utah.
2. Division of Epidemiology, Department of Internal Medicine, University of Utah, Salt Lake
City, Utah.
3. Rickettsial Zoonoses Branch, Division of Vector-Borne Diseases, Centers for Disease
Control and Prevention, Atlanta, Georgia.
4. Division of Infectious Diseases and International Health, Department of Medicine, Duke
University, Durham, North Carolina.
5. Duke Global Health Institute, Duke University, Durham, North Carolina.
6. Center for International Health, University of Otago, Dunedin, New Zealand.
7. Kilimanjaro Christian Medical Centre, Moshi, Tanzania.
8. Kilimanjaro Christian Medical University College, Moshi, Tanzania
9. Division of Pediatric Infectious Diseases, Department of Pediatrics, Duke University,
Durham, North Carolina.
10. Mawenzi Regional Referral Hospital, Moshi, Tanzania.
11. Ministry of Health, Community Development, Gender, Elderly, and Children, Dodoma,
Tanzania.
12. Division of Microbiology and Immunology, Department of Pathology, University of Utah,
Salt Lake City.
13. Programme in Emerging Infectious Diseases, Duke-National University of Singapore
Medical School, Singapore, Singapore
* contributed equally
+ corresponding author Email: daniel.leung@utah.edu (D.T.L.)
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.
1
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Abstract
Spotted fever group rickettsioses (SFGR) pose a global threat as emerging
zoonotic infectious diseases; however, timely and cost-effective diagnostic
tools are currently limited. While traditional clinical prediction models focus
on individual patient-level parameters, we hypothesize that for infectious
diseases, the inclusion of location-specific parameters such as climate data
may improve predictive ability. To create a prediction model, we used
data from 449 patients presenting to two hospitals in northern Tanzania
between 2007 to 2008, of which 71 (15.8%) met criteria for acute SFGR
based on ≥4-fold rise in antibody titers between acute and convalescent
serum samples. We fit random forest classifiers by incorporating clinical
and demographic data from hospitalized febrile participants as well as
satellite-derived climate predictors from the Kilimanjaro Region. In crossvalidation, a prediction model combining clinical, climate, and environmental
predictors (20 predictors total) achieved a statistically non-significant
increase in the area under the receiver operating characteristic curve
(AUC) compared to clinical predictors alone [AUC: 0.72 (95% CI:0.57-0.86)
versus AUC: 0.64 (95% CI:0.48-0.80)]. In conclusion, we derived and
internally-validated a diagnostic prediction model for acute SFGR,
2
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
demonstrating that the inclusion of climate variables alongside clinical
variables improved model performance, though this difference was not
statistically significant. Novel strategies are needed to improve the
diagnosis of acute SFGR, including the identification of diagnostic
biomarkers that could enhance clinical prediction models.
3
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Background
Spotted fever group rickettsioses (SFGR) are a group of illnesses caused
by bacteria from the genus Rickettsia that include endemic, new, and
emerging zoonotic infectious diseases with a worldwide distribution. In
several African countries, SFGR has been identified as the infectious
etiology in 5-22% of febrile hospital admissions.1-3 Prompt recognition and
treatment of SFGR are important as multiple studies have shown the
delay in initiation of tetracycline antimicrobials is associated with increased
morbidity and mortality.4-6 However, current diagnostic methods do not allow
for timely and accurate diagnosis. The most sensitive reference standard
diagnostic, a 4-fold rise in immunofluorescent antibody (IFA) titer between
paired acute and convalescent serum samples, requires convalescent serum
collection, and therefore by definition cannot establish the diagnosis at the
time of presentation.7 In the case of R. africae and R. conorii, SFGR of
importance in African countries, seroconversion may not occur until 4
weeks after illness onset.8
Rickettsia are intracellular species and do not circulate extensively in the
bloodstream, limiting the sensitivity of polymerase chain reaction (PCR) on
4
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
blood specimens to around 60%.7 Laboratory values such as
thrombocytopenia, hyponatremia, and elevated transaminases are supportive
features, but cannot be relied on to guide early management as they are
non-specific findings and often within normal limits or only slightly above
the reference range early in the course of illness.9
Clinical diagnosis
relying on the triad classically associated with SFGR—a history of a tick
bite, rash, and fever—only occur in a minority of cases.9-12 Incorporating
tetracycline therapy into the empiric syndromic management of febrile
illness in high prevalence settings would not only exacerbate antimicrobial
resistance, but subject children to the risks of tetracycline therapy,
including bone growth suppression and permanent tooth discoloration.13,14
Thus, more accurate, timely, and cost-effective tools are needed for
diagnosis of SFGR.
Clinical Decision-Support Systems (CDSS) incorporating prediction models
have the potential to improve management of infectious diseases. CDSS
have proven effective at enhancing therapeutic management and reducing
unnecessary diagnostic tests in both high-income countries (HICs)
LMICs.
16-18
15
and
Traditional predictive models generally incorporate clinical
5
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
information that is obtained solely from the presenting patient. However, as
with other zoonoses, the range and incidence of SFGR has been
associated with climate and climate-related environmental factors. 19,20 Thus,
incorporating location-specific parameters, such as climate and environmental
data, into a prediction model may increase diagnostic accuracy.18,21,22
In this study, we demonstrate a ‘proof of concept’ by integrating locationspecific parameters into clinical prediction. Our overarching goal is to
create an accessible and cost-effective CDSS that assists clinicians in
diagnosing SFGR. To achieve this, we used data from a clinical study of
febrile illness in an SFGR-endemic region in northern Tanzania 23 to
develop a clinical prediction model that incorporates climate and
environmental data.
Methods
Study Design, Setting, and Data Source
6
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
For the derivation and validation of a prediction model, we used deidentified data from a study of participants presenting with febrile illness to
two hospitals in Kilimanjaro Region of northern Tanzania, 2007-2008.24,25
Prior to data collection, the Kilimanjaro Region in northern Tanzania had a
population of 1,380,000 in a mostly rural and semirural setting;26,27 and
Moshi, the administrative center of the Kilimanjaro Region, had a
population of approximately 144,000.
27
The climate is characterized by a
long rainy period (March-May) and a short rainy period (NovemberDecember). Febrile participants presenting to Kilimanjaro Christian Medical
Centre (KCMC) or Mawenzi Regional Hospital (MRH) in Moshi, Tanzania
from September 2007 through August 2008 were eligible for enrollment.
Complete study methods have been described elsewhere.24,25 KCMC is a
tertiary care hospital serving several Regions in northern Tanzania; at the
time of the study KCMC had 458 inpatient beds. MRH, the regional
hospital for Kilimanjaro, had 300 beds at the time of the study and
served. Together, KCMC and MRH served as major providers of hospitalbased care in the Moshi area.
7
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
For pediatric participants 2 months to 12 years, inclusion criteria were a
history of fever in the past 48 hours or a measured axillary temperature
≥ 37.5°C or rectal temperature ≥ 38°C. For adolescent and adult
participants (≥13 years of age), inclusion criteria were an oral temperature
≥38°C on admission to the hospital. All participants required paired sera
for inclusion in this analysis. Blood specimens were collected for a
complete blood count (CBC) and serologic infectious disease diagnostics.
Participants were also tested for HIV and malaria with rapid diagnostic
testing. After obtaining informed consent, a trained study team member
collected standardized demographic data, clinical history, and physical
examination findings. Participants were asked to return 4-6 weeks after
enrollment for collection of a convalescent serum sample.
Acute and convalescent serum samples collected for SFGR testing were
sent to the Rickettsial Zoonoses Branch of the US Centers for Disease
Control and Prevention (US CDC). Serum samples were tested for SFGR
by IgG IFA to R. conorii (Moroccan strain). SFGR was defined as a ≥4fold increase in IFA titer to R. conorii between acute and convalescent
8
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
serum in a participant. Participants with less than a 4-fold rise in IFA
titer to R. conorii were considered non-SFGR febrile illness.
For each participant, a trained study team member collected standardized
demographic data, clinical history, admission vital signs, and physical
examination findings. As the presentation of infectious disease can differ
between pediatric and adult participants, several of the clinical variables
were only obtained for either pediatric or adult participants. For our
clinical prediction modeling we only included variables that were collected
for both groups, which included: age, heart rate, respiration rate, blood
pressure, oxygen saturation, height, weight, body mass index (BMI), cough,
diarrhea, emesis, hematochezia, dyspnea, seizures, crepitations,
hepatomegaly, splenomegaly, pallor, lymphadenopathy, oral candidiasis,
meningeal signs, HIV rapid diagnostic result, malaria rapid diagnostic result,
and if the participant resided in a rural setting. We recorded clinical
symptoms and physical exam findings as binary variables and age, vital
signs, and complete blood count (CBC) results as continuous variables.
Climate and Environmental data
9
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
We extracted climate and environmental data from the MODIS (Moderate
Resolution Imaging Spectroradiometer) satellite for the Kilimanjaro Region,
Tanzania. Environmental data included the normalized difference vegetation
index (NDVI), the enhanced vegetation index (EVI), and the normalized
difference water index (NDWI), evapotranspiration. Climate data included
daytime and nighttime temperature. NDVI and EVI indicators are based on
a 16-day time series composite image at 1km * 1km spatial resolution
and were obtained from Moderate Resolution Imaging Spectroradiometer
(MODIS) product MOD13A2. Surface temperatures, acquired from MODIS
product MOD11A1, are daily measurements at 1km * 1km spatial
resolution. Evapotranspiration is based on an 8-day time series composite
image at 500m x 500m spatial resolution and obtained from MOD16A2GF.
Finally, using MODI09A1, a 500m x 500m 8-day composite time series,
we calculated NDWI from near-infrared (NIR – MODIS band 2) and shortwave infrared (SWIR- MODIS band 6) reflectance’s.28,29 We consolidated all
climate and environmental data within a uniform 16-day time series
window; for shorter times-series, we computed the mean for each 16-day
window. For example, data from January 1 through January 16, 2007
constituted one window, followed by measurements from January 17
10
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
through February 2, 2007 in the subsequent window. To account for
recent climate and environmental patterns that may influence SFGR
incidence, we lagged each 16-day times series at one, two, and three
months.
Finally, we aligned the lagged measurements with the admission
dates of study participants. For instance, a participant presenting on April
4, 2007 would have data from the window containing March 4 (1-month
lag), February 4 (2-month lag), and January 4 (3-month lag).
Statistical Analysis and Modeling
To compare acute SFGR versus non-SFGR febrile illness groups on
univariate analysis, we used the Wilcoxon rank sum test due to the nonnormal distribution of age data. We used Pearson’s Chi-squared test and
Fisher’s exact test for categorical variables. We used the random forest
algorithm to fit a model to predict risk of participants having acute SFGR
versus non-SFGR febrile illness. Random forests are a machine learning
algorithm which constructs a multitude of decision trees and averages over
them to obtain a prediction robust to nonlinearities and interactions
between covariates; random forests algorithms have been widely applied to
biomedical sciences for both classification and regression.30,31
11
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
We excluded predictors with highly skewed binary predictors (predictors
with 95% or more values concentrated as either 0 or 1) from analysis.
For the remaining predictors, we imputed missing data using the
‘missRanger’ package in R. To determine which predictors to include in
our analysis, we fit two distinct models – one utilizing solely satellitederived climate and environmental data and another incorporating clinical
and demographic data. We used the ‘permimp’ package in R to assess
the variable importance using permutation-based methods. This method
involves systematically shuffling or permuting the values of individual
predictors to evaluate their impact on performance. Next, we identified the
top 10 predictors from each model based on their respective permuted
importance scores. We included these predictors in our final analysis.
To assess predictive performance for each random forest model, we used
repeated cross-validation using 80% training/20% testing splits with 100
iterations. In each iteration, we trained models on 80% of the data, made
predictions on the 20% test set, and obtained measures of performance.
We determined overall model performance by averaging the area under
12
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
the receiver operating characteristic curve (AUC) and confidence intervals
across the 100 iterations. To determine statistical significance in the AUC
between models we used a bootstrap method over 100 iterations, which
involves resampling the data with replacement multiple times, creating
bootstrap samples. For each bootstrap sample, we generated receiver
operating characteristic (ROC) and computed the difference in AUC
between the curves. We completed all analyses using R version 4.2.0,
and model development/validation was completed in accordance with the
Transparent reporting of a multivariable prediction model for individual
prognosis or diagnosis (TRIPOD) checklist (Supplement Table 1).
Research Ethics
The primary study was approved by the Kilimanjaro Christian Medical
University College Health Research Ethics Committee, the Tanzania
National Institutes for Medical Research National Health Research Ethics
Coordinating Committee, and Institutional Review Boards of Duke University
Medical Center, and the US CDC. The secondary data analysis was
reviewed by the Institutional Review Board of the University of Utah and
determined to be exempt (IRB_00164810). All minors had written informed
13
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
consent given from a parent or guardian, and all adult participants
provided their own written informed consent.
Results
Of the 870 participants enrolled in the study, 449 (51.6%) underwent
follow-up for the collection of convalescent serum. Of these 449
participants, 71 (15.8%) met criteria for acute SFGR (Figure 1).
We excluded the highly skewed predictors hematochezia, meningeal signs,
and the malaria rapid diagnostic from our analysis. We found statistically
significant differences in several clinical variables including vital signs,
clinical symptoms, and laboratory results between acute SFGR and nonSGFR febrile illness (Table 1). Overall, acute SFGR participants were
older (median age 24 versus 8 years, p-value=0.003) with significantly
higher height and weight. Acute SFGR participants had a lower respiratory
rate than non-SFGR febrile illness participants (median 28 versus 32
breaths per minute, p-value<0.001) and were more likely reside in a rural
setting (59% versus 45%, p-value=0.025). There were no significant
14
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
differences in the CBC results between the two groups, although platelet
count and hematocrit had a p-value of 0.07 and 0.08, respectively.
We also found several significantly different climate and environmental
predictors between acute SFGR and non-SFGR febrile illness participants
(Supplemental table S1). Acute SFGR was associated with higher recent
temperatures (significantly higher nighttime mean (Odds ratio (OR): 1.17
[1.03-1.34], p-value = 0.01) and nighttime maximum (OR: 1.16 [1.02-1.32],
p-value = 0.02) temperature at one-month lag and daytime minimum (OR:
1.08 [1.00-1.17], p-value = 0.03) temperature at one-month lag) as well as
lower minimum NDWI (a proxy for plant water stress, where lower values
signify increased plant stress) at one- and two-month lags (OR:0.20 [0.050.71], p-value = 0.01; OR: 0.09 [0.02-0.38], p-value < 0.001). Additionally,
acute SFGR participants had lower minimum evapotranspiration rates at a
two-month lag (OR:0.78 [0.64-0.96] p-value=0.02) and higher maximum
evapotranspiration rates at a three-month lag (OR:1.06 [1.01-1.12], pvalue=0.03).
15
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Performance of clinical, climate, and environmental predictors and
parsimonious model selection
Table 2 lists the best performing clinical, climate, and environmental
predictors, as well as the best performing predictors when combined. We
first assessed model performance with only the ten best performing clinical
predictors: AUC of 0.64 (95% CI:0.48-0.80) and with only the ten best
performing climate and environmental predictors: AUC: 0.61 (95% CI:0.470.77). Next, we fit a model using the ten best performing clinical and the
ten best performing climate and environmental predictors and assessed
how this model compared to a model with only the ten best performing
clinical predictors. By combining clinical, climate and environmental
predictors, the AUC improved to 0.72 (95% CI: 0.57-0.86), though this
improvement was not statistically significant (median p-value=0.3, 12% of pvalues <0.05). A model with a sensitivity of 70%, 80%, and 90% had a
specificity of 61%, 51%, and 33% respectively. Vice versa, a model with
a specificity of 70%, 80%, and 90% had a sensitivity of 62%, 46%, and
29% respectively.
16
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
To create a parsimonious model, we fit models by successively
incorporating fewer of the best performing predictors. Model performance
was relatively similar with 10, 15, and 20 predictors and began to
decrease with less than 10 predictors (Figure 2). A model with the best
performing 10 predictors had an AUC: of 0.69 (95%CI 0.54-0.84) and a
sensitivity of 66%, a specificity of 72%, PPV of 93%, and NPV of 33%.
This model would include eight clinical predictors: respiration rate, platelet
count, HIV rapid test result, red blood cell count, hemoglobin, admission
temperature, oxygen saturation, basophil count and two climate and
environmental predictors: minimum NDWI at a two-month lag and maximum
evapotranspiration at a one-month lag.
Discussion
Using data from a two-center clinical study of febrile illness from northern
Tanzania, we show the derivation and cross-validation of a diagnostic
prediction model for SFGR, a febrile illness lacking an accurate laboratory
diagnostic during acute illness. We also showed that the addition of
satellite-derived climate and environmental predictors improved the predictive
performance of clinical predictors alone. A parsimonious model with ten
17
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
predictors including three vital signs, four results from CBC, two satellitederived climate predictors, and a rapid HIV test achieved an AUC of 0.69
(95%CI 0.54-0.84) on cross-validation. While our predictive model offers an
improvement over existing clinical prediction models published for SFGR,32
given the suboptimal performance of these models, there is a critical need
for the exploration and validation of specific biomarkers that could enhance
diagnostic precision of SFGR clinical prediction models and contribute to
more effective management strategies in regions affected by this potentially
fatal bacterial disease. We propose assessing candidate biomarkers,
including proteins, peptides, and nucleic acids, including routine clinical
analytes (e.g., fibrinogen) and vetted translational research assays (e.g.,
endothelial activation markers such as angiopoietein-2) that are relevant to
SFGR’s known pathophysiology of endothelial infection and inflammation.
The use of satellite Imagery has been shown to facilitate modeling of
population dynamics of ticks,33,34 the vector for transmission of SFGR. In
our prediction modeling, satellite-derived climate and environmental
predictors improved the AUC of our internally validated model. The
optimum threshold for the parsimonious model resulted in a sensitivity of
18
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
66%, a specificity of 72%, PPV of 93%, and NPV of 33%. A PPV of
93% would allow clinicians to use this model to determine which patients
should be started on empiric treatment for SFGR with tetracycline therapy.
However, a sensitivity of 66% indicates that this model would miss nearly
one-third of SFGR patients. Models with higher sensitivities had much
lower specificities (i.e., high potential for false positive predictions). Given
the dynamic nature of NDVI, EVI, and NDWI, which undergo variations
influenced by land use changes and other anthropogenic impacts,35-37
external validation of the model is needed, as these fluctuations may
impose limitations on their applicability within models spanning several
years. For use in a clinical decision support tool, the most recent
satellite-derived climate and environmental data could be gathered from
online sources, based on smartphone-based detection of GPS location.
Similar to what has been reported in the literature, we found that a
higher body temperature on admission was one of the top clinical
predictors associated with acute SFGR infection.38,39 CBC results including
thrombocytopenia, leukopenia, and lymphopenia have been shown to be
significantly different between acute SFGR and non-SFGR febrile illness. In
19
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
our model, platelet count, hemoglobin, and hematocrit were important CBC
predictors, however, lymphopenia and leukopenia were not. Our model also
found that respiration rate and oxygen saturation contribute to
discrimination between acute SFGR and non-SFGR febrile illness. While
our dataset did not have complete information on classic SFGR symptoms
including headache, myalgias, and rash,11 these symptoms have been
found to occur in similar proportions among acute SFGR and non-SFGR
febrile illness.9,10,40
A major limitation of our study is lack of external validation. The lack of
external validation, coupled with the fact that we constructed our model
using data from a single endemic Region in northern Tanzania, potentially
hinders the model’s generalizability to a broader population. Given the
intricate interplay between vector and host, the climate and environmental
indices that affect SFGR may vary between regions. Second, our model
was constructed using a relatively small dataset, resulting in wide
confidence intervals for the calculated cross-validated AUC. could help
Finally, our model lacks other laboratory values that have shown to be
correlated with SFGR infection (e.g., sodium,38,41 transaminases,38,41-43 lactic
20
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
dehydrogenase,41,43
fibrinogen44); inclusion of these laboratory parameters
may have improved model performance.
As a proof of concept using an existing dataset of acute SFGR and nonSFGR febrile illness in Tanzania, we demonstrated proof of concept that
inclusion of climate and environmental variables along with clinical variables
improved clinical prediction models for identifying SFGR. Further research
should expand upon this analysis by incorporating data from additional
febrile cohorts, exploring the inclusion of clinical biomarkers, and assessing
the performance of this model in diverse settings endemic for SFGR to
ensure its generalizability.
Acknowledgements:
This research was supported by the International Studies on AIDS
Associated Co-infections, United States National Institutes of Health (U01
AI062563 to J.A.C. and V.P.M, K24 AI166087 to D.T.L., and R38
HL143605 to R.J.W. through Utah Stimulating Access to Research in
Residency (StARR)).
We acknowledge the Hubert-Yeargan Center for
Global Health at Duke University for critical infrastructure support for the
21
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Kilimanjaro Christian Medical Centre-Duke University
Collaboration. Disclaimers: The findings and conclusions in this report are
those of the authors and do not necessarily represent the official position
of the US Centers for Disease Control and Prevention. Use of trade
names and commercial sources is for identification only and does not
imply endorsement by the US Department of Health and Human Services
or the US Centers for Disease Control and Prevention.
22
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
Crump JA, Morrissey AB, Nicholson WL, et al. Etiology of severe
non-malaria febrile illness in Northern Tanzania: a prospective cohort
study. PLoS Negl Trop Dis. 2013;7(7):e2324.
Ndip LM, Fokam EB, Bouyer DH, et al. Detection of Rickettsia
africae in patients and ticks along the coastal region of Cameroon.
Am J Trop Med Hyg. Sep 2004;71(3):363-366.
Maina AN, Farris CM, Odhiambo A, et al. Q Fever, Scrub Typhus,
and Rickettsial Diseases in Children, Kenya, 2011-2012. Emerg Infect
Dis. May 2016;22(5):883-886.
Dalton MJ, Clarke MJ, Holman RC, et al. National surveillance for
Rocky Mountain spotted fever, 1981-1992: epidemiologic summary and
evaluation of risk factors for fatal outcome. Am J Trop Med Hyg.
May 1995;52(5):405-413.
Kirkland KB, Wilkinson WE, Sexton DJ. Therapeutic delay and
mortality in cases of Rocky Mountain spotted fever. Clin Infect Dis.
May 1995;20(5):1118-1121.
Regan JJ, Traeger MS, Humpherys D, et al. Risk factors for fatal
outcome from rocky mountain spotted Fever in a highly endemic
area-Arizona, 2002-2011. Clin Infect Dis. Jun 1 2015;60(11):16591666.
Biggs HM, Behravesh CB, Bradley KK, et al. Diagnosis and
Management of Tickborne Rickettsial Diseases: Rocky Mountain
Spotted Fever and Other Spotted Fever Group Rickettsioses,
Ehrlichioses, and Anaplasmosis - United States. MMWR Recomm
Rep. May 13 2016;65(2):1-44.
Fournier PE, Jensenius M, Laferl H, Vene S, Raoult D. Kinetics of
antibody responses in Rickettsia africae and Rickettsia conorii
infections. Clin Diagn Lab Immunol. Mar 2002;9(2):324-328.
Traeger MS, Regan JJ, Humpherys D, et al. Rocky mountain
spotted fever characterization and comparison to similar illnesses in
a highly endemic area-Arizona, 2002-2011. Clin Infect Dis. Jun 1
2015;60(11):1650-1658.
23
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
Prabhu M, Nicholson WL, Roche AJ, et al. Q fever, spotted fever
group, and typhus group rickettsioses among hospitalized febrile
patients in northern Tanzania. Clin Infect Dis. Aug 2011;53(4):e8-15.
Cohen R, Finn T, Babushkin F, et al. Spotted Fever Group
Rickettsioses in Israel, 2010-2019. Emerg Infect Dis. Aug
2021;27(8):2117-2126.
Delisle J, Mendell NL, Stull-Lane A, Bloch KC, Bouyer DH, Moncayo
AC. Human infections by multiple spotted fever group rickettsiae in
Tennessee. The American journal of tropical medicine and hygiene.
2016;94(6):1212.
Cross R, Ling C, Day NP, McGready R, Paris DH. Revisiting
doxycycline in pregnancy and early childhood--time to rebuild its
reputation? Expert Opin Drug Saf. 2016;15(3):367-382.
Wormser GP, Wormser RP, Strle F, Myers R, Cunha BA. How safe
is doxycycline for young children or for pregnant or breastfeeding
women? Diagn Microbiol Infect Dis. Mar 2019;93(3):238-242.
Bright TJ, Wong A, Dhurjati R, et al. Effect of Clinical DecisionSupport Systems. Annals of Internal Medicine. 2012/07/03
2012;157(1):29-43.
Bilal S, Nelson E, Meisner L, et al. Evaluation of Standard and
Mobile Health-Supported Clinical Diagnostic Tools for Assessing
Dehydration in Patients with Diarrhea in Rural Bangladesh. The
American journal of tropical medicine and hygiene. 2018;99(1):171179.
Tuon FF, Gasparetto J, Wollmann LC, Moraes TP. Mobile health
application to assist doctors in antibiotic prescription - an approach
for antibiotic stewardship. Braz J Infect Dis. Nov-Dec 2017;21(6):660664.
Garbern SC, Nelson EJ, Nasrin S, et al. External validation of a
mobile clinical decision support system for diarrhea etiology prediction
in children: A multicenter study in Bangladesh and Mali. Elife. Feb
9 2022;11.
Kerins JL, Dorevitch S, Dworkin MS. Spotted Fever Group
Rickettsioses (SFGR): weather and incidence in Illinois. Epidemiol
Infect. Sep 2017;145(12):2466-2472.
24
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
Zhang WJ, Cai XY, Yang C, et al. Cervical necrotizing fasciitis due
to methicillin-resistant Staphylococcus aureus: a case report. Int J
Oral Maxillofac Surg. Aug 2010;39(8):830-834.
Fine AM, Brownstein JS, Nigrovic LE, et al. Integrating Spatial
Epidemiology Into a Decision Model for Evaluation of Facial Palsy in
Children. Archives of Pediatrics & Adolescent Medicine.
2011;165(1):61-67.
Nelson EJ, Khan AI, Keita AM, et al. Improving Antibiotic
Stewardship for Diarrheal Disease With Probability-Based Electronic
Clinical Decision Support: A Randomized Crossover Trial. JAMA
Pediatr. Oct 1 2022;176(10):973-979.
Pisharody S, Rubach MP, Carugati M, et al. Incidence Estimates of
Acute Q Fever and Spotted Fever Group Rickettsioses, Kilimanjaro,
Tanzania, from 2007 to 2008 and from 2012 to 2014. Am J Trop
Med Hyg. Dec 20 2021;106(2):494-503.
Crump JA, Ramadhani HO, Morrissey AB, et al. Invasive bacterial
and fungal infections among hospitalized HIV-infected and HIVuninfected adults and adolescents in northern Tanzania. Clin Infect
Dis. Feb 1 2011;52(3):341-348.
Crump JA, Ramadhani HO, Morrissey AB, et al. Invasive bacterial
and fungal infections among hospitalized HIV-infected and HIVuninfected children and infants in northern Tanzania. Trop Med Int
Health. Jul 2011;16(7):830-837.
Biggs HM, Hertz JT, Munishi OM, et al. Estimating leptospirosis
incidence using hospital-based surveillance and a population-based
health care utilization survey in Tanzania. PLoS Negl Trop Dis.
2013;7(12):e2589.
Tanzania NBoSotURo. Tanzania Census 2002: Analytical Report.
2006.
Chen D, Huang J, Jackson TJ. Vegetation water content estimation
for corn and soybeans using spectral indices derived from MODIS
near- and short-wave infrared bands. Remote Sensing of
Environment. 2005/10/15/ 2005;98(2):225-236.
Jackson TJ, Chen D, Cosh M, et al. Vegetation water content
mapping using Landsat data derived normalized difference water
index for corn and soybeans. Remote Sensing of Environment.
2004/09/30/ 2004;92(4):475-482.
25
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
30.
31.
32.
33.
34.
35.
36.
Peng SY, Chuang YC, Kang TW, Tseng KH. Random forest can
predict 30-day mortality of spontaneous intracerebral hemorrhage with
remarkable discrimination. Eur J Neurol. Jul 2010;17(7):945-950.
Sarica A, Cerasa A, Quattrone A. Random Forest Algorithm for the
Classification of Neuroimaging Data in Alzheimer's Disease: A
Systematic Review. Front Aging Neurosci. 2017;9:329.
Lopez DM, de Mello FL, Giordano Dias CM, et al. Evaluating the
Surveillance System for Spotted Fever in Brazil Using MachineLearning Techniques. Front Public Health. 2017;5:323.
Randolph SE. Ticks and tick-borne disease systems in space and
from space. Adv Parasitol. 2000;47:217-243.
Estrada-Peña A. Geostatistics and remote sensing using NOAAAVHRR satellite imagery as predictive tools in tick distribution and
habitat suitability estimations for Boophilus microplus (Acari: Ixodidae)
in South America. National Oceanographic and Atmosphere
Administration-Advanced Very High Resolution Radiometer. Vet
Parasitol. Feb 1 1999;81(1):73-82.
Aburas MM, Abdullah SH, Ramli MF, Ash’aari ZH. Measuring land
cover change in Seremban, Malaysia using NDVI index. Procedia
Environmental Sciences. 2015;30:238-243.
Lunetta RS, Knight JF, Ediriwickrema J, Lyon JG, Worthy LD. Landcover change detection using multi-temporal MODIS NDVI data.
Geospatial Information Handbook for Water Resources and Watershed
Management, Volume II: CRC Press; 2022:65-88.
37.
38.
39.
Wang G, Peng W, Zhang L, Zhang J, Xiang J. Vegetation EVI
Changes and Response to Natural Factors and Human Activities
Based on Geographically and Temporally Weighted Regression.
Global Ecology and Conservation. 2023:e02531.
Buckingham SC, Marshall GS, Schutze GE, et al. Clinical and
laboratory features, hospital course, and outcome of Rocky Mountain
spotted fever in children. J Pediatr. Feb 2007;150(2):180-184,
184.e181.
Silva-Ramos CR, Hidalgo M, Faccini-Martínez Á A. Clinical,
epidemiological, and laboratory features of Rickettsia parkeri
rickettsiosis: A systematic review. Ticks Tick Borne Dis. Jul
2021;12(4):101734.
26
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
40.
41.
42.
43.
44.
Faruque LI, Zaman RU, Gurley ES, et al. Prevalence and clinical
presentation of Rickettsia, Coxiella, Leptospira, Bartonella and
chikungunya virus infections among hospital-based febrile patients
from December 2008 to November 2009 in Bangladesh. BMC
infectious diseases. 2017;17(1):1-12.
Antón E, Font B, Muñoz T, Sanfeliu I, Segura F. Clinical and
laboratory characteristics of 144 patients with mediterranean spotted
fever. Eur J Clin Microbiol Infect Dis. Feb 2003;22(2):126-128.
Mahara F. Japanese spotted fever: report of 31 cases and review
of the literature. Emerg Infect Dis. Apr-Jun 1997;3(2):105-111.
Miyashima Y, Iwamuro M, Shibata M, et al. Prediction of
Disseminated Intravascular Coagulation by Liver Function Tests in
Patients with Japanese Spotted Fever. Intern Med. Jan 15
2018;57(2):197-202.
Rao AK, Schapira M, Clements ML, et al. A prospective study of
platelets and plasma proteolytic systems during the early stages of
Rocky Mountain spotted fever. N Engl J Med. Apr 21
1988;318(16):1021-1028.
27
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
28
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Table 1. Clinical characteristics and complete blood count results for febrile participants with and without spotted-fever
group rickettsioses, northern Tanzania, 2007-2008.
Overall,
Non-SFGR febrile
Acute SFGR,
p-value2
1
1
N=449
illness,
N=71
1
N=378
Age, years
9 (2, 32)
8 (1, 31)
24 (4, 39)
0.003
Rural
211 (47%)
168 (45%)
42 (59%)
0.03
Admission temperature, °C
38.40 (38.00,
38.40 (38.00, 39.00)
38.60 (38.00, 39.20)
0.4
39.10)
Heart rate, beats per minute
112 (97, 131)
112 (98, 132)
108 (96, 124)
0.2
Respiration rate, breaths per 32 (24, 44)
32 (26, 45)
28 (24, 35)
<0.001
minute
Oxygen saturation, %
96.0 (94.0,
96.0 (94.0, 98.0)
97.0 (94.0, 98.0)
0.5
98.0)
Systolic blood pressure,
108 (100,
108 (100, 120)
108 (100, 117)
0.4
mmHg
120)
Diastolic blood pressure,
68 (60, 74)
68 (60, 74)
67 (60, 73)
0.7
mmHg
BMI
17.2 (14.4,
17.0 (14.3, 21.0)
19.2 (15.6, 22.5)
0.06
21.6)
Weight, kg
22 (10, 55)
20 (10, 55)
49 (14, 61)
0.01
Height, m
1.25 (0.82,
1.19 (0.81, 1.62)
1.54 (1.05, 1.63)
0.01
1.62)
Cough
Oral Candida
Crepitations
Hepatomegaly
Diarrhea
Emesis
Dyspnea
Seizure
Pallor
Lymphadenopathy
293 (65%)
43 (9.6%)
196 (44%)
34 (7.6%)
90 (20%)
132 (29%)
152 (34%)
41 (9.2%)
32 (7.2%)
39 (8.7%)
252 (67%)
39 (10%)
165 (44%)
27 (7.2%)
77 (20%)
109 (29%)
132 (35%)
37 (9.8%)
27 (7.2%)
34 (9.0%)
41 (58%)
4 (5.6%)
31 (44%)
7 (9.9%)
13 (18%)
23 (32%)
20 (28%)
4 (5.6%)
5 (7.0%)
5 (7.0%)
0.1
0.2
>0.9
0.4
0.7
0.5
0.3
0.3
>0.9
0.6
Rapid Malaria Diagnostic
Rapid HIV Diagnostic
Positive
Indeterminant
White blood cell count, K/µL
Red blood cell count, M/ µL
19 (4.2%)
16 (4.2%)
3 (4.2%)
>0.9
0.13
103 (23%)
8 (1.8%)
9 (6, 13)
4.28 (3.66,
4.79)
92 (24%)
8 (2.1%)
9 (6, 13)
4.24 (3.59, 4.79)
11 (15%)
0 (0%)
7 (5, 13)
4.43 (3.97, 4.82)
0.1
0.2
Hemoglobin, g/dL
10.80 (9.00,
12.40)
32 (28, 37)
267 (161,
391)
62 (42, 76)
10.70 (9.00, 12.28)
11.30 (9.95, 12.75)
0.1
32 (27, 36)
275 (164, 404)
33 (30, 37)
225 (130, 360)
0.08
0.07
62 (42, 76)
65 (45, 79)
0.3
25 (16, 43)
8.5 (5.9, 11.5)
0.20 (0.00,
0.90)
26 (16, 44)
8.5 (5.9, 11.6)
0.20 (0.00, 1.00)
22 (13, 42)
7.8 (5.3, 10.5)
0.20 (0.00, 0.70)
0.2
0.3
0.3
Hematocrit, %
Platelet count, K/µL
Neutrophil, %
Lymphocyte, %
Monocyte, %
Eosinophil, %
29
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Basophil, %
1
2
0.80 (0.40,
1.20)
0.80 (0.40, 1.20)
0.80 (0.45, 1.10)
>0.9
n (%); Median (IQR)
Pearson’s Chi-squared test; Wilcoxon rank sum test; Fisher’s exact test
30
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Table 2. Best performing predictors by permutation of importance for
predictors
Best Performing Climate and
Best Performing Clinical
Environmental Predictors
Predictors
AUC: 0.61 (95% CI: 0.47-0.77) AUC: 0.64 (95% CI:0.48-0.80)
Predictors
Permutation Predictors
Permutation
of
of
Importance
Importance
Minimum NDWI
8.0e-05
Respiration rate
0.00380
(2)
Daytime
maximum
temperature (2)
Maximum NDVI
(2)
Daytime minimum
temperature (1)
Minimum
evapotranspiration
(3)
Minimum EVI (2)
Daytime mean
temperature (2)
Daytime
maximum
temperature (1)
Maximum
evapotranspiration
(1)
Maximum NDVI
(1)
3.6e-05
2.9e-05
2.6e-05
2.5e-05
Admission
temperature
Oxygen
saturation
Red blood cell
count
HIV
Hemoglobin
Platelet count
1.8e-05
Cough
Best Performing Combined Predictors
AUC: 0.72 (95% CI: 0.57-0.86)
Predictors
Permutation
of Importance
Respiration Rate
Platelet
0.00330
0.00240
0.00171
0.00165
0.00239
HIV
Hemoglobin
Red blood cell
count
Oxygen Saturation
Admission
temperature
Minimum EVI (2)
0.00019
1.8e-05
Hematocrit
0.00001
1.8e-05
Basophil count
-0.00112
0.00437
count
0.00359
0.00223
1.9e-05
1.9e-05
climate, environmental, and clinic
0.00119
0.00119
0.00060
0.00046
0.00031
0.00027
Maximum
Evapotranspiration
(1)
Minimum NDWI (2)
Maximum NDVI (2)
Daytime maximum
temperature(2)
Daytime mean
temperature (2)
Maximum NDVI (1)
Hematocrit
Minimum
evapotranspiration (3)
Daytime maximum
temperature (1)
0.00023
0.00017
0.00013
0.00008
0.00008
0.00001
0.00001
0.00000
-0.00003
31
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Daytime minimum
temperature (1)
Basophil count
Cough
-0.00010
-0.00020
-0.00031
32
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
7,507 admissions to KCMC and MRH
screened for eligibility from 17 September
2007 through 25 August 2008
6,177 (82.3%) did not meet eligibility
criteria
1,330 (17.7%) met eligibility criteria
460 (34.6%) were not enrolled
870 (65.4%) enrolled in study
421 (48.4%) not tested by spotted
fever rickettsioses IFA
449 (51.7%) tested by spotted fever group
rickettsioses IFA 4- to 6-weeks posthospitalization
Figure 1. Study flow diagram. Screening and enrollment of patients
hospitalized at KCMC and MRH.
KCMC: Kilimanjaro Christian Medical
Centre; MRH: Mawenzi Regional Hospital; IFA: immunofluorescence assay
33
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
AUC by Number of Predictors
0.8
AUC
0.7
0.6
0.5
0.4
5
10
Number of Predictors
15
20
Figure 2. Average AUC (solid line) and 95% Confidence Intervals (dotted
lines) from cross-validation (100 iterations) for each model by number of
predictors included in the model. AUC: area under the receiver operating
characteristic curve.
34
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Supplemental Table S1. TRIPOD checklist.
Section/Topic
Item
Checklist Item
Page
Title and abstract
Title
1
Abstract
2
Identify the study as developing and/or validating a multivariable prediction model,
the target population, and the outcome to be predicted.
Provide a summary of objectives, study design, setting, participants, sample size,
predictors, outcome, statistical analysis, results, and conclusions.
1
2
Introduction
Background
and objectives
3a
3b
Explain the medical context (including whether diagnostic or prognostic) and
rationale for developing or validating the multivariable prediction model, including
references to existing models.
Specify the objectives, including whether the study describes the development or
validation of the model or both.
2-3
3
Methods
4a
Source of data
4b
5a
Participants
Outcome
5b
5c
6a
6b
7a
Predictors
7b
Sample size
8
Missing data
9
10a
Statistical
analysis
methods
Risk groups
Results
10b
10d
11
15b
Describe the flow of participants through the study, including the number of
participants with and without the outcome and, if applicable, a summary of the
follow-up time. A diagram may be helpful.
Describe the characteristics of the participants (basic demographics, clinical
features, available predictors), including the number of participants with missing
data for predictors and outcome.
Specify the number of participants and outcome events in each analysis.
If done, report the unadjusted association between each candidate predictor and
outcome.
Present the full prediction model to allow predictions for individuals (i.e., all
regression coefficients, and model intercept or baseline survival at a given time
point).
Explain how to the use the prediction model.
16
Report performance measures (with CIs) for the prediction model.
18
Discuss any limitations of the study (such as nonrepresentative sample, few events
per predictor, missing data).
13a
Participants
13b
Model
development
Model
specification
Model
performance
Discussion
Limitations
Describe the study design or source of data (e.g., randomized trial, cohort, or
registry data), separately for the development and validation data sets, if applicable.
Specify the key study dates, including start of accrual; end of accrual; and, if
applicable, end of follow-up.
Specify key elements of the study setting (e.g., primary care, secondary care,
general population) including number and location of centres.
Describe eligibility criteria for participants.
Give details of treatments received, if relevant.
Clearly define the outcome that is predicted by the prediction model, including how
and when assessed.
Report any actions to blind assessment of the outcome to be predicted.
Clearly define all predictors used in developing or validating the multivariable
prediction model, including how and when they were measured.
Report any actions to blind assessment of predictors for the outcome and other
predictors.
Explain how the study size was arrived at.
Describe how missing data were handled (e.g., complete-case analysis, single
imputation, multiple imputation) with details of any imputation method.
Describe how predictors were handled in the analyses.
Specify type of model, all model-building procedures (including any predictor
selection), and method for internal validation.
Specify all measures used to assess model performance and, if relevant, to
compare multiple models.
Provide details on how risk groups were created, if done.
14a
14b
15a
3-4
3-4
3-4
3-4
N/A
4
N/A
4
N/A
7
5
5
5
5
N/A
7
6-8
6-8
N/A
9-10
9-11
9-10
35
11-12
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Interpretation
Implications
Other information
Supplementary
information
Funding
19b
Give an overall interpretation of the results, considering objectives, limitations, and
results from similar studies, and other relevant evidence.
11-12
20
Discuss the potential clinical use of the model and implications for future research.
11-12
21
22
Provide information about the availability of supplementary resources, such as study
protocol, Web calculator, and data sets.
Give the source of funding and the role of the funders for the present study.
36
12-14
12
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Supplemental Table S2. MODIS satellite-derived climate and environmental predictors for
acute SFGR and non-SFGR febrile illness, Kilimanjaro region, Tanzania 2007-2008
Satellite predictor (months
Overall,
Non-SFGR
Acute SFGR,
p1
1
lagged)
N=449
febrile illness,
N=71
value2
N=3781
Mean EVI (1)
0.28 (0.22,
0.28 (0.23,
0.28 (0.22,
0.2
0.30)
0.30)
0.29)
Mean NDVI (1)
0.47 (0.42,
0.48 (0.42,
0.47 (0.42,
0.5
0.52)
0.53)
0.50)
Maximum EVI (1)
0.71 (0.65,
0.71 (0.65,
0.71 (0.69,
>0.9
0.73)
0.73)
0.72)
Maximum NDVI (1)
0.927 (0.923,
0.927 (0.923,
0.927 (0.923,
0.9
0.937)
0.941)
0.933)
Minimum EVI (1)
-0.16 (-0.19, - -0.16 (-0.19, -0.16 (-0.18, - 0.073
0.12)
0.12)
0.12)
Minimum NDVI (1)
-0.167 (-0.188, -0.167 (-0.188,
-0.167 (-0.186, >0.9
-0.158)
-0.149)
-0.158)
Mean NDWI (1)
0.22 (0.21,
0.23 (0.20,
0.22 (0.21,
0.4
0.30)
0.30)
0.24)
Minimum NDWI (1)
1.11 (1.01,
1.12 (1.03,
1.10 (0.93,
0.008
1.29)
1.29)
1.14)
Maximum NDWI (1)
0.25 (0.22,
0.25 (0.22,
0.24 (0.22,
0.087
0.29)
0.29)
0.26)
13.7 (11.1,
13.7 (11.1,
12.9 (12.0,
0.2
Mean evapotranspiration
17.5)
17.5)
14.6)
(1)
Minimum
3.95 (3.25,
5.00)
3.90 (3.25,
4.60)
0.6
evapotranspiration (1)
3.95 (3.25,
4.60)
Maximum
61 (56, 63)
61 (58, 63)
61 (55, 64)
0.4
288.04 K
(286.27,
289.02)
295.25 K
(293.53,
295.86)
266.7 K
(265.0, 269.2)
304.1 K
(298.7, 306.3)
288.04 K
(286.27,
288.93)
294.95 K
(293.53,
295.86)
266.4 K (265.0,
269.2)
303.9 K (298.7,
306.4)
288.86 K
(287.42,
290.59)
295.76 K
(294.34,
297.77)
266.8 K
(264.9, 269.2)
304.1 K
(298.8, 306.3)
0.008
evapotranspiration (1)
Nighttime Mean Temp. (1)
Nighttime Maximum Temp.
(1)
Nighttime Minimum Temp.
(1)
Daytime Mean Temp. (1)
0.005
0.4
0.14
37
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Daytime Maximum Temp.
(1)
Daytime Minimum Temp. (1)
Mean EVI (2)
Mean NDVI (2)
Maximum EVI (2)
Maximum NDVI (2)
Minimum EVI (2)
Minimum NDVI (2)
Mean NDWI (2)
Minimum NDWI (2)
Maximum NDWI (2)
Mean evapotranspiration
(2)
Minimum
evapotranspiration (2)
Maximum
evapotranspiration (2)
Nighttime Mean Temp. (2)
Nighttime Maximum Temp.
(2)
Nighttime Minimum Temp.
(2)
Daytime Mean Temp. (2)
Daytime Maximum Temp.
(2)
317.8 K
(309.8, 321.1)
275.1 K
(273.7, 278.6)
0.28 (0.22,
0.29)
0.47 (0.43,
0.50)
0.71 (0.69,
0.74)
0.927 (0.923,
0.936)
-0.16 (-0.18, 0.12)
-0.167 (-0.188,
-0.158)
0.22 (0.18,
0.24)
1.12 (1.03,
1.22)
0.25 (0.22,
0.27)
13.22 (11.47,
15.43)
317.7 K (309.8,
321.1)
275.1 K (273.7,
278.3)
0.28 (0.22,
0.29)
0.47 (0.43,
0.50)
0.71 (0.69,
0.74)
0.927 (0.923,
0.936)
-0.16 (-0.18, 0.12)
-0.167 (-0.188,
-0.158)
0.22 (0.15,
0.25)
1.12 (1.05,
1.29)
0.25 (0.22,
0.27)
13.71 (11.47,
15.43)
318.1 K
(310.5, 321.0)
276.5 K
(275.1, 279.5)
0.26 (0.23,
0.28)
0.45 (0.43,
0.50)
0.72 (0.69,
0.74)
0.928 (0.923,
0.933)
-0.15 (-0.17, 0.10)
-0.164 (-0.186,
-0.149)
0.22 (0.21,
0.23)
1.12 (1.01,
1.14)
0.24 (0.22,
0.26)
13.22 (12.88,
14.55)
0.4
3.90 (3.25,
4.55)
3.90 (3.25,
4.55)
3.70 (3.25,
4.15)
0.038
61.1 (58.0,
63.1)
61.1 (58.2,
63.1)
60.0 (55.4,
63.1)
0.3
288.54 K
(287.23,
288.93)
295.25 K
(293.53,
296.41)
266.8 K
(265.1, 270.0)
304.1 K
(299.8, 306.7)
318.1 K
(310.5, 321.1)
288.04 K
(287.23,
288.93)
295.25 K
(293.53,
295.86)
266.8 K (265.1,
270.0)
304.1 K (299.8,
306.7)
318.1 K (310.5,
321.1)
288.54 K
(287.36,
289.05)
295.53 K
(293.57,
297.30)
266.8 K
(264.5, 270.0)
304.9 K
(303.6, 306.7)
320.0 K
(317.6, 321.1)
0.3
0.023
0.2
0.5
0.2
0.7
0.10
0.3
>0.9
0.011
0.7
0.5
0.2
0.6
0.2
0.15
38
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Daytime Minimum Temp. (2)
Mean EVI (3)
Mean NDVI (3)
Maximum EVI (3)
Maximum NDVI (3)
Minimum EVI (3)
Minimum NDVI (3)
Mean NDWI (3)
Minimum NDWI (3)
Maximum NDWI (3)
Mean evapotranspiration
(3)
Minimum
evapotranspiration (3)
Maximum
evapotranspiration (3)
Nighttime Mean Temp. (3)
Nighttime Maximum Temp.
(3)
Nighttime Minimum Temp.
(3)
Daytime Mean Temp. (3)
Daytime Maximum Temp.
(3)
275.6 K
(273.7, 278.6)
0.26 (0.23,
0.29)
0.45 (0.41,
0.49)
0.71 (0.69,
0.73)
0.927 (0.923,
0.936)
-0.16 (-0.18, 0.12)
-0.169 (-0.188,
-0.158)
0.22 (0.15,
0.23)
1.12 (1.05,
1.26)
0.24 (0.22,
0.27)
12.93 (11.42,
14.55)
275.4 K (273.7,
278.6)
0.26 (0.22,
0.29)
0.45 (0.41,
0.49)
0.71 (0.65,
0.73)
0.927 (0.923,
0.936)
-0.16 (-0.18, 0.12)
-0.169 (-0.188,
-0.158)
0.22 (0.15,
0.23)
1.11 (1.05,
1.26)
0.24 (0.22,
0.27)
12.88 (11.36,
14.55)
275.6 K
(273.6, 278.3)
0.25 (0.24,
0.28)
0.45 (0.44,
0.52)
0.71 (0.69,
0.74)
0.927 (0.918,
0.936)
-0.16 (-0.18, 0.12)
-0.183 (-0.188,
-0.158)
0.22 (0.15,
0.23)
1.13 (1.06,
1.29)
0.24 (0.22,
0.28)
13.71 (11.44,
15.43)
0.5
3.90 (3.25,
4.35)
3.90 (3.25,
4.35)
4.00 (3.25,
4.55)
0.8
61.1 (57.2,
63.7)
61.1 (56.4,
63.1)
62.4 (58.0,
67.4)
0.024
288.72 K
(287.23,
289.16)
295.53 K
(293.57,
296.41)
266.8 K
(265.4, 269.2)
304.89 K
(299.81,
306.66)
319.3 K
(311.0, 321.4)
288.54 K
(287.23,
289.16)
295.53 K
(293.57,
296.41)
266.8 K (265.5,
269.2)
304.59 K
(299.81,
306.66)
318.7 K (311.0,
321.4)
288.72 K
(287.35,
288.97)
295.74 K
(293.66,
296.41)
266.7 K
(265.1, 269.2)
306.14 K
(300.47,
307.02)
320.3 K
(311.7, 321.4)
>0.9
0.5
0.11
0.5
0.3
>0.9
0.5
>0.9
0.15
0.2
0.15
0.4
0.5
0.10
0.14
39
medRxiv preprint doi: https://doi.org/10.1101/2024.06.20.24309257; this version posted June 21, 2024. The copyright holder for this preprint
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
It is made available under a CC-BY-NC-ND 4.0 International license .
Daytime Minimum Temp. (3)
1
2
276.8 K
(274.0, 278.7)
276.8 K (273.7,
278.7)
276.6 K
(274.0, 278.7)
0.8
Median (IQR)
Wilcox rank sum test
40