Abstract
Standardized lab tests are central for patient evaluation, differential diagnosis and treatment. Interpretation of these data is nevertheless lacking quantitative and personalized metrics. Here we report on the modeling of 2.1 billion lab measurements of 92 different lab tests from 2.8 million adults over a span of 18 years. Following unsupervised filtering of 131 chronic conditions and 5,223 drug–test pairs we performed a virtual survey of lab tests distributions in healthy individuals. Age and sex alone explain less than 10% of the within-normal test variance in 89 out of 92 tests. Personalized models based on patients’ history explain 60% of the variance for 17 tests and over 36% for half of the tests. This allows for systematic stratification of the risk for future abnormal test levels and subsequent emerging disease. Multivariate modeling of within-normal lab tests can be readily implemented as a basis for quantitative patient evaluation.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Due to privacy regulations, all data analysis was conducted on a secured de-identified dedicated server within the Clalit Healthcare environment. Requests for access to all or parts of the Clalit datasets should be addressed to Clalit Healthcare Services, via the Clalit Research Institute. Requests will be considered by the Clalit Data Access committee given the Clalit data-sharing policy. Summary statistics and lab ranges are available at https://tanaylab.weizmann.ac.il/labs/.
Code availability
Specific scripts and infrastructure software is tailored to the EHR data representation in the Clalit system and therefore cannot be provided as is. Requests for software modules as a basis for adaptation to other EHR environments or usage of the software on Clalit data following Clalit approval should be addressed to the authors after data access is approved by the Clalit Data Access committee.
References
Whyte, M. B. & Kelly, P. The normal range: it is not normal and it is not a range. Postgrad. Med. J. 94, 613–616 (2018).
Hoffmann, R. G. Statistics in the practice of medicine. JAMA 185, 864–873 (1963).
Katayev, A., Balciza, C. & Seccombe, D. W. Establishing reference intervals for clinical laboratory test resultsis there a better way? Am. J. Clin. Pathol. 133, 180–186 (2010).
Smellie, W. S. A. When is ‘abnormal’ abnormal? Dealing with the slightly out of range laboratory result. J. Clin. Pathol. 59, 1005–1007 (2006).
Eddy, D. M. & Clanton, C. H. The art of diagnosis: solving the clinicopathological exercise. N. Engl. J. Med. 306, 1263–1268 (1982).
Ross, D. W., Ayscue, L. H., Watson, J. & Bentley, S. A. Stability of hematologic parameters in healthy subjects: intraindividual versus interindividual variation. Am. J. Clin. Pathol. 90, 262–267 (1988).
Harris, E. K. Effects of intra-and interindividual variation on the appropriate use of normal ranges. Clin. Chem. 20, 1535–1542 (1974).
KDIGO Working Group. KDIGO clinical practice guideline for acute kidney injury. Kidney Int. Suppl. https://doi.org/10.1038/kisup.2012.1 (2012).
Hulsen, T. et al. From big data to precision medicine. Front. Med. 6, 34 (2019).
Obermeyer, Z. & Emanuel, E. J. Predicting the future—big data, machine learning, and clinical medicine. N. Engl. J. Med. 375, 1216–1219 (2016).
Raghupathi, W. & Raghupathi, V. Big data analytics in healthcare: promise and potential. Health Inf. Sci. Syst. 2, 3 (2014).
Murdoch, T. B. & Detsky, A. S. The inevitable application of big data to health care. JAMA https://doi.org/10.1001/jama.2013.393 (2013).
Weng, S. F., Reps, J., Kai, J., Garibaldi, J. M. & Qureshi, N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS ONE 12, e0174944 (2017).
Avati, A. et al. Improving palliative care with deep learning. BMC Med. Inf. Decis. Mak. 18, 122 (2018).
Razavian, N. et al. Population-level prediction of type 2 diabetes from claims data and analysis of risk factors. Big Data 3, 277–287 (2015).
Rappoport, N. et al. Comparing ethnicity-specific reference intervals for clinical laboratory tests from EHR data. J. Appl. Lab. Med. 3, 366–377 (2018).
Beutler, E. & Waalen, J. The definition of anemia: what is the lower limit of normal of the blood hemoglobin concentration? Blood 107, 1747–1750 (2006).
Abelson, S. et al. Prediction of acute myeloid leukaemia risk in healthy individuals. Nature 559, 400 (2018).
Goldshtein, I., Neeman, U., Chodick, G. & Shalev, V. Variations in hemoglobin before colorectal cancer diagnosis. Eur. J. Cancer Prev. 19, 342–344 (2010).
Li, L. et al. Longitudinal progression trajectory of GFR among patients with CKD. Am. J. Kidney Dis. 59, 504–512 (2012).
Gebregziabher, M., Egede, L. E., Lynch, C. P., Echols, C. & Zhao, Y. Effect of trajectories of glycemic control on mortality in type 2 diabetes: a semiparametric joint modeling approach. Am. J. Epidemiol. 171, 1090–1098 (2010).
Heianza, Y. et al. Longitudinal trajectories of HbA1c and fasting plasma glucose levels during the development of type 2 diabetes: the Toranomon Hospital Health Management Center study 7 (TOPICS 7). Diabetes Care 35, 1050–1052 (2012).
Singer, S. R. et al. EMR-based medication adherence metric markedly enhances identification of nonadherent patients. Am. J. Manag Care 18, e372–e377 (2012).
Balicer, R. D. & Afek, A. Digital health nation: Israel’s global big data innovation hub. Lancet 389, 2451–2453 (2017).
American Board of Internal Medicine. ABIM Laboratory Test Reference Ranges — July 2021. https://www.abim.org/~/media/ABIM%20Public/Files/pdf/exam/laboratory-reference-ranges.pdf (2021).
Adeli, K. et al. Complex biological profile of hematologic markers across pediatric, adult, and geriatric ages: establishment of robust pediatric and adult reference intervals on the basis of the Canadian health measures survey. Clin. Chem. 61, 1075–1086 (2015).
den Bossche, J. V. et al. Reference intervals for a complete blood count determined on different automated haematology analysers: Abx pentra 120 retic, coulter Gen-S, sysmex SE 9500, abbott cell dyn 4000 and bayer advia 120. Clin. Chem. Lab. Med. https://doi.org/10.1515/CCLM.2002.014 (2002).
Ioannou, G. N., Boyko, E. J. & Lee, S. P. The prevalence and predictors of elevated serum aminotransferase activity in the United States in 1999–2002. Off. J. Am. Coll. Gastroenterol. 101, 76–82 (2006).
Cheng, C. K.-W., Chan, J., Cembrowski, G. S. & van Assendelft, O. W. Complete blood count reference interval diagrams derived from NHANES III: stratification by age, sex, and race. Lab. Hematol. 10, 42–53 (2004).
Adeli, K. et al. Biochemical marker reference values across pediatric, adult, and geriatric ages: establishment of robust pediatric and adult reference intervals on the basis of the Canadian health measures survey. Clin. Chem. 61, 1049–1062 (2015).
Hsieh, M. M., Everhart, J. E., Byrd-Holt, D. D., Tisdale, J. F. & Rodgers, G. P. Prevalence of neutropenia in the US population: age, sex, smoking status, and ethnic differences. Ann. Intern. Med. 146, 486–492 (2007).
Kritchevsky, D. Age-related changes in lipid metabolism. Proc. Soc. Exp. Biol. Med. 165, 193–199 (1980).
Hu, L. et al. Prognostic value of RDW in cancers: a systematic review and meta-analysis. Oncotarget 8, 16027–16035 (2016).
Horne, B. D. et al. Which white blood cell subtypes predict increased cardiovascular risk? J. Am. Coll. Cardiol. 45, 1638–1643 (2005).
Patel, K. V., Ferrucci, L., Ershler, W. B., Longo, D. L. & Guralnik, J. M. Red blood cell distribution width and the risk of death in middle-aged and older adults. Arch. Intern. Med. 169, 515–523 (2009).
Patel, K. V. et al. Red cell distribution width and mortality in older adults: a meta-analysis. J. Gerontol. A Biol. Sci. Med Sci. 65A, 258–265 (2010).
Ruhl, C. E. & Everhart, J. E. The association of low serum alanine aminotransferase activity with mortality in the US population. Am. J. Epidemiol. 178, 1702–1711 (2013).
Polubriaginof, F. C. G. et al. Disease heritability inferred from familial relationships reported in medical records. Cell 173, 1692–1704 (2018).
Ge, T., Chen, C.-Y., Neale, B. M., Sabuncu, M. R. & Smoller, J. W. Phenome-wide heritability analysis of the UK Biobank. PLoS Genet. 13, e1006711 (2017).
Pan UKBB https://pan.ukbb.broadinstitute.org/ (2020).
Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Fry, A. et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am. J. Epidemiol. 186, 1026–1034 (2017).
Brodin, P. et al. Variation in the human immune system is largely driven by non-heritable influences. Cell 160, 37–47 (2015).
Lundberg, S. M. & Lee, S.-I. A Unified Approach to Interpreting Model Predictions. in Advances in Neural Information Processing Systems (eds. Guyon, I. et al.) vol. 30 (Curran Associates, Inc., 2017).
Acknowledgements
We thank the Division of Information Systems and Digital at Clalit and the Clalit Research Institute for help with data transition and de-identification within the Clalit–Weizmann Initiative. We thank members of the Tanay group for discussions. Research was supported in part by the D. Dan and Betty Kahn Foundation, Israel precision medicine program and the European Research Council.
Author information
Authors and Affiliations
Contributions
N.M.C., R.K., A.L., O.S., G.B. and A.T. conceived and designed the study; N.M.C., R.J., A.L., M.H. and A.T. developed the software and pipeline; R.B. provided access and initial context to the data; N.M.C. and O.S. analyzed data with help from A.L., R.J., L.I.S., G.B. and A.T. N.M.C., O.S. and A.T. wrote the manuscript with input from all authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Medicine thanks Marina Sirota and the other, anonymous, reviewer(s) for their contribution to this work. Joao Monteiro was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Lab ascertainment bias and female age trend summary.
a) EHR lab ascertainment bias. Bars indicate the percent of lab measurements (y-axis) by age (x-axis) according to patient inferred status (color) for females (left) and males (right). Status is determined by screening time intervals under the effect of diagnoses (dx) affecting overall survival, medications (med) that correlate with test value alteration, pregnancy, or hospitalization (hosp.). b) Global female age trends. Median age-controlled lab values were normalized given the matching distributions in healthy 20yo individuals. Heatmap is depicting clusters of lab tests showing correlated age-linked trends. Data from males can be found in Fig. 1d.
Extended Data Fig. 2 Lab distribution age trend by BMI and socio-economic classification.
a) Shown are lab median (y-axis) by age (x-axis) for females (left) and males (right) for low BMI patients (BMI < 25, blue) and high BMI patients (BMI > 25, red). b) Similar to a, lab median by age stratified by socio-economic classification (Methods).
Extended Data Fig. 3 Hazards ratio by sex.
Log Hazard Ratio derived by cox proportional hazards regression model using normalized lab values. Hazard ratios were computed for all healthy patients according to age (columns) for males (top) and females (bottom), grouped by mean normalized lab levels reported on the year prior to the index date (Jan 1st 2011). For reference we also report hazard ratios for healthy patients with out of range labs in the same period. Error bars represent standard errors.
Extended Data Fig. 4 Hazards ratio by BMI and lab covariation.
a) Log Hazard Ratio derived by cox proportional hazards regression model using normalized lab values. Hazard ratios were computed for all healthy patients at age 60yo to 65yo for low BMI (top) and high BMI (bottom), grouped by mean normalized lab levels reported on the year prior to the index date (Jan 1st 2011). For reference we also report hazard ratios for healthy patients with out of range labs in the same period. Error bars represent standard errors. b) Lab covariation. Shown are two-dimensional distributions of pairs of age/gender normalized lab values for patients across all ages (20-90).
Extended Data Fig. 5 Personalization index in females.
Changes in 5-year personalization at older age 70–75 (Y axis) is shown against 5-year personalization in females age 40-45 (x-axis). Labs are color coded by age 40 personalization index.
Extended Data Fig. 6 Validation of heritability estimates.
a) Comparison of calculated h2 scores (y-axis) and heritability estimation in riftehr external dataset (x-axis). Pearson correlation 0.5. b) Similar to a, external data obtained from UK Biobank (http://www.nealelab.is/uk-biobank). Pearson correlation 0.37. c) Scatterplot of heritability score (h2, y-axis) and young males personalization index (ages 25yo 30yo) (x-axis). d) similar to c, personalization index computed for older males (ages 70yo to 75yo).
Extended Data Fig. 7 Imputation models design schematics and validation.
a) Imputation process applied to all adult patients (age >20) in Clalit EHR and for each lab test. Imputation applied at 6 month time resolution only at times where measured data was not available in the prior 6 month time period, but lab data was either measured or imputed in the 2-year time period prior to imputation. b) Multi lab multi time point model training schematic. c) Shown is the r2 values for the imputation model (single-time point single lab) computed using data up to 6 months prior to prediction time (blue) compared to two year prediction of the multi-time multi lab regression model (green) applied to patients in different ages (X axis, female/ male panels).
Extended Data Fig. 8 Effect of noise and data size on regression model performance.
a) Shown is the r2 values for regression models, with variable degrees of noise, predicting lab test values two years forward in time for patients in different ages (X axis, female/male panels) using single time point single lab models (top) and mutli-lab single time point models (bottom). b) Similar to a, with variable data size used for model training, comparing all healthy, 50% down sampling of data and 10% down sampling of data.
Extended Data Fig. 9 Regression models feature contribution.
a) Feature contribution was assessed by the Shapley values framework and was applied to each single lab single time point lab regression model. Heatmap depicts for each lab regression model (x-axis) the log2 relative absolute Shapley value for each feature (y-axis). Regression model labs (x-axis feature (y-axis) contribution in each lab test regression model (x-axis). Features were clustered using hierarchical clustering. Top and right annotation colors reflect feature/lab regression type. b) Top regression model features. Shown are Shapley values for top features for selected regression models. Bar colors indicate type of feature.
Extended Data Fig. 10 Diabetes Mellitus incidence rate and predictive models.
a) Incidence rate of DM in Clalit population. Shown is the number of new T2D cases per 100 K per year (y-axis) by age (x-axis) for male (light grey) and female (dark grey). Error bars indicate 95% confidence interval. b) Kaplan Meier estimates for six-year DM cumulative incidence stratified by fasting glucose lab test. Healthy patients age 50-60 at index date 1.1.2011 were classified according to most recent fasting glucose lab test value in the past year: FG [90, 95) (n = 46970), FG [95,100) (n = 28927), FG [100,105) (n = 13921), FG [105, 110) (5989), FG[110,115) (n = 2911) and FG [115,120) (n = 1504) Error bars indicate 95% confidence intervals. c) Cumulative incidence of new T2DM diagnosis computed by Kaplan-Meier method, stratified by models trained on raw lab data. 95% confidence intervals are shown in lighter colors. Similar to Fig. 4e, but trained on raw lab values and not quantile normalized values. Note that raw imputation data was computed via quantile normalized values. Model prediction in this case is robust to raw values and provides similar performance as quantile normalized values.
Supplementary information
Rights and permissions
About this article
Cite this article
Cohen, N.M., Schwartzman, O., Jaschek, R. et al. Personalized lab test models to quantify disease potentials in healthy individuals. Nat Med 27, 1582–1591 (2021). https://doi.org/10.1038/s41591-021-01468-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41591-021-01468-6