Article
Open access
Published: 25 July 2024

Polygenic risk scores as a marker for epilepsy risk across lifetime and after unspecified seizure events

Nature Communications volume 15, Article number: 6277 (2024) Cite this article

2331 Accesses
1 Citations
15 Altmetric
Metrics details

Subjects

Abstract

A diagnosis of epilepsy has significant consequences for an individual but is often challenging in clinical practice. Novel biomarkers are thus greatly needed. Here, we investigated how common genetic factors (epilepsy polygenic risk scores, [PRSs]) influence epilepsy risk in detailed longitudinal electronic health records (EHRs) of > 700k Finns and Estonians. We found that a high genetic generalized epilepsy PRS (PRS_GGE) increased risk for genetic generalized epilepsy (GGE) (hazard ratio [HR] 1.73 per PRS_GGE standard deviation [SD]) across lifetime and within 10 years after an unspecified seizure event. The effect of PRS_GGE was significantly larger on idiopathic generalized epilepsies, in females and for earlier epilepsy onset. Analogously, we found significant but more modest focal epilepsy PRS burden associated with non-acquired focal epilepsy (NAFE). Here, we outline the potential of epilepsy specific PRSs to serve as biomarkers after a first seizure event.

GWAS meta-analysis of over 29,000 people with epilepsy identifies 26 risk loci and subtype-specific genetic architecture

Article Open access 31 August 2023

Exome sequencing of 20,979 individuals with epilepsy reveals shared and distinct ultra-rare genetic risk across disorder subtypes

Article 03 October 2024

Polygenic risk score as clinical utility in psychiatry: a clinical viewpoint

Article 07 August 2020

Introduction

Epilepsy is a serious neurological disorder characterized by unprovoked seizures, which affects up to 1% of individuals worldwide (WHO, 2019), with children and the elderly being particularly affected. Although epilepsy can be caused by acquired conditions such as stroke, tumor or head injury, most cases (ca. 70–80%) are due to genetic influences¹, including rare and common genetic variants. Diagnosing epilepsy is often challenging^2,3,4 and multiple individuals are initially misdiagnosed⁵. An epilepsy diagnosis is potentially lifesaving with a 3x elevated mortality risk in epilepsy (WHO, 2019). Epilepsy-related deaths can be prevented by antiseizure medication (ASM) which however often have adverse effects⁶. Thus, correct epilepsy diagnosis is crucial, but the most widely-used diagnostic tool in epilepsy, the electroencephalogram (EEG), has quite variable sensitivity and specificity in different clinical settings ranging about 17–58% and 70–98%, respectively^7,8 and moderate inter-rater agreement⁹, illustrating a need for additional biomarkers³. Due to the great importance and challenge specific ‘first seizure clinics’ are solely dedicated to investigating an epilepsy diagnosis after a newly onset seizure². Until 2014, epilepsy was defined as having two unprovoked seizures >24 h apart by the International League Against Epilepsy¹⁰. This definition was then extended to the following additional scenarios of having 2) a diagnosis of an epilepsy syndrome or 3) one unprovoked seizure and a probability of further seizures with a recurrence risk of at least 60% over the next 10 years¹⁰.

The common epilepsies can be broadly categorized into genetic generalized (GGE) and non-acquired focal epilepsies (NAFE), where the latter originate from a particular brain area¹⁰. First-degree relatives of patients with GGE had an 8.3-fold increased risk of developing GGE while first-degree relatives of patients with NAFE had a 2.5-fold increased risk of developing NAFE, compared to the general population, respectively¹¹. In agreement, the SNP-heritability (i.e., the variance of GGE attributed to common genetic variants) is approximately 30–40%^12,13,14 which is relatively high compared to other common diseases¹⁵. The same measure is more moderate for NAFE with SNP-heritability of about 9-16%^12,13. Previous genome-wide association studies have shown that common variants contribute more substantially to the more common forms of epilepsy¹². There is only a modest burden of ultra-rare genetic variants in GGE and NAFE; rare variants likely contribute only a small fraction towards their heritability¹⁶ and there are few Mendelian disease genes exclusively associated with them¹⁷.

Recent research has shown that common genetic variants with small effects on specific diseases can be combined into “polygenic” risk scores (PRSs) with high disease-specific PRSs conferring comparable risk as rare monogenic variants¹⁸. Thus, interest in PRSs is growing as a potential clinically important diagnostic tool^{19,20,21,22,23}. It was recently shown that individuals with epilepsy had a significantly higher epilepsy PRS compared to unaffected controls^24,25. However, investigating how epilepsy PRSs may predict epilepsy risk in specific clinical scenarios has so far been lacking. Here, we thus investigate how epilepsy PRSs can stratify epilepsy risk across lifetime and after unspecified seizure events.

Results

Electronic health records accurately represent epilepsy diagnoses

We investigated epilepsy PRSs in detailed longitudinal electronic health records (EHR) from the FinnGen project^26,27 using FinnGen data freeze R12 (n = 520,105, 282,064 females) and the Estonian biobank²⁸ as a validation cohort (for further study sample characteristics, see Table 1). We further explored the BioMe cohort²⁹ with regards to Non-European ancestries. Our phenotype data was derived from ICD codes and ASM purchases and reimbursements of official state registries spanning up to 50 years. We defined non-acquired focal epilepsy (=NAFE) by having ≥2 NAFE-specific ICD codes and genetic generalized epilepsy (=GGE) by having ≥2 GGE-specific ICD codes, respectively. Additionally, we require ≥2 ASM purchases for a NAFE or GGE category (Supplementary Fig. 1). Further details on epilepsy case definitions can be found in Supplementary Tables 1 and 2 and Methods. Sample numbers are given in Table 1. We also investigated 226 individuals with ≥2 traditional idiopathic generalized epilepsy diagnoses (=IGE)³⁰ i.e. Childhood/Juvenile Absence Epilepsy (ICD 40.33/35), Juvenile Myoclonic Epilepsy (ICD 40.36) and Generalized Tonic–Clonic Seizures Alone (ICD 40.35). Individuals’ age at first epilepsy diagnosis was in line with the known age of onset of respective IGE syndromes supporting our EHR-derived diagnoses (Supplementary Fig. 2).

Table 1 Descriptive Statistics of main study and replication cohort

Full size table

Epilepsy PRS is most elevated in GGE, specifically IGE, with the same effect sizes as in clinically curated epilepsy cohorts

We then calculated epilepsy PRSs to determine individuals’ genetic burden for epilepsy. Here, we used the International League Against Epilepsy (ILAE) genomewide association study’s (GWAS) 2023 summary statistics¹³ as discovery data, i.e. to determine which genetic variants increase or decrease epilepsy risk. We then summed 1000 s of genetic risk/protective variants for epilepsy with individually small effects into a single epilepsy PRS per individual. Here, we constructed separate focal epilepsy PRS (PRS_NAFE) and generalized epilepsy PRS (PRS_GGE). We found a significant elevation of PRS_GGE in 924 individuals with GGE (Fig. 1A) which was particularly pronounced in IGE (see also next paragraph and Fig. 1B). We also found a significant elevation of PRS_NAFE in 5509 individuals with NAFE (Fig. 1C), but no significant elevation of PRS_GGE in individuals with unspecified seizures (Fig. 1D). Similarly, we found a high correlation (Pearson’s correlation coefficient = 0.91, p-value = 4 × 10⁻⁸) between PRS_GGE decile and GGE prevalence in our data (Fig. 2). Overall, we can thus confirm previous studies that used PRS as a marker for genetic liability of common epilepsy types. Importantly, we find very similar respective effect sizes of PRS_GGE and PRS_NAFE on GGE and NAFE as reported in previous cohorts from Epi25 or the Cleveland Clinic²⁴ when using the same GWAS¹² in an earlier version of the manuscript³¹ or the updated GWAS¹³ (see Table 2). We thus consider it likely that the epilepsy phenotypes in our biobank data are comparable to the phenotypes curated according to clinical criteria in these cohorts.

**Fig. 1: Epilepsy PRS of epilepsy cases (red) compared to population controls (gray) (n = 273,974) (density curves).**

**Fig. 2: Correlation of PRS_GGE decile and GGE prevalence.**

Table 2 Enrichment of epilepsy cases in individuals with increasing epilepsy PRS

Full size table

High PRS is associated with epilepsy across lifetime and after unspecified seizure events

We next investigated the effect of epilepsy PRS on epilepsy rates across lifetime, separately for PRS_GGE and PRS_NAFE. We stratified our cohort into bins of epilepsy PRS standard deviations (SD) (Fig. 3) and compared the cumulative epilepsy incidence in each SD bin to the rest of the cohort (for increased power). Individuals with a PRS_GGE > 2 SD (ca. 2% of the cohort) had a more than 4-fold lifetime risk of developing GGE than the rest of the cohort (Hazard ratio [HR]: 4.2, Confidence Interval [CI]: 3.2-5.4, p-value: 4 × 10⁻²⁷, method: cox proportional hazard model³² [coxph], Fig. 3, panel A). The epilepsy risk decreased proportionally with the decreasing PRS_GGE SD bin. Overall, the HR increased by 1.73 per increased SD of PRS_GGE (Table 3, 95%-CI 1.62–1.86, p-value = 8 × 10⁻⁵⁵). When restricting to IGE the HR per PRS_GGE SD was 2.4 (95%-CI 2.1–2.7, p-value 2 × 10⁻³⁴). Individuals with a PRS_GGE > 1 SD had a HR of 12.1 for IGE compared to those with PRS_GGE < −1 SD (95%-CI 6-25, p-value 3 × 10⁻¹¹, IGE rate in PRS_GGE < −1 SD: 8/75,114, IGE rate in PRS_GGE > 1 SD: 88/75,505). PRS discriminated GGE cases versus controls with a concordance index (C-index) of 0.64 (95%-CI 0.61–0.68) adjusting for the same covariates (birth year, sex, batch and PCs). Overall, we thus showed PRS_GGE as a significant biomarker for lifetime epilepsy risk.

**Fig. 3: Epilepsy PRS as a marker for epilepsy risk across lifetime and after unspecified seizure events.**

Table 3 Effects of PRS on epilepsy risk in biobanks FinnGen and Estonian biobank

Full size table

However, the absolute risk of developing epilepsy is small across lifetime (<1%, see Fig. 3A/C), even for individuals with high epilepsy PRS. Lifetime risk prediction is thus less clinically meaningful. When considering the subset of individuals that were diagnosed with an unspecified seizure corresponding to ICD code R56.8/7803A at an age <40 years their absolute risk for GGE increased compared to baseline (Fig. 3B). Within 10 years after the unspecified seizure, the GGE rate reached 42% in > 2 SD PRS_GGE compared to 4% in <−2 SD PRS_GGE (or 27% in > 1 SD PRS_GGE versus 8% in < −1 SD PRS_GGE). PRS_GGE affected relative epilepsy risk similarly after an unspecified seizure (HR per PRS_GGE SD: 1.5, 95%-CI: 1.3–1.8, p-value = 1 × 10⁻⁹, C-index 0.60, 95%-CI 0.53-0.67) as across lifetime. Similarly, PRS_NAFE had a significant but more modest effect on NAFE cumulative lifetime incidence (HR per PRS_NAFE SD: 1.13, 95%-CI: 1.09-1.17, p-value = 3 × 10⁻¹⁰) and after unspecified seizure (HR per PRS_NAFE SD: 1.075, 95%-CI: 1.014–1.14, p-value = 0.02), in line with a lower heritability of focal epilepsy (Fig. 3C, D). In addition, we tested the effect of a PRS_all-epilepsy computed from a GWAS of all epilepsy phenotypes, including unclassified epilepsy, on lifetime epilepsy. Unfortunately, we found only limited association with lifetime risk of GGE, NAFE or any epilepsy (Supplementary Table 3).

We replicated analyses of PRS_GGE effects across lifetime in the Estonian biobank²⁸ (Estonia, European ancestry, Supplementary Fig. 3), obtaining similar estimates (see Table 3) and thus validating our results. We further explored the effects of PRS_GGE in individuals with diverse ancestries in the BioMe biobank (Supplementary Fig. 4, Supplementary Note), a biobank that links genetic and EHR data for more than 30,000 individuals from diverse ancestral and cultural backgrounds recruited primarily in the Mount Sinai Health System in New York City. While the effect of PRS_GGE in BioMe followed similar trends, our analyses were underpowered and thus did not reach significance. Further analyses are needed to investigate the portability of epilepsy PRS effects to other ancestry groups.

Epilepsy PRS has sex-specific effects on epilepsy subtypes

In other diseases than epilepsy, studies previously reported sex-specific PRS effects and larger effects of PRS on disease in earlier age groups³³. Thus, we sought to investigate the effect of age at onset and sex on PRS effects on epilepsy. We found a significant interaction of sex and PRS_GGE on GGE case (n = 924) status (cox model p-value 0.002, regression p-value 0.02). So we next investigated the effect of PRS on lifetime epilepsy separately for men and women. PRS_GGE had a larger influence on lifetime GGE in females (HR_female per PRS SD: 1.9, 95%-CI 1.7–2.0, p-value = 1 × 10⁻⁴⁷, n_GGE = 543) than in males (HR_male per PRS SD: 1.5, 95%-CI 1.3–1.7, p-value = 2 × 10⁻¹¹, n_GGE = 381, Supplementary Fig. 5). We further found a higher prevalence of epilepsy in females, specifically with onset in the teenage—young adult range (Supplementary Fig. 6). Exploring sex-specific effects on specific epilepsy types we found no significant effect of sex (p = 0.4) nor PRS*sex interaction (p = 0.7) in IGE (n = 226) but found a significant effect of sex (p-value 7 × 10⁻⁴) and PRS*sex interaction (1 × 10⁻³) in non-IGE GGE (n = 657). Similarly, the effect of PRS_GGE on non-IGE GGE was substantially higher in females (HR_female 1.78, 95%-CI 1.60–1.99, p-value 5×10⁻²⁶, HR_male 1.32, 95%-CI 1.16–1.50, p-value 2 × 10⁻⁵) while it was quite comparable for IGE (HR_female 2.35, 95%-CI 1.16–1.60, p-value 1 × 10⁻²⁵; HR_male 2.57, 95% CI 1.92–3.44, p-value 2 × 10⁻¹⁰). We also found a significant interaction of sex and PRS_NAFE on NAFE (p-value 0.008) with slightly higher PRS_NAFE effects on NAFE in females (HR_male: 1.10, 95%-CI 1.06–1.15, p-value = 2 × 10⁻⁶; n = 2706, HR_female: 1.18, 95%-CI 1.13–1.22, p-value = 2 × 10⁻¹⁷, n = 2806).

Epilepsy PRS has a larger effect when epilepsy onset is earlier

We further explored whether epilepsy PRS effects were potentially different for different ages of epilepsy onset. We thus divided our cohort into quintiles of age at first epilepsy diagnosis and found significant effects of PRS_GGE on GGE and of PRS_NAFE on NAFE case status in all age at onset bins except GGE onset > 60 years and NAFE onset > 80 years (method logistic regression, see Supplementary Table 4). We found the largest effects of PRS_GGE when individuals had earlier ages at first diagnosis, e.g. for GGE effects were largest at onset 0-20 (OR 1.9, 95%-CI 1.7–2.1, p-value 9 × 10⁻³⁸) and of PRS_NAFE on NAFE at onset 20-40 years (OR 1.21, 95%-CI 1.14–1.28, p-value 4 × 10⁻¹⁰). This is in line with other illnesses³³. We next investigated, if the large genetic influences on IGE described in the paragraph above could be explained by a higher proportion of individuals with younger age at epilepsy onset in the IGE group. So within the GGE group, we compared the effect of PRS_GGE on IGE versus non-IGE and still found a higher effect of PRS_GGE on IGE even when accounting for age at first epilepsy diagnosis (OR 1.58, 95%-CI 1.29–1.95, p-value 2 × 10⁻⁵).

PRS_GGE is specifically associated with GGE while PRS_NAFE is more heterogeneous

We aimed to investigate the phenotypes associated with a genetic epilepsy liability that are not epilepsy, in a hypothesis-free approach to elucidate whether genetic factors influence GGE/NAFE in a disease-specific manner. We thus performed a phenome-wide association study (PheWAS) testing the effect of PRS_GGE and PRS_NAFE on 2139 distinct disease phenotypes (method: logistic regression, FinnGen data freeze: R6, GWAS: ILAE 2018¹², Fig. 4). GGE (labeled as ‘Generalized Epilepsy’) is the only phenotype that is significantly affected by PRS_GGE after Bonferroni correction. We thus argue that PRS_GGE is very specifically associated with GGE increasing its potential diagnostic utility. While PRS_NAFE is expectedly associated with NAFE, multiple other phenotype associations are unexpected. The most significant ones are related to back pain, but also include hypertension, cardiovascular disease and depression medications, with lower significance. We tested the genetic correlation of NAFE and the 19 traits that were significant in our PheWAS (method: LD score regression^34,35, Supplementary Fig. 7). After multiple testing correction none remained significant. However, phenotypes ‘other anxiety disorders’ (r_g = 0.54, p-value = 0.02), ‘all anxiety disorders’ (r_g = 0.44, p-value = 0.02) and ‘depression medications’ (r_g = 0.33, p-value = 0.04) were genetically correlated with nominal significance.

**Fig. 4: Phenome-wide association study testing the effect of epilepsy PRS on 2139 distinct disease phenotypes in FinnGen.**

Discussion

The diagnosis of epilepsy is an important yet challenging clinical task; thus the need for novel biomarkers remains high. Recent studies demonstrated a genetic burden in the form of an elevated PRS_GGE in epilepsy cases versus controls^24,25 which we replicate in our data. The effect of PRS_GGE has however not been studied outside the case control setting. Here, we investigate the effect of PRS_GGE longitudinally; on lifetime epilepsy, on epilepsy after an unspecified seizure event and on 1000 s of disease endpoints in other clinical areas.

In this study, we could demonstrate that common genetic variants, in the form of PRS_GGE have a significant quantitative effect on GGE lifetime cumulative incidence that we could reproduce in another biobank with hazard ratios of 3-4 for the upper tails of the PRS_GGE distribution in line with previous studies²⁴ and after unspecified seizure events. Predictions are modest with C-indices of ca. 0.6, but comparable to the performance of models using clinical variables (C-indices in similar ranges of ca. 0.6 reported in the MESS trial⁴ or in EEG studies^36,37). Thus, we expect PRSs to have potential utility as a supportive but not standalone tool. In our data, the effect of PRS_NAFE on NAFE across lifetime was also significant but more modest than for PRS_GGE and GGE, in line with other studies.

We were surprised to find that the effect of PRS_GGE on GGE was substantially larger in females than males which was not previously reported. Previous studies reported a higher incidence of GGE in women^38,39 which we also observed in our cohort. These could be caused by a different epilepsy susceptibility in males and females mediated by biological or environmental sex-specific factors. This is likely not caused by different pathomechanisms as a recent study found a high correlation of genetic effects on epilepsy in males and females¹³. However, we find sex-specific PRS effects predominantly for non-IGE suggesting sex-specific genetic factors may differentially influence risk for specific epilepsy subtypes. Thus, further research is needed to elucidate how genetic factors may differently influence epilepsy between sexes.

The effect of PRS_GGE on GGE was quite specific with no significant effects on other diseases. However, we did not test the effects on non-disease phenotypes. Previously, high PRS_GGE and high PRS_NAFE were both associated with low educational attainment and neuroticism-related personality traits⁴⁰ which could result from epilepsy or side effects of ASMs or may also be pleiotropic effects. Apart from NAFE, PRS_NAFE had effects on other diseases including back pain, which was not previously reported; and anxiety/depression-related traits. Here, nominal significant genetic correlations of NAFE with anxiety disorders and depression medications are in line with previous reports in the UK biobank that individuals with high PRS_NAFE but without a NAFE diagnosis had more likely experienced anxiety or depression⁴⁰ pointing to a potential pleiotropic effect. Co-morbidities of chronic pain and depression have been previously reported⁴¹.

We see the highest potential clinical utility of epilepsy PRS in patient groups with a high absolute risk of having epilepsy such as after an unspecified seizure event. Current clinical guidelines require at least one unprovoked seizure and at least a 60% chance of a second seizure to diagnose epilepsy¹⁰. In a clinical setting, the diagnosis is often not as quantifiable as the definition suggests and is heavily dependent on clinical expertise. We find, as an example, that individuals with a PRS_GGE > 2 SD have a > 3x increased risk of being diagnosed with GGE than the rest of the population. This includes individuals with unspecified seizure events who are at elevated risk for a later epilepsy diagnosis. After the exclusion of reversible causes for their unspecified seizure, a high PRS_GGE could support stratifying groups at risk for a second seizure in conjunction with an EEG while other biomarkers are currently sparse³. Other recent studies suggest that PRS_GGE have additional value to the information of family history^42,43. Practically, genetic testing is regularly done in pediatric epilepsy and generation of PRS could thus potentially be integrated in an existing workflow. Here, integrating PRS with rare variants could also improve disease prognosis as genetic background has been shown to influence how severely carriers of genetic variants with large disease effects⁴⁴ such as Dravet syndrome⁴⁵ are affected. Another advantage is a high cost-effectiveness as PRS can be generated from genotype data that can also be repurposed from other disease areas²³.

Our study has several limitations. We have conducted most of our analyses in cohorts with European ancestry. As has been previously described for other diseases, the predictive ability of polygenic risk scores is heavily dependent on genetic ancestry¹⁴. While the effect of PRS_GGE on epilepsy showed similar trends in the primarily non-European BioMe cohort sample sizes remained prohibitive. Further studies in diverse populations are thus needed. Another limitation is that our phenotype data is derived from EHRs. We can thus not verify how many epilepsy cases have been confirmed by epileptologists. However, we obtain similar PRS effect sizes as in clinical cohorts²⁴, which thus validates our case definitions by combining EHR diagnoses with ASM purchase and reimbursement data. The central registry of Finnish EHR data have the unique advantage that reimbursements for ASMs are always based on a certificate made by a neurologist. In addition, while we excluded individuals that were also part of the discovery GWAS we did not have the option to directly compare individual-level data between the discovery GWAS and our validation cohorts. We could thus not control for any potential relatedness between the cohorts with the potential to inflate our results⁴⁶.

Our data thus proposes an interesting potential for epilepsy PRS, specifically for PRS_GGE, as a biomarker for epilepsy risk where it could—combined with clinical markers such as the EEG—improve epilepsy risk prediction. Our data outlines how this could be specifically useful in situations of elevated epilepsy risk such as an unspecified seizure event. Ultimately, this needs to be investigated in a clinical setting.

Methods

This study complies with all relevant ethical regulations; the Ethics Committee of the Hospital District of Helsinki and Uusimaa approved the study protocol for FinnGen (Nr HUS/990/2017), the Estonian Committee on Bioethics and Human Research for Estonian biobank (protocol 1.1-12/624) and the Icahn School of Medicine at Mount Sinai Institutional Review Board (IRB; approval STUDY-19-00951) for BioMe.

Data and definition of epilepsy cases and controls

Here, we define epilepsy case and control status from detailed longitudinal EHR of the FinnGen project²⁷ using data freeze R12 as a main cohort (n = 520,105) and Estonian biobank²⁸ as an additional validation cohort (n = 210,382). We use phenotype data derived from official state registries. These include 9,313 individuals with epilepsy ICD codes, 2,485,702 ASM purchases and 12,695 ASM reimbursements of ATC codes N03A*. We list an overview of case definitions and numbers in Supplementary Table 1. 94.7% of individuals with ≥ 2 generalized seizure ICD codes and 93.7% of individuals with ≥ 2 focal seizure ICD codes purchased ≥ 2 ASMs, while only 16.4% of individuals without epilepsy diagnoses purchased ≥ 2 ASMs (see Supplementary Fig. 1). This cross-validates our EHR data.

Reimbursement rights for epilepsy are derived from the Social Insurance Institution of Finland (KELA), Finland’s national authority. All persons with newly diagnosed epilepsy are eligible for ASM reimbursement, which is also routinely applied for, necessitating a detailed statement by a neurologist and investigations at a specialist clinic. The statement is checked and approved by specialist physicians at the reimbursement institution KELA before the right is granted. Epilepsy diagnoses in Finland are made according to national guidelines, which are updated according to ILAE epilepsy definitions.

We thus chose the following criteria to define GGE:

at least two ICD codes of G40.3 (“Generalized idiopathic epilepsy […]”) or corresponding ICD9 codes (Supplementary Table 2) and at least two purchases of ASMs (as defined by N03* ATC codes).

We chose the following criteria to define NAFE:

at least two ICD codes of G40.0, G40.1, G40.2 (“Localization-related (focal)(partial) […] epilepsy […]”) or corresponding ICD9 diagnoses (Supplementary Table 2) and at least two purchases of ASMs.
excluded possible structural etiology of focal seizures such as stroke, brain tumor, CNS infection and CNS injury (for ICD codes see Supplementary Table 2). Here, we only excluded individuals if they had their first seizure event within one year after the brain-related potential epileptogenic event.

For 1008 individuals with both focal and generalized epilepsy codes we applied the following additional criteria for a GGE diagnosis:

more generalized than focal epilepsy codes AND
most frequent ICD code is a generalized epilepsy code AND
no reimbursement category of focal epilepsy.

We used the same criteria vice versa to define NAFE among individuals with focal and generalized epilepsy codes.

We defined idiopathic generalized epilepsy (IGE) according to ILAE^30,47 by at least two ICD codes of 40.33 (Childhood Absence Epilepsy), 40.34 (Generalized Tonic–Clonic Seizures Alone, here using the ICD Code of the formerly known term Generalized Tonic–Clonic Seizures on Awakening) 40.35 (Juvenile Absence Epilepsy), 40.36 (Juvenile Myoclonic Epilepsy). See Supplementary Fig. 2 for age at first diagnosis.

We used individuals without epilepsy-related diagnoses as controls. We excluded individuals who purchased ASMs from the control group.

For the analysis of GGE incidence following an unspecified seizure event, we used the same diagnosis of GGE as described above. We defined an unspecified seizure event with an ICD code of R56.8/7803 A (‘unspecified convulsions’). From the group with a single unspecified seizure event we excluded individuals

with any other epilepsy-related diagnoses (G40/G41 ICD codes) AND
who purchased or reimbursed ASMs within two years before up to 10 years after event AND
who were at any time diagnosed with alcohol-related ICD codes OR who had multiple unspecified seizure events (to exclude potential alcohol withdrawal seizures).

When individuals had 2 seizure diagnoses on the same day we counted them as one seizure event as they most likely represent two labels of the same event. When the 2 seizure diagnoses had discordant ICD labels we labeled them according to the most specific ICD code. (As an example, individuals with diagnoses of unspecified seizure and generalized epilepsy on initial presentation would be classified as diagnosed with ‘generalized epilepsy’ on initial presentation.)

We defined epilepsy cases similarly in the validation cohort Estonian biobank, with the only exception that instead of using the reimbursement data to differentiate between NAFE and GGE in individuals who had both focal and generalized epilepsy codes, we used prescription data. Specifically, we excluded individuals as GGE cases if they had any ASM prescriptions that listed NAFE as a reason for the prescription and vice versa. We performed the unspecified seizure event analysis only in FinnGen where we had a sufficient sample size.

Importantly, we find very similar respective effect sizes of PRS_GGE and PRS_NAFE on GGE and NAFE as reported in previous cohorts from Epi25 or the Cleveland Clinic²⁴ when using the same GWAS¹² in a previous version of the manuscript³¹ or the updated GWAS¹³ (see Table 2). We acknowledge, that PRS effects in our cohort may not be directly comparable as we are using a different PRS calculation method (using all 835 K weighted SNPs⁴⁸ instead of classic clumping and using SNPs <p-value threshold 0.5²⁴). However, differences in phenotype definitions have been reported to have larger effects than differences in PRS methods⁴⁹, specifically for epilepsy¹³. We thus consider it likely that the epilepsy phenotypes in our biobank data are comparable to the phenotypes curated according to clinical criteria in these cohorts.

Calculation of polygenic risk scores

We calculated epilepsy PRS with the method PRS-CS⁴⁸. Here, we used the summary statistics from the ILAE GWAS 2023¹³ and ILAE GWAS 2018¹² (only PheWAS analyses and analyses in the BioMe cohort) as discovery data, i.e. to determine which genetic variants increase or decrease epilepsy risk. We constructed separate focal PRS (PRS_NAFE) and generalized epilepsy PRS (PRS_GGE). The ILAE 2023 GWAS contained the FinRisk cohort (n > 40k controls, part of FinnGen). We therefore excluded 25,405 FinRisk samples from the controls of our study to avoid overlap with the GWAS discovery cohort. The Finnish GenEpa cohort part of both GWAS was not part of FinnGen. We applied the PRS-CS-auto algorithm to infer posterior effect sizes for the variants for PRS calculation. PRS-CS-auto learns the model’s global scaling parameter ϕ from the data. We used data from the 1000 Genomes⁵⁰ as a reference panel for linkage disequilibrium. We then weighted and summed all available genetic variants that confer either risk for or protection from epilepsy into a single epilepsy PRS per individual using the PLINK–score command⁵¹. The PRS-CS pipeline in FinnGen is described in more detail at https://github.com/FINNGEN/CS-PRS-pipeline. We provide the PRS weights file as Supplementary Data 1.

In FinnGen and EstBB, we restricted our analysis to individuals with European ancestry, while we additionally included African and American continental ancestry groups in the BioMe cohort (Supplementary Note). We inferred population labels based on principle component analysis of the genotype data as described previously^27,28.

Statistical analyses

We used the R programming language for all statistical analyses. Pipelines for parallel computing were created using Cromwell-29 and 31 and Wdltool-0.14. Statistical analyses and figures were done using different version of R packages ggplot2⁵², data.table, plyr, survminer, survival, tidyr and Rutils.

In all analyses including PRSs namely logistic regression, survival analyses and concordance index calculations, we included the following covariates: the first 10 principal components of genetic markers (10 PCs) as a proxy for population substructure and ancestry, genotyping batch (only in the FinnGen cohort), sex, birth year, age at last follow up. For analyses that included only individuals with seizures, we included age at the first epilepsy diagnosis as a covariate instead of age at the last follow-up. We tested PRS_GGE, PRS_NAFE and PRS_all-epilepsy as indicated in the results of the manuscript.

All statistical tests were conducted as two-sided hypotheses without assuming a specific direction. No statistical method was used to predetermine sample size, instead the maximum number of available samples from the respective studies were used.

In the PheWas, we defined independent diseases when for any disease category not more than 40% of affected individuals are listed in any other disease category.

We performed survival analyses using the Cox Proportional-Hazards model (Cox-PH)³². Follow-up starts at birth and ends at the age of first epilepsy diagnosis (for individuals with epilepsy), age at last record available in the EHR or death, depending on what happened first. We also performed survival analyses in individuals with an unspecified seizure. Here, follow-up started at the age of the unspecified seizure and ended at the age of first epilepsy diagnosis, age at last record available in the EHR, death or after 10 years, depending on what happened first. We tested for sex differences by including an interaction term of PRS x sex in the Cox-PH model. We used the first 10 PCs, genotyping batch, sex, birth year and age at last follow up as covariates in all survival analyses. We did not exclude individuals that were related as we found in sensitivity analyses of a different project that this did not influence the PRS effect on disease⁵³. As an additional check, we repeated our survival analysis after excluding 320,226 related individuals (corresponding to kinship values > 0.04 and > 3rd degree relatedness using the software KING⁵⁴) from the FinnGen data. The effect sizes of PRS on lifetime GGE risk (method: cox model, for IGE: HR of 2.39 per SD PRS, 95%-CI 2.1–2.7, p-value 2 × 10e−34, for GGE: HR of 1.74 per SD PRS, 95%-CI 1.6–1.9, p-value 1 × 10e−53) remained almost identical. This may be expected since we found few related individuals among GGE (18 out of 924) and IGE (6 out of 226) cases. We excluded sex in sex-specific survival analyses. We included age at first unspecified seizure in survival analyses of individuals with an unspecified seizure.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All results described in this manuscript can be found in the (Supplementary) Tables. A full list of FinnGen endpoints for release 12 is available at www.finngen.fi/en/researchers/clinical-endpoints. Individual level data in this study are not publicly available due to legal and privacy limitations, but they can be accessed through individual participating biobanks. The FinnGen data may be accessed through Finnish Biobanks’ FinBB portal (www.finbb.fi; email: info.fingenious@finbb.fi). Researchers interested in Estonian Biobank can request access at https://www.geenivaramu.ee/en/access-biobank. For access to data from BioMe biobank, please read here (https://icahn.mssm.edu/research/ipm/programs/biome-biobank). For questions, please reach out to biomebiobank@mssm.edu. Source data in the form of summary statistics are provided with this paper. Source data are provided with this paper.

Code availability

Code for FinnGen core analyses is available at https://github.com/FINNGEN/.

References

Hildebrand, M. S. et al. Recent advances in the molecular genetics of epilepsy. J. Med. Genet. 50, 271–279 (2013).
Article CAS PubMed Google Scholar
McIntosh, A. M. et al. Newly diagnosed seizures assessed at two established first seizure clinics: clinic characteristics, investigations, and findings over 11 years. Epilepsia Open 6, 171–180 (2021).
Article PubMed PubMed Central Google Scholar
Hegde, M. & Lowenstein, D. H. The search for circulating epilepsy biomarkers. Biomark. Med. 8, 413–427 (2014).
Article CAS PubMed Google Scholar
Kim, L. G., Johnson, T. L., Marson, A. G., Chadwick, D. W. & group, M. M. S. Prediction of risk of seizure recurrence after a single seizure and early epilepsy: further results from the MESS trial. Lancet Neurol. 5, 317–322 (2006).
Article PubMed Google Scholar
Smith, D., Defalla, B. A. & Chadwick, D. W. The misdiagnosis of epilepsy and the management of refractory epilepsy in a specialist clinic. QJM 92, 15–23 (1999).
Article CAS PubMed Google Scholar
Grabowski, D. C., Fishman, J., Wild, I. & Lavin, B. Changing the neurology policy landscape in the United States: Misconceptions and facts about epilepsy. Health Policy 122, 797–802 (2018).
Article PubMed Google Scholar
Bouma, H. K., Labos, C., Gore, G. C., Wolfson, C. & Keezer, M. R. The diagnostic accuracy of routine electroencephalography after a first unprovoked seizure. Eur. J. Neurol. 23, 455–463 (2016).
Article CAS PubMed Google Scholar
Smith, S. J. EEG in the diagnosis, classification, and management of patients with epilepsy. J. Neurol. Neurosurg. Psychiatry 76, ii2–7, (2005).
Article PubMed PubMed Central Google Scholar
Jing, J. et al. Interrater reliability of experts in identifying interictal epileptiform discharges in electroencephalograms. JAMA Neurol. 77, 49–57, (2020).
Article PubMed Google Scholar
Fisher, R. S. et al. ILAE official report: a practical clinical definition of epilepsy. Epilepsia 55, 475–482 (2014).
Article PubMed Google Scholar
Peljto, A. L. et al. Familial risk of epilepsy: a population-based study. Brain 137, 795–805 (2014).
Article PubMed PubMed Central Google Scholar
ILAE, Epilepsy, I. L. A. & Epilepsies, Co. C. Genome-wide mega-analysis identifies 16 loci and highlights diverse biological mechanisms in the common epilepsies. Nat. Commun. 9, 5269 (2018).
Article ADS Google Scholar
International League Against Epilepsy Consortium on Complex, E. GWAS meta-analysis of over 29,000 people with epilepsy identifies 26 risk loci and subtype-specific genetic architecture. Nat. Genet. 55, 1471–1482 (2023).
Article Google Scholar
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
Article CAS PubMed PubMed Central Google Scholar
Brainstorm, C. et al. Analysis of shared heritability in common disorders of the brain. Science 360 https://doi.org/10.1126/science.aap8757 (2018).
Epi25Collaborative. Ultra-rare genetic variation in the epilepsies: a whole-exome sequencing study of 17,606 individuals. Am. J. Hum. Genet. 105, 267–282 (2019).
Article Google Scholar
Oliver, K. L. et al. Genes4Epilepsy: an epilepsy gene resource. Epilepsia 64, 1368–1375 (2023).
Article PubMed PubMed Central Google Scholar
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Article CAS PubMed PubMed Central Google Scholar
Widen, E. et al. How communicating polygenic and clinical risk for atherosclerotic cardiovascular disease impacts health behavior: an observational follow-up study. Circ. Genom. Precis. Med. 15, e003459 (2022).
Article PubMed Google Scholar
Mars, N. et al. The role of polygenic risk and susceptibility genes in breast cancer over the course of life. Nat. Commun. 11, 6383 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Shah, P. D. Polygenic risk scores for breast cancer-can they deliver on the promise of precision medicine? JAMA Netw. Open 4, e2119333 (2021).
Article PubMed Google Scholar
Kullo, I. J. et al. Polygenic scores in biomedical research. Nat. Rev. Genet. https://doi.org/10.1038/s41576-022-00470-z (2022).
Article PubMed PubMed Central Google Scholar
Lewis, C. M. & Vassos, E. Polygenic risk scores: from research tools to clinical instruments. Genome Med. 12, 44 (2020).
Article PubMed PubMed Central Google Scholar
Leu, C. et al. Polygenic burden in focal and generalized epilepsies. Brain 142, 3473–3481 (2019).
Article PubMed PubMed Central Google Scholar
Moreau, C. et al. Polygenic risk scores of several subtypes of epilepsies in a founder population. Neurol. Genet. 6, e416 (2020).
Article PubMed PubMed Central Google Scholar
Heyne, H. O. et al. Mono- and biallelic variant effects on disease at biobank scale. Nature 613, 519–525 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Leitsalu, L. et al. Cohort Profile: Estonian Biobank of the Estonian Genome Center, University of Tartu. Int J. Epidemiol. 44, 1137–1147 (2015).
Article PubMed Google Scholar
Belbin, G. M. et al. Toward a fine-scale population health monitoring system. Cell 184, 2068–2083 e2011 (2021).
Article CAS PubMed Google Scholar
Hirsch, E. et al. ILAE definition of the idiopathic generalized epilepsy syndromes: position statement by the ILAE Task Force on nosology and definitions. Epilepsia 63, 1475–1499 (2022).
Article PubMed Google Scholar
Heyne, H. O. et al. Polygenic risk scores as a marker for epilepsy risk across lifetime and after unspecified seizure events. medRxiv https://doi.org/10.1101/2023.11.27.23297542 (2023).
Cox, D. R. Regression models and life-tables. J. R. Stat. Soc. Ser. B (Methodol.) 34, 187–220 (1972).
Article MathSciNet Google Scholar
Jiang, X., Holmes, C. & McVean, G. The impact of age on genetic risk for common diseases. PLoS Genet. 17, e1009723 (2021).
Article CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bonnett, L. J. et al. Risk of seizure recurrence in people with single seizures and early epilepsy—model development and external validation. Seizure 94, 26–32 (2022).
Article PubMed PubMed Central Google Scholar
Lemoine, E. et al. Machine-learning for the prediction of one-year seizure recurrence based on routine electroencephalography. Sci. Rep. 13, 12650 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Christensen, J., Kjeldsen, M. J., Andersen, H., Friis, M. L. & Sidenius, P. Gender differences in epilepsy. Epilepsia 46, 956–960 (2005).
Article PubMed Google Scholar
Videira, G. et al. Female preponderance in genetic generalized epilepsies. Seizure 91, 167–171 (2021).
Article PubMed Google Scholar
Leu, C. et al. Pleiotropy of polygenic factors associated with focal and generalized epilepsy in the general population. PLoS ONE 15, e0232292 (2020).
Article CAS PubMed PubMed Central Google Scholar
Sheng, J., Liu, S., Wang, Y., Cui, R. & Zhang, X. The link between depression and chronic pain: neural mechanisms in the brain. Neural Plast. 2017, 9724371 (2017).
Article PubMed PubMed Central Google Scholar
Oliver, K. L. et al. Common risk variants for epilepsy are enriched in families previously targeted for rare monogenic variant discovery. EBioMedicine 81, 104079 (2022).
Article CAS PubMed PubMed Central Google Scholar
Mars, N. et al. Systematic comparison of family history and polygenic risk across 24 common diseases. Am. J. Hum. Genet. 109, 2152–2162 (2022).
Campbell, C. et al. The role of common genetic variation in presumed monogenic epilepsies. EBioMedicine 81, 104098 (2022).
Article CAS PubMed PubMed Central Google Scholar
Martins Custodio, H. et al. Widespread genomic influences on phenotype in Dravet syndrome, a ‘monogenic’ condition. Brain https://doi.org/10.1093/brain/awad111 (2023).
Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013).
Article CAS PubMed PubMed Central Google Scholar
ILAE Genetic determinants of common epilepsies: a meta-analysis of genome-wide association studies. Lancet Neurol. 13, 893–903 (2014).
Article Google Scholar
Ge, T., Chen, C. Y., Ni, Y., Feng, Y. A. & Smoller, J. W. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
Article ADS PubMed PubMed Central Google Scholar
Evaluation of polygenic scoring methods in five biobanks reveals greater variability between biobanks than between methods and highlights benefits of ensemble learning. medRxiv https://doi.org/10.1101/2023.11.20.23298215 (2023).
Genomes Project, C. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Article ADS Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar
Wickham, H. in Use R!, 1 online resource (XVI, 260 pages 232 illustrations, 140 illustrations in color (Springer International Publishing: Imprint: Springer, Cham, 2016).
Jermy, B. et al. A unified framework for estimating country-specific cumulative incidence for 18 diseases stratified by polygenic risk. medRxiv https://doi.org/10.1101/2023.06.12.23291186 (2023).
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

FinnGen: We want to acknowledge the participants and investigators of the FinnGen study. We thank Pietro Della Briotta Parolo for calculating polygenic risk scores in FinnGen. Individuals in FinnGen provided informed consent for biobank research, based on the Finnish Biobank Act. Alternatively, separate research cohorts, collected prior the Finnish Biobank Act came into effect (in September 2013) and start of FinnGen (August 2017), were collected based on study-specific consents and later transferred to the Finnish biobanks after approval by Fimea (Finnish Medicines Agency), the National Supervisory Authority for Welfare and Health. Recruitment protocols followed the biobank protocols approved by Fimea. The Coordinating Ethics Committee of the Hospital District of Helsinki and Uusimaa (HUS) statement number for the FinnGen study is Nr HUS/990/2017. The FinnGen study is approved by Finnish Institute for Health and Welfare (permit numbers: THL/2031/6.02.00/2017, THL/1101/5.05.00/2017, THL/341/6.02.00/2018, THL/2222/6.02.00/2018, THL/283/6.02.00/2019, THL/1721/5.05.00/2019 and THL/1524/5.05.00/2020), Digital and population data service agency (permit numbers: VRK43431/2017-3, VRK/6909/2018-3, VRK/4415/2019-3), the Social Insurance Institution (permit numbers: KELA 58/522/2017, KELA 131/522/2018, KELA 70/522/2019, KELA 98/522/2019, KELA 134/522/2019, KELA 138/522/2019, KELA 2/522/2020, KELA 16/522/2020), Findata permit numbers THL/2364/14.02/2020, THL/4055/14.06.00/2020, THL/3433/14.06.00/2020, THL/4432/14.06/2020, THL/5189/14.06/2020, THL/5894/14.06.00/2020, THL/6619/14.06.00/2020, THL/209/14.06.00/2021, THL/688/14.06.00/2021, THL/1284/14.06.00/2021, THL/1965/14.06.00/2021, THL/5546/14.02.00/2020, THL/2658/14.06.00/2021, THL/4235/14.06.00/2021, Statistics Finland (permit numbers: TK-53-1041-17 and TK/143/07.03.00/2020 (earlier TK-53-90-20) TK/1735/07.03.00/2021, TK/3112/07.03.00/2021) and Finnish Registry for Kidney Diseases permission/extract from the meeting minutes on 4th July 2019. The Biobank Access Decisions for FinnGen samples and data utilized in FinnGen Data Freeze 11 include: THL Biobank BB2017_55, BB2017_111, BB2018_19, BB_2018_34, BB_2018_67, BB2018_71, BB2019_7, BB2019_8, BB2019_26, BB2020_1, BB2021_65, Finnish Red Cross Blood Service Biobank 7.12.2017, Helsinki Biobank HUS/359/2017, HUS/248/2020, HUS/430/2021 §28, §29, HUS/150/2022 §12, §13, §14, §15, §16, §17, §18, §23, §58, §59, HUS/128/2023 §18, Auria Biobank AB17-5154 and amendment #1 (August 17 2020) and amendments BB_2021-0140, BB_2021-0156 (August 26 2021, Feb 2 2022), BB_2021-0169, BB_2021-0179, BB_2021-0161, AB20-5926 and amendment #1 (April 23 2020) and it´s modifications (Sep 22 2021), BB_2022-0262, BB_2022-0256, Biobank Borealis of Northern Finland_2017_1013, 2021_5010, 2021_5010 Amendment, 2021_5018, 2021_5018 Amendment, 2021_5015, 2021_5015 Amendment, 2021_5015 Amendment_2, 2021_5023, 2021_5023 Amendment, 2021_5023 Amendment_2, 2021_5017, 2021_5017 Amendment, 2022_6001, 2022_6001 Amendment, 2022_6006 Amendment, 2022_6006 Amendment, 2022_6006 Amendment_2, BB22-0067, 2022_0262, 2022_0262 Amendment, Biobank of Eastern Finland 1186/2018 and amendment 22§/2020, 53§/2021, 13§/2022, 14§/2022, 15§/2022, 27§/2022, 28§/2022, 29§/2022, 33§/2022, 35§/2022, 36§/2022, 37§/2022, 39§/2022, 7§/2023, 32§/2023, 33§/2023, 34§/2023, 35§/2023, 36§/2023, 37§/2023, 38§/2023, 39§/2023, 40§/2023, 41§/2023, Finnish Clinical Biobank Tampere MH0004 and amendments (21.02.2020 & 06.10.2020), BB2021-0140 8§/2021, 9§/2021, §9/2022, §10/2022, §12/2022, 13§/2022, §20/2022, §21/2022, §22/2022, §23/2022, 28§/2022, 29§/2022, 30§/2022, 31§/2022, 32§/2022, 38§/2022, 40§/2022, 42§/2022, 1§/2023, Central Finland Biobank 1-2017, BB_2021-0161, BB_2021-0169, BB_2021-0179, BB_2021-0170, BB_2022-0256, BB_2022-0262, BB22-0067, Decision allowing to continue data processing until 31st Aug 2024 for projects: BB_2021-0179, BB22-0067,BB_2022-0262, BB_2021-0170, BB_2021-0164, BB_2021-0161, and BB_2021-0169, and Terveystalo Biobank STB 2018001 and amendment 25th Aug 2020, Finnish Hematological Registry and Clinical Biobank decision 18th June 2021, Arctic biobank P0844: ARC_2021_1001. Following biobanks are acknowledged for delivering biobank samples to FinnGen: Auria Biobank (www.auria.fi/biopankki), THL Biobank (www.thl.fi/biobank), Helsinki Biobank (www.helsinginbiopankki.fi), Biobank Borealis of Northern Finland (https://www.ppshp.fi/Tutkimus-ja-opetus/Biopankki/Pages/Biobank-Borealis-briefly-in-English.aspx), Finnish Clinical Biobank Tampere (www.tays.fi/en-US/Research_and_development/Finnish_Clinical_Biobank_Tampere), Biobank of Eastern Finland (www.ita-suomenbiopankki.fi/en), Central Finland Biobank (www.ksshp.fi/fi-FI/Potilaalle/Biopankki), Finnish Red Cross Blood Service Biobank (www.veripalvelu.fi/verenluovutus/biopankkitoiminta), Terveystalo Biobank (www.terveystalo.com/fi/Yritystietoa/Terveystalo-Biopankki/Biopankki/) and Arctic Biobank (https://www.oulu.fi/en/university/faculties-and-units/faculty-medicine/northern-finland-birth-cohorts-and-arctic-biobank). All Finnish Biobanks are members of BBMRI.fi infrastructure (www.bbmri.fi). Finnish Biobank Cooperative -FINBB (https://finbb.fi/) is the coordinator of BBMRI-ERIC operations in Finland. Estonian biobank: We want to acknowledge the participants and investigators of the Estonian biobank (EstBB). The EstBB is a population-based biobank managed by the Institute of Genomics at the University of Tartu. It currently contains genotype data and health information for more than 200,000 participants, representing almost 20% of Estonia’s adult population. All participants have provided broad written consent that covers the provision of samples for future research use along with the acquisition of electronic health records from national registries and databases. The activities of the EstBB are regulated by the Human Genes Research Act, which was adopted in 2000 specifically for the operations of the EstBB. Individual level data analysis in the EstBB was carried out under ethical approval 1.1-12/624 from the Estonian Committee on Bioethics and Human Research (Estonian Ministry of Social Affairs), using data according to release application 6-7/GI/11577 from the Estonian Biobank. Data analysis was carried out in part in the High-Performance Computing Center of University of Tartu. BioMe biobank: We want to acknowledge the participants and investigators of the BioMe cohort. Founded in September 2007, BioMe is a biobank that links genetic and EMR data for more than 50,000 individuals from diverse ancestral and cultural backgrounds recruited primarily in ambulatory care settings in the Mount Sinai Health System (MSHS) in New York City. The current study was approved by the Icahn School of Medicine at Mount Sinai Institutional Review Board (IRB; approval STUDY-19-00951). All study participants provided written informed consent. This work was supported in part through the computational and data resources and staff expertise provided by Scientific Computing and Data at the Icahn School of Medicine at Mount Sinai and supported by the Clinical and Translational Science Awards (CTSA) grant UL1TR004419 from the National Center for Advancing Translational Sciences. The FinnGen project is funded by two grants from Business Finland (HUS 4685/31/2016 and UH 4386/31/2016) and by thirteen industry partners (AbbVie Inc, AstraZeneca UK Ltd, Biogen MA Inc, Celgene Corporation, Celgene International II Sàrl, Genentech Inc, Merck Sharp & Dohme Corp, Pfizer Inc., GlaxoSmithKline Intellectual Property Development Ltd., Sanofi US Services Inc., Maze Therapeutics Inc., Janssen Biotech Inc, Novartis AG and Boehringer Ingelheim International GmbH). The EstBB project was funded by the European Union through the European Regional Development Fund Project No. 2014-2020.4.01.15-0012 GENTRANSMED. Work with the BioMe biobank was supported in part through the computational and data resources and staff expertise provided by Scientific Computing and Data at the Icahn School of Medicine at Mount Sinai, the Clinical and Translational Science Awards (CTSA) grant UL1TR004419 from the National Center for Advancing Translational Sciences and Office of Research Infrastructure of the National Institutes of Health under award number S10OD026880. This work was funded by the Hasso Plattner Foundation (HPF). This work was supported by the Academy of Finland Center of Excellence in Complex Disease Genetics (grant number 312075 to M.D and 312074 to A.P), the National Institutes of Health (grant number NIH/1R01NS106104-01A1 to A.P.), the Estonian Research Council (R.M. and F.-D.P. were supported by grant PRG1911 and TK214) and the German Research Foundation (DFG, grant number 516649954 to H.O.H.).

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

These authors contributed equally: Reetta Kälviainen, Mark J. Daly.

Authors and Affiliations

Hasso Plattner Institute for Digital Engineering, University of Potsdam, Potsdam, Germany
Henrike O. Heyne, Julian Wanner & Jennifer I. Daniel Onwuchekwa
Hasso Plattner Institute, Mount Sinai School of Medicine, New York, NY, US
Henrike O. Heyne
Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
Henrike O. Heyne, Julian Wanner, Aarno Palotie & Mark J. Daly
Program for Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Henrike O. Heyne, Aarno Palotie & Mark J. Daly
Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
Fanny-Dhelia Pajuste & Reedik Mägi
Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
Fanny-Dhelia Pajuste & Reedik Mägi
Faculty of Life Sciences, University of Siegen, Siegen, Germany
Jennifer I. Daniel Onwuchekwa
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Aarno Palotie & Mark J. Daly
Kuopio Epilepsy Center, Neurocenter, Kuopio University Hospital, Member of ERN EpiCARE, Kuopio, Finland
Reetta Kälviainen
Institute of Clinical Medicine, School of Medicine, Faculty of Health Sciences, University of Eastern Finland, Kuopio, Finland
Reetta Kälviainen
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
Mark J. Daly

Authors

Henrike O. Heyne
View author publications
You can also search for this author in PubMed Google Scholar
Fanny-Dhelia Pajuste
View author publications
You can also search for this author in PubMed Google Scholar
Julian Wanner
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer I. Daniel Onwuchekwa
View author publications
You can also search for this author in PubMed Google Scholar
Reedik Mägi
View author publications
You can also search for this author in PubMed Google Scholar
Aarno Palotie
View author publications
You can also search for this author in PubMed Google Scholar
Reetta Kälviainen
View author publications
You can also search for this author in PubMed Google Scholar
Mark J. Daly
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

FinnGen

Henrike O. Heyne
, Julian Wanner
, Aarno Palotie
, Reetta Kälviainen
& Mark J. Daly

Estonian Biobank research team

Fanny-Dhelia Pajuste
& Reedik Mägi

Contributions

H.O.H. conceptualized the study, curated and analyzed FinnGen data and wrote the paper. F.D.P. and J.I.D.O. performed replication analyses in the Estonian biobank and BioMe biobank, respectively. J.W. performed analyses with FinnGen data. F.D.P. and J.W. contributed equally. D.K.L., R.M. R.K., A.P. provided resources. R.K. provided and oversaw clinical interpretation of the data and results. M.J.D. supervised analytical aspects of the study. All authors read and approved the manuscript.

Corresponding author

Correspondence to Henrike O. Heyne.

Ethics declarations

Competing interests

M.J.D. is a founder of Maze Therapeutics. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Melanie Bahlo and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Heyne, H.O., Pajuste, FD., Wanner, J. et al. Polygenic risk scores as a marker for epilepsy risk across lifetime and after unspecified seizure events. Nat Commun 15, 6277 (2024). https://doi.org/10.1038/s41467-024-50295-z

Download citation

Received: 14 September 2023
Accepted: 04 July 2024
Published: 25 July 2024
DOI: https://doi.org/10.1038/s41467-024-50295-z

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Electronic health records accurately represent epilepsy diagnoses

Epilepsy PRS is most elevated in GGE, specifically IGE, with the same effect sizes as in clinically curated epilepsy cohorts

High PRS is associated with epilepsy across lifetime and after unspecified seizure events

Epilepsy PRS has sex-specific effects on epilepsy subtypes

Epilepsy PRS has a larger effect when epilepsy onset is earlier

PRSGGE is specifically associated with GGE while PRSNAFE is more heterogeneous

Discussion

Methods

Data and definition of epilepsy cases and controls

Calculation of polygenic risk scores

Statistical analyses

Reporting summary

Data availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Consortia

FinnGen

Estonian Biobank research team

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links

PRS_GGE is specifically associated with GGE while PRS_NAFE is more heterogeneous