Introduction

Migraine is a debilitating and complex genetic disorder with 1 billion affected individuals worldwide and a life-time prevalence of 15–20% [1]. Given the high prevalence, migraine has a considerable social and economic impact, with annual expenses estimated at €27 billion in Europe alone [2]. Furthermore, the Global Burden of Disease Study (2016) ranks migraine the second highest cause of disability worldwide [3]. Migraine is characterized by episodes of moderate to severe throbbing, unilateral headache, which intensifies with an increase in physical activity, and is accompanied by nausea and increased sensitivity to light and sound [4]. Rare variants with large effect sizes may be implicated with a high migraine risk [5]. The most recent meta-analysis on migraine genome-wide association studies (GWAS) identified 44 independent SNPs defining 38 migraine risk loci [6]. However, a GWAS typically assesses common SNPs derived from imputation and genotyping, and thus effect sizes are small. Using next generation sequencing, it is possible to assess rare variants. Such rare variants are expected to have a larger effect size, particularly if the variant segregates with disease within families [7]. Common complex disorders, such as migraine, are thought to be influenced by alterations in gene regulation [8, 9]. As migraine symptoms occur episodically and can be affected by regulatory factors, such as hormones [10, 11], it is very likely that gene dysregulation plays an important role in migraine etiology. SNPs that affect gene regulation or are present in regulatory regions have been linked with migraine [12, 13], which further supports the impact of regulatory dysfunction on the disorder. However, most of these SNPs are common with small effect sizes and fail to explain a high migraine risk.

We assessed whether rare variants in regulatory regions in known migraine risk loci have an increased risk burden for migraine. Additionally, we investigated whether the rare regulatory variants had effects that are independent of the variants in the migraine risk loci. We employed a cohort of extended families with familial clustering of migraine and performed whole-genome sequencing to identify rare variants. The findings were replicated in an independent case-control cohort of sporadic migraineurs with no familial history of migraine and controls.

Materials and methods

Familial cohort

A total of 155 families with familial clustering of migraine were recruited. The families consisted of 1040 subjects (mean number of individuals per family = 6.7) of which 746 individuals were diagnosed with migraine (mean number of migraineurs per family = 4.8) and 294 had no history of migraine. The smallest families consisted of at least two individuals and at least one migraineur. A proband from each family was initially recruited at the Danish Headache Center, Rigshospitalet-Glostrup, Denmark, and the remaining family members were subsequently recruited, as described elsewhere [14]. All subjects were assessed by a neurologist or a senior medical student trained in headache diagnostics using a validated semi-structured interview [15, 16] based on the International Classification of Headache Disorders [4].

Replication cohort

The replication cohort consisted of 2027 sporadic migraineurs (no first-degree relatives with migraine) and 1650 controls. The sporadic migraineurs were recruited and interviewed at the Danish Headache Center using the same procedures as described for the probands of the familial cohort. The controls were recruited for the Danish study of Non-Invasive testing in Coronary Artery Disease (Dan-NICAD) according to procedures described by Nissen et al. [17].

Sequencing

Genomic DNA was extracted from whole blood. All samples from both the familial cohort and the replication cohort were subjected to the same whole-genome sequencing procedures using an Illumina NovaSeq 6000 sequencing platform and S4 flow cells and subsequently subjected to quality control by deCODE genetics, as described elsewhere [18].

Defining genomic regions for analysis

All regulatory regions surrounding the index SNP defining the known, autosomal migraine risk loci [6] were analyzed. We defined the genomic search regions as 1 Mb pairs upstream and downstream of the index SNP (see supplementary Table 1 for regions). In the 2 Mb pair window, the genomic positions (according to the Genome Reference Consortium Human Build 38 GRCh38.p12 (hg38)) of all regulatory regions were annotated using BED files downloaded from the UCSC Table browser [19]. These BED files included insulators [20, 21], polycomb group response elements (PREs) [20, 21], enhancers [22], transcription factor binding sites (TFBS) [21], CpG islands [23], and promoters [24]. All BED files were downloaded on 02/21/2019 except for the BED file of the PREs (downloaded on 02/28/2019). Any genomic coordinates from the hg19 assembly were subsequently converted to the hg38 assembly. In total, 1079 insulators; 518 PREs; 1651 enhancers; 16,190 TFBSs; 828 CpG islands; and 135 promoters were mapped and subjected to familial association analysis. Within the regulatory regions, only genetic variants with a minor allele frequency (MAF) < 5% were assessed.

Familial association analysis

VCF files of each participant in the study were merged, multiallelic sites were split into a biallelic representation, the positions of the regulatory regions were annotated, and rare genetic variants (MAF < 5%) were isolated. The rare-variant association analysis was carried out with the software of Family Sequence Kernel Association Test (F-SKAT) [25]. Age and gender were used as covariates. The age was defined as the age at the time of the interview (the time of diagnosis). The analysis was performed with all default options in effect, and the resulting p values were controlled for multiple testing using the Bonferroni method. A Bonferroni-corrected p value < 0.05 was considered significant. To assess whether the findings were dependent on the original index SNP of the migraine GWAS, the number of alleles of the index SNP (0, 1, or 2) were applied as additional covariates in a separate rare-variant association analysis with F-SKAT.

Replication

The findings from the familial association analysis were assessed in a case-control design using an independent cohort of sporadic migraineurs with no familial history of migraine and controls. Aggregated data were available on rare variants (MAF < 5%) in the regulatory regions for the controls. Thus, we tested whether the frequency of rare variants was increased in sporadic migraineurs compared to the controls under the null hypothesis of no association with migraine and under a dominant model of penetrance, using a χ2 test [26]. The resulting p values were controlled for multiple testing using the Bonferroni method. A Bonferroni-corrected p value < 0.05 was considered significant.

eQTL and gene expression analysis

Regulatory regions with an increased burden of rare variants were subsequently assessed using gene expression data and single-tissue cis-eQTL data from the Genotype-Tissue Expression (GTEx) version 8.0 dataset (dbGaP Accession phs000424.v8.p2, 2019-08-26 release). All SNP identifiers were found in the Database of Single Nucleotide Polymorphisms (dbSNP) [27] build 152.

Results

Increased burden of rare variants found in four regulatory regions

We analyzed 155 migraine families (n = 1040) and 37 autosomal migraine risk loci (supplementary Table 1) and found a significant increased burden of rare variants segregating with migraine in 39 regulatory regions (see supplementary Table 2 for regulatory regions and p values). As expected, when analyzing rare variants in regions, that are known to harbor polygenic effects in the known risk loci [6, 28, 29], we found that the distribution of the test statistics deviated from the null, with a genomic inflation factor of λ = 1.89 (supplementary Fig. 1).

In an independent case-control cohort using sporadic migraineurs with no familial history of migraine (ncases = 2027) and controls (ncontrols = 1650), assuming a dominant model of penetrance, we verified that four of the regulatory regions had an increased burden of rare variants (Table 1). The four regulatory regions include a CpG island located near the PHACTR1 migraine risk locus and three PREs located near the KCNK5, ASTN2, and RNF213 migraine risk locus. The CpG island located near the PHACTR1 migraine risk locus associated with the highest migraine risk, having an odds ratio (OR) of 1.54 (Fig. 1). Extensive information of the rare variants identified for the four regulatory regions is available in supplementary Table 3.

Table 1 Results from replication using an independent case-control cohort of sporadic migraineurs and controls. The table presents the name and MAF of the index SNP of the migraine risk loci, as given by Gormley et al. (2016) [6], the genomic positions of the regulatory regions, the type of regulatory regions, the Bonferroni-corrected p values, and the ORs with the 95% confidence intervals (CI)
Fig. 1
figure 1

Effect of regulatory regions on migraine risk. The figure presents the effect of the significant regulatory regions on migraine risk. X-axis: OR (dot) with the 95% confidence intervals (lines). Y-axis: The genomic positions (given as chromosome:start:end) and type of regulatory regions and the migraine risk loci in brackets

Association of migraine and rare regulatory variants independent of known risk loci

Subsequently, we tested whether the four regulatory regions with increased burden were correlated with the index SNP of the migraine risk loci. Using the risk allele count of the index SNP as covariate, we found a nonsignificant effect on the results defining each migraine risk locus as additional covariates (Table 2).

Table 2 Results obtained with familial association analysis using the presence of the index SNP as additional covariates. The table gives the name of the migraine risk loci, the type of regulatory regions, and the Bonferroni-corrected p values with and without applying the presence of the index SNP as additional covariates

Possible regulatory targets identified for three regulatory regions

To shed light on the biological mechanisms regulated by the regulatory regions, the genic regulatory targets were sought to be identified. The CpG island near the PHACTR1 migraine risk locus is overlapping the promoter region of various splice variants of the GFOD1 gene and the non-coding GFOD1-AS1 gene (Fig. 2a). We also found two cis-eQTLs within the CpG island that associated with GFOD1 transcript levels (supplementary Table 4). Analysis of the gene expression in the GTEx database indicates that GFOD1 is expressed across different kinds of tissue and is highly expressed in brain and vascular tissue (Fig. 3a).

Fig. 2
figure 2

The genomic regions in which the four regulatory regions are located. In the top of each figure is 1 Mb pairs upstream and downstream of the index SNP defining the migraine risk loci displayed. Within this 2 Mb pair window, regulatory regions were annotated and rare variants within these were assessed. The index SNP defining the migraine risk loci is located in the middle, and below are the regulatory regions (the CpG island and the PREs) in which rare variants associated with an increased migraine risk and genic transcript information. In the bottom of each figure is a close-up of the regulatory regions. Here, the rare variants, that had an increased burden in the familial association analysis, are indicated. The genomic regions surrounding a the PHACTR1 locus and the CpG island, b the KCNK5 locus and the PRE, c the RNF213 locus and the PRE, and d the ASTN2 locus and the PRE

Fig. 3
figure 3

Expression of the possible genic regulatory targets across different tissues. The figure displays expression data from the GTEx database where the tissue types have been divided according to tissue group (brain, vascular, and other remaining tissue). Y-axis: the expression in transcripts per million (TPM). Expression of aGFOD1, bKCNK17, cKIF6, dCBX2, eLINC02078, fENPP7, gTBC1D16, hCARD14

PREs can be situated far away from their target genes, in introns, or in the 3′ untranslated regions (UTRs) [30]. Thus, the relations of the PREs to genes are based on cis-eQTLs in the GTEx database. Significant cis-eQTLs in the PRE near the KCNK5 locus mapped to the KCNK17 and KIF6 gene (supplementary Table 4). The PRE is situated across two introns and one exon of the KCNK17 gene (Fig. 2b). According to GTEx, KCNK17 is primarily expressed in vascular tissue (Fig. 3b) while KIF6 is mainly expressed in the brain (Fig. 3c).

The PRE near the RNF213 locus harbored cis-eQTLs that associated with the transcript levels of five different genes. These include the nearby genes of CBX2, LINC0207, ENPP7, TBC1D16, and CARD14 (Fig. 2c). We do not note a higher expression of these genes in brain or vascular tissue compared to other tissue types (Fig. 3d–h).

The PRE situated near the ASTN2 locus is located in an intron of the PAPPA gene (Fig. 2d). No significant cis-eQTLs could be found within this PRE.

No increased burden of rare variants in regulatory targets

We assessed whether the possible gene targets of the four regulatory regions had an increased burden of rare variants segregating with migraine. Using a familial association analysis, we did not find a significantly increased burden of rare variants segregating with migraine after Bonferroni correction (Table 3). The analysis was performed on the protein-coding genes only.

Table 3 Results from familial association analysis on rare variants segregating with migraine in the possible genic regulatory targets. The name of the migraine risk loci, the regulatory regions, and the corresponding possible genic regulatory target are presented. Additionally, the nominal p value and the Bonferroni-corrected p value of the familial association analysis are displayed

Discussion

Migraine is most often an episodic disorder and can be affected by regulatory factors, such as hormones. It is therefore very likely that gene dysregulation plays a causal role in migraine etiology. Here, we have assessed the hypothesis that an increased burden of rare regulatory variants in known migraine risk loci can be associated with a risk of migraine. We also addressed whether the association is independent of the common risk variants in the loci. We found four regulatory regions with an OR > 1, in which an increased burden of rare variants is independently associated with migraine. We report eight possible regulatory target genes. These genes are different than those in which the index SNPs of the known migraine risk loci resides in.

The CpG island near the PHACTR1 locus is overlapping the promoter region of various splice variants of the GFOD1 gene and the non-coding GFOD1-AS1 gene. As about 70% of all promoters are located in a CpG island-rich area [31], it is highly likely that this CpG island is involved in the regulation of the GFOD1 and the GFOD1-AS1 gene. The regulation of the GFOD1 gene is supported by the existence of two cis-eQTLs within the CpG island, that are associated with GFOD1 transcript levels. Interestingly, the research group of Lasky-Su et al. [32] has reported a SNP in an intron of the GFOD1 gene that associated with ADHD, which is a comorbid disorder of migraine [33, 34]. Thus, the findings of this study could help elucidate the association between the disorders. Analysis of the gene expression in the GTEx database indicates that GFOD1 is expressed predominantly in brain and vascular tissue, which are tissue types that have been connected with migraine [35, 36]. Transcription of antisense non-coding RNAs has been shown to participate in both cis and trans gene regulation [37], and it is thus possible that GFOD1-AS1 plays an important regulatory role on GFOD1.

Variations within CpG islands have been shown to influence DNA methylation patterns. Depending on the increase or elimination of CpG dinucleotides, the mutations may lead to hyper- or hypomethylation and, therefore, an altered transcriptional activity [38]. If CpG islands are hypermethylated, they may cause a stable silencing of the gene. In contrast, a hypomethylation can result in an overexpression [39]. Therefore, such changes of the CpG sites with the CpG island could affect the expression of GFOD1.

Significant cis-eQTLs in the PRE near the KCNK5 locus drive a change in the gene expression of KCNK17 and KIF6. The KCNK17 gene encodes a membrane protein and belongs to the family of two-pore domain potassium (K2P) channels. Of the same K2P channel subfamily is the protein product of the KCNK5 gene [40], in which the index SNP defining the migraine risk locus resides. From a different subfamily is the KCNK18 gene where a frameshift mutation has been found to segregate perfectly in families with migraine with aura [41]. Additionally, previously identified migraine-associated genes include genes encoding potassium channels [41, 42]. Cortical spreading depression has been associated with migraine attacks [4], and because cortical spreading depression is characterized by ion influxes and neuronal depolarization, genetic alterations affecting potassium channels could play a role in this. The findings of our study could support that migraine may be a channelopathy and that potassium ion channels influence the disorder.

The PRE near the RNF213 locus harbored cis-eQTLs that act upon five different genes. These include the nearby genes of CBX2, LINC0207, ENPP7, TBC1D16, and CARD14. The protein encoded by the CARD14 gene mediates the activation of transcription factor NF-κB [43]. NF-κB plays a part in induction of nitric oxide production [44], which has been implied as having a key causative role in migraine [45] and can induce headache in healthy subjects [46]. Therefore, inhibition of NF-κB has been proposed as a possible therapeutic approach to treat migraine [47].

PREs are a class of silencers that have a suppressive effect on gene expression, when polycomb group (PcG) proteins are bound to them. Mutations within a PRE could result in reduced silencer activity and therefore lead to an increased gene expression. If mutations within the PRE results in increased expression of CARD14, it could therefore cause an increased activation of NF-κB and thus influence migraine.

The CBX2 gene encodes a subunit of the PRC1-like complex, which is a PcG protein. Thus, this PcG protein participates in suppression of gene expression through binding to PREs. Thus, if the expression of the CBX2 gene is altered, it could result in altered gene regulation.

Aside from finding a connection between a PcG protein and migraine, we have also found associations between an increased burden of rare variants in three PREs and migraine. It is interesting, that we find multiple connections between an increased migraine risk and PREs and proteins that bind to them. In humans, PREs are known to be influential on human embryonic development and epigenetic memory [48] but the function and regulatory mechanism of action of mammalian PREs is still largely unknown [49]. It is therefore speculative, what effect PREs have on migraine. As more knowledge on PREs is gained in the future, we may be able to understand how their altered regulatory mechanism may affect migraine. PcG proteins have been suggested to act synergistically in order to silence gene expression [50]. Consequently, any mutated PcG binding site within a PRE may cause an altered phenotype. Future research could include determining if the PREs we find associated with migraine harbor PcG bindings sites that have been altered by the rare mutations.

Lastly, we addressed whether only the regulatory regions had a higher burden of rare variants segregating with migraine or if the genes possibly targeted by the regulatory regions had a higher burden of rare variants. We found that none of the protein-coding possible genic regulatory targets had a significantly increased burden of rare variants segregating with migraine. This suggests that a higher burden of rare variants in the regulatory regions, and not their respective regulatory targets, is implicated in the pathophysiology of migraine. This supports that migraine is influenced by gene dysregulation and that regulatory dysfunction has a crucial impact on the disorder.

Conclusion

We report that families with familial clustering of migraine have an increased burden of rare variants segregating with migraine in regulatory regions. The regulatory regions are located near known migraine risk loci but display effects independent of the common variants defining the loci. The possible regulatory targets suggest different genes than those originally tagged by the index SNPs of the migraine loci. The findings support that gene dysregulation plays a crucial, if not causal, role in migraine etiology.