Abstract
Free full text
Activation of proto-oncogenes by disruption of chromosome neighborhoods
Abstract
Oncogenes are activated through well-known chromosomal alterations, including gene fusion, translocation and focal amplification. Recent evidence that the control of key genes depends on chromosome structures called insulated neighborhoods led us to investigate whether proto-oncogenes occur within these structures and if oncogene activation can occur via disruption of insulated neighborhood boundaries in cancer cells. We mapped insulated neighborhoods in T-cell acute lymphoblastic leukemia (T-ALL), and found that tumor cell genomes contain recurrent microdeletions that eliminate the boundary sites of insulated neighborhoods containing prominent T-ALL proto-oncogenes. Perturbation of such boundaries in non-malignant cells was sufficient to activate proto-oncogenes. Mutations affecting chromosome neighborhood boundaries were found in many types of cancer. Thus, oncogene activation can occur via genetic alterations that disrupt insulated neighborhoods in malignant cells.
Graphical Abstract
One Sentence Summary
Proto-oncogenes can be activated by genetic alterations that disrupt 3D chromosome structure.
Tumor cell gene expression programs are typically driven by somatic mutations that alter the coding sequence or expression of proto-oncogenes (1) (Fig. 1A), and identifying such mutations in patient genomes is a major goal of cancer genomics (2, 3). Dysregulation of proto-oncogenes frequently involves mutations that bring transcriptional enhancers into proximity of these genes (4). Transcriptional enhancers normally interact with their target genes through the formation of DNA loops (5-7), which typically are constrained within larger CTCF-cohesin mediated loops called insulated neighborhoods (8-10), which in turn can form clusters that contribute to topologically associating domains (TADs) (11, 12) (Fig. S1A). This recent understanding of chromosome structure led us to hypothesize that silent proto-oncogenes located within insulated neighborhoods might be activated in cancer cells via loss of an insulated neighborhood boundary, with consequent aberrant activation by enhancers that are normally located outside the neighborhood (Fig. 1A, lowest panel).
To test this hypothesis, we first mapped neighborhoods and other cis-regulatory interactions in a cancer cell genome using Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) (Fig. 1B, Table S1). A T-cell acute lymphoblastic leukemia (T-ALL) cell (Jurkat) was selected for these studies because key T-ALL oncogenes and genetic alterations are well-known (13, 14). The ChIA-PET technique generates a high-resolution (~5kb) chromatin interaction map of sites in the genome bound by a specific protein factor (8, 15, 16). Cohesin was selected as the target protein because it is involved in both CTCF-CTCF interactions and enhancer-promoter interactions (5-7), and has proven useful for identifying insulated neighborhoods (8, 10) (Fig S1A-B). The cohesin ChIA-PET data were processed using multiple analytical approaches (Fig. S1-4, Table S2) and identified 9,757 high-confidence interactions, including 9,038 CTCF-CTCF interactions and 379 enhancer-promoter interactions (Fig. S4C). The CTCF-CTCF loops had a median length of 270 kb, contained on average 2-3 genes and covered ~52% of the genome (Table S2). Such CTCF-CTCF loops have been called insulated neighborhoods because disruption of either CTCF boundary causes dysregulation of local genes due to inappropriate enhancer-promoter interactions (8, 10). Consistent with this, the Jurkat chromosome structure data showed that the majority of cohesin-associated enhancer-promoter interactions had endpoints that occurred within the CTCF-CTCF loops (Fig. 1C, S2H). These results provide an initial map of the 3D regulatory landscape of a tumor cell genome.
We next investigated the relationship between genes that have been implicated in T-ALL pathogenesis and the insulated neighborhoods. The majority of genes (40/55) implicated in TALL pathogenesis (curated from the Cancer Gene Census and individual studies)(Table S3) were located within the insulated neighborhoods identified in Jurkat cells (Fig. 2A, S5); 27 of these genes were transcriptionally active and 13 were silent based on RNA-Seq data (Fig. 2A, Table S4). Active oncogenes are often associated with super-enhancers (17, 18), and we found that 13 of the 27 active T-ALL Pathogenesis Genes associated with super-enhancers (Fig. 2A-B, S5A). Silent genes have also been shown to be protected by insulated neighborhoods from active enhancers located outside the neighborhood, and we found multiple instances of silent proto-oncogenes located within CTCF-CTCF loop structures in the Jurkat genome (Fig. 2A, 2C, Fig. S5B). Thus, both active oncogenes and silent proto-oncogenes are located within insulated neighborhoods in these T-ALL cells.
If some insulated neighborhoods function to prevent proto-oncogene activation, some T-ALL tumor cells may have genetic alterations that perturb the CTCF boundaries of neighborhoods containing T-ALL oncogenes. To investigate this possibility, we identified recurrent deletions in T-ALL genomes that span insulated neighborhood boundaries using data from multiple studies (Table S5A) and filtered for relatively short deletions (<500 kb) in order to minimize collection of deletions that affect multiple genes (Fig. S6A). Among the 438 recurrent deletions identified with this approach, 113 overlapped at least one boundary of insulated neighborhoods identified in T-ALL, and 6 of these affected neighborhoods containing T-ALL Pathogenesis Genes (Fig. S6B, Table S5B). Examples of two such genes, TAL1 and LMO2, are shown in Fig. 3A and Fig. 3G.
If deletions overlapping neighborhood boundaries can cause activation of proto-oncogenes within the loops, then site-specific deletion of a loop boundary CTCF site at the TAL1 locus should be sufficient to activate these proto-oncogenes in non-malignant cells. TAL1 encodes a transcription factor that is overexpressed in ~50% of T-ALL cases and is a key oncogenic driver of this cancer (19, 20). TAL1 can be activated by deletions that fuse a promoter-less TAL1 gene to the promoter of STIL (19) and this was observed in many patient deletions (Fig. 3A). Several patient deletions, however, retained the TAL1 promoter (endpoint >5kb from promoter) but overlapped the CTCF boundary site of the TAL1 neighborhood (Fig. 3A), and TAL1 was active in the samples harboring these deletions (Fig. S7A-B). This suggests disruption of the insulated neighborhood, allowing activation of TAL1 by regulatory elements outside of the loop. We tested this idea by CRISPR/Cas9 mediated deletion of the TAL1 neighborhood boundary in human embryonic kidney cells (HEK-293T) (Fig. 3B). In these cells, the TAL1 proto-oncogene is silent as evidenced by low H3K27Ac occupancy and RNA-Seq (Fig. 3B). However, at least one active regulatory element occurs ~60kb upstream of TAL1, adjacent to the CMPK1 promoter, as evidenced by high levels of H3K27Ac and p300/CBP (Fig. 3B) and enhancer reporter assays (Fig. S8A-B). Deletion of a ~400 bp segment encompassing the boundary CTCF site, which abolished CTCF binding (Fig. S8A), caused a 2.3-fold induction of the TAL1 transcript (Fig. 3C), suggesting that the integrity of the neighborhood contributes to the silent state of TAL1 (Fig. 3D). Supporting this model, contacts between DNA regions that are normally within and outside of the neighborhood were increased (Fig. 3E-F, S10). Furthermore, deletion of the CTCF site in primary human T-cells also caused a small but detectable activation of TAL1 (Fig. S8C-G). These results are consistent with the model that the silent state of the TAL1 proto-oncogene is dependent on the integrity of the insulated neighborhood (Fig. 3D).
We further tested the model that site-specific perturbation of a loop boundary is sufficient to activate a proto-oncogene at the LMO2 locus. The LMO2 gene encodes a transcription factor that is overexpressed and oncogenic in some forms of T-ALL (14, 20). The region upstream of the LMO2 promoter is recurrently deleted in T-ALL and these deletions are linked to LMO2 activation (Fig. 3G); a previous study proposed that deletion of cryptic repressors located in the deleted region enable activation of LMO2 (21). Analysis of a T-ALL patient cohort (22) revealed deletions that overlap the CTCF boundary site of the LMO2 neighborhood, and that patient cells harboring these deletions had generally high levels of LMO2 expression (Fig. S9A-B). CRISPR/Cas9-mediated deletion in HEK-293T cells of a ~25 kb segment encompassing the insulated neighborhood boundary CTCF site and two additional CTCF sites that could act as boundary elements, caused a 2-fold increase in the LMO2 transcript (Fig. 3H-J), and a large-scale rearrangement of interactions around LMO2 as evidenced by 5C analysis (Fig. 3K-L, S10). These results indicate that the deleted CTCF sites contribute to the silent state of the LMO2 proto-oncogene (Fig. 3J).
The boundaries of chromosome neighborhoods may be disrupted in other cancers. A recent study noted that mutations in CTCF binding sites occur frequently in cancers (23), but it is unclear if mutations in boundaries are common as only a subset of CTCF sites form insulated neighborhoods (8, 10, 24). CTCF-cohesin bound loops are largely preserved across cell types (8, 9, 24), and a set of ~10,000 constitutive CTCF-CTCF loops shared by GM12878 lymphoblastoid, Jurkat cells and K562 CML cells (24) were identified for comparison (Fig. 4A, S11, Table S8). The boundaries of these neighborhoods were examined for somatic point mutations found in cancer genomes using the ICGC database containing data for ~50 cancer types, ~2300 WGS samples, and ~13 million unique somatic mutations (Table S9). We found a striking enrichment of mutations at the CTCF boundaries of constitutive neighborhoods (Fig. 4B, S12A, Table S10) compared to regions flanking the boundary CTCF sites (+/−1kb of the CTCF binding motif; P<10−4, permutation test (Fig. S12B), and in many instances these created a significant change in the consensus CTCF binding motif (Fig. S12C). Non-boundary CTCF sites did not show such enrichment (Fig. 4B, S12D, S14). The genomes of esophageal and liver carcinoma samples were particularly enriched for boundary CTCF site mutations (Fig. 4C-D, S12D-E, S13, Table S10), and there was no similar enrichment of mutations at the binding sites of other transcription factors (Fig. S15). In these cancers, a considerable fraction of the mutated neighborhood boundary CTCF sites were affected by multiple mutations (≥3 mutations per site) [280/1826 (15%) in esophageal carcinoma, and 54/1030 (5%) in liver carcinoma](Table S10), and recurrent mutations occurred more frequently in neighborhood boundary CTCF sites compared to non-boundary CTCF sites (Fig. S16A-C). The genes located within the most frequently mutated neighborhoods included known cellular proto-oncogenes annotated in the Cancer Gene Census and other genes that have not been associated with these cancers (Fig 4E-F, Table S11-S12). Two examples of proto-oncogene -containing neighborhoods where the activation of the gene located in the neighborhood has been observed in the respective cancer type are shown in Fig 4G-H. These results suggest that somatic mutations of insulated neighborhood boundaries occur in the genomes of many different cancers.
In summary, disruption of insulated neighborhood boundaries can cause oncogene activation in cancer cells. With maps of 3D chromosome structure such as those described here, cancer genome analysis can consider how recurrent perturbations of boundary elements may impact expression of genes with roles in tumor biology. Our understanding of 3D chromosome structure and its control is rapidly advancing and should be considered for potential diagnostic and therapeutic purposes. Because control of 3D chromosome structure involves binding of specific sites by CTCF and cohesin, which is affected by protein cofactors, DNA methylation and local RNA synthesis (25), future advances in our understanding of these regulatory processes may provide new approaches to therapeutics that impact aberrant chromosome structures.
Supplementary Material
Supplemental
Table S5
Table S6
Table S7
Table S8
Table S9
Table S1
Table S10
Table S11
Table S12
Table S13
Table S2
Table S3
Table S4
Acknowledgments
Supported by NIH grants HG002668 (R.A.Y.), CA109901 (R.A.Y.), HG003143 (J.D.), NS088538 (R.J.), MH104610 (R.J.) and AI120766 (M.H.P.); an Erwin Schrödinger Fellowship (J3490) from the Austrian Science Fund (FWF)(D.H.), Ludwig Graduate Fellowship funds (A.S.W.), the Laurie Kraus Lacob Faculty Scholar Award in Pediatric Translational Research (M.H.P.), Hyundai Hope on Wheels (M.H.P.), an Individual Postdoctoral grant (DFF–1333-00106B)(R.O.B.) and a Sapere Aude Research Talent grant (DFF–1331-00735B)(R.O.B.) from the Danish Council for Independent Research, Medical Sciences. We thank Rebecca Fitzgerald, Sean Grimmond and the ICGC Genome Projects ESAD-UK and OV-AU for permission to use genome sequence data. Datasets generated in this study have been deposited in the Gene Expression Omnibus under the Accession number GSE68978. The Whitehead Institute filed a patent application based on this paper. R.A.Y. is a founder of Syros Pharmaceuticals and R.J. is a founder of Fate Therapeutics.
References and Notes
Full text links
Read article at publisher's site: https://doi.org/10.1126/science.aad9024
Read article for free, from open access legal sources, via Unpaywall: https://science.sciencemag.org/content/sci/351/6280/1454.full.pdf
Citations & impact
Impact metrics
Citations of article over time
Alternative metrics
Smart citations by scite.ai
Explore citation contexts and check if this article has been
supported or disputed.
https://scite.ai/reports/10.1126/science.aad9024
Article citations
Machine and Deep Learning Methods for Predicting 3D Genome Organization.
Methods Mol Biol, 2856:357-400, 01 Jan 2025
Cited by: 1 article | PMID: 39283464
Review
Global loss of promoter-enhancer connectivity and rebalancing of gene expression during early colorectal cancer carcinogenesis.
Nat Cancer, 30 Oct 2024
Cited by: 0 articles | PMID: 39478119
5-Hydroxymethylcytosine in circulating cell-free DNA as a potential diagnostic biomarker for SLE.
Lupus Sci Med, 11(2):e001286, 04 Oct 2024
Cited by: 0 articles | PMID: 39366755 | PMCID: PMC11459320
Polymer Physics Models Reveal Structural Folding Features of Single-Molecule Gene Chromatin Conformations.
Int J Mol Sci, 25(18):10215, 23 Sep 2024
Cited by: 0 articles | PMID: 39337699 | PMCID: PMC11432541
Probabilistic inference of epigenetic age acceleration from cellular dynamics.
Nat Aging, 4(10):1493-1507, 23 Sep 2024
Cited by: 4 articles | PMID: 39313745 | PMCID: PMC11485233
Go to all (573) article citations
Other citations
Data
Data behind the article
This data has been text mined from the article, or deposited into data resources.
BioStudies: supplemental material and supporting data
GEO - Gene Expression Omnibus
- (1 citation) GEO - GSE68978
IGSR: The International Genome Sample Resource
- (1 citation) IGSR/1000 Genomes - GM12878
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
CANCER. The oncogene makes its escape.
Science, 351(6280):1398-1399, 01 Mar 2016
Cited by: 3 articles | PMID: 27013717
Role of proto-oncogene activation in carcinogenesis.
Environ Health Perspect, 98:13-24, 01 Nov 1992
Cited by: 54 articles | PMID: 1486840 | PMCID: PMC1519627
Review Free full text in Europe PMC
DeepCBS: shedding light on the impact of mutations occurring at CTCF binding sites.
Front Genet, 15:1354208, 23 Feb 2024
Cited by: 0 articles | PMID: 38463168 | PMCID: PMC10920299
Genomic Alterations of Non-Coding Regions Underlie Human Cancer: Lessons from T-ALL.
Trends Mol Med, 22(12):1035-1046, 27 Oct 2016
Cited by: 9 articles | PMID: 28240214 | PMCID: PMC5330204
Review Free full text in Europe PMC
Funding
Funders who supported this work.
Austrian Science Fund FWF (2)
Role of Super-enhancers in Stem Cell Differentiation
Dr Denes Hnisz, Massachusetts Institute of Technology
Grant ID: J 3490
Grant ID: J3490
Danish Council for Independent Research, Medical Sciences
Howard Hughes Medical Institute
Hyundai Hope on Wheels
Laurie Kraus Lacob Faculty Scholar Award in Pediatric Translational Research
Ludwig Graduate Fellowship
NCI NIH HHS (4)
Grant ID: P30 CA014051
Grant ID: CA109901
Grant ID: U54 CA193419
Grant ID: P01 CA109901
NHGRI NIH HHS (6)
Grant ID: R01 HG003143
Grant ID: U01 HG007910
Grant ID: U54 HG007010
Grant ID: R01 HG002668
Grant ID: HG002668
Grant ID: R25 HG007631
NIAID NIH HHS (3)
Grant ID: R01 AI120766
Grant ID: U01 R01 AI 117839
Grant ID: AI120766
NICHD NIH HHS (1)
Grant ID: R37 HD045022
NIDA NIH HHS (1)
Grant ID: U01 DA 040588
NIDDK NIH HHS (1)
Grant ID: U54 DK107980
NIGMS NIH HHS (3)
Grant ID: T32 GM087237
Grant ID: T32 GM007287
Grant ID: R01 GM 112720
NIH (7)
Grant ID: MH104610
Grant ID: NS088538
Grant ID: U54 DK107980
Grant ID: CA109901
Grant ID: U01 DA 040588
Grant ID: AI120766
Grant ID: HG002668
NIMH NIH HHS (2)
Grant ID: MH104610
Grant ID: R01 MH104610
NINDS NIH HHS (2)
Grant ID: NS088538
Grant ID: R01 NS088538
National Cancer Institute (1)
Grant ID: U54 CA193419
National Human Genome Research Institute (3)
Grant ID: U01 HG007910
Grant ID: R01 HG003143
Grant ID: U54 HG007010
National Institute of Allergy and Infectious Diseases (1)
Grant ID: U01 R01 AI 117839
National Institute of General Medical Sciences (1)
Grant ID: R01 GM 112720
Sapere Aude Research Talent (1)
Grant ID: DFF-1331-00735B