Abstract
Free full text
Structural and Functional Diversity of Topologically Associating Domains
Abstract
Recent studies have shown that chromosomes in a range of organisms are compartmentalized in different types of chromatin domains. In mammals, chromosomes form compartments that are composed of smaller Topologically Associating Domains (TADs). TADs are thought to represent functional domains of gene regulation but much is still unknown about the mechanisms of their formation and how they exert their regulatory effect on embedded genes. Further, similar domains have been detected in other organisms, including flies, worms, fungi and bacteria. Although in all these cases these domains appear similar as detected by 3C-based methods, their biology appears to be quite distinct with differences in the protein complexes involved in their formation and differences in their internal organization. Here we outline our current understanding of such domains in different organisms and their roles in gene regulation.
Introduction
The organization of chromosomes inside the cell nucleus is closely related to regulation of gene expression [1-5]. At the nuclear level this is apparent in the well-known spatial separation of active and inactive chromatin, where heterochromatic loci tend to be near the periphery and actively transcribed genes are localized more internal [1, 6]. This separation is also observed within chromosome territories: chromosomes are divided into large multi-Mb compartments that contain either active and open (A-compartments) or inactive and closed chromatin (B-compartments) [7, 8]. A compartments cluster with other A compartments, as do B compartments with B compartments. Given that different cell types express different gene sets driven by distinct groups of regulatory elements, the positions of A-and B-compartments change accordingly. Thus, global nuclear organization reflects a high level of compartmentalization that is directly correlated with the cell type-specific gene expression and chromatin status of the genome. However the exact nature of chromosome organization at the sub-megabase scale, which is the level at which most gene regulatory landscapes and long range interactions are thought to occur [9-12] had remained somewhat of a blackbox.
Recently, chromosome conformation capture (3C, [13]) experiments have uncovered the presence of an additional level of compartmentalization at this scale. Throughout the genomes of a wide range of species from bacteria to human, chromosomes are organized as a string of domains. These domains are characterized by preferential chromatin interactions within them, and spatial separation of loci located in different domains. In mammalian genomes these domains are several hundred Kb in size, up to 1-2 Mb [14, 15], whereas they are smaller in flies (~60Kb) [16, 17], and bacteria (~170Kb) [18]. In eukaryotes these domains are referred to as Topological Domains [15] or Topologically Associating Domains ([14]; here referred to as TADs), or as Chromatin Interaction Domains (CIDs) in bacteria [18]. TADs are distinct from A- and B- compartments as they are smaller and largely cell type invariant (see below).
Whether and how any of these chromosomal domains directly contribute to regulation of the genome, e.g. gene expression is less clear. One reason is that control of gene expression is usually thought to occur at a much smaller scale. For instance, genes can be regulated by distal regulatory elements such as enhancers. Enhancers are thought to act over tens of kilobases, up to hundreds of kilobases at most, regulating nearby genes but not necessarily, or exclusively, the closest gene [2, 12, 19]. Enhancers may regulate target genes by direct looping interactions with their promoters (e.g. [2, 12, 20, 21]). There are now many examples of such interactions, but the molecular mechanisms by which these loops are formed, their dynamics and how these interactions activate expression remains poorly understood. Another unresolved question is what determines specificity of long-range promoter-enhancer interactions. Given that enhancers can apparently loop to reach genes hundreds of kilobases away it is not known how bona fide target genes are identified and/or inappropriate interactions are prevented. It has been proposed that TADs play roles in regulating gene expression by either facilitating or preventing looping interactions [3, 4, 22], which would point to mechanistic links between chromosome compartmentalization, chromatin folding and regulation of gene expression (see below).
Here we will outline our current understanding of chromatin domains, focusing on TADs and CIDs. After describing their structural features and commonalities and differences between species we will present evidence that these domains form key structures involved in gene regulation by defining target regions of regulatory elements. We will then outline outstanding questions and propose future approaches to delineate and dissect the mechanisms of chromatin domain formation, enhancer action and transcription.
Self-interacting chromosomal domains are present in a wide range of organisms
Here we focus on chromosomal domains that are defined by the increased contact probability of loci located within them, which is readily detected in chromosome conformation capture experiments. In such experiments, e.g. 3C, 5C and Hi-C, comprehensive chromatin interaction datasets are obtained that can be represented as two-dimensional interaction heatmaps, where the genomic coordinates of the interacting pairs of loci are displayed along the two axes (Figure 1). Chromatin interaction maps typically display a very prominent diagonal that reflects the very frequent contacts between loci located close to each other in the linear genome [23]. Analysis of local and genome-wide chromatin interaction maps for mouse and human genomes, as well as in the fruifly Drosophila, led to the first observation of self interacting chromatin domains, that are apparent in chromatin interaction maps as a series of squares of relatively high interaction frequency along the diagonal [14-16]. These squares represent contiguous regions where loci interact with each other relatively frequently. Self-interacting domains are separated by sharp boundaries that appear to structurally insulate adjacent domains from each other as indicated by the fact that loci located in neighboring domains display a much lower contact frequency [14, 15, 24]
Self-interacting chromosomal domains have now been detected in bacteria, fungi, flies, nematodes, and in mammals. As detected by chromosome conformation capture experiments these domains all stand out as regions of increased contact frequency, but there are fundamental differences related to their structure and size, the processes and proteins that determine their formation, and possibly the mechanisms by which they affect chromatin state and gene expression. Below we describe the structural features of these domains in different organisms.
Topologically Associating Domains in Mammals
The first evidence for TADs in mammals came from a 5C analysis of the X chromosome inactivation center (Xic) in mouse ESCs and differentiated cells [14] (Figure 1), and a genome-wide Hi-C study in mouse and human cells [15]. Microscopy studies had previously hinted at globular chromosome structures in the megabase size range, possibly built up of approximately 100kb domains [25, 26]. Super-resolution DNA FISH across the Xic region revealed that TADs might indeed represent such physical entities of preferentially associating chromatin at the single cell level, as FISH probes were found to intermingle more frequently within TADs than between them [14, 27]. It should be noted that data from FISH and 3C-based techniques are not always concordant, however, and that chromatin interactions, and/or FISH detection at some genomic regions may be subject to specific influences [28].
TADs in mammalian genomes range from tens of kb up to 1 or 2 Mb, with an average of around 800 kb. Two remarkable features of mammalian TADs are their relative invariance during differentiation [14, 15] and their general conservation in relative position (though not necessarily in size) between man and mouse [15, 29]. Whether this conservation is at the level of boundaries between TADs, or the interacting regions within TADs, remains an open question and indeed the nature of the underlying sequences that are conserved may vary from one genomic region to another, as will be discussed below. Although TADs are generally present and invariant, there are specific cases where TADs are not present. First, the inactive X chromosome appears to be depleted of TADs [14, 30, 31] and shows rather random interactions along its length [32]. Second, TADs are not observed along mitotic chromosomes [33]. Thus, even in mammalian cells, TADs can be absent in some cases, e.g. during chromosome-wide transcriptional silencing and chromosome condensation, although these processes may be driven by distinct molecular mechanisms.
Although TADs seem to be relatively invariant, the long-range sequence interactions within them can vary significantly between cell types and during differentiation, with specific long-range interactions appearing, while others are lost [14]. Many of these dynamic changes can be linked to the regulatory enhancer-promoter interaction events that orchestrate transcription during development (e.g. [34]), for which many classic examples, such as the b-globin locus [35] have already been described. The emerging picture is that TADs encompass the regulatory landscapes of genes, and that the meeting of enhancers with their target promoters happens usually, if not always, in the context of TADs [36-38].
An important question thus concerns what underlies TAD formation and how sequence interactions are restricted to occur within, but not between domains. An obvious mechanism would be that boundaries between TADs have specific insulating properties. An alternative but not mutually exclusive mechanism could be that sequence interactions within TADs are sufficient to ensure spatial segregation, although asymmetry in interactions (i.e. preferential interactions within the domain as compared to interactions between domains) must be provided somehow. Deletion of a boundary at the Xic locus resulted in aberrant interactions between previously separate TADs and misregulation of genes, presumably due to de novo enhancer-promoter interactions. Importantly, this deletion resulted in only partial fusion of the two adjacent TADs, with the appearance of a new boundary within one TAD implying that in the context of de novo interactions, a novel boundary can actually form [14]. More recent studies of structural variations that disrupt TAD boundaries in the context of malformation syndromes reveal that ectopic interactions leading to aberrant gene expression can be caused by disruption of boundary elements rather than merely by distance effects, highlighting the importance of TADs and their boundaries in genomic compartmentalization and normal gene regulation [37] at least in some regions.
Mammalian TAD boundaries are reported to be enriched in active transcription, housekeeping genes, tRNA genes and short interspersed nuclear elements (SINEs), as well as binding sites for the architectural proteins CTCF and cohesin [15]. However such binding sites also exist both within TADs, and CTCF and cohesin depletion reduce the intensity of intra-TAD interactions without affecting overall TAD location or organization [39-41]. This is consistent with their putative role in mediating enhancer-promoter contacts within TADs but leaves open the question of their role at boundaries between TADs.
How CTCF and cohesin organize chromatin in such a way as to prevent interactions between particular TADs and isolate gene expression states from one another still remains unclear. Such insulation can occur through a local activity of protein complexes bound to an individual TAD boundary, or through formation of boundary-boundary interactions (e.g. through CTCF and/or cohesin bound to each [24, 42, 43] leading to a “looped configuration”. Chromatin looping leads to physical insulation of loci located within the loop from loci outside the loop [44]. Consistent with this model, new insights into the finer details of TAD organization, revealed intriguing orientation-specific looping interactions between CTCF sites at domain boundaries [42]. This study identified smaller contact domains within TADs, in the order of 100-200kb, containing multiple specific loops that occur between CTCF sites in a predominantly (>90%) convergent orientation, with asymmetric motifs “facing” one another. Thus (0.2-2Mb) TADs appear to represent just one level of folding in a more intricate hierarchy–something already hinted at in previous Hi-C and 5C maps where smaller domains in the order of tens of kilobases, as well as much larger domains spanning a few megabases are visible [14, 15]. The precise relationship between TAD boundaries and these CTCF-anchored loops and contact domains, is not known in detail yet.
Clearly facultative enhancer-promoter interactions cannot underlie the apparent stability of TADs during development. Indeed, some long-range interactions within mammalian TADs appear to be invariant [14], raising the interesting possibility that architectural elements, distinct from regulatory elements such as enhancers and promoters, may exist. Consistent with this, approximately one third of the long-range interactions mediated by cohesin and CTCF do not involve enhancer and promoter sequences in mammalian cells [43]. Recent physical modeling of 5C data at the Xic points to the existence of such structural elements required for TAD formation [27]. A recent study applying Hi-C to four different mammals revealed that the modular organization of chromosomes is robustly conserved in syntenic regions and that this is compatible with conservation of the CTCF binding landscape [29]. The most highly conserved CTCF sites were found to co-localize with cohesin and to be enriched at strong TAD boundaries. Furthermore, CTCF DNA motif orientations defined the directionality of the long-range interactions [42]. On the other hand divergent CTCF binding between species correlated with divergence of internal domain structure. Furthermore, the authors found that TADs are reorganised as intact modules during evolution, providing further support that TADs represent functional domains of long-range gene regulation.
An understanding of the stability and dynamics of the long range interactions within TADs will be critical to assess how this level of chromatin folding impacts on TAD structure and gene expression. Indeed, a predictive physical model of the chromatin fibre suggests that some TADs represent domains of probabilistic interactions between the sequences lying within them, rather than to stable looping structures [27].
Hi-C data previously demonstrated that the genome is partitioned into distinct compartments [8]. The relationships between compartments and TADs are still being explored. Compartments represent large (up to several Mb) chromosomal domains defined by their preferential interactions with other compartments, whereas TADs are defined by the preferential interactions within them. Compartments tend to interact with other compartments that share their chromatin and/or transcriptional state: chromosomal regions enriched in active (A compartment) or inactive (B compartment) chromatin preferentially interact. Recent high-resolution Hi-C maps suggest that A and B-compartment can be further split is several sub-types [42]. Compartments can encompass several directly adjacent TADs that share chromatin state and that display similar genome-wide interactions with other sets of TADs. This has led to a model where interphase chromosome organization is a hierarchy of chromatin domains with TADs as the universal building blocks [4]: tissue invariant TADs come together in 3D space to form larger cell type-specific compartments. Compartment differences between cell types are due to relocation of entire TADs from one compartment type to another. This model accommodates the observation that TAD boundary positions are mostly invariant across tissues, while the chromatin states within TADs can change dramatically in different cell types and conditions, reflecting changes in gene activity and leading to altered compartment associations. A further prediction of this model is that the principles underlying compartment formation may be rather different to those underlying TADs. Indeed, the former are likely to depend on chromatin associated factors such as trithorax or polycomb; whereas TADs and their boundaries are not dependent on such factors, and instead rely on architectural proteins such as CTCF.
Topologically Associating Domains in Flies
The partitioning of the Drosophila genome into approximately 1,000 physical domains each in the range of tens to 100kb (average 60 Kb; Figure 1), that may be equivalent to TADs, was first described using a genome-wide 3C analysis (3C-seq) on early embryos by the Cavalli lab [16]. Similar domains were identified by the Corces lab by Hi-C analysis of kc167 cells [17]. These domains were found to correlate strongly with epigenomic features, including histone modifications, active gene density, association with the nuclear lamina interaction, replication timing, nucleotide and repetitive element composition. Many of the physical domains identified by Hi-C could thus be classified into previous, statistically defined epigenomic groups [45] e.g. active domains (domains showing active transcription), repressive domains (at the nuclear periphery), Polycomb and HP1 domains bound by Polycomb group complexes and HP1 respectively and null domains, lacking specific epigenetic marks. Although the precise association between physical domains and their epigenomic status still remains unclear, recent studies focusing on Polycomb-repressed domains suggest that they correspond to cooperative interactions among low-affinity sequences, DNA-binding factors such as PHO, and the Polycomb machinery, with PHO recruitment to sites within Polycomb domains being stabilized by PRC1. On the other hand, chromosomal domains categorised as active show rather distinct folding patterns, with more rapid decay in contact frequency as a function of genomic distance than other domains. The local structure of active domains may thus be rather different to repressive domains [46]. However, whether physical domains or TADs provide a basic chromosome architecture onto which epigenomic domains are laid down, or whether epigenomic demarcation is involved in the formation or maintenance of a TAD will require genetic disruption of the enzymes involved and Hi-C assessment. In support of the former model we note that in mammals, it would appear that disruption of large domains of H3K27me3 or H3K9me2 at the Xic locus did not impact on TAD segmentation [14] although an impact on local compaction could not be ruled out entirely.
As in mammals, the specification of TAD boundaries in Drosophila probably relies at least in part, on architectural proteins [16, 17]. Unlike mammals however, numerous DNA binding architectural proteins, including CTCF, have been identified in Drosophila, each recognizing a unique DNA motif [47]. There are also multiple accessory proteins, in addition to Rad21 (cohesin) that can associate with these DNA binding proteins. The specific combinations of architectural and accessory proteins at different genomic regions, as well as the number and orientation of their binding sites can easily be imagined to produce a diversity of 3D organization states, that can vary in a cell type specific fashion. Very little is currently known about the differences in chromosome folding states between tissues or developmental stages. One study found extensive looping between functional elements that was stable across development [48]. On the other had, a recent study investigated the changes induced during heat shock [49]. Temperature stress induced a dramatic rearrangement in 3D chromosome organization, with the relocalization of architectural proteins from TAD boundaries to sites within TADs, leading to an increase in long-distance inter-TAD interactions, with increased contacts among enhancers and promoters of silenced genes. These results reinforce the notion that architectural protein complexes play critical role in TAD boundary formation.
Topologically Associating Domains in C. elegans
Recently the first genome-wide chromatin interaction map for C. elegans embryos was obtained by combining conventional 3C with deep sequencing [24] (Figure 1). This map revealed known features of C. elegans nuclear organization, e.g. the tethering of large multi-Mb domains near the ends of the chromosomes to the nuclear lamina [50]. As a result of the peripheral localization of these domains, they interact with each other as well, both in cis and in trans, leading to the formation of higher order nuclear “compartments”, comparable to those observed in mammalian cells.
Perhaps surprisingly, no strong TADs were observed along the five autosomes, although some weak TAD boundaries could be detected. This is in contrast to the genome of Drosophila that has a genome of comparable size and complexity (e.g. gene number), and where chromosomal domains are prominently present along all chromosomes [16, 17]. Thus, TADs are clearly not a universal feature of metazoan chromosomes. Indeed, no TADs have been observed in Arabidopsis [51-53]. The global lack of TADs in C. elegans and Arabidopsis may be related to the fact that long-range enhancers do not appear to be required for developmental gene regulation in these two organisms. Another difference between Drosophila and C. elegans is that TADs in C. elegans are considerably larger: 1-2 Mb compared to ~60 kb.
TADs are present along the two X-chromosomes of C. elegans in hermaphrodites. This is interesting because in hermaphrodites gene expression along the X chromosomes is repressed by a factor of two to make the expression similar to that in males that carry only a single X chromosome. This chromosome-wide process of dosage compensation is specific to the X chromosome and is mediated by the condensin-like Dosage Compensation Complex (DCC). In mutants that cannot recruit the DCC to the X-chromosomes, leading to loss of dosage compensation, most of the TAD boundaries were no longer detected, or strongly reduced in strength. This observation points to a direct role of the DCC and the process of dosage compensation in formation of many, but not all, TADs along X in C. elegans.
The DCC is recruited to the X chromosomes through binding to rex (recruitment on X) sites [54]. Interestingly, the strongest TAD boundaries on X contain strong rex sites. These are also the TAD boundaries that are most affected in DCC mutants. Further, deletion of rex sites from a TAD boundary is sufficient to eliminate the boundary. Thus, DCC binding to rex sites is critical for TAD boundary formation.
The molecular mechanism by which the DCC induces TAD formation is not known. One intriguing finding is that rex sites at TAD boundaries engage in DCC-dependent long-range looping interactions, especially with the adjacent rex-containing neighboring TAD boundary over 1 Mb away [24]. These results show that the DCC induces and reinforces TADs through binding high-affinity rex sites and mediating long-range looping interactions between them. This is reminiscent of TADs in humans and flies, where at least a subset of them display chromatin loops between CTCF-bound sites located within their boundaries [17, 42]. The role of TAD formation in dosage compensation along the X chromosome is still an open question.
Chromatin Globules in S. pombe
Self-interacting chromatin domains have also been detected in S. pombe, referred to as “globules” [55] (Figure 1). These globules are 50-100 Kb in size and are found all along the genome. In this case globule boundaries are enriched for 3’ends of convergent genes. Such convergent sites are bound by the cohesin complex. Interestingly, in a partial loss-of-function cohesin mutant globule boundaries are lost pointing to important roles of the cohesin complex in domain formation in this organism. S. pombe cultures are mainly composed of G2 cells, and thus globules could be related to the prominent sister chromatid cohesion in G2. Importantly however, globules were also observed in G1 cells, and these were cohesin dependent, indicating that globule formation depends on an activity of the cohesin complex that is separate from its role in sister chromatid cohesion.
The functional consequences of globule formation are not known in detail. In partial loss-of-function cohesin mutants, when globule formation is affected, widespread aberrant transcriptional read-through is observed. However, whether this is due to loss of globules per se or due or to other functions of the cohesin complex is not known.
Chromatin Interaction Domains in Bacteria
Hi-C and 5C analyses of the spatial organization of the circular Caulobacter crescentus genome showed that the chromosome adopts an elongated structure where the origin of replication is anchored at one pole of the cell with the two chromosome arms running in parallel along the length of the cell [18, 56]. The higher resolution Hi-C data further revealed the presence of self-interacting chromatin domains, referred to Chromatin Interaction Domains (CIDs) that are on average 170 kb (30-420 Kb) in size [18] (Figure 1). Several lines of evidence indicate that active transcription plays a major role in CID formation. First, almost all boundaries contain highly transcribed genes. Second, genetically relocating an active gene also relocates the associated CID boundary. Third, blocking transcription by addition of rifampicin disrupted CIDs and reduced their boundaries. Thus, in C. crescentus transcription, esp. at CID boundaries is a major driver of domain formation, possibly by forming locally unwound and plectoneme-free stretches of DNA at boundaries. This is in contrast to TADs in metazoans that are independent of transcription.
A fundamental difference between prokaryotic chromatin and eukaryotic chromatin is the fact that in bacteria DNA is nucleosome-free and supercoiled leading to the formation of plectonemes. Modeling of the C. crescentus genome as a circular chromosome composed of a series of plectonemes, on average 13 Kb in size, with plectoneme-free areas at CID boundaries produced predicted Hi-C maps that are very similar to the experimentally observed data. This suggests that each CID contains around 10-15 such plectonemes that can migrate throughout the CID but cannot pass the plectoneme-free regions at the actively transcribed genes at their boundaries. Consistent with supercoiling playing a major role, treatment of cells with novobiocin that inhibits gyrase and negative supercoiling, reduced the sharpness of CID boundaries as well as their locations.
The data from C. crescentus provides a striking example of the fact that self-interacting chromatin domains can be observed in many genomes, but that the molecular mechanisms of their formation, and the folding of the DNA within them can be very different in different species.
TADs are functional domains
There is now growing and strong evidence that TADs are critical chromosome structural units of long-range gene regulation. The first data relating TADs to gene expression came from analysis of expression patterns of genes located within the same TAD across ES cell differentiation [14]. It was found that genes embedded in the same TAD show similar dynamics of expression during differentiation, whereas genes located in different TADs were less correlated. Further, some TADs correspond to Lamin Associated Domains (LADs), or domains covered by certain histone modifications such as H3K9Me2 and H3K27Me3, which all mark repressed chromatin states. This data indicates that at least some TADs are units of chromatin state and histone modification that correlates with regions of gene repression. Importantly, deletion of genes encoding enzymes that deposit such histone modifications, results in loss of the modifications, while TADs are maintained [14]. This shows that TADS are not the result of formation of domains of histone modifications, but that instead histone-modifying complexes act on pre-existing TADs to regulate chromatin state, and possibly gene expression, at the level of the entire domain. Further evidence that TADs can be regulated as units is provided by experiments where gene expression in T47D cells was induced by addition of nuclear hormones such as progestin [46]. It was found that up to 20% of the TADs behaved as discrete regulatory units where the majority of the genes embedded within them are either activated or repressed.
As mentioned above mammalian TAD positions are to a significant extent conserved between different cell types, and even between mouse and human. Despite this universal and cell-type invariant architecture, TADs are believed to be involved in highly tissue-specific gene regulation. First, using correlation analysis across large panels of cell types, target genes of cell-type specific enhancer-like elements have been predicted [11, 57]. These predictions identified enhancer-promoter pairs that are significantly enriched for pairs located within the same TAD [11]. Further, we have found that long-range looping interactions between promoters and distal regulatory elements detected by 5C are highly cell-type specific but occur within generally invariant TADs (Smith, Lajoie Jain and Dekker, unpublished results). These observations may explain another feature of TADs that is readily observed in 5C and high-resolution Hi-C datasets: whereas the boundaries of TADs are highly conserved between cell types, the internal folding and interaction patterns of TADs are highly cell type-specific [14, 15, 34, 42] and may represent intra-TAD loops between genes and regulatory elements.
Perhaps the strongest evidence that TADs correspond to functional domains is provided by a completely independent approach. Symmons and co-workers employed an enhancer trap-like strategy and generated mice with a reporter sensor construct inserted at different positions along the chromosome [38]. Analysis of expression patterns of this panel of reporters identified functional chromosomal domains: wherever the reporter is inserted within such domain, the expression pattern is the same. These results suggest that enhancers exert their activities throughout such regulatory domains to control cell type-specific expression of any receptive promoter within the region. Importantly, these regulatory domains identified based solely on gene expression patterns show a remarkable correlation with TADs. Therefore, TADs are structural as well as functional units of gene regulation.
Recently two studies have shown that genetic rearrangements that affect TAD organization alter gene expression by changing patterns of long-range enhancer – promoter interactions. An inversion on chromosome 3 is implicated in AML. It was found that this inversion disrupts two TADs at the breakpoints and results in formation of new hybrid TADs containing parts of each flanking genomic region [58]. As a result, an enhancer that normally regulates the GATA2 gene is repositioned and now located within the TAD that contains the EVI oncogene. This enhancer activates the oncogene and contributes to tumor formation. At the same time the GATA2 gene is no longer located within the TAD containing the enhancer and this leads to GATA2 haploinsufficiency which is implicated in sporadic familial AML/MDS and MonoMac/Emberger syndromes.
A second example is provided by naturally occurring genomic rearrangements involved in human limb malformations. Reconstruction of such alterations in mice showed that these affect TAD boundaries and as a result lead to ectopic enhancer-promoter connections that normally do not happen as these elements are located in different TADs [37]. Specifically, different rearrangements place a set of Eph4 enhancers within the same TAD as WNT, IHH, or PAX3, depending on the precise nature of the TAD reorganization, leading to inappropriate interactions between the enhancers and the promoters of these genes.
Finally, RNAi-mediated knock down, or cell-type specific knockout of cohesin subunits and CTCF has confirmed that these protein complexes play roles in TAD formation [40, 41, 59]. These studies found that removal of these complexes leads to weakening of TAD boundaries with concomitant gain of ectopic interactions between genes and regulatory elements located in different TADs. This also results in some altered gene expression. It should be noted though, that the effects on chromatin organization and transcription are rather modest, which could indicate that other factors are also involved, or that the knock downs are not complete and remaining levels of proteins are sufficient for maintaining significant chromatin structure. Still, overall these examples of functional effects of genetic perturbations of TAD organization point to deep mechanistic relationships between chromatin domain formation and gene regulation.
We note that information on TAD organization might also prove powerful in interpreting genome-wide association studies. Such studies typically identify non-coding regions linked to disease that likely contain gene regulatory elements. Current data outlined above suggest that TAD organization will help predict target genes for these regulatory elements: target genes should be located within the same TAD. Thus, insights into the domainal organization of chromosomes and its relation to long-range gene regulation can contribute to uncovering the molecular mechanisms of the genetic basis of disease.
Other roles of TADs, globules and CIDs
Besides roles in gene regulation, TADs have also been linked to patterns of DNA replication [60]. Replication timing fluctuates along chromosomes in units of several hundred kilobases. Intriguingly, almost all TAD borders were found to be located at borders of replication domains in at least some cell types. In a given cell type, series of adjacent TADs can all replicate early, but the transition to a late replication domains occurs at TAD boundaries. At which TAD boundary this transition in replication timing happens can depend on the cell type. Consistently, TADs typically replicated as a whole either early or late, and switched replication timing as units during differentiation in accordance with changes in their transcriptional activity and chromatin state. The mechanisms by which replication timing is regulated at the TAD level are not known in detail.
Most functional studies have been focused on TADs in mammalian cells. Whether insights obtained from these studies can be extrapolated to chromatin domains in other organisms remains an open question. For instance, it is not known whether globules in S. pombe play similar roles in transcriptional control and DNA replication as TADs in mammals. Fungi such as S. pombe are not thought to regulate genes through long-range interactions between promoters and distal enhancers. Thus, it is not clear whether globules play similar roles as TADs in constraining such looping interactions. As mentioned above, cohesin mutants disrupt globule formation and also affect 3’ends processing of transcript pointing to different roles of these structures in gene regulation.
CIDs appear to be a fundamentally different type of chromatin domain than TADs and globules, despite a similar appearance in Hi-C interaction maps. As mentioned above CIDs have been proposed to be composed of a series of migrating plectonemes that are blocked at CID boundaries. Supercoiling of DNA is important for gene expression in bacteria, and CIDs may be important for regulation of gene expression by constraining supercoiling and preventing local dissipation of plectonemes. On the other hand, CIDs may form simply as a result of local DNA unwinding at boundaries, and may not play an active role in transcriptional control. Much more work is required to elucidate the mechanistic relationships between transcription, supercoiling and CID formation.
Finally, TADs in C. elegans present another example where they may play roles other than constraining looping interactions between genes and enhancers. In this case, the formation of TADs along the X chromosome somehow impacts gene expression uniformly all along the chromosome. How this is accomplished is not known. It is interesting to note that these TADs differ from mammalian TADs in that they depend on a condensin-related complex. On the other hand, C. elegans TADs also share features with mammalian TADs, including the presence of looping interactions between TAD boundaries, and their overall large (Mb) size.
Clearly, while TAD-like chromatin domains are observed across organisms, the mechanisms of their formation and the protein complexes involved, their internal organization and their functional roles in regulating genomic processes such as transcription and replication differ greatly.
Outstanding questions and future studies
The presence of TAD-like chromatin domains in a range of organisms is now well established, but molecular insights into the mechanisms of their formation, and their roles in regulating a range of genomic activities are still largely lacking. Major questions include: 1) When and how (during development, during the cell cycle) are TADs established and how are they maintained? 2) How can adjacent TAD-like domains be prevented from mixing? Is chromatin looping between TAD boundaries sufficient for such spatial separation? 3) What is the internal organization of TADs? Is supercoiling involved in eukaryotic TADs, as it is in CID formation in bacteria? Does looping within TADs play a role in TAD formation and/or stabilization? 4) What is the internal dynamics of chromatin folding within TADs in real time in single cells? Are enhancer-promoter looping interactions within TADs stable, or dynamic? All of these questions need to be addressed in the various model organisms described above, as it is likely that different mechanisms are at work.
Some of the outstanding questions can now be experimentally addressed by genetic perturbation approaches using genome editing tools such as those based the CRISPR/Cas9 system. There are already examples where targeted deletion of TAD boundaries [37] or binding sites of candidate protein complexes [24] were introduced followed by analysis of the conformation of the chromatin by 3C-based methods, and effects on local gene expression.
To gain insights into the dynamics of TAD-like domains, live cell imaging will be essential. Again CRISPR/Cas9 –based tools could be employed to visualize targeted elements within TADs and correlate their interactions over time, and possible relate such interactions with local transcription.
Further, the molecular machines that fold chromatin at the level of TADs need to be identified and their mechanism of action elucidated. Whether the proteins that drive chromatin folding are different from those that maintain it (through the cell cycle, during DNA repair etc.) must also be explored. Several protein complexes are already known, including cohesin, condensin, and CTCF. These, and their associated molecules, provide fruitful staring points but there are likely other complexes involved as well.
Given the important roles of TADs in gene control and other processes, deeper insights into their biology promises to lead to a better understanding of how cells regulate their genome and how genetic variants can lead to inappropriate gene expression over hundreds of kilobases leading to disease.
Acknowledgements
We would like to thank members of the Heard and Dekker labs for discussios and ideas. Supported by grants from the National Human Genome Research Institute (R01 HG003143) to J.D, an ERC Advanced Investigator award, EU FP7 grants SYBOSS and MODHEP, La Ligue, Labex DEEP (ANR-11-LBX-0044) part of the IDEX Idex PSL (ANR-10-IDEX-0001-02 PSL) to E.H.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
Full text links
Read article at publisher's site: https://doi.org/10.1016/j.febslet.2015.08.044
Read article for free, from open access legal sources, via Unpaywall: https://febs.onlinelibrary.wiley.com/doi/pdfdirect/10.1016/j.febslet.2015.08.044
Citations & impact
Impact metrics
Article citations
Machine and Deep Learning Methods for Predicting 3D Genome Organization.
Methods Mol Biol, 2856:357-400, 01 Jan 2025
Cited by: 1 article | PMID: 39283464
Review
HiCMC: High-Efficiency Contact Matrix Compressor.
BMC Bioinformatics, 25(1):296, 10 Sep 2024
Cited by: 0 articles | PMID: 39256681 | PMCID: PMC11389233
Structural basis for linker histone H5-nucleosome binding and chromatin fiber compaction.
Cell Res, 34(10):707-724, 05 Aug 2024
Cited by: 0 articles | PMID: 39103524 | PMCID: PMC11442585
OpenNucleome for high-resolution nuclear structural and dynamical modeling.
Elife, 13:RP93223, 15 Aug 2024
Cited by: 1 article | PMID: 39146200 | PMCID: PMC11326778
Identifying topologically associating domains using differential kernels.
PLoS Comput Biol, 20(7):e1012221, 15 Jul 2024
Cited by: 0 articles | PMID: 39008525 | PMCID: PMC11249266
Go to all (169) article citations
Other citations
Wikipedia
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Topologically-associating domains: gene warehouses adapted to serve transcriptional regulation.
Transcription, 7(3):84-90, 25 Apr 2016
Cited by: 11 articles | PMID: 27111547 | PMCID: PMC4984688
Review Free full text in Europe PMC
Chromatin Domains: The Unit of Chromosome Organization.
Mol Cell, 62(5):668-680, 01 Jun 2016
Cited by: 420 articles | PMID: 27259200 | PMCID: PMC5371509
Review Free full text in Europe PMC
TADs as modular and dynamic units for gene regulation by hormones.
FEBS Lett, 589(20 pt a):2885-2892, 24 May 2015
Cited by: 13 articles | PMID: 26012375
Review
Active chromatin and transcription play a key role in chromosome partitioning into topologically associating domains.
Genome Res, 26(1):70-84, 30 Oct 2015
Cited by: 205 articles | PMID: 26518482 | PMCID: PMC4691752
Funding
Funders who supported this work.
NHGRI NIH HHS (1)
Grant ID: R01 HG003143
National Human Genome Research Institute (1)
Grant ID: R01 HG003143