Abstract
Free full text
Nucleosome organization in the Drosophila genome
Associated Data
Abstract
Comparative genomics of nucleosome positions provides a powerful means for understanding how the organization of chromatin and the transcription machinery co-evolve. Here we produce a high resolution reference map of H2A.Z and bulk nucleosome locations across the genome of the fly D. melanogaster, and compare it to that from the yeast S. cerevisiae. Like Saccharomyces, Drosophila nucleosomes are organized around active transcription start sites in a canonical −1, NFR (nucleosome-free region), +1 arrangement. However, Drosophila does not incorporate H2A.Z into the −1 nucleosome and does not bury its transcriptional start site in the +1 nucleosome. At thousands of genes, RNA polymerase II engages the +1 nucleosome and pauses. How the transcription initiation machinery contends with the +1 nucleosome appears to be fundamentally different between lower and higher eukaryotes.
Knowledge of the precise location of nucleosomes in a genome is essential in order to understand the context in which chromosomal processes such as transcription and DNA replication operate. A common theme to emerge from recent genome-wide maps of nucleosome locations is a general deficiency of nucleosomes in promoter regions and an enrichment of certain histone modifications towards the 5′ end of genes1–7. A high resolution genomic map of nucleosome locations in the budding yeast S. cerevisiae has further revealed the nucleosomal context of cis-regulatory elements and transcriptional start sites1–7. However such context has not been established in multicellular eukaryotes, and so fundamental questions remain: 1) Is there a common theme by which genes of multicellular eukaryotes position their nucleosomes with respect to functional chromosomal elements? 2) Are such themes and their underlying rules evolutionarily conserved across eukaryotes? 3) What are the functional implications for those themes that differ across the major eukaryotic lines? To address these questions we have produced a genome-wide high resolution map of H2A.Z/H2Av and bulk nucleosome locations in the embryo of the fruit fly D. melanogaster. H2A.Z is widely distributed in Drosophila8, but some evidence points to specialized roles9,10. In Saccharomyces, H2A.Z replaces H2A at the 5′ end of active genes11–14, and thus provides a focused representation of promoter chromatin architecture.
Drosophila embryos are composed of a wide variety of cell types in which subsets of genes may elicit distinct gene expression programs15,16. Global gene expression profiles during all stages of Drosophila development from 8–12 hrs post fertilization to a young adult fly are correlated (Fig. S1), which possibly reflects the broad expression pattern of the large repertoire of house-keeping genes in most cell types during development15,16. This general spatial and temporal independence of gene expression provides impetus to use whole embryos to develop a reference nucleosome map. Indeed, our map reveals that nucleosomes are generally well organized, despite cell type heterogeneity.
Open and closed chromatin structures are linked to transcription, H2A.Z, and core promoter elements
Embryos were treated with formaldehyde, and H2A.Z nucleosome core particles were immunopurified (Figs. S2–S3). 652,738 H2A.Z-containing nucleosomes were sequenced (Fig. S4), and mapped to 207,025 consensus locations in the Drosophila r5.2 reference genome (Figs. 1a and S2b, see browser at http://atlas.bx.psu.edu/), thereby providing >3-fold depth of coverage (Fig. S5). Correction for micrococcal nuclease (MNase) digestion bias was imposed (Fig. S6). Those 112,750 nucleosomes detected three or more times were further analyzed, although patterns were identical when all nucleosomes were analyzed. The internal median error of the data was 4 bp (Fig. S7).
Fig. 1b displays the predominant embryonic distribution of H2A.Z nucleosomes relative to the transcription start site (TSS) of all coding genes, and is compared to the pattern previously derived from Saccharomyces1. Patterns around noncoding genes are shown in Fig. S8. 11,994 of the 14,143 Drosophila coding genes (85%) contained at least one H2A.Z nucleosome (detected three or more times) within 1 kb of the TSS. H2A.Z levels correlated with gene expression (Figs. 1c and S9), as has been seen on individual genes and in Saccharomyces12,13,17.
H2A.Z nucleosomes were predominantly distributed at 175 bp intervals from the TSS (compared to 165 bp in Saccharomyces1, Fig. 1b), demonstrating that a predominant organizational pattern exists for H2A.Z nucleosomes in Drosophila embryos that transcends a spatial and temporal context. The H2A.Z pattern was compared to the distribution of bulk nucleosomes (i.e., those containing any combination of H2A.Z and H2A), determined using high density tiling arrays (36 bp probe spacing). Within genic regions the same organizational pattern was found (Fig. S10). For both datasets, a nucleosome-depleted region was evident immediately upstream of the +1 nucleosome, which likely reflects a nucleosome-free core promoter region (NFR), as first detected in Saccharomcyces7. Like Saccharomyces, a −1 nucleosome was detected ~180 bp upstream of the TSS. However, in contrast, it lacked H2A.Z.
Surprisingly, the genic array of Drosophila nucleosomes, started ~75 bp further downstream from the equivalent position in Saccharomyces, placing the +1 nucleosome at +135 (Figs. 1b and S10). This shift has important implications in how the TSS is presented to RNA polymerase II (Pol II). In Saccharomyces, the TSS resides within the nucleosome border potentially allowing the nucleosome to regulate start site selection and efficiency1. In Drosophila, the predominant arrangement of nucleosomes might allow unimpeded access to the TSS with potential blockage occurring downstream after initiation.
Drosophila have well-defined core promoter elements such as TATA, Initiator, DPE, and MTE which bind to the general transcription machinery18–22, although these elements are not found in most genes. For genes lacking these core promoter elements or having a DPE, the canonical nucleosome organization was observed (black pattern in Fig. S11), which was more robust when only H2A.Z containing nucleosomes were examined (blue pattern). In contrast, genes containing TATA, Inr, or MTE had a diminished canonical nucleosome organization and a diminished NFR, indicating that these classes of genes may have a more compact and gene-specific chromatin architecture, including a positioned nucleosome over the TSS. Consequently, they might be more dependent upon chromatin remodelling for expression. When genes become transcriptionally competent, resident nucleosomes could adopt a more open and canonical organization, which includes replacing H2A with H2A.Z. Three observations support this hypothesis. First, H2A.Z and bulk nucleosomes at highly expressed genes were more uniformly organized than those at lowly expressed genes (Fig. S9). Second, bulk nucleosomes for genes that contained H2A.Z at their 5′ end displayed the canonical pattern, while those lacking H2A.Z did not (Fig. S10, black plot vs red trace). Third, within any class of genes except those having an Initiator, H2A.Z nucleosomes adopted a more canonical organization than the bulk set of nucleosomes (Fig. S11). These results suggest that transcription and the presence of H2A.Z are linked to an open and uniform chromatin architecture at promoter regions.
Conserved DNA motifs and H2A.Z nucleosomes are organized around each other
Recent genome sequencing of 12 Drosophila species of differing evolutionary distance has provided an unprecedented opportunity to identify conserved DNA sequence motifs23. In comparing the distribution of motifs around the TSS23, we found four recurring patterns: 27 motifs were classified as “nucleosomal”, 57 as “anti-nucleosomal”, 12 as “fixed”, and 98 as “random” (left panels in Fig. 2a and Fig. S12). “Nucleosomal” and “anti-nucleosomal” patterns matched the general distribution of where nucleosomes were relatively enriched or depleted, respectively, relative to the TSS (see Fig. 1b). “Fixed” elements were at a defined distance from the TSS, and “random” elements lacked patterning. The “nucleosomal” and “anti-nucleosomal” patterns suggest that certain motifs are organized to be downstream of the TSS in the midst of nucleosomal arrays, while others are organized to be upstream of the TSS, where nucleosomes are relatively depleted.
We examined the organizational relationship of these DNA motifs to individual H2A.Z nucleosomes genome-wide (right panels of Fig. 2a and Fig. S12, and all motifs in Fig. 2b). Strikingly, “nucleosomal” motifs were consistently enriched on the H2A.Z nucleosome surface, whereas “anti-nucleosomal” motifs were consistently depleted. Individual “fixed” motifs were mostly depleted of H2A.Z nucleosomes. These findings along with several controls (Fig. S13) suggest that motifs and nucleosomes adopt a preferred organization around each other, regardless of their genomic location. This organization could be linked to co-evolution of base sequence composition bias in and around nucleosomes. The functional importance of such context remains to be determined.
Drosophila use a CC/GG patterning rather than AA/TT for demarcating nucleosome positions
We examined whether the positions of Drosophila H2A.Z nucleosomes are at least partly defined by the underlying DNA sequence pattern, and whether such pattern might be evolutionarily conserved. We determined the frequency of dinucleotides across Drosophila H2A.Z nucleosomal DNA since 10 bp periodic patterns of certain dinucleotides enhance the wrapping and positioning of DNA around the histone core (Figs. 3a and S14). As seen in Saccharomyces, 10 bp periodic patterns of A/T dinucleotides running counter-phase to G/C dinucleotide was observed. The modest amplitudes of the pattern suggest that such periodicities are infrequent, and thus used selectively (i.e., most nucleosomes lack underlying positioning signals).
We further investigated the rules of nucleosome positioning by scanning promoter regions for correlations to nucleosome positioning sequences previously identified for a relatively small number of yeast or human nucleosomes24, in which AA/TT (yeast25 and worms26) or CC/GG (human)27 dinucleotides occur in a biased and/or periodic arrangement across nucleosomal DNA. Unlike in yeast, the AA/TT positioning pattern failed to identify nucleosome locations (Fig. 3b, black trace). However, the CC/GG pattern (Fig. S15) reproduced the exact position of the +1 nucleosomes (Fig. 3b, red trace), indicating that the Drosophila +1 nucleosome may be positioned in part by CC/GG-based positioning sequences that are utilized preferentially in metazoans. Consistent with this, +1 nucleosomes are highly positioned around the 5′ end of genes (Fig. S16).
Nucleosome-free regions reside at the end of active genes
Despite H2A.Z being enriched at the 5′ end of genes, substantial levels were detected throughout the genome, which allowed us to examine nucleosome organization at the 3′ end of genes (Figs. 4a and S17a). Strikingly, H2A.Z nucleosome levels spiked near the ORF end points then dropped precipitously further downstream into the intergenic regions, where transcripts terminate. The spike occurred ~30 bp upstream from the stop codon and ~160 bp upstream of the transcript polyA site. A similar nucleosome drop-off was seen when bulk nucleosomes were examined (Fig. S17b), but was not evident at genes that lacked H2A.Z. Thus, like the 5′ end, the presence of H2A.Z may be linked to a more open chromatin architecture at the 3′ end of genes. The change in nucleosome density coincided with alterations in nucleosome positioning sequences (Fig. 4b). Thus, such “3′-NFRs” might be defined in part by the underlying DNA sequence. Conceivably, 3′ NFRs might function in transcription termination.
RNA polymerase II contacts the +1 nucleosome and pauses
The location of the +1 nucleosome at the 5′ end of genes is striking because its upstream border resides at approximately +62 (relative to the TSS), which is near where Pol II pauses during the transcription cycle3,28–32. To examine the potential linkage between Pol II pausing and nucleosome positions, we first determined the genome-wide location of Pol II in embryos at 1,956 putatively paused genes (Fig. 5a). Pol II was concentrated in a ~300 bp region that peaked around +90, which overlaps the region bound by the +1 nucleosome, and is consistent with other recent placements30–32. Indeed, the distribution of paused Pol II, as directly measured by permanganate reactivity of thymines on a statistically robust subset of ~50 genes (yellow trace in Figs. 5a and S18a), indicates that pausing occurs between +20 and +50 with the center at +3530. This high resolution permanganate footprinting data, which represents the most definitive means of assessing Pol II pausing, places the front edge of Pol II (~16 bp downstream of the bubble33) within ~10 bp of the +1 nucleosome border.
The location of the +1 H2A.Z nucleosome was similar (but not identical) whether or not paused Pol II was present (Fig. 5b), indicating that Pol II was not likely to be the cause of the nucleosome shift compared to Saccharomyces. Rather, the positioned +1 nucleosome might be contributing to pol II pausing, which is consistent with other studies34–37. Other factors including NELF are likely to make significant contributions to pausing as well30,38,39.
Intriguingly, genes that contained a paused Pol II showed a ~10 bp downstream shift of H2A.Z nucleosomes (P-value = 10−9; Fig. 5b). The same shift was observed if H2A.Z sequencing reads (rather than nucleosomes) or bulk nucleosomes are plotted (Fig. S19a,b). The shift suggests that as part of the pausing process, Pol II collides with the +1 nucleosome, possibly displacing it downstream by one turn of the DNA helix. If the downstream nucleosomes are positioned in large part by the principles of statistical positioning40,41, rather than the underlying DNA sequence, then a shift of the +1 nucleosome is expected to have a ripple effect on downstream nucleosomes.
To test the prediction that Pol II is engaging the +1 nucleosome, bulk mononucleosomes were prepared from formaldehyde crosslinked embryos and immunoprecipitated with antibodies directed against Pol II. DNA corresponding to mononucleosomes (~150 bp) was gel-purified and mapped to the entire Drosophila genome with high resolution tiling arrays. Fig. 5c (black trace) shows that the distribution of nucleosome-Pol II crosslinking at Pol II-paused genes peaked at the +1 nucleosome. This was not seen at genes lacking a paused Pol II or H2A.Z. The selective enrichment at +1 demonstrates that Pol II is predominantly engaged with the +1 nucleosome, and thus the +1 nucleosome may be instrumental in establishing the paused state.
Conclusions
The high resolution map of Drosophila nucleosomes reveals evolutionarily conserved and divergent principles of nucleosome organization. Genes that possess H2A.Z nucleosomes are likely to have experienced a transcription event. They tend to have nucleosome-free promoter and termination regions and intervening arrays of uniformly positioned nucleosomes that become less uniform towards the 3′ end of the gene. H2A.Z nucleosomes in general might not block assembly of the transcription machinery at transcriptionally “experienced” promoters. However, repressed promoters or those containing Initiator elements do appear to have an H2A nucleosome over the TSS.
Conserved DNA sequence motifs (and thus any proteins that bind to them) tend to have an organizational relationship with nucleosomes. “Anti-nucleosomal” motifs including those for proteins such as engrailed, even skipped, fushi tarzu, giant, hunchback, and knirps tend to be located upstream of the TSS and might contribute to the exclusion of nucleosomes over the core promoter. Indeed some have anti-nucleosomal activity42,43. “Nucleosomal” motifs include sites for achaete, antennapedia, dorsal, tramtrack, and others. Their preference for locations downstream of the TSS where nucleosomes are well organized raises the possibility that they contribute to nucleosome organization.
In Saccharomyces, the location of the TSS just inside the +1 nucleosome border, allows the nucleosome to potentially exert control over initiation, whereas in Drosophila, the general case may be to position the +1 nucleosome to interact with a transcriptionally engaged paused polymerase. Whether the +1 nucleosome is causative or just participatory in the pausing is not known. It is now becoming clear that metazoans regulate transcription in large part through Pol II pausing rather than solely through transcription complex assembly3,31,32,44. The nucleosome map and its context to DNA regulatory elements, presented here, provides a framework for designing experiments and analyzing existing data to understand how metazoans regulate transcription.
METHODS SUMMARY
D. melanogaster embryos (0–12 hr) were collected and crosslinked with formaldehyde. H2A.Z was immunoprecipitated from chromatin digested with MNase. Mononucleosomal DNA was gel-purified and sequenced using Roche GS20/FLX pyrosequencing technology1,45. Chromatin from crosslinked embryos was also solubilized by sonication and/or MNase digestion, where indicated, and Pol II immunoprecipitated. Bulk nucleosomes were not immunoprecipitated. MNase-treated samples were gel-purified in the 75–200 bp range. DNA samples were then hybridized to Affymetrix Drosophila tiling microarrays (36 bp average probe spacing).
Supplementary Material
MethodsFigsTab
Full Methods and any associated references are available in the online version of the paper at www.nature.com/nature.
Supplementary Information is linked to the online version of the paper at www.nature.com/nature.
Acknowledgments
This work was supported by grants HG004160 (BFP), and GM47477 (DSG). We thank M. Biggin for early access to the Pol II ChIP-chip data, Ruopeng Fan for supplying the rpb3 antibody, and Chanhyo Lee for help in identifying paused Pol II.
Footnotes
Author Information Sequence data deposition is through NCBI Trace Archives TI SRA000283, Sequencing Center = “CCGB”, and microarray deposition through ArrayExpress, Accession numbers E-MEXP-1515 and -1519. Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interest.
Author Contributions T.M. prepared and purified the nucleosomes including Pol II-bound nucleosomes; C.J. analyzed the nucleosome mapping data and its relationship to other genomic features; I.P.I. performed computational analyses related to nucleosome positioning sequences; X.L. conducted ChIP-chip on Pol II; B.J.V. conducted ChIP-chip on nucleosome-Pol II interactions; S.J.Z. provided bioinformatics support; L.T. constructed libraries and sequenced nucleosomal DNA; J.Q. mapped sequencing reads to the yeast genome; RG provided H2A.Z antibodies; SCS directed the DNA sequencing phase; DSG directed embryo preparations and helped interpret the data; I.A. developed computational approaches to derive nucleosome maps from the read locations and developed the associated browser; B.F.P. directed the project, interpreted the data, and wrote the paper.
References
Full text links
Read article at publisher's site: https://doi.org/10.1038/nature06929
Read article for free, from open access legal sources, via Unpaywall: https://europepmc.org/articles/pmc2735122?pdf=render
Citations & impact
Impact metrics
Citations of article over time
Alternative metrics
Smart citations by scite.ai
Explore citation contexts and check if this article has been
supported or disputed.
https://scite.ai/reports/10.1038/nature06929
Article citations
Size-based expectation maximization for characterizing nucleosome positions and subtypes.
Genome Res, 34(9):1334-1343, 11 Oct 2024
Cited by: 0 articles | PMID: 38886069
An integrated machine-learning model to predict nucleosome architecture.
Nucleic Acids Res, 52(17):10132-10143, 01 Sep 2024
Cited by: 0 articles | PMID: 39162225 | PMCID: PMC11417389
Emerging Approaches to Profile Accessible Chromatin from Formalin-Fixed Paraffin-Embedded Sections.
Epigenomes, 8(2):20, 12 May 2024
Cited by: 0 articles | PMID: 38804369 | PMCID: PMC11130958
Review Free full text in Europe PMC
Genome organization across scales: mechanistic insights from in vitro reconstitution studies.
Biochem Soc Trans, 52(2):793-802, 01 Apr 2024
Cited by: 0 articles | PMID: 38451192 | PMCID: PMC11088924
Review Free full text in Europe PMC
Incorporation of the histone variant H2A.Z counteracts gene silencing mediated by H3K27 trimethylation in Fusarium fujikuroi.
Epigenetics Chromatin, 17(1):7, 20 Mar 2024
Cited by: 0 articles | PMID: 38509556 | PMCID: PMC10953111
Go to all (489) article citations
Data
Data behind the article
This data has been text mined from the article, or deposited into data resources.
BioStudies: supplemental material and supporting data
Functional Genomics Experiments
- (1 citation) ArrayExpress - E-MEXP-1515
Nucleotide Sequences
- (1 citation) ENA - SRA000283
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
H2A.Z nucleosome positioning has no impact on genetic variation in Drosophila genome.
PLoS One, 8(3):e58295, 05 Mar 2013
Cited by: 1 article | PMID: 23472174 | PMCID: PMC3589275
Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome.
Nature, 446(7135):572-576, 01 Mar 2007
Cited by: 484 articles | PMID: 17392789
Variant histone H2A.Z is globally localized to the promoters of inactive yeast genes and regulates nucleosome positioning.
PLoS Biol, 3(12):e384, 01 Nov 2005
Cited by: 295 articles | PMID: 16248679 | PMCID: PMC1275524
The specificity of H2A.Z occupancy in the yeast genome and its relationship to transcription.
Curr Genet, 66(5):939-944, 14 Jun 2020
Cited by: 6 articles | PMID: 32537667
Review
Funding
Funders who supported this work.
NHGRI NIH HHS (4)
Grant ID: R01 HG004160-01A1
Grant ID: R56 HG004160
Grant ID: HG004160
Grant ID: R01 HG004160
NIGMS NIH HHS (2)
Grant ID: R01 GM047477
Grant ID: GM47477