High-resolution genome-wide mapping of the primary structure of chromatin.

Zhang Z ¹,

Pugh BF

Affiliations

1. Center for Comparative Genomics and Bioinformatics, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA.
Authors
Zhang Z¹
(1 author)

ORCIDs linked to this article

Zhang Z | 0000-0002-4310-0525

Cell, 01 Jan 2011, 144(2):175-186
https://doi.org/10.1016/j.cell.2011.01.003 PMID: 21241889 PMCID: PMC3061432

Free full text in Europe PMC

Abstract

The genomic organization of chromatin is increasingly recognized as a key regulator of cell behavior, but deciphering its regulation mechanisms requires detailed knowledge of chromatin's primary structure-the assembly of nucleosomes throughout the genome. This Primer explains the principles for mapping and analyzing the primary organization of chromatin on a genomic scale. After introducing chromatin organization and its impact on gene regulation and human health, we then describe methods that detect nucleosome positioning and occupancy levels using chromatin immunoprecipitation in combination with deep sequencing (ChIP-Seq), a strategy that is now straightforward and cost efficient. We then explore current strategies for converting the sequence information into knowledge about chromatin, an exciting challenge for biologists and bioinformaticians.

Free full text

Cell. Author manuscript; available in PMC 2012 Jan 21.

Published in final edited form as:

Cell. 2011 Jan 21; 144(2): 175–186.

https://doi.org/10.1016/j.cell.2011.01.003

PMCID: PMC3061432

NIHMSID: NIHMS264968

PMID: 21241889

High resolution genome-wide mapping of the primary structure of chromatin

Zhenhai Zhang and B. Franklin Pugh^*

Author information Copyright and License information Disclaimer

The publisher's final edited version of this article is available at Cell

See other articles in PMC that cite the published article.

Abstract

The genomic organization of chromatin is increasingly recognized as a key regulator of cell behavior, but deciphering its regulation mechanisms requires detailed knowledge of chromatin’s primary structure - the assembly of nucleosomes throughout the genome. This Primer explains the principles for mapping and analyzing the primary organization of chromatin on a genomic scale. After introducing chromatin organization and its impact on gene regulation and human health, we then describe methods that detect nucleosome positioning and occupancy levels using chromatin-immunoprecipitation in combination with deep sequencing (ChIP-Seq), a strategy that is now straightforward and cost-efficient. We then explore current strategies for converting the sequence information into knowledge about chromatin, an exciting challenge for biologists and bioinformaticians.

Keywords: Nucleosomes, histones, genomics, methods, deep sequencing, chromatin, genome-wide

Chromatin regulates remarkably diverse processes in eukaryotic organisms, from development and disease progression to cognition and aging. Not surprisingly then, deciphering how chromatin directs gene expression continues to be a major research priority. Chromatin is genomic DNA compacted into chromosomes by histone proteins, but RNA and other proteins are also important constituents. The fundamental repeating unit and building block of chromatin is the nucleosome; each nucleosome contains ~147 base pairs of DNA wrapped approximately twice around a protein core and consists of two copies each of the four histones, H2A, H2B, H3, and H4.

Nucleosomes may contain histone variants, such as H2A.Z and H3.3, which are found at active genes (Malik and Henikoff, 2003). In addition, nucleosomes are often decorated with posttranslational modifications at specific amino acids of their histones. These modifications include acetylation, methylation, phosphorylation, ubiquitination, citrullination, SUMOylation, and ADP ribosylation (Kouzarides, 2007). The histone variants and modifications impart functionality to nucleosomes, such as the regulation of gene expression and the compaction of chromatin into higher ordered structures.

Indeed, a histone code may exist in which specific combinations of histone variants and modifications provide landmarks for gene regulatory proteins. These landmarks designate not only the start and end of genes but also the transcriptional status of a gene (Jenuwein and Allis, 2001). For example, trimethylation at lysine 4 on histone H3 (H3K4me3) marks the 5’ region of active genes, whereas trimethylation of lysine 36 (H3K36me3) marks the middle-3’ region of these genes. These histone modifications and variants may provide a “global positioning system” for assembly of proteins that regulated chromatin and the transcription machinery.

The primary structure of chromatin consists of nucleosomes organized in and around genes (Figure 1) with an array of uniformly-spaced nucleosomes beginning at a fixed distance immediately downstream of most transcriptional start sites (Jiang and Pugh, 2009; Yuan et al., 2005). However, chromatin is more than this simple “beads-on-a-string” model. Nucleosomes dynamically interconvert to more compacted units, which further fold into higher-order structures. Histone modifications, in particular phosphorylation, likely direct chromatin folding and compaction, leaving genomic regions to reside in specific domains of the nucleus.

An external file that holds a picture, illustration, etc.
Object name is nihms264968f1.jpg

Figure 1

Chromatin architecture

The primary structure of chromatin can be thought of as “beads-on-a-string” with uniformly-spaced arrays of nucleosomes at a fixed distance downstream of transcriptional start sites. With the exception of specific regulatory situations, intact nucleosomes generally avoid the core promoter region, where the transcription machinery assembles. These nucleosome-free regions provide an opportunity to regulate gene expression at steps beyond simple promoter access, for example through elongation control of RNA polymerase II (Core and Lis, 2008). The protein core of nucleosomes is composed of histones, which often contain posttranslational modifications on specific amino acids and can be replaced by transcription-linked histone variants (dark blue and purple. As depicted, chromatin also folds into more compact structures aided by certain histone modifications.

In this Primer, we introduce experimental and analytical strategies for genome-wide characterization of chromatin at its primary organizational level - nucleosome positioning, occupancy, variant composition, and modifications. (For genome-wide methods that map long-range interactions in higher-order chromatin, we refer readers to Lieberman-Aiden et al. (2009).) First, we describe examples of chromatin research that have or may benefit from nucleosome mapping studies, such as stem cell reprogramming and cancer therapeutics. Then we describe experimental considerations for mapping nucleosomes, and we conclude with a discussion of the computational strategies used to analyze large nucleosomal datasets.

CHROMATIN MAPPING IMPACTS DIVERSE RESEARCH AREAS

Studies dedicated to characterizing the primary structure of chromatin on a genomic scale aim to understand: 1) how nucleosomes become organized across a genome; 2) how this organization influences evolution; and, 3) how nucleosome organization regulates genes and other chromosomal elements, ultimately in relation to their impact on human health.

Basic Organization of Nucleosomes

We know that general features of DNA sequences either favor or disfavor nucleosome formation. For example, sequences with high GC content or with AA or TT dinucleotides in periodic 10 base-pair intervals favor nucleosome formation, whereas sequences with tracts of deoxyadenosine nucleotides (poly(dA:dT)) disfavor nucleosome deposition (Hughes and Rando, 2009; Jiang and Pugh, 2009; Segal and Widom, 2009). However, beyond these general principles the details for how DNA sequence and cellular factors influence the positioning, occupancy, and other properties of nucleosomes are still unknown (Figure 2A).

An external file that holds a picture, illustration, etc.
Object name is nihms264968f2.jpg

Figure 2

Nucleosomal properties measured by high-resolution mapping studies

A. Numerous properties of individual nucleosomes can be extracted from histone mapping studies, including the spacing between nucleosomes, the presence of histone variants and posttranslational modifications, nucleosome fuzziness, occupancy, and position relative to a genomic feature. Fuzziness is the degree to which a nucleosome deviates from its consensus position in a population measurement. Occupancy is a measure of nucleosome density. B. Even a small change in a nucleosome’s location can alter a sequence’s accessibility to regulatory proteins. This schematic illustrates the rotational accessibility and inaccessibility of the DNA major groove on the surface of the core histone complex.

Many sequence-specific DNA-binding proteins determine their genomic locations by making precise molecular interactions with as few as 6–8 base pairs of a DNA. An equivalent level of specificity applied to nucleosomal DNA but dispersed over its ~147 base pairs would be difficult to discern. Moreover, unlike sequence-specific DNA binding proteins, nucleosomes do not adopt a single position. In a population of molecules, such positions can be quite variable or “fuzzy” (Figure 2A). As non-overlapping “beads-on-a-string,” the position of one nucleosome restricts the possible positions of adjacent nucleosomes. Consequently, the determinants of a nucleosome position may have distant origins, being propagated through adjacent nucleosomes.

Furthermore, chromatin remodeling complexes, such as SWR1, ISW2, and SWI/SNF, directly regulate nucleosome composition, positions, and occupancy levels, respectively. We know that these protein machines use the energy of ATP hydrolysis to drive nucleosomes to override intrinsic preferences for DNA sequence, but we do not know how they contribute to the organization that predominates in and around genes. Experiments aimed at understanding this question often delete or deplete remodeling complexes in vivo and then assess how the absence of these factors affects nucleosome organization.

Nucleosome Organization Influences Evolution

Nucleosome organization in vivo is not random, and it is now clear that DNA accessibility imparted by nucleosome positions alters DNA susceptibility to mutations, insertions, and deletions. This differential susceptibility shapes the genomic landscape, with mutations tending to be on nucleosomal DNA and insertions and deletions tending to be in linker regions (Sasaki et al., 2009). Nucleosome organization might impact human diversity, whereby single nucleotide polymorphisms tend to be enriched in nucleosome-free promoter regions near nucleosomal edges (Schuster et al., 2010). Further research in this area may be directed at understanding how nucleosome organization shapes the evolution of promoter and enhancer elements, including sequence conservation of cis-regulatory elements.

Nucleosome Organization Regulates Gene Expression

Repositioning a nucleosome by as little as a few base pairs may be sufficient to change the accessibility of a DNA regulatory element. If the element is located on linker DNA between nucleosomes, then it may be accessible (Figure 2B). If the element resides on nucleosomal DNA, then it may be inaccessible, particularly if the helical nature of DNA faces the site inward towards the histone core (Jiang and Pugh, 2009; Segal and Widom, 2009).

An alternative means for enhancing the accessibility of regulatory elements and coding sequences is through a complete or partial dismantling of nucleosomes. Such remodeling has been characterized by perturbing the cellular environment and mapping the resulting nucleosome reorganization (Schones et al., 2008). For example, genes that are induced by heat shock tend to lose nucleosomes in their promoter region (Shivaswamy et al., 2008). Experiments aimed at understanding the underlying mechanism of remodeling involve depleting or deleting remodeling factors and then examining how nucleosome positions and occupancy levels change. One general rule emerging from these experiments is that robust transcriptional activity involves nucleosome depletion whereas transcriptional regulation may involve nucleosome repositioning.

Histone modifications and variants have emerged as a fascinating yet still enigmatic means by which nucleosomes regulate gene expression. Histone variants, such as H2A.Z and H3.3, as well as certain acetylation sites, such as H3K9, 14 and H4K5, 8, 12, 16, create nucleosomes that may facilitate the eviction and/or repositioning of nucleosomes during transcription. Modifications, such as H3K4me3 and H3K36me3, bind proteins involved in the transcription cycle whereas other marks, such as H3K9me3 and H3K27me3, bind proteins that create inaccessible repressive chromatin (Kouzarides, 2007). Furthermore, other modifications mark the locations of transcriptional enhancers (Heintzman et al., 2009). With more than fifty different histone modifications available, the potential for combinatorial control is bewildering, and deciphering this code will certainly be a focus of future research for many years.

Nucleosomes at different positions in the genome serve unique functions. The first nucleosome downstream of a transcriptional start site may regulate the accessibility of the start site and/or the ability of RNA polymerase II to progress into a productive elongation state (although the evidence for such linkages has been correlative rather than causative). In contrast, the first nucleosome upstream of the transcriptional start site may regulate the accessibility of cis regulatory elements that bind sequence-specific transcription factors. Nucleosomes in the middle of genes may prevent spurious transcription initiation that might otherwise generate truncated gene products. Histone variants and modifications are selective to specific nucleosome positions (Figure 1), and thus are likely to endow nucleosomes with position-relevant functions. Because we do not currently know all the mechanistic steps of a transcription cycle (or the order of the steps), we have yet to learn how the combinatorial configuration of histone variants and modifications participate in transcriptional regulation.

The primary organization of nucleosomes across a genome may be largely invariant from cell type to cell type. However, highly targeted changes in positioning, occupancy, histone composition, and modifications probably help define cell types. Therefore, current emphasis is being placed on generating genome-wide maps of histone modification states as cells or organisms undergo developmental programs (Shi, 2007). In this regard, the ENCODE (ENCyclopedia Of DNA Elements) project has provided a major boost for the field, as it reports on a large number of modification states across model cell lines (Birney et al., 2007). From this and other publicly generated data, a multitude of questions can be addressed. What modification states are associated with tissue differentiation, cell identity, and epigenetic inheritance? How are these modifications “read” to elicit such programs? For example, one study reports that cells already committed to a lineage have more promoter regions with repressive histone modification marks than embryonic stem cells (Hawkins et al., 2010). Are these marks reducing options for pluripotency by locking down promoters?

Nucleosome Organization’s Influence on Human Health

We are only beginning to understand the critical roles that chromatin plays in human health and disease. For example, induced pluripotent stem (iPS) cells have the potential to regenerate damaged tissue, but it is becoming clear that such cells, which originate from adult tissue, are not entirely equivalent to embryonic stem (ES) cells. Differences in these cells’ chromatin appear to be paramount. For example, at least in one case, regional states of repressive chromatin differ between ES and iPS cells, but reactivation of such regions allow the iPS cells to behave as ES cells when they are used to regenerate mice (Stadtfeld et al., 2010). A key aspect of keeping ES cells pluripotent is the maintenance of “open” chromatin states, which are generally depleted of repressive chromatin marks and rendered more dynamic by ATP-dependent chromatin remodelers, such as the chromodomain-helicase-DNA-binding protein 1 (CHD1) (Gaspar-Maia et al., 2009). Open regions may be thought of as providing ES cells with many transcriptional options for differentiation that would otherwise be eliminated by a closed chromatin state. Numerous studies are currently devoted to defining the architecture of these open states, including the identification of nucleosome positions, depletion levels, and modification states.

Analogous to maintaining stem cells, chromatin of cancerous cells is also reprogrammed, and many studies are focused on mapping where chromatin regions change between “open” and “closed” states in cancer cells. Such maps may better define distinct cancer subtypes to facilitate clinical treatments. For example, lower levels of certain histone modification states, such as H3K4me2, H3K9me2, and H3K18ac, have been strong prognostic indicators of the treatment outcomes for patients with pancreatic cancer (Manuyakorn et al., 2010). Cancer cells can become resistant to chemotherapies, but interestingly, such drugs become more effective when used in combination with inhibitors of chromatin modifiers, such as histone deacetylation (Sharma et al., 2010).

In addition to its role in cancer, histone acetylation is also a key component of memory and behavior. For example, H4K12ac in the hypothalamus has been associated with memory formation (Peleg et al., 2010). Loss of this mark correlates with cognitive decline in mice, which can be restored with inhibitors of histone deacetylation. Similarly, an inability to methylate H3K9 has been linked to impaired learning (Schaefer et al., 2009). Histone deacetylase inhibitors have also been used to treat numerous neurological disorders, including anxiety and depression (Gundersen and Blendy, 2009). Future experiments will likely map histone modification states in relevant portions of the brain when mice are subjected to memory and behavioral tests.

From yeast to mammals, lifespan has been linked to specific states of histone acetylation and methylation. Sirtuins, a class of histone deacetylases, promote gene silencing and longevity (Guarente, 2000). More recently, loss of H3K4 methylation and maintenance of H4K16 acetylation of histones has been attributed to increased life span (Dang et al., 2009; Greer et al., 2010). Genome-wide nucleosome mapping of histone modification states and nucleosome organization in old versus young cells and in cells with altered abilities to add or remove relevant acetylation and methylation marks, may help to locate key genomic changes that alter life span.

EXPERIMENTAL CONSIDERATIONS: FROM SAMPLE TO SEQUENCE TAGS

Embarking on a genome-wide mapping project of chromatin is easier than one might expect, but finishing the mapping may be harder than anticipated. A basic molecular biology laboratory with access to either commercial or in-house whole genome sequencers can prepare nucleosomes for mapping and determine their locations in a genome at single nucleosome accuracy. The difficult but most rewarding part is turning these maps into knowledge about chromatin. Here we discuss strategies for conducting genome-wide mapping of nucleosomes, with a focus on using micrococcal nuclease (MNase) to generate mononucleosomes and then deep sequencing to identify their locations. We believe that that this MNase ChIP-Seq is probably the most effective means of mapping the primary structure of chromatin. However, first we briefly describe a few other genome-wide mapping strategies.

Overview of Available Strategies

Data from MNase ChIP-Seq provide population averages from a large number of cells. To map nucleosome configurations on individual DNA molecules, ectopic expression of DNA methyltransferases may be used in vivo or added to nuclei. Nucleosomal DNA is identified because its sequence is eventually altered, whereas linker DNA is not. For example, the M.CviPI methyltransferase methylates cytosine in 5’-GC-3’ dinucleotides when it is present in linker DNA (Pardo et al., 2010). This methyl-cytosine, unlike cytosine, is protected against bisulfite conversion to uracil (and ultimately thymine) in vitro. After traditional Sanger sequencing, the configuration of a nucleosomal array on the original DNA molecule is inferred from the sequence. GC dinucleotides are inferred to be nucleosome-free whereas ~150 base-pair spans of GTs that are GCs in the reference (i.e., untreated) genome, are interpreted as nucleosomal.

The value of mapping an array of nucleosomes on a single DNA molecule is that adjacent nucleosome positions may appear to overlap in a population average, but are actually mutually exclusive when examined on a single molecule basis. Currently, this strategy has not been applied on a genomic scale, as it optimally requires high-throughput long-read (>1000 nucleotides) sequencing.

Regions depleted of nucleosomes are candidates for regulatory regions. Therefore, if the primary purpose of chromatin mapping is to screen for nucleosome-depleted regions in many cell types or under various conditions, then FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) may be the appropriate method (Giresi and Lieb, 2009). FAIRE simply depends upon the differential partitioning of nucleosomal and nucleosome-free DNA in phenol-chloroform and aqueous phases. Thus, the major advantages of FAIRE are its simplicity and cost-efficiency. However, its resolution is low compared to mapping the location of individual nucleosomes.

DNase I hypersensitivity has been a classical means of mapping regions of accessible chromatin. Like FAIRE, it is a strategy used on a genomic scale in the ENCODE project (Hesselberth et al., 2009). However, due to frequent cleavages within nucleosomal DNA, nucleosome positions may be more difficult to discern when DNase I is used instead of MNase. Moreover, DNase I involves many sample handling steps and is complicated by technical variation in DNA digestion.

Initial Preparation of Mononucleosomes

From an operational perspective, there are two types of starting material for mapping nucleosomes: tissue excised from a multi-cellular eukaryotic organism, and minimally aggregated cells, including those drawn from blood, grown in tissue culture, or cultured as free-living microorganisms. Excised tissue may contain a heterogeneous mixture of cells, which may obscure chromatin patterns specific to a cell-type. Cellular heterogeneity may be minimized by highly selective and precise tissue excision, which may necessitate acquisition of less material. Although the minimum amount of excised material required to generate nucleosome maps is not known, a lower limit of ~10,000 cells drawn from blood may provide a guide (Adli et al., 2010). More commonly, large numbers of cells are easily collected from tissue culture or microorganisms, such as yeast for which10⁷–10⁸ cells are used. More sophisticated methods may be used to isolate tissue-specific nuclei (Deal and Henikoff, 2010).

The production of genome-wide nucleosome maps has variously used or avoided formaldehyde crosslinking (Figure 3, step 1). Formaldehyde essentially “freezes” existing protein-protein and protein-nucleic acid interactions in place, thereby preserving the in vivo status of interactions, without adverse effects on nucleosomes (Fragoso and Hager, 1997). Without crosslinking, nucleosomes may re-organize during cell harvesting, chromatin preparation, and chromatin fragmentation. However, for most genes in yeast, we and other laboratories have found that nucleosome organization is largely the same in the presence or absence of formaldehyde when MNase is used for chromatin fragmentation (Kaplan et al., 2009). Nevertheless, we have also found genomic regions where nucleosome organization varies in the absence of formaldehyde, and thus, we recommend a simple formaldehyde crosslinking step.

An external file that holds a picture, illustration, etc.
Object name is nihms264968f3.jpg

Figure 3

Flow chart for nucleosome preparation, mapping, and analysis

Images exemplify or illustrate the type of material or data at each stage. Steps 5 and 6 may be performed in either order.

Yeast and plants have cell walls, which require disruption through mechanical breakage (e.g., vigorous vortexing with glass beads) or enzymatic digestion (Albert et al., 2007; Rando, 2010) (Figure 3, step 2). Tissue or small whole animals, such as worms, may be disrupted by grinding of frozen material (Kolasinska-Zwierz et al., 2009). Tissue culture cells in sufficient quantities may be disrupted by douncing cells in a hypotonic buffer. If the amount of material is low, then it may be more practical to lyse with an ionic detergent, such as SDS, in combination with a freeze-thaw cycle (Adli et al., 2010). However, because SDS disruption is not compatible with subsequent MNase digestion, the chromatin must be fragmented by low resolution sonication.

Chromatin Fragmentation

The method of chromatin fragmentation is critical to producing nucleosome maps of a desired resolution (Figure 3, step 3). Sonication produces DNA fragments ranging from ~200 to ~700 base pairs. The heterogeneity of fragment size and cleavage sites makes sonication suitable for characterizing chromatin states over wide regions encompassing many nucleosomes, but it is not optimal for mapping individual nucleosomes. MNase digestion, on the other hand, produces DNA fragments with ends that correspond to the ends of nucleosomes and thus, produces maps with very high resolution.

One potential limitation of MNase digestion is its bias towards cleaving at A or T more frequently than at G or C. However, extensive MNase digestion that predominantly produces mononucleosomes largely, but not entirely, overcomes this bias because even unfavorable cleavage sites become cleaved. Furthermore, residual bias can be computationally compensated (Albert et al., 2007). A limitation of extensive MNase digestion is the production of subnucleosomal-sized DNA fragments, particularly at highly transcribed genes where the DNA on the surface of remodeled or partially disassembled nucleosomes may be more exposed (Weiner et al., 2010). The lack of nucleosome-sized DNA fragments in such regions may be interpreted as being entirely nucleosome free, as opposed to the presence of remodeled or partial nucleosomes that escape detection.

Different chromatin samples and preparations of MNase (i.e., commercial lots) may yield different degrees of MNase digestion. Therefore, it is prudent to titrate the MNase to achieve ~80% of the DNA as mononucleosomal, which is detected by electrophoresis as a band at ~150 base pairs (Figure 3, step 3) (Rando, 2010). In addition, pooling chromatin that has been fragmented to various extents from an MNase titration may help avoid biased isolation of mononucleosome subpopulations that differ in accessibility.

Fragmentation by sonication releases insoluble chromatin fragments from the pellet to the supernatant. MNase treatment solubilizes mononucleosomes in yeast but is often less efficient in fly and mammalian systems. Therefore, a brief sonication in these latter two systems improves solubilization, without creating additional fragmentation. Alternatively, salt extractions of increasing strength can be used to selectively solubilize “active” chromatin (Henikoff et al., 2009). Gel analysis of histones and DNA released to the supernatant versus that retained in the pellet can be conducted to confirm full extraction.

Chromatin Immunoprecipitation (ChIP)

Perhaps the most frequent use of nucleosome mapping is to characterize the distribution of histone modification states or histone variants. In these cases, immobilized antibodies against the particular modification or variant are necessary to immunoprecipitate (or “ChIP”) chromatin fragments possessing the specific modification or variant (Figure 3, step 4)(Liu et al., 2005). Because only a small percentage of DNA becomes crosslinked to histones by formaldehyde, immunoprecipitation should be conducted in the presence of detergent (e.g., 0.05% SDS) to eliminate uncrosslinked DNA. Many antibodies, such as those against H3K4me3 and H2A.Z, are commercially available, providing a level of standardization and quality-control of antibody specificity. However, one limitation of any antibody targeted against a modification is its potential to cross-react with the same or a similar modification located at other sites. Alternatively, an antibody may not recognize its epitope if a nearby amino acid is also modified, and such interfering modification might be present in only a subpopulation of the nucleosomes. Synthetic peptides harboring the modification or potentially confounding secondary modifications can be used to verify antibody specificity.

Detection

Historically, genome-wide detection of chromatin began with the use of low-resolution DNA microarrays in yeast. PCR probes of each intergenic and genic region were arrayed onto glass slides upon which fluorescently-labeled ChIP material was hybridized (reviewed in Jiang and Pugh, 2009). Higher resolution was achieved with microarrays containing overlapping 50-nucleotide probes tiled every 20 base pairs across a small region of the yeast genome. Next high-density microarrays that spanned entire genomes were developed. These arrays, which remain in use today probably for a limited time, can generate maps of individual nucleosomes but with lower resolution compared to deep sequencing. Deep sequencing has the additional advantages of less background, better coverage, and a larger dynamic range compared to microarrays. That said, the fuzziness of nucleosome positions over a population (Figure 2A) precludes full realization of deep sequencing’s intrinsic high resolution.

Regardless of the fragmentation method or whether ChIP is use, the resulting DNA should be gel purified in the 120–170 base-pair range to remove nonspecific, subnucleosomal, and polynucleosomal DNA fragments (Figure 3, step 5). Currently, deep sequencing of nucleosomal DNA requires library preparation, which essentially involves ligating DNA adapters to the ends of gel-purified mononucleosomal DNA (Figure 3, step 6). This allows for PCR amplification of the sample and creates a template by which sequencing initiates. By this stage, users typically have given their samples to a sequencing facility, which will construct the libraries for sequencing using kits provided by the manufactures of the sequencing instrument (Figure 3, step 7). Research laboratories that produce large numbers of libraries may develop their own library preparation protocols, which enhance cost efficiency. Adaptor sequences are available at company web sites, and their ligation involves standard molecular biology manipulations. In this case, greater DNA yields may be obtained by gel purifying after library preparation and PCR amplification.

Currently, the Illumina Genome Analyzer and the Applied Biosystems SOLiD sequencers are the most widely used deep sequencers for this type of work. Although a variety of deep sequencers will likely be available in the near future, the key instrument parameter for nucleosome mapping (and for ChIP-Seq in general) is not the read length but rather the tag count, which is the number of different DNA molecules that can be sequenced and mapped to the reference genome. In general, a technology platform should meet these minimum specifications: minimal steps for library construction, a sequencing read or tag length of ~35 nucleotides, read accuracy of >99%, turnaround time of less than a few days, and a cost of under US$10,000 per run.

In principle, biases during ligation, post-construction PCR amplification, gel purification, and sequencing can result in biased tag production, which may influence the apparent occupancy level and position of nucleosomes (Stein et al., 2010). Nevertheless, such biases may be compensated computationally because they manifest as anomalously high tag counts at specific genomic coordinates. For example, setting an upper limit on the normalized tag counts at a particular coordinate may correct such statistical outliers (Kaplan et al., 2009). In practice, sequencing bias may be rather innocuous because data is often aggregated in a way that eliminates outliers.

Sequencing Tags

Sample processing that includes MNase-digestion, immunoprecipitation, and gel purification of mononucleosomes eliminates nonspecific background contamination of genomic DNA, which would otherwise degrade the quality of the maps. As such, each sequencing tag represents a measured nucleosome position, generally without the need for background correction.

A general rule of thumb is that the number of sequencing tags needed to uniquely identify >90% of all nucleosomes is minimally ten times the number of estimated nucleosomes. An estimated number of total nucleosomes is the genome size divided by 200 (i.e., the average base pair distance covered by a nucleosome core particle plus linker). Thus, complete yeast nucleosome maps require at least 600,000 tags whereas human nucleosome maps require at least 150 million tags.

However, more or less tags may be needed depending upon the goal of the experiment. If the goal is to measure occupancy levels, then ~3–5 times more tags may be required to provide robust quantitative numbers of tags per nucleosome position or per genomic coordinate. If data are to be aggregated, for example by averaging the distribution of tags around a collection of genes, then substantially less tags may be sufficient. Indeed, not every nucleosome would need to be detected. Such minimal coverage is cost efficient when many experiments are conducted simultaneously, such as screening samples or titrating conditions. Because only a small portion of the library is sequenced, more coverage can be achieved by sequencing more of the library as needed. Similarly, histone modification states typically occur at only a fraction of all nucleosomes, and thus, in principle, require fewer tags.

The number of needed sequencing tags for each sample must dovetail with the minimal sequencing “bandwidth” (Figure 4). Each channel of the sequencer flow cell (the current Illumina sequencer has 8 channels and the current SOLiD sequencer has 1–8 channels) represents the minimal bandwidth of the sequencer. If a sequencer delivers, for example, 40 million mappable tags as its minimal bandwidth per channel, and the user requires ~10 million tags per sample, then 4 multiplexed samples can be placed into each channel. Sample multiplexing, which is also called indexing or barcoding, is achieved by using commercially-designed adapters. These adapters contain a unique pre-defined 5–10 nucleotide DNA sequence used to identify the sample.

An external file that holds a picture, illustration, etc.
Object name is nihms264968f4.jpg

Figure 4

How multiplexing can influence tag production

Left: The standard practice in ChIP-Seq is to index or “barcode“ each genomic sample of nucleosomal DNA with a unique DNA sequence of 6–10 nucleotides. The barcode is ultimately sequenced and used to associate each tag with the sample from which it came, when many different samples are pooled together. One option is to PCR amplify each DNA sample, then pool equal mass proportions to generate equal number of tags for each nucleosomal sample. The problem with this approach is that a sample that might be expected to have a very low tag count, such as a negative control lacking an antibody or epitope, will yield approximately the same number of tags as real test samples. This could give the erroneous impression of high background in the test samples.

Right: An alternatively strategy is to pool samples prior to PCR amplification, based upon mixing equivalent numbers of cells (or some other metric of equivalency between samples). Then, the proportionality of tags between samples will, to a first approximation, remain constant. The problem with this approach is that any loss of sample or excess DNA contamination in a sample at any stage prior to pooling (including cell harvesting, chromatin-immunoprecipitation, or library construction) will carry through to the end. As a result, tags from different test samples, which might be expected to have similar tag counts, could vary widely. Thus, the risk is that some samples may not yield enough tag counts to conduct the appropriate analysis.

Current commercial systems allow up to 96 barcodes. Once indexed, samples can then be pooled in any desired ratio to achieve the requisite number of tags deliverable by the channel (but subject to the caveats illustrated in Figure 4). The combination of sample pooling and judicious apportioning of tags could drive sequencing costs below US$50 per sample.

Technical improvements in deep sequencing will continue to increase the number and length of sequenced tags. For standard mapping, sequencing beyond the minimal length that is required to uniquely identify a tag in the genome offers little advantage, and in fact has a number of disadvantages. The main drawback is cost. Once a tag is uniquely mapped in the genome, additional sequencing cycles add cost without adding more tags to the dataset. Other unnecessary disadvantages of longer reads include slower instrument turnaround and greater data storage needs. Sequencing error rates, in general, are not a significant issue because tags need only to be uniquely identified in the reference genome, and thus can have multiple errors without impacting its uniqueness. This contrasts with detection of single nucleotide polymorphisms or de novo sequencing of genomes in which accuracy is critical.

The point of diminishing returns on sequence length is approximately 25–27 nucleotides. However, sequencing kits typically produce 35 nucleotide tags, which represent a good compromise between the need for unique identification and the drawbacks of longer reads.

One type of sequencing run, called “fragment” by an Applied Biosystems term and “single read” by Illumina, identifies only one end of the nucleosomal DNA molecule for each tag. In contrast, “paired-end” sequencing simultaneously identifies both ends in each tag. In principle, paired-end sequencing provides more accurate maps. However, the added accuracy may not be worth the roughly two-fold increase in sequencing costs if it is not needed to address the questions at hand. For nucleosome mapping, both ends of a consensus nucleosome are already measured separately in a population of molecules detected by a fragment or single-read run. Moreover, consensus nucleosomes over a population are not at fixed positions but are rather “fuzzy” (Figure 2A), and thus the added accuracy of pair-end sequencing may be moot. Paired-end and longer-read sequencing is advantageous when mapping nucleosomes in repetitive or low complexity genomic regions, where the additional information provides a greater probability of uniquely identifying its location.

BIOINFORMATICS: TURNING SEQUENCES INTO NUCLEOSOMES

Mapping DNA Sequence Tags

Current Applied Biosystems SOLiD and Illumina GA/HiSeq platforms produce raw photographic image files of fluorescence intensities. The fluorescence resides on a two-dimensional surface that represents a detected nucleotide (Illumina) or dinucleotide (Applied Biosystems) incorporated into a group or cluster of identical clonally-amplified DNA molecules. An entire flow cell houses hundreds of millions of clusters that undergo 20–150 cycles of sequencing (although 35 is the target), ultimately culminating in terabytes of image data. For many sequencing operations, long-term storage of these image files is cost-prohibitive, and thus they are kept only short-term. In general, image files become obsolete once they are converted to FASTQ (Illumina) or CSFASTA (Applied Biosystems) files, which delineate the nucleotide sequence (i.e., “base calls”). Physical DNA libraries may be kept indefinitely, should re-sequencing be necessary.

Both sequencing platforms have on-board software to split off barcodes and map raw sequence data to a user-selected reference genome. ELAND software, packaged with the Illumina GA, aligns a sequence to a reference genome by first seeding an alignment with the first 32 nucleotides of the tag, then extends the alignment to the total tag length. Applied Biosystems provides the SOLiD System Analysis Pipeline Tool (Corona Lite). Other aligners are also available, including Maq, RMAP, Cloudburst, SOAP, SHRiMP, Bowtie, and BWA (Li and Homer, 2010). SHRiMP and Bowtie are particularly popular.

From Mapped Tags to Nucleosomes

The relevant part of a mapped tag is the genomic coordinate of its 5’ end (lowest coordinate on the forward strand, highest coordinate on the reverse strand). It corresponds to a single measured nucleosome border, if the sequenced library originates from mononucleosomal DNA. The sequence specificity bias that is inherent to MNase digestion and incomplete protection of borders by histones collude to create imprecision in a mapped border, which in practice may be largely moot, because nucleosome positions are intrinsically imprecise.

Nonetheless, the resulting tag can be used to represent a nucleosome in multiple ways (Figure 5A): 1) As a strand-specific nucleosome border (unshifted tag) (Barski et al., 2007); 2) As an entire nucleosome by extending the tag in the 3’ direction to a length of 147 base pairs (extended tag) (Kaplan et al., 2009); or, 3) As a single coordinate representing a presumed nucleosome midpoint by shifting the 5’ end of the tag 73 nucleotides towards the 3’ direction (shifted tag) (Albert et al., 2007). For paired-end sequencing, the midpoint between the highest and lowest coordinates of the two tags defines the nucleosome midpoint (Figure 5B). The midpoint can also be extended 73 nucleotides in both directions to define the presumed nucleosome length. When sonication is used to fragment chromatin, the resulting libraries largely lack single nucleosome precision. Nevertheless, their sequenced 5’ ends can be shifted in the 3’ direction by the average fragmentation size to estimate the midpoint of the nucleosome.

An external file that holds a picture, illustration, etc.
Object name is nihms264968f5.jpg

Figure 5

Turning tags into nucleosomes

Solid red and blue lines represent nucleosomal DNA ready for sequencing. The nucleosomal DNA is produced as MNase-resistant DNA fragments. In a population of molecules, the DNA fragments will have heterogeneous ends due to biases in digestion efficiency at different sequences, as well as the nucleosome not residing at a single position (i.e. “fuzzy” positioning). The asterisk, representing a unique sequence, provides a frame of reference. A. In “fragment” or “single-read sequencing,” the DNA library is sequenced from only one of the adapters (green) (except in reading the barcode) and in the direction indicated by the blue or red arrows. Consequently, each nucleosome border is measured independently as a population. Tags can then be extended to 147 nucleotides or their 5’ ends shifted by 73 nucleotides, as indicated to the right. Either way, the resulting frequency distributions, while looking different, have exactly the same uncertainty. B. Paired-end sequencing allows both ends of the same DNA molecule to be sequenced. The midpoint of the pair defines the consensus nucleosome midpoint, which can be extended 73 nucleotides in both directions (right side).

Clusters of tags can be aggregated to define a consensus nucleosome position, which then represents a population average (Figure 3, step 8). For this, our laboratory uses GeneTrack software, which was the first peak calling software developed for mapping nucleosomes or any other data by ChIP-Seq (Albert et al., 2007). GeneTrack converts tag counts at each coordinate into a smoothed Gaussian distribution across multiple coordinates, implementing a user-defined standard deviation. GeneTrack then sums all instances of the distribution to create a smoothed continuous landscape across the genome. Local peaks are then identified, starting with the highest peak. A user-defined exclusion zone (e.g., 147 nucleotides) is centered over the peak to represent the steric exclusion of a nucleosome and prevent the calling of secondary peaks within the exclusion zone.

Peak calling can be performed with unshifted tags on each DNA strand separately. The consensus midpoint for the nucleosome is then the midpoint distance between a peak on one strand and the next downstream (3’) peak but located on the opposite strand. Alternatively, shifted tags from each strand can be first combined, and then applied to GeneTrack. The former may be more accurate as it involves position-specific correction factors. The latter involves only a single correction for an entire dataset and may be more appropriate for raw data display in a browser.

A number of other “peak-calling” algorithms have been used to define consensus nucleosome positions. Hidden Markov modeling has been applied to microarray data (Yuan et al., 2005) and can infer which DNA segments are occupied by nucleosomes after the algorithm trains on a data set. On the other hand, template filtering aims to classify peak patterns (Weiner et al., 2010), then shifts peak pairs on opposite strands individually by whatever distances maximize area overlap between the two peaks. In contrast, Model-Based Analysis of ChIP-Seq (MACS) uses empirical modeling of the length of protein-DNA interaction sites in combination with local biases in the genome based on a Poisson distribution (Zhang et al., 2008). Other peak-calling software, which may be applicable to nucleosome mapping, includes PeakFinder, FindPeaks, SISSRs, QuEST, CisGenome, PeakSeq, and Hpeak. Indeed, Laajala et al. (2009) compare these programs for use with ChIP-Seq data.

Data Analysis Pipeline

Data analysis can be divided into three stages (Figure 3, steps 8–10): primary, secondary, and tertiary analyses. Primary analysis starts with raw base calls, maps the tags to a reference genome, and then identifies peaks (i.e., “peak calls”). This processing removes unmappable tags, aggregates the data, and provides some quality assessment of the dataset. Primary analysis, in principle, requires no biological knowledge and can be handled by trained computational staff.

During secondary analysis, nucleosome parameters are extracted from the dataset. These parameters include inter-nucleosomal spacing, the “fuzziness” or variation of individual locations of nucleosomes, distances from a particular sequence element, and the extent to which nucleosomes occupy a region of genomic DNA (Figure 2A). Here knowledge of bioinformatics and genomics is required, particularly an understanding of genome annotation (e.g., how strands, coordinates, and features are defined), organization (e.g., how features, such as genes and regulatory elements are placed), and structure (i.e., how bendability varies across DNA sequences to how chromatin folds into higher-order structures).

During tertiary analysis, a dataset is compared to many other experimental datasets (Figure 3, step 10). Examples of these analyses include the extent to which two histone modification states, such H3K27me3 and H3K4me3, co-localize throughout the genome and the distribution of nucleosomes around measured genomic features, such as transcriptional start sites. Tertiary analysis requires extensive biological knowledge to focus on key questions and avoid a seemingly endless number of less informative comparisons. Analysis pipelines can be developed in-house or assisted by online applications such as Galaxy (Goecks et al., 2010).

ANALYSIS STRATEGIES: FROM COORDINATES TO NUCLEOSOME ORGANIZATION

Multiple metrics of nucleosome organization provide insights into chromatin regulation, including nucleosome occupancy levels, nucleosome “fuzziness,” spacing distance between adjacent nucleosomes, and their distribution around genomic features (Figure 2A). The simplest display of nucleosome organization consists of a browser shot displaying the distribution of sequencing tags for a representative genomic region (Figure 3). A browser shot is attainable by the University of California, Santa Cruz (UCSC) genome browsers (Rosenbloom et al., 2010) and GeneTrack (Albert et al., 2008). The value of a browser shot is that it gives the most intuitive and unfiltered assessment of the data; however, it does represent only an anecdotal and potentially “cherry-picked” example.

Nucleosome Occupancy

Nucleosome occupancy can be assessed for a consensus nucleosome location or on a per base pair basis (Figure 5). The former measures the number of shifted tags residing within ±73 base pairs of a consensus nucleosome midpoint (Mavrich et al., 2008). By this measure, a single occupancy value is attributed to a consensus nucleosome position defined by a single coordinate. Preferential MNase digestion sites and fuzziness of the nucleosome may influence the consensus position but will have little effect on its occupancy level.

In contrast, occupancy measured on a per base-pair basis counts the number of extended tags that cover each genomic coordinate without defining any consensus position (Kaplan et al., 2009). By this measure, occupancy levels at coordinates will be influenced by preferential MNase digestion sites and nucleosome fuzziness.

Comparing occupancy levels across datasets requires normalization because the number of tags delivered by a sequencer is, to a first approximation, defined by the user rather than reflective of any biological property. Normalization simply involves setting the total number of uniquely mappable tags between samples equal. This assumes that the number of nucleosomes between the compared samples is indeed equal in the biological setting. Because different extents of MNase digestion may influence occupancy levels measured on a per base-pair basis (Weiner et al. 2010), digestion uniformity (i.e. ~80% mononucleosomal) is a critical parameter.

Although the amount of histones present across samples may be approximately constant, a particular histone modification may vary across samples. In such situations, if the total level of a histone modification across the whole genome can be measured independently, for example by immunoblotting, then total sample tag counts can be scaled to reflect this measured level (Figure 6). Measured levels of a particular histone modification (i.e. tag counts) at specific genomic locations have two contributing biological factors: nucleosome occupancy level and the amount of modification per nucleosome. The latter is the desired metric and can be derived by dividing the measured modification level for a consensus nucleosome or genomic interval by the corresponding level of nucleosome occupancy (Figure 6).

An external file that holds a picture, illustration, etc.
Object name is nihms264968f6.jpg

Figure 6

Data normalization

This schematic depicts an immunoblot measuring the bulk levels of chromatin-associated histone H3 and a particular histone modification. The corresponding immunoblot signals in a reference sample are set to 100. Assuming total levels of H3 are constant between samples, the relative level of each modification can be assessed. Nucleosomal tags generated from MNase ChIP-Seq can then be proportionally adjusted to reflect the relative modification state. This does not preclude further normalization on a locus-by-locus basis in which modification densities are calculated for a given amount of core histone (typically, H3) present in the sample.

Normalized levels of nucleosome occupancy or normalized modification densities can be compared across datasets on a per nucleosome basis or over defined intervals, such as every 500 base pairs. Datasets in which a large proportion of the occupancy levels are distributed over a rather narrow range of values will yield poor correlations between datasets. By analogy, any portion of a calm ocean will look like any other portion of a calm ocean. Data can be filtered in order to compare only nucleosome or intervals that have particularly high or low occupancy levels (Kaplan et al., 2009). However, one caveat to this filtering is that any resulting correlation is applicable only to those regions.

Nucleosome Positioning

Most nucleosomes are not randomly placed in the genome, but they also do not associate with a particular DNA sequence in the same manner as sequence-specific transcription factors. Nevertheless, many nucleosomes, but not all, reside at preferred locations in the genome. The degree to which a cluster of nucleosomal tags deviate from its consensus position is a measure of its positioning or “fuzziness” (Albert et al., 2007) (Figure 2A). The fuzzier a position is, the less meaningful is its assigned consensus position. Multiple methods of measuring nucleosome fuzziness have been used (Albert et al., 2007; Kaplan et al., 2009; Weiner et al., 2010; Zhang et al., 2009).

Nucleosome positions relative to a set of genomic sequences or features can be evaluated by plotting a frequency distribution of tag distances from those features (Figure 3, composite plot and cluster plot). Genomic features that are commonly examined include transcriptional start sites, specific cis-regulatory elements, and bound locations of specific proteins. Frequency distributions plots can be displayed as line graphs, which represent the collective tag distributions around a given set or subset of features, such as the transcriptional start sites for the most highly-transcribed genes. These “composite” or “averaged” plots provide a simple and intuitive quantitative assessment of nucleosome occupancy, spacing, and positions in a single graph, but these plots do not reflect the variance in the system. For example, two dissimilar patterns may be, to a large extent, self-canceling when presented as a single composite plot.

Plotting frequency distribution data as “cluster” plots makes pattern variation more evident, but these plots are visually less quantitative. Cluster plots are essentially like line graphs for each region of interest, where the graph is collapsed to a one-dimensional row running along the x-axis (representing distance from a feature), and frequency bin counts (y-axis) are represented by a color scale. Indeed, thousands of these rows can be aligned in the second dimension. These rows can then be sorted or organized, typically by K-means or hierarchical clustering.

PERSEPCTIVE

Historically, progress in the biological sciences was predicated on each individual experiment being relatively inexpensive, with the sum total of all experiments producing a body of knowledge. However, with the advent of genome-wide mapping by sequencing, each experiment has been comparatively expensive but has produced enough data to generate a substantial body of knowledge. Therefore, experiments have been carefully chosen to maximize the benefit to cost ratio. However, as the “low hanging fruit” of genome sequencing shrinks, researchers must penetrate deeper into the chromatin problem.

We are only beginning to investigate the complexity of chromatin and its interplay with gene regulatory proteins on genome-wide scale. Nevertheless, research on genomic chromatin organization is expanding rapidly and encompassing a broader spectrum of biology, from the fundamental biophysical properties of chromosomes to human behavior. We still do not understand how DNA sequences work together with chromatin remodeling proteins, such as RSC, CHD1, and SWI/SNF, and other chromatin proteins to define the highly organized state of nucleosomes. We do not understand how histone modification states promote neural memory or cellular identity.

Characterizing chromatin architecture and deciphering its code returns us back to the original state of biology in which many experiments sum together to comprise a body of research. Sequencing costs have dropped to the point where a large number of mapping experiments can now be completed within the scope of a typical grant. Thus, genome-wide experiments, such as high-resolution nucleosome mapping, are now accessible to a large number of researchers. As more investigators enter the field, its pace of discovery and impact across biology will only increase during the next decade.

Acknowledgments

We thank Mike Kladde, Bryan J. Venters, Shinichiro Wachi, Kuangyu Yen, Liye Zhang, Sujana Ghosh, Megha Wal, Kiran Batta, Jing Hu, and Christine H. Walsh for helpful discussions. This work was supported by NIH grant HG004160.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

Adli M, Zhu J, Bernstein BE. Genome-wide chromatin maps derived from limited numbers of hematopoietic progenitors. Nat Methods. 2010;7:615–618. [Europe PMC free article] [Abstract] [Google Scholar]
Albert I, Mavrich TN, Tomsho LP, Qi J, Zanton SJ, Schuster SC, Pugh BF. Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome. Nature. 2007;446:572–576. [Abstract] [Google Scholar]
Albert I, Wachi S, Jiang C, Pugh BF. GeneTrack - a genomic data processing and visualization framework. Bioinformatics 2008 [Europe PMC free article] [Abstract] [Google Scholar]
Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129:823–837. [Abstract] [Google Scholar]
Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447:799–816. [Europe PMC free article] [Abstract] [Google Scholar]
Core LJ, Lis JT. Transcription regulation through promoter-proximal pausing of RNA polymerase II. Science. 2008;319:1791–1792. [Europe PMC free article] [Abstract] [Google Scholar]
Dang W, Steffen KK, Perry R, Dorsey JA, Johnson FB, Shilatifard A, Kaeberlein M, Kennedy BK, Berger SL. Histone H4 lysine 16 acetylation regulates cellular lifespan. Nature. 2009;459:802–807. [Europe PMC free article] [Abstract] [Google Scholar]
Deal RB, Henikoff S. A simple method for gene expression and chromatin profiling of individual cell types within a tissue. Dev Cell. 2010;18:1030–1040. [Europe PMC free article] [Abstract] [Google Scholar]
Fragoso G, Hager GL. Analysis of in vivo nucleosome positions by determination of nucleosome-linker boundaries in crosslinked chromatin. Methods. 1997;11:246–252. [Abstract] [Google Scholar]
Gaspar-Maia A, Alajem A, Polesso F, Sridharan R, Mason MJ, Heidersbach A, Ramalho-Santos J, McManus MT, Plath K, Meshorer E, et al. Chd1 regulates open chromatin and pluripotency of embryonic stem cells. Nature. 2009;460:863–868. [Europe PMC free article] [Abstract] [Google Scholar]
Giresi PG, Lieb JD. Isolation of active regulatory elements from eukaryotic chromatin using FAIRE (Formaldehyde Assisted Isolation of Regulatory Elements) Methods. 2009;48:233–239. [Europe PMC free article] [Abstract] [Google Scholar]
Goecks J, Nekrutenko A, Taylor J. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 2010;11:R86. [Europe PMC free article] [Abstract] [Google Scholar]
Greer EL, Maures TJ, Hauswirth AG, Green EM, Leeman DS, Maro GS, Han S, Banko MR, Gozani O, Brunet A. Members of the H3K4 trimethylation complex regulate lifespan in a germline-dependent manner in C. elegans. Nature. 2010;466:383–387. [Europe PMC free article] [Abstract] [Google Scholar]
Guarente L. Sir2 links chromatin silencing, metabolism, and aging. Genes Dev. 2000;14:1021–1026. [Abstract] [Google Scholar]
Gundersen BB, Blendy JA. Effects of the histone deacetylase inhibitor sodium butyrate in models of depression and anxiety. Neuropharmacology. 2009;57:67–74. [Europe PMC free article] [Abstract] [Google Scholar]
Hawkins RD, Hon GC, Lee LK, Ngo Q, Lister R, Pelizzola M, Edsall LE, Kuan S, Luu Y, Klugman S, et al. Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell. 2010;6:479–491. [Europe PMC free article] [Abstract] [Google Scholar]
Heintzman ND, Hon GC, Hawkins RD, Kheradpour P, Stark A, Harp LF, Ye Z, Lee LK, Stuart RK, Ching CW, et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature. 2009;459:108–112. [Europe PMC free article] [Abstract] [Google Scholar]
Henikoff S, Henikoff JG, Sakai A, Loeb GB, Ahmad K. Genome-wide profiling of salt fractions maps physical properties of chromatin. Genome Res. 2009;19:460–469. [Europe PMC free article] [Abstract] [Google Scholar]
Hesselberth JR, Chen X, Zhang Z, Sabo PJ, Sandstrom R, Reynolds AP, Thurman RE, Neph S, Kuehn MS, Noble WS, et al. Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nat Methods. 2009;6:283–289. [Europe PMC free article] [Abstract] [Google Scholar]
Hughes A, Rando OJ. Chromatin 'programming' by sequence - is there more to the nucleosome code than %GC? J Biol. 2009;8:96. [Europe PMC free article] [Abstract] [Google Scholar]
Jenuwein T, Allis CD. Translating the histone code. Science. 2001;293:1074–1080. [Abstract] [Google Scholar]
Jiang C, Pugh BF. Nucleosome positioning and gene regulation: advances through genomics. Nat Rev Genet. 2009;10:161–172. [Abstract] [Google Scholar]
Kaplan N, Moore IK, Fondufe-Mittendorf Y, Gossett AJ, Tillo D, Field Y, LeProust EM, Hughes TR, Lieb JD, Widom J, et al. The DNA-encoded nucleosome organization of a eukaryotic genome. Nature. 2009;458:362–366. [Europe PMC free article] [Abstract] [Google Scholar]
Kolasinska-Zwierz P, Down T, Latorre I, Liu T, Liu XS, Ahringer J. Differential chromatin marking of introns and expressed exons by H3K36me3. Nat Genet. 2009;41:376–381. [Europe PMC free article] [Abstract] [Google Scholar]
Kouzarides T. Chromatin modifications and their function. Cell. 2007;128:693–705. [Abstract] [Google Scholar]
Laajala TD, Raghav S, Tuomela S, Lahesmaa R, Aittokallio T, Elo LL. A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments. BMC Genomics. 2009;10:618. [Europe PMC free article] [Abstract] [Google Scholar]
Li H, Homer N. A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform 2010 [Europe PMC free article] [Abstract] [Google Scholar]
Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009;326:289–293. [Europe PMC free article] [Abstract] [Google Scholar]
Liu CL, Kaplan T, Kim M, Buratowski S, Schreiber SL, Friedman N, Rando OJ. Single-Nucleosome Mapping of Histone Modifications in S. cerevisiae. PLoS Biol. 2005;3:e328. [Abstract] [Google Scholar]
Malik HS, Henikoff S. Phylogenomics of the nucleosome. Nat Struct Biol. 2003;10:882–891. [Abstract] [Google Scholar]
Manuyakorn A, Paulus R, Farrell J, Dawson NA, Tze S, Cheung-Lau G, Hines OJ, Reber H, Seligson DB, Horvath S, et al. Cellular histone modification patterns predict prognosis and treatment response in resectable pancreatic adenocarcinoma: results from RTOG 9704. J Clin Oncol. 2010;28:1358–1365. [Europe PMC free article] [Abstract] [Google Scholar]
Mavrich TN, Ioshikhes IP, Venters BJ, Jiang C, Tomsho LP, Qi J, Schuster SC, Albert I, Pugh BF. A barrier nucleosome model for statistical positioning of nucleosomes throughout the yeast genome. Genome Res. 2008;18:1073–1083. [Europe PMC free article] [Abstract] [Google Scholar]
Pardo CE, Carr IM, Hoffman CJ, Darst RP, Markham AF, Bonthron DT, Kladde MP. MethylViewer: computational analysis and editing for bisulfite sequencing and methyltransferase accessibility protocol for individual templates (MAPit) projects. Nucleic Acids Res 2010 [Europe PMC free article] [Abstract] [Google Scholar]
Peleg S, Sananbenesi F, Zovoilis A, Burkhardt S, Bahari-Javan S, Agis-Balboa RC, Cota P, Wittnam JL, Gogol-Doering A, Opitz L, et al. Altered histone acetylation is associated with age-dependent memory impairment in mice. Science. 2010;328:753–756. [Abstract] [Google Scholar]
Rando OJ. Genome-wide mapping of nucleosomes in yeast. Methods Enzymol. 2010;470:105–118. [Europe PMC free article] [Abstract] [Google Scholar]
Rosenbloom KR, Dreszer TR, Pheasant M, Barber GP, Meyer LR, Pohl A, Raney BJ, Wang T, Hinrichs AS, Zweig AS, et al. ENCODE whole-genome data in the UCSC Genome Browser. Nucleic Acids Res. 2010;38:D620–625. [Europe PMC free article] [Abstract] [Google Scholar]
Sasaki S, Mello CC, Shimada A, Nakatani Y, Hashimoto S, Ogawa M, Matsushima K, Gu SG, Kasahara M, Ahsan B, et al. Chromatin-associated periodicity in genetic variation downstream of transcriptional start sites. Science. 2009;323:401–404. [Europe PMC free article] [Abstract] [Google Scholar]
Schaefer A, Sampath SC, Intrator A, Min A, Gertler TS, Surmeier DJ, Tarakhovsky A, Greengard P. Control of cognition and adaptive behavior by the GLP/G9a epigenetic suppressor complex. Neuron. 2009;64:678–691. [Europe PMC free article] [Abstract] [Google Scholar]
Schones DE, Cui K, Cuddapah S, Roh TY, Barski A, Wang Z, Wei G, Zhao K. Dynamic regulation of nucleosome positioning in the human genome. Cell. 2008;132:887–898. [Abstract] [Google Scholar]
Schuster SC, Miller W, Ratan A, Tomsho LP, Giardine B, Kasson LR, Harris RS, Petersen DC, Zhao F, Qi J, et al. Complete Khoisan and Bantu genomes from southern Africa. Nature. 2010;463:943–947. [Europe PMC free article] [Abstract] [Google Scholar]
Segal E, Widom J. What controls nucleosome positions? Trends Genet. 2009;25:335–343. [Europe PMC free article] [Abstract] [Google Scholar]
Sharma SV, Lee DY, Li B, Quinlan MP, Takahashi F, Maheswaran S, McDermott U, Azizian N, Zou L, Fischbach MA, et al. A chromatin-mediated reversible drug-tolerant state in cancer cell subpopulations. Cell. 2010;141:69–80. [Europe PMC free article] [Abstract] [Google Scholar]
Shi Y. Histone lysine demethylases: emerging roles in development, physiology and disease. Nat Rev Genet. 2007;8:829–833. [Abstract] [Google Scholar]
Shivaswamy S, Bhinge A, Zhao Y, Jones S, Hirst M, Iyer VR. Dynamic remodeling of individual nucleosomes across a eukaryotic genome in response to transcriptional perturbation. PLoS Biol. 2008;6:e65. [Abstract] [Google Scholar]
Stadtfeld M, Apostolou E, Akutsu H, Fukuda A, Follett P, Natesan S, Kono T, Shioda T, Hochedlinger K. Aberrant silencing of imprinted genes on chromosome 12qF1 in mouse induced pluripotent stem cells. Nature. 2010;465:175–181. [Europe PMC free article] [Abstract] [Google Scholar]
Stein A, Takasuka TE, Collings CK. Are nucleosome positions in vivo primarily determined by histone-DNA sequence preferences? Nucleic Acids Res. 2010;38:709–719. [Europe PMC free article] [Abstract] [Google Scholar]
Weiner A, Hughes A, Yassour M, Rando OJ, Friedman N. High-resolution nucleosome mapping reveals transcription-dependent promoter packaging. Genome Res. 2010;20:90–100. [Europe PMC free article] [Abstract] [Google Scholar]
Yuan GC, Liu YJ, Dion MF, Slack MD, Wu LF, Altschuler SJ, Rando OJ. Genome-scale identification of nucleosome positions in S. cerevisiae. Science. 2005;309:626–630. [Abstract] [Google Scholar]
Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, et al. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137. [Europe PMC free article] [Abstract] [Google Scholar]
Zhang Y, Moqtaderi Z, Rattner BP, Euskirchen G, Snyder M, Kadonaga JT, Liu XS, Struhl K. Intrinsic histone-DNA interactions are not the major determinant of nucleosome positions in vivo. Nat Struct Mol Biol. 2009;16:847–852. [Europe PMC free article] [Abstract] [Google Scholar]

Full text links

Read article at publisher's site: https://doi.org/10.1016/j.cell.2011.01.003

Read article for free, from open access legal sources, via Unpaywall: http://www.cell.com/article/S0092867411000043/pdf

Subscription required at www.cell.com
http://www.cell.com/cgi/content/reprint/144/2/175

Citations & impact

Impact metrics

100

Citations

Jump to Citations

Citations of article over time

Alternative metrics

Altmetric item for https://www.altmetric.com/details/51888846

Altmetric
Discover the attention surrounding your research
https://www.altmetric.com/details/51888846

Smart citations by scite.ai
Explore citation contexts and check if this article has been supported or disputed.
https://scite.ai/reports/10.1016/j.cell.2011.01.003

Supporting

Mentioning

Contrasting

132

Article citations

An integrated machine-learning model to predict nucleosome architecture.
Sala A, Labrador M, Buitrago D, De Jorge P, Battistini F, Heath IB, Orozco M
Nucleic Acids Res, 52(17):10132-10143, 01 Sep 2024
Cited by: 0 articles | PMID: 39162225 | PMCID: PMC11417389
This article is in the Europe PMC Open access subset. Refer to the copyright information in the article for licensing details.
Free full text in Europe PMC
Establishing an optimized ATAC-seq protocol for the maize.
Hsieh JA, Lin PY, Wang CT, Lee YJ, Chang P, Lu RJ, Chen PY, Wang CR
Front Plant Sci, 15:1370618, 28 May 2024
Cited by: 0 articles | PMID: 38863553
In-silico guided chemical exploration of KDM4A fragments hits.
Lombino J, Vallone R, Cimino M, Gulotta MR, De Simone G, Morando MA, Sabbatella R, Di Martino S, Fogazza M, Sarno F, Coronnello C, De Rosa M, Cipollina C, Altucci L, Perricone U, Alfano C
Clin Epigenetics, 15(1):197, 21 Dec 2023
Cited by: 0 articles | PMID: 38129913 | PMCID: PMC10740270
This article is in the Europe PMC Open access subset. Refer to the copyright information in the article for licensing details.
Free full text in Europe PMC
Omics and Multi-Omics Analysis for the Early Identification and Improved Outcome of Patients with Psoriatic Arthritis.
Gurke R, Bendes A, Bowes J, Koehm M, Twyman RM, Barton A, Elewaut D, Goodyear C, Hahnefeld L, Hillenbrand R, Hunter E, Ibberson M, Ioannidis V, Kugler S, Lories RJ, Resch E, Rüping S, Scholich K, Schwenk JM, [...] HIPPOCRATES Consortium
Biomedicines, 10(10):2387, 24 Sep 2022
Cited by: 10 articles | PMID: 36289648 | PMCID: PMC9598654
Review
This article is in the Europe PMC Open access subset. Refer to the copyright information in the article for licensing details.
Free full text in Europe PMC
In Vitro Mapping of Nucleosome Positions at Base-Pair Resolution Using Ortho-Phenanthroline.
Ghassabi Kondalaji S, Bowman GD
Curr Protoc, 2(8):e518, 01 Aug 2022
Cited by: 1 article | PMID: 35943282 | PMCID: PMC9373710
Free full text in Europe PMC

Go to all (100) article citations

Other citations

Wikipedia

https://en.wikipedia.org/wiki/Epigenomics

Funding

Funders who supported this work.

NHGRI NIH HHS (5)

Grant ID: HG004160
16 publications
Grant ID: R01 HG004160-05
1 publication
Grant ID: R01 HG004160
30 publications
Grant ID: R01 HG004160-04
1 publication
Grant ID: R56 HG004160
15 publications

Search life-sciences literature (45,100,050 articles, preprints and more)

High-resolution genome-wide mapping of the primary structure of chromatin.

Author information

Affiliations

ORCIDs linked to this article

Abstract

Free full text

High resolution genome-wide mapping of the primary structure of chromatin

Abstract

CHROMATIN MAPPING IMPACTS DIVERSE RESEARCH AREAS

Basic Organization of Nucleosomes

Nucleosome Organization Influences Evolution

Nucleosome Organization Regulates Gene Expression

Nucleosome Organization’s Influence on Human Health

EXPERIMENTAL CONSIDERATIONS: FROM SAMPLE TO SEQUENCE TAGS

Overview of Available Strategies

Initial Preparation of Mononucleosomes

Chromatin Fragmentation

Chromatin Immunoprecipitation (ChIP)

Detection

Sequencing Tags

BIOINFORMATICS: TURNING SEQUENCES INTO NUCLEOSOMES

Mapping DNA Sequence Tags

From Mapped Tags to Nucleosomes

Data Analysis Pipeline

ANALYSIS STRATEGIES: FROM COORDINATES TO NUCLEOSOME ORGANIZATION

Nucleosome Occupancy

Nucleosome Positioning

PERSEPCTIVE

Acknowledgments

Footnotes

References

Full text links

Citations & impact

Impact metrics

Citations of article over time

Alternative metrics

Article citations

Other citations

Wikipedia

Similar Articles

Funding

NHGRI NIH HHS (5)﻿

Partnerships & funding

NHGRI NIH HHS (5)