Nothing Special   »   [go: up one dir, main page]

Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Jul 1.
Published in final edited form as: Nat Methods. 2013 Nov 10;11(1):63–65. doi: 10.1038/nmeth.2727

The Local Resolution of Cryo-EM Density Maps

Alp Kucukelbir 1, Fred J Sigworth 1,2, Hemant D Tagare 1,3
PMCID: PMC3903095  NIHMSID: NIHMS532456  PMID: 24213166

Abstract

Using local sinusoidal features in a standard statistical testing framework, we propose a definition of local resolution for 3D density maps. The resulting algorithm has no free parameters and may be extended to other imaging modalities. Evaluating the local resolution of single particle reconstructions and sub-tomogram averages shows variable resolution across a 4 to 40Å range.


Various resolution measures for electron cryo-microscopy (cryo-EM) have been proposed in the past three decades1. Unlike the classical “Rayleigh” resolution which characterizes instruments, these measures characterize features present in the data. A commonly used cryo-EM resolution measure is the Fourier shell correlation (FSC) procedure. It quantifies the strength, relative to noise, of sinusoidal features across the entire density map. FSC produces a single resolution for the entire density map. FSC cannot assess locally varying resolution, which may be caused by sample heterogeneity and image processing errors2. The goal of this paper is to overcome this limitation of FSC by presenting a definition of local resolution which can assess variable resolution.

As a resolution measure, FSC has other limitations, too. FSC resolution depends on the computational stage where the data are split3. Further, calculating the resolution from FSC requires a threshold, whose value and interpretation has been debated1. Alternative approaches4;5 address some of these shortcomings, but do not define local resolution.

Recent structural studies6;7 use windowed FSC for local resolution8. Windowed FSC masks the split-dataset density maps with a window and calculates FSC resolutions as the window moves through the map. This requires a window size parameter, whose value is often arbitrary. While this approach implicitly conducts multiple tests on the density map, it does not control the false discovery rate (FDR) in the thresholding of the FSC. FDR control is critical because local resolution tests are repeated at many points in the volume. Additionally, there is data dependency between neighboring points which windowed FSC does not account for.

We propose a mathematical theory and an efficient algorithm for measuring local resolution that address all of the above limitations. The theory is based on the following idea: A λ Å feature exists at a point in the volume if a 3D local-sinusoid of wavelength λ is statistically detectable above noise at that point. A likelihood-ratio hypothesis test of the local-sinusoid vs. noise can detect this feature at a given p-value (typically p = 0.05). We define the local resolution at a point as the smallest λ at which the local-sinusoid is detectable, accounting for multiple testing with an FDR procedure.

Our algorithm, named ResMap, implements this theory. ResMap begins by initializing a local-sinusoid model at λ = 2μ, where μ is the voxel spacing in Å. Likelihood ratio tests are conducted at all voxels in the volume, with explicit FDR control that accounts for data dependency. Voxels that pass the test are assigned resolution λ, while those that fail are tested at a larger λ. The algorithm produces a local resolution map with a number assigned to every voxel in the density map (Fig. 1a). There are no algorithm parameters to tune and we may unambiguously speak of finding local resolution at the given p-value.

Figure 1.

Figure 1

Local resolution. (a) The ResMap algorithm. Wavelength λ is initialized to twice the voxel spacing. Likelihood ratio tests decide whether the local-sinusoid model is detectable at each voxel. Voxels that pass the test are controlled for false discoveries. Voxels that fail the test are tested again after increasing λ (Online Methods). (b) Cosine- and sine-like H2 functions oriented along an axis. White and black indicate negative and positive parts, respectively (Supplementary Note 1). (c) Left, slice through noisy simulated density maps with voxel spacing 1Å. Right, radial plots. ResMap-H2 resolution estimates show a steady improvement as the simulated signal becomes more finely varying. Bottom, corresponding results for 1/f noise display robustness against non-white noise (Supplementary Fig. 1).

In ResMap, local-sinusoids of wavelength λ are approximated by a set of functions called H2. This set is derived from Gaussian windowed second-order Hermite polynomials9;10, with window size proportional to the wavelength λ (Fig. 1b and Online Methods). ResMap results with H2 are specifically denoted as ResMap-H2. H2 functions are steerable, so their linear combination can locally model any arbitrarily oriented local-sinusoid in 3D (Supplementary Note 1).

At a fixed wavelength λ, the standard likelihood-ratio test11 can detect whether a local-sinusoid is present in the steerable function approximation. The test requires an estimate of the noise variance, which we obtain from the region surrounding the particle. The likelihood-ratio test does not depend on how this variance is estimated. Other noise estimates, such as those obtained by analyzing split-dataset reconstructions, can also be used. The smallest λ at which the likelihood-ratio test passes at a given p-value defines the resolution. We control for false discoveries using a method that takes into account the dependencies between tests12 (Online Methods).

We first evaluated ResMap using a simulated density map of a radially symmetrical ‘chirp signal’ whose wavelength decreased with radius. We added white and non-white noise with two different variance levels (Supplementary Fig. 1). ResMap-H2 estimates show an intuitive relation to the underlying signal features (Fig. 1c). Further, increasing the noise worsens the resolution at every point. ResMap-H2 results for this simulation exhibit a ripple in the transitions between the peaks and valleys of the signal. This is because transitions have more energy in the higher frequencies and are thus detectable with local sinusoids of smaller scale.

We then tested ResMap with four different density maps ranging from near-atomic (~4Å) single particle reconstructions to typical sub-tomogram averages (~40Å). All results were obtained with a p-value of 0.05. We compare ResMap-H2 results to regular and gold-standard3 FSC plots, and windowed FSC maps.

First, we analyzed a single particle 80S ribosome reconstruction (EMD-2275)13. The original publication estimates a resolution of 4.5Å (gold-standard FSC at 0.143) and notes the blurring from the heterogeneity in the 40S subunit (Fig. 2a). ResMap-H2 resolution estimates fall between 4.5 and 5.5Å in the 60S subunit and between 4.5 and 9Å in the 40S subunit. Some parts of the 40S are just as resolved as the 60S, which ResMap-H2 results show in the portion of 40S adjacent to 60S. The median ResMap-H2 resolutions in the 40S and 60S subunits are 6.5 and 5Å, respectively, which agree with a map-vs-atomic-model FSC plot (Fig. 3J in the original publication13). ResMap-H2 results additionally point to a decrease in resolution near the edges of the particle. This may be due to image alignment errors or the interaction of the ribosome with the solvent.

Figure 2.

Figure 2

ResMap-H2 results using experimental density maps. Color bars apply to both volumes and slices. 3D visualizations are rendered using UCSF Chimera17. White dotted lines in color bars indicate FSC 0.143 and 0.5 thresholds from the original publications. (a) 80S ribosome (EMD-2275). ResMap-H2 results indicate a decreased resolution within the 40S subunit and near the edges of the particle. (b) Tulane virus (EMD-5529). ResMap estimates lower resolutions in the protruding domains while the shell appears well-resolved. (c) Subtomogram GroEL (uncropped version of EMD-2221). ResMap-H2 results show an α-helix with varying levels of resolution (Supplementary Vid. 1). (d) Subtomogram ATP synthase dimer (uncropped version of EMD-2161). ResMap delineates the central dimer as better resolved than the neighboring dimers and membrane.

Second, we analyzed a single particle Tulane virus reconstruction (EMD-5529)14. The original publication estimates a 6.3Å (gold-standard FSC at 0.143) resolution for the entire particle and highlights the significant flexibility of the protruding domains of the virus. ResMap corroborates these findings, estimating the resolution of the shell between 6 and 7Å and the resolution of the protruding domains between 7 and 9Å (Fig. 2b).

Third, we analyzed a sub-tomogram average of GroEL (EMD-2221)15. The original publication reports a 8.4Å (FSC at 0.5) resolution. ResMap-H2 estimates suggest that many α-helices are resolved up to 7.5Å (Fig. 2c). This is evident in the close-up rendering that displays the central part of a helix at 7.5Å but the end and adjoining loop at about 9.5Å. These results are corroborated in Supplementary Video 1 where the central part of the helix is shown to maintain its tubular structure under a range of surface threshold values.

Finally, we analyzed a sub-tomogram average of ATP synthase dimers (EMD-2161)16. The original publication estimates a 37Å (FSC at 0.5) resolution. ResMap-H2 resolution estimates are between 30 and 42Å in the central dimers, which are better resolved than the neighboring dimers and membrane (Fig. 2d). The edges of the central dimers appear to be at a higher resolution than their cores. This is likely due to the strong dark bands surrounding the particle, as is typical in particles reconstructed without contrast transfer function correction.

ResMap is consistent with windowed FSC, but differs in some important aspects. Windowed FSC can be slow, taking anywhere between 25 minutes and 4 hours to compute, depending on the window size. ResMap usually requires a few minutes. More importantly, windowed FSC results appear to be sensitive to the fixed size of the user-defined window (Supplementary Fig. 2). Too large a window size may include the solvent in the FSC computation and lead to underestimation of resolution (Supplementary Fig. 3). ResMap does not suffer from this effect because the spatial-frequency domain localization uncertainty is minimized by the adaptively varying widths of the H2 functions. The behavior of ResMap for typical p-values is shown in Supplementary Figure 4.

For all cases above, ResMap-H2 local resolutions within the reconstructed particle are almost always between the 0.5 and 0.143 threshold of the FSC in the original publications. This is consistent with the idea that the 0.5 threshold may be too conservative while the 0.143 threshold too optimistic 3;13. Moreover, ResMap-H2 results agree with published flexibility analyses and also visually match the level of detail in the density maps.

We have released ResMap as a cross-platform executable package with a simple graphical user interface (Supplementary Fig. 5). The software and test data are publicly accessible (Supplementary Software; http://resmap.sourceforge.net). ResMap can also be applied to other fields by choosing features other then local sinusoids. For instance, a 2D Gaussian feature may be appropriate for optical nanoscopy, while rotated 2D arcs may be of use in radio telescopy.

The anticipated2 recent increase in heterogeneity studies6;7;13 highlights the pressing challenge of evaluating the local resolution of cryo-EM density maps. A local resolution method that is both statistically rigorous and practical is therefore a critical step in enabling researchers to assess the quality of cryo-EM density maps.

Methods

Methods and any associated references are available in the online version of the paper.

Online Methods

This section presents mathematical details of the theory and algorithm. The mathematical formulation is in 3D, but actual computations are performed on column vectors, where the elements in 3D are inserted in order into the vector. To highlight this difference, 3D variables are displayed in ordinary type (A) and their vectorized counterparts in bold type (A).

Modeling the signal locally

We first describe how the density map can be approximated locally by any basis. Then, we introduce the 3D sinusoid-like feature basis used in our algorithm.

The 3D density map S is in a V × V × V voxel array. The voxels are indexed by discrete-valued coordinates x, y, z. We refer to a voxel in the array as v = (vx, vy, vz) where vx, vy, vz are its coordinates. The vectorized density map of S is a V3 × 1 column vector S.

Suppose Wv, α is a spherically symmetric Gaussian function centered at voxel v with scaling parameter α,

Wv,α(x,y,z)=exp(α2[(xvx)2+(yvy)2+(zvz)2]), (1)

and that φkv, α, k = 1, … , K are basis functions centered at v. We then have

Wv,αDS=Wv,αDΦv,αβ+η, (2)

which locally approximates the density map S by basis functions φkv, α. Here √WDv, α is a diagonal matrix with √Wv, α along the principal diagonal, Φv, α is a matrix whose columns are the basis functions φkv, α, β is a column vector of the coefficients of the basis functions, and η is zero-mean Gaussian noise with variance σ2. Note that the weighting function Wv, α determines the spatial extent of the local model.

To fit the local model to S we minimize the weighted residual sum of squares (WRSSv, α)

WRSSv,α=(SΦv,αβ)TWv,αD(SΦv,αβ)=Wv,αD(SΦv,αβ)2, (3)

with respect to β. The minimizing coefficient vector is denoted as β^.

3D sinusoid-like features

A natural basis for density maps is one containing rotated 3D sinusoids with wavelength λ Å. Unfortunately, describing all orientations in 3D requires an infinite number of basis functions. A computationally tractable alternative is to use steerable filters9;10, which we refer to as “steerable functions”. Steerable functions are a finite set of functions with the property that every 3D rotation of any of the functions is produced by linear combinations of the functions9 (Supplementary Note 1).

The steerable functions we use are the second-order Hermite polynomial and its approximate quadrature, multiplied by a Gaussian function. We call this set H2. The elements of H2 match cosine and sine functions up to their second order Taylor expansion terms. They can also be scaled such that their spectral peak occurs at any desired wavelength. The H2 steerable functions are constructed from a pair of functions

Gv,α(x,y,z)=PGv,α(x,y,z)Wv,α(x,y,z)=[4(α(xvx))22]Wv,α(x,y,z),Hv,α(x,y,z)=PHv,α(x,y,z)Wv,α(x,y,z)=[(α(xvx))32.254(α(xvx))]Wv,α(x,y,z), (4)

where Gv, α is the cosine-like function, Hv, α is the sine-like function. The scalar α controls the peak frequency and the width of the Gaussian function. Setting α = 2π/λ √2/√5 gives a spectral peak for Gv, α and Hv, α at wavelength λ (Supplementary Note 2).

The functions in Equation (4) are each composed of the Hermite polynomial (PGv,α) or its quadrature (PHv,α) multiplied by Wv, α, the spherically symmetric Gaussian weighting function. Their spectral peak occurs on the x-axis of the frequency domain because the functions are oriented along the the spatial x-axis. Rotating Gv, α and Hv, α so that their spectral peaks occur along the vertices and faces of an icosahedron respectively gives 6 + 10 3D steerable functions. The linear combination of these 16 functions produces all possible rotations of Gv, α and Hv, α in 3D10, thereby covering a shell in the 3D Fourier domain (Supplementary Note 1).

We denote the rotated Hermite polynomials as PGv,αi, i = 1, ⋯, 6 and PHv,αi, j = 1, ⋯, 10. These polynomials and the constant function 1 are our local-sinusoid signal model. Because the weighting function Wv, α appears outside of the basis matrix Φv, α in Equation (2), the Φv, α matrix need only contain the vectorized polynomials,

Φv,α=[1PGv,α1PGv,α6PHv,α1PHv,α10], (5)

where the bold face denotes vectorization.

Likelihood ratio testing

Testing whether the data in the neighborhood of voxel v support the local-sinusoid model is equivalent to testing the two hypotheses,

0:Theconstanttermβ0isunconstrained,butβ1==β16=0,1:Allofβ0,β1,,β16areunconstrained, (6)

where the null hypothesis ℋ0 states that the data do not support the local-sinusoid model (the coefficients of all local-sinusoid terms are zero). The alternate hypothesis ℋ1 allows the coefficients to take on any finite value.

The likelihood ratio test11 is a standard procedure for comparing such hypotheses. The test works by calculating the β’s that maximize the likelihood (probability) under each hypothesis and then taking the logarithm of the ratio of the maximized likelihoods. The β’s that maximize the likelihood are found by minimizing the WRSS from Equation (3) under ℋ0 and ℋ1, respectively. Many common statistical tests such as the Pearson χ2 test and the F-test are derived from the likelihood ratio test (Supplementary Note 3).

A simple calculation shows that the negative logarithm of the log-likelihood ratio, called the likelihood ratio statistic (LRS), is given by

LRS(S;v,α)=1σ2(Wv,αD(S1β0^)2Wv,αD(SΦv,αβ^)2),=ST(Γ0Γ)Sσ2,whereΓ0=Wv,αDWv,αD1(1TWv,αD1)11TWv,αD,Γ=Wv,αDWv,αDΦv,α(Φv,αTWv,αDΦv,α)1Φv,αTWv,αD. (7)

The LRS is a difference of weighted residuals between the null model fit and the local-sinusoid model fit. It takes large values when the local-sinusoid model is a better fit than the null model.

The likelihood ratio test is applied by comparing the LRS to a number c, defined by

Pr[ST(Γ0Γ)Sσ2<c]=1p, (8)

for some p-value p, usually 0.05. If LRS < c, then the data do not support the model and we accept the null hypothesis. Otherwise, we accept the hypothesis that the local-sinusoidal model fits the data.

Calculating the threshold c requires the statistical distribution of the LRS. Unfortunately, because of the weighting function Wv, α, the LRS does not have a closed form statistical distribution. However, ST0 − Γ) S asymptotically tends to a weighted sum of χ2 random variables Σr γrχ2 where γr are the eigenvalues18 of Γ0 − Γ. Fast and accurate numerical methods are available to compute such distributions19.

The LRS computation requires the value of the noise variance σ2. We estimate this variance accurately by taking non-overlapping cubes of voxels from the region of the density map surrounding the particle. We use the following variance estimator recommended for local signal modeling18

σ^2=1Bb=1BCbT(Γ0Γ)Cbtrace(Γ0Γ), (9)

where Cb is a cube of voxels from the background and B is the number of non-overlapping cubes that are available in the background. This estimator is robust to non-white noise as it only requires the noise spectrum to be relatively flat within the shell in 3D Fourier space that the local-sinusoid model, implicitly indicated by Γ, is approximating.

The noise variance may also be estimated from the difference map between split-dataset reconstructions. In this case, the estimator from Equation (9) is adjusted to accept cubes of voxels from the region inside the particle. The noise statistics inside and outside the particle are nearly identical (Supplementary Note 4). Both noise variance estimators are implemented in the accompanying software package (Supplementary Software; http://resmap.sourceforge.net).

Multiple testing correction

The likelihood ratio test chooses between two hypotheses at each voxel. Since this test has to be repeated for many voxels in S, some sort of false discovery rate (FDR) control is necessary. The tests in neighboring windows are not independent from each other, therefore we use the Benjamini-Yekutieli FDR procedure12 that accounts for dependencies between tests.

Summary

ResMap works by applying a hypothesis test at every voxel. The null hypothesis is that the data in the neighborhood of a voxel do not support a local sinusoid. The alternative hypothesis is that the data describe a local sinusoid. These features are modeled by 3D steerable functions. The likelihood ratio statistic is used to decide between the the hypotheses at a given p-value. Noise variance is estimated from the area surrounding the particle and multiple testing correction is applied to carry out the test at many voxels.

Supplementary Material

1
2

Acknowledgments

We are grateful to S. Scheres, G. Lander, A. Bartesaghi, and K. Davies for stimulating discussions and sharing their density maps for this study. This work was supported by NSERC award PGS-D3 (A.K.), and NIH grants R01LM010142 (H.D.T.), R01GM095658 (A.K., H.D.T.), and R01NS021501 (F.J.S.).

Footnotes

Author Contributions

A.K. and H.D.T. conceived the theory. A.K. developed the algorithm and performed experiments. A.K., F.J.S., and H.D.T. designed the experiments and wrote the manuscript.

Competing Financial Interests

The authors declare no competing financial interests.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

RESOURCES