Abstract
Free full text
Receptor Recognition Mechanisms of Coronaviruses: a Decade of Structural Studies
Receptor recognition by viruses is the first and essential step of viral infections of host cells. It is an important determinant of viral host range and cross-species infection and a primary target for antiviral intervention. Coronaviruses recognize a variety of host receptors, infect many hosts, and are health threats to humans and animals. The receptor-binding S1 subunit of coronavirus spike proteins contains two distinctive domains, the N-terminal domain (S1-NTD) and the C-terminal domain (S1-CTD), both of which can function as receptor-binding domains (RBDs). S1-NTDs and S1-CTDs from three major coronavirus genera recognize at least four protein receptors and three sugar receptors and demonstrate a complex receptor recognition pattern. For example, highly similar coronavirus S1-CTDs within the same genus can recognize different receptors, whereas very different coronavirus S1-CTDs from different genera can recognize the same receptor. Moreover, coronavirus S1-NTDs can recognize either protein or sugar receptors. Structural studies in the past decade have elucidated many of the puzzles associated with coronavirus-receptor interactions. This article reviews the latest knowledge on the receptor recognition mechanisms of coronaviruses and discusses how coronaviruses have evolved their complex receptor recognition pattern. It also summarizes important principles that govern receptor recognition by viruses in general.
Coronaviruses (CoV) are a group of common, ancient, and diverse viruses. They infect many mammalian and avian species and cause respiratory, gastrointestinal, and central nervous system diseases (1, 2). Coronavirus virions contain an envelope, a helical capsid, and a single-stranded and positive-sense RNA genome. The length of their genomes, which are the largest among all RNA viruses, typically ranges between 27 and 32 kb. They were named “coronaviruses” because of the protruding spike proteins on their envelope that give the virions a crown-like shape (“corona” in Latin means crown). Coronaviruses belong to the Coronaviridae family in the order of Nidovirales. They can be classified into at least three major genera, α, β, and γ (formerly group 1, 2, and 3, respectively) (3). Prototypic α-genus coronaviruses include human coronavirus NL63 (HCoV-NL63), porcine transmissible gastroenteritis coronavirus (TGEV), and porcine respiratory coronavirus (PRCV). Prototypic β-genus coronaviruses include severe acute respiratory syndrome coronavirus (SARS-CoV), Middle East respiratory syndrome coronavirus (MERS-CoV), mouse hepatitis coronavirus (MHV), and bovine coronavirus (BCoV). Prototypic γ-genus coronaviruses include avian infectious bronchitis virus (IBV). These three major coronavirus genera and their prototypic coronaviruses are the focus of this review article (Fig. 1).
Coronaviruses impose health threats to humans and animals. Two β-coronaviruses, SARS-CoV and MERS-CoV, are highly pathogenic human pathogens. SARS-CoV caused the SARS epidemic in 2002 to 2003, with over 8,000 infections and a fatality rate of ~10% (4,–7). MERS-CoV emerged from the Middle East in 2012. As of 16 October 2014, MERS-CoV had caused 877 infections with a fatality rate of ~36% (http://www.who.int/csr/don/16-october-2014-mers/en/) (8, 9). HCoV-NL63 from the α-genus is a prevalent human respiratory pathogen that is often associated with common colds in healthy adults and acute respiratory diseases in young children (10, 11). Among the animal coronaviruses, TGEV from the α-genus and MHV from the β-genus cause close to 100% fatality in young pigs and young mice, respectively (12,–15); BCoV from the β-genus and IBV from the γ-genus also cause significant health damage in cattle and chickens, respectively (16,–19). Therefore, research on coronaviruses has strong health and economic implications.
Receptor recognition by viruses is the first and essential step of viral infections of host cells (20). An envelope-anchored spike protein mediates coronavirus entry into host cells by first binding to a receptor on the host cell surface and then fusing viral and host membranes (21, 22). A member of the class I viral membrane fusion proteins (23,–26), the coronavirus spike consists of three segments—an ectodomain, a single-pass transmembrane anchor, and a short intracellular tail (27, 28). The ectodomain can be divided into a receptor-binding S1 subunit and a membrane-fusion S2 subunit. The amino acid sequences of S1 diverge across different genera but are relatively conserved within each genus (29). S1 contains two independent domains, an N-terminal domain (S1-NTD) and a C-terminal domain (S1-CTD, also called S1 C-domain) (Fig. 1) (29). Either or both of these S1 domains can function as a receptor-binding domain (RBD). The binding interaction between coronavirus RBD and its receptor is one of the most important determinants of the coronavirus host range and cross-species infection (2, 30). In addition, coronavirus RBDs contain major neutralization epitopes, induce most of the host immune responses, and may serve as subunit vaccines against coronavirus infections (31,–36). Knowledge about the receptor recognition mechanisms of coronaviruses is critical for understanding coronavirus pathogenesis and epidemics and for human intervention in coronavirus infections.
Coronaviruses recognize a variety of host receptors (Fig. 1). Although HCoV-NL63 and SARS-CoV belong to the α-genus and β-genus, respectively, their S1-CTDs recognize the same receptor, angiotensin-converting enzyme 2 (ACE2) (37,–43). Although HCoV-NL63, TGEV, and PRCV all belong to the α-genus, their S1-CTDs recognize different receptors—TGEV and PRCV S1-CTDs both recognize aminopeptidase N (APN) (44, 45). Similarly, although SARS-CoV and MERS-CoV both belong to the β-genus, their S1-CTDs recognize different receptors—MERS-CoV S1-CTD recognizes dipeptidyl peptidase 4 (DPP4) (46,–48). Although MHV and BCoV both belong to the β-genus, their S1-NTDs recognize carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1) and sugar, respectively (49,–53). In addition, the S1-NTDs of α-genus TGEV and γ-genus IBV also recognize sugar (52, 54,–58). Overall, coronaviruses have evolved a complex receptor recognition pattern: (i) coronaviruses use one or both S1 domains as RBDs; (ii) highly similar coronavirus S1-CTDs within the same genus can recognize different protein receptors, whereas very different coronavirus S1-CTDs from different genera can recognize the same protein receptor; and (iii) coronavirus S1-NTDs can recognize either protein or sugar receptors. Understanding the receptor recognition mechanisms of coronaviruses can provide critical insight into the origin, evolution, and receptor selection of coronaviruses.
In addition to their viral receptor functions, the receptors for coronaviruses have their own physiological functions. ACE2 is a zinc-dependent carboxypeptidase that cleaves one residue from the C terminus of angiotensin peptides and functions in blood pressure regulation (59,–62). ACE2 also protects against severe acute lung failure, and SARS-CoV-induced downregulation of ACE2 promotes lung injury (63, 64). APN is a zinc-dependent aminopeptidase that cleaves one residue from the N terminus of many physiological peptides and plays multifunctional roles such as in pain regulation, blood pressure regulation, and tumor cell angiogenesis (65, 66). DPP4 is a serine exoprotease that cleaves two residues from the N terminus of many physiological peptides and functions in immune regulation, signal transduction, and apoptosis (67,–70). CEACAM1 is a cell adhesion molecule and functions in cell-cell adhesion (71,–73). Sugars decorate many proteins and fats on cell surfaces and function in many biological processes such as immunity and cell-cell communication (57, 74, 75). How these cell-surface molecules are selected by viruses as their entry receptors has been a major puzzle in virology.
Analyses of crystal structures of coronavirus S1 domains and their complexes with their respective receptor have elucidated many puzzles associated with coronavirus-receptor interactions. Since the SARS epidemic, the crystal structures of five coronavirus S1 domains complexed with their respective receptor have been determined. These are the β-genus SARS-CoV S1-CTD complexed with human ACE2 (76), β-genus MERS-CoV S1-CTD complexed with human DPP4 (77, 78), α-genus HCoV-NL63 S1-CTD complexed with human ACE2 (79), α-genus PRCV S1-CTD complexed with porcine APN (80), and β-genus MHV S1-NTD complexed with murine CEACAM1 (81). In addition, the crystal structure of β-genus BCoV S1-NTD by itself has been determined, with its sugar-binding site identified through mutagenesis (53). These six representative structures not only reveal how coronaviruses recognize their receptors in atomic details but also shed light on how coronaviruses do so using complicated evolutionary strategies. Other than these six representative structures, several variant forms of these structures have also been determined, including S1-CTDs of different SARS-CoV strains complexed with ACE2 from animals and S1-CTD of a MERS-CoV-related bat coronavirus HKU4 complexed with human DPP4 (82,–84). This article reviews these structural studies and their implications for the receptor recognition and evolution of coronaviruses.
β-genus SARS-CoV S1-CTD complexed with human ACE2 was the first crystal structure determined for a coronavirus S1 domain and S1 domain/receptor complex (Fig. 2A) (76, 85). SARS-CoV S1-CTD contains two subdomains: a core structure and an extended loop. The core structure consists of a five-stranded antiparallel β-sheet and several short connecting α-helices. The extended loop lies on one edge of the core structure and forms a gently concave surface with two ridges on both sides and a two-stranded antiparallel β-sheet sitting in the middle (Fig. 3A and andB).B). Because this extended loop makes all the contacts with ACE2, it has been termed receptor-binding motif (RBM). On the other hand, the peptidase domain of ACE2 has a claw-like structure with two lobes. The enzymatic active site of ACE2 is buried in a cavity surrounded by the two lobes. SARS-CoV S1-CTD binds to the outer surface of the N-terminal lobe, away from the peptidase active site. Consequently, SARS-CoV binding has no effect on the enzymatic activity of ACE2 and vice versa. The SARS-CoV-binding region on the ACE2 surface has been termed virus-binding motif (VBM). The RBM and VBM complement each other in shape and chemical details. The structure of SARS-CoV S1-CTD/ACE2 complex provided the first view of coronavirus S1 and S1/receptor complex and laid the foundation for future structural and evolutionary comparisons with other coronavirus S1 and S1/receptor complexes.
Comparative studies of the interactions between the S1-CTD from different SARS-CoV strains and ACE2 from different host species have elucidated the molecular and structural mechanisms by which SARS-CoV transmitted from animals to humans and caused the SARS epidemic (30, 83, 84, 86,–89). Two virus-binding hot spots have been identified in the VBM of ACE2, one centering on ACE2 residue Lys31 and the other centering on ACE2 residue Lys353 (Fig. 3C and andD).D). Both of these virus-binding hot spots consist of a salt bridge that is buried in a hydrophobic environment. Structure-guided functional studies revealed that both virus-binding hot spots provide significant energy to the virus-receptor binding interactions (90). Indeed, all of the naturally selected viral mutations in SARS-CoV RBM surround the two hot spots, with significant impact on the structures of the hot spots, the ACE2 binding affinity, and the host immune responses (84, 91). One of these viral mutations, K479N, facilitated transmission of SARS-CoV from palm civets to humans. Another viral mutation, S487T, facilitated transmission of SARS-CoV from human to human. These two mutations contributed significantly to the SARS epidemic in 2002 to 2003. The S1-CTD of a SARS-CoV-related Rs3367 bat coronavirus contains two asparagines at these two positions (corresponding to positions 479 and 487 in human SARS-CoV strains) (92). The first asparagine is favorable for human ACE2 binding, and the second one is less favorable. Thus, Rs3367 recognizes human ACE2 but probably less well than the human SARS-CoV strains do. For more details about how the structural analysis of SARS-CoV RBD/ACE2 interactions has provided insight into the SARS epidemic, please refer to another recent review article on this topic (30). These structural studies of SARS-CoV S1-CTD/ACE2 interactions demonstrate that it is critical to understand viral evolution, cross-species transmission, and epidemics within a detailed structural framework.
The crystal structures of β-genus MERS-CoV S1-CTD by itself and in complex with human DPP4 provided another view of coronavirus S1 and S1/receptor complex (Fig. 2B) (77, 78, 93). Like SARS-CoV S1-CTD, MERS-CoV S1-CTD also contains a core structure and an RBM. The core structures of MERS-CoV and SARS-CoV S1-CTDs are highly similar to each other, but their RBMs are markedly different, leading to different receptor specificities. The RBM of MERS-CoV S1-CTD mainly consists of a four-stranded β-sheet, in contrast to the loop-dominated RBM in SARS-CoV S1-CTD. Like the VBM for SARS-CoV on ACE2, the VBM for MERS-CoV is also located on the outer surface of DPP4, away from the peptidase active site. Whereas the conserved core structures of SARS-CoV and MERS-CoV S1-CTDs suggest a common evolutionary origin, the different RBMs of the two S1-CTDs indicate a divergent evolutionary pathway that has led to their recognition of different host receptors. The S1-CTDs of MERS-CoV and a highly related bat coronavirus HKU4 recognize DPP4 in very similar ways, suggesting a close evolutionary relationship between the two viruses (82, 94). In addition to enhancing the understanding of coronavirus evolution, the structure of MERS-CoV S1-CTD/DPP4 complex has important implications for understanding the host range and cross-species transmission of MERS-CoV (82, 94,–97).
α-Genus HCoV-NL63 S1-CTD complexed with human ACE2 was the first crystal structure determined for an α-coronavirus S1 domain (Fig. 2C) (79). This structure, along with the structure of β-genus SARS-CoV S1-CTD complexed with ACE2, provided the first view of how two different viruses recognize their common host receptor. The finding was intriguing. At first glance, HCoV-NL63 and SARS-CoV S1-CTDs are very different. The core structure of HCoV-NL63 S1-CTD is a β-sandwich consisting of two β-sheet layers stacked together through hydrophobic interactions, which is in contrast to the single β-sheet layer in the core structure of SARS-CoV S1-CTD. Their RBMs are also different. The RBMs of HCoV-NL63 S1-CTD are three short and discontinuous loops, whereas the RBM of SARS-CoV S1-CTD is a single long and continuous subdomain. Indeed, the protein-folding Dali server failed to detect any structural similarity between HCoV-NL63 and SARS-CoV S1-CTDs (98). However, structural topology analysis revealed that the secondary structural elements in HCoV-NL63 S1-CTD are connected in the same way as those in SARS-CoV S1-CTD, although two β-strands in the former (strands β-1 and β-4) become α-helices in the latter (helices α-1 and α-4) and another β-strand (strand β-1) in the former is missing altogether in the latter (Fig. 2E and andF)F) (29). These results suggest that HCoV-NL63 and SARS-CoV S1-CTDs share an evolutionary origin and that the structural differences between the two S1-CTDs result from extensive divergent evolution.
Despite their different tertiary structures, HCoV-NL63 and SARS-CoV S1-CTDs bind to a common region on ACE2 (79, 90). The VBMs for the two viruses on ACE2 overlap, and a number of ACE2 residues interact with both S1-CTDs (Fig. 3E and andF).F). Surprisingly, one of the two virus-binding hot spots on ACE2 for SARS-CoV binding, which centers on ACE2 residue Lys353, plays a similarly critical role in the binding of HCoV-NL63 (Fig. 3G). Disturbance of the hot spot structure via mutagenesis decreased or abolished the binding of both viruses. Hence, Lys353 and the nearby residues on ACE2 form a common virus-binding hot spot that is critical for the attachment of two different coronaviruses. On the other hand, among the three RBMs in HCoV-NL63 S1-CTD, only RBM1 and RBM2, but not RBM3, are involved in binding the common virus-binding hot spot on ACE2, despite the fact that RBM3 is topologically equivalent to the RBM in SARS-CoV S1-CTD (Fig. 2A, ,C,C, ,E,E, and andF).F). The different molecular mechanisms used by the two S1-CTDs to recognize ACE2 suggest a convergent evolutionary relationship between the two S1-CTDs (i.e., the two S1-CTDs evolved independently to recognize the same virus-binding hot spot on ACE2), although a divergent evolutional relationship cannot be completely ruled out (i.e., the two S1-CTDs both evolved from a common ancestral protein that bound ACE2). Therefore, after HCoV-NL63 and SARS-CoV S1-CTDs underwent divergent evolution to attain different structures, they might have further converged to recognize the same region on the same receptor. The common virus-binding hot spot on ACE2 might be the driving force for this later convergent evolution.
The crystal structure of α-genus PRCV S1-CTD complexed with porcine APN illustrated how another similar α-coronavirus S1-CTD recognizes a different host receptor (Fig. 2D) (80). Similarly to the structural relationship between SARS-CoV and MERS-CoV S1-CTDs, PRCV and HCoV-NL63 S1-CTDs also have highly similar core structures. However, their three RBMs are divergent, leading to different receptor specificities. Similarly to the VBMs on ACE2 and DPP4, the VBMs for PRCV on APN are also located on the outer surface of APN, away from the peptidase active site. Overall, these results suggest that PRCV and HCoV-NL63 S1-CTDs share an evolutionary origin but have diverged in their RBM loops to recognize different host receptors.
We propose the following evolutionary scenario for coronavirus S1-CTDs (Fig. 4). All coronavirus S1-CTDs likely shared one evolutionary origin, as evidenced by their related structural topologies across different genera (Fig. 2E and andF).F). Through divergent evolution, coronavirus S1-CTDs attained β-sandwich core structures in the α-genus and β-sheet core structures in the β-genus. Although the structures of γ-coronavirus S1-CTDs are not known, their core structures may also have a topology related to those of α- and β-coronavirus S1-CTDs. Furthermore, α-coronavirus S1-CTDs diverged in the three RBM loops to acquire different receptor specificities—ACE2 specificity for HCoV-NL63 and APN specificity for PRCV. β-Coronavirus S1-CTDs also diverged in the RBM subdomain to acquire different receptor specificities—ACE2 specificity for SARS-CoV and DPP4 specificity for MERS-CoV. The S1-CTDs of α-genus HCoV-NL63 and β-genus SARS-CoV first diverged into different tertiary structures but later converged to recognize the same receptor ACE2. In sum, coronavirus S1-CTDs have undergone convoluted structural evolutions, leading to their complex receptor recognition pattern.
β-Genus MHV S1-NTD complexed with mouse CEACAM1 was the first structure available for a coronavirus S1-NTD and S1-NTD/receptor complex (Fig. 5A) (81). Surprisingly, MHV S1-NTD contains a core structure that has the same structural fold as human galectins (galactose-binding lectins) (Fig. 5C) (99). The core structure of MHV S1-NTD is a thirteen-stranded β-sandwich consisting of two β-sheet layers of six and seven strands, respectively. The structural topologies of MHV S1-NTD and human galectins are identical, except that MHV S1-NTD contains two additional β-strands in one of the β-sheet layers (Fig. 5D and andE).E). Compared with human galectins, MHV S1-NTD contains additional structural motifs on top of the core that form a ceiling-like structure. The outer surface of this ceiling-like structure functions as RBM by binding to the VBM on the N-terminal Ig-like domain of CEACAM1. Despite its galectin fold, MHV S1-NTD does not bind sugars, as revealed by sugar-binding assays. Moreover, neither the RBM on MHV S1-NTD nor the VBM on CEACAM1 contains any sugar at the binding interface. Instead, MHV S1-NTD binds to CEACAM1 through exclusive protein-protein interactions. A hydrophobic patch in the VBM of CEACAM1 functions as a virus-binding hot spot; mutations in this region significantly decreased the binding of MHV S1-NTD (81, 100,–102). Taken together, these results suggest that MHV S1-NTD and host galectins share the same evolutionary origin; they also indicate that although MHV S1-NTD binds only a CEACAM1 protein receptor, other coronavirus S1-NTDs may bind sugar receptors and function as viral lectins.
Analysis of the crystal structure of β-genus BCoV S1-NTD provided the first view of a functional lectin domain in a coronavirus spike (Fig. 5B) (53). The overall structure of BCoV S1-NTD is highly similar to that of MHV S1-NTD, also containing a galectin-like core and a ceiling-like structure on top of the core. In contrast to MHV S1-NTD, which binds CEACAM1 but not sugars, BCoV S1-NTD binds a sugar receptor but not CEACAM1. Glycan screen arrays identified Neu5,9Ac2 (5-N-acetyl-9-O-acetylneuraminic acid) as the sugar receptor for BCoV S1-NTD. Although the structure of a sugar-bound BCoV S1-NTD is not available, structure-guided mutagenesis has revealed that the sugar-binding site is located in a pocket surrounded by the core and the ceiling-like structure on top of the core. The sugar-binding sites in BCoV S1-NTD and human galectins overlap, although human galectins recognize a different sugar receptor, galactose. Structural comparison between MHV and BCoV S1-NTDs revealed that subtle structural changes between the two S1-NTDs, mainly involving different conformations of RBM loops, explain why BCoV S1-NTD does not bind CEACAM1 and why MHV S1-NTD does not bind sugars. These results suggest that MHV and BCoV S1-NTDs are both evolutionarily related to human galectins but that they have diverged from human galectins with specificities for a novel protein receptor and a different sugar receptor, respectively.
We propose the following evolutionary scenario for coronavirus S1-NTDs (Fig. 6). Ancestral coronaviruses stole a host galectin gene and inserted it into the 5′ end of their spike gene, which became coronavirus S1-NTD. Since then, coronavirus S1-NTDs have undergone divergent evolution in three genera. β-Genus BCoV S1-NTD has kept the lectin activity but evolved specificity for a different sugar receptor, Neu5,9Ac2. Although the crystal structures of α- and γ-coronavirus S1-NTDs are not available, they may also have the galectin fold for the following reasons. First, the conserved structural topology of S1-CTDs across different coronavirus genera strongly suggests a similarly conserved structural topology of S1-NTDs across different coronavirus genera. Second, the S1-NTDs of both α-genus TGEV and γ-genus IBV function as lectins, although the former recognizes both N-glycolylneuraminic acid (Neu5Gc) and N-acetylneuraminic acid (Neu5Ac) and the latter recognizes Neu5Gc. Hence, sugar-binding S1-NTDs across different coronavirus genera may share the same galectin fold but have diverged to recognize different sugar receptors. On the other hand, β-genus MHV S1-NTD has evolved specificity for a novel protein receptor, CEACAM1. Subsequently, MHV S1-NTD lost its lectin activity because proteins in general have advantages over sugars as viral receptors by providing higher affinity and specificity for viral attachment.
Are coronaviruses the only viruses that stole a host lectin and integrated it into their spike? A survey of viral lectins with known tertiary structures revealed that galectin-like domains are present in a variety of viral spikes, including influenza virus hemagglutinin, whose galectin-like fold was previously unknown (24, 103). Moreover, these viral lectins display diverse sugar-binding modes, but they share a feature—their sugar-binding sites are all located in cavities and are not easily accessible to host antibodies and immune cells. As a comparison, the sugar-binding sites in host galectins are open and easily accessible (Fig. 5C). It was thus hypothesized that these viral lectins all originated from host galectins but have evolved to use hidden sugar-binding sites to evade host immune surveillance (104). The above analysis may explain why coronavirus S1-NTDs have evolved the ceiling-like structure on top of the core, which is used to protect the sugar-binding site in coronavirus S1-NTDs from the host immune system. Subsequently, MHV S1-NTD took advantage of the ceiling-like structure and evolved CEACAM1-binding RBM on the outer surface of this ceiling-like structure. In this sense, the evolution of CEACAM1-binding RBM in MHV S1-NTD might be an indirect outcome of the efforts of coronaviruses to battle the host immune attacks.
So far, we have reviewed the receptor recognition and evolution of coronavirus S1-NTDs and S1-CTDs separately. How do S1-NTDs and S1-CTDs work together in the receptor recognition and evolution of coronavirus spikes? Electron microscopic studies of the SARS-CoV spike revealed that it is a clove-shaped trimer, with three individual S1 heads and a trimeric S2 stalk (Fig. 7) (27, 28). ACE2 binds to the tip of the SARS-CoV spike trimer, where S1-CTD is located. Because the membrane-distal tips of the trimeric spike are the most exposed and protruding region on the whole spike, S1-CTD is directly exposed to the host immune system, evolves at an increased pace to evade the host immune surveillance, and becomes hypervariable in primary, secondary, and tertiary structures. The RBM of S1-CTD is located on the very tip of the trimeric spikes and evolves at the fastest pace. On the other hand, S1-NTD is likely located underneath S1-CTD, is less exposed to the host immune system, and evolves at a slower pace than S1-CTD. Therefore, between the two S1 domains, the more conserved S1-NTDs may function as the more reliable RBDs that recognize sugar receptors, allowing coronaviruses to search for additional and high-affinity protein receptors using their fast-evolving S1-CTDs. Such dual-RBD structures in coronavirus spikes may give coronaviruses an evolutionary advantage in finding new receptors and expanding their host ranges.
Why were specific host cell surface molecules selected as coronavirus receptors? Among the known coronavirus receptors, sugars are probably the primordial and fallback receptors for coronaviruses. Sugars are abundant on host cell surfaces and are easy targets for viruses to grab. To use sugars as their receptors, a variety of viruses might have stolen a host galectin and used it as a viral lectin. On the other hand, using protein receptors may enhance the affinity and specificity of viral attachment, increase the efficiency of viral entry, and facilitate viruses to expand their host ranges and alter their tropisms (105). Host cell surface proteins have some common features as viral receptors. First, they frequently undergo endocytosis, which facilitates viral entry. Second, they contain VBM on their surfaces for high-affinity virus binding. In the VBMs of both ACE2 and CEACAM1, virus-binding hot spots have been identified and contribute significant energy to virus/receptor binding interactions (79, 81, 90). Therefore, host cell surface molecules are not randomly selected by viruses as their receptors. In fact, there are structural and evolutional reasons behind these selections by viruses.
The structural studies of coronavirus-receptor interactions described above have established the following virology principles. First, drastic structural changes in viral RBDs can still lead to recognition of a virus-binding hot spot on the same receptor protein. Supporting this principle is the finding that SARS-CoV and HCoV-NL63 recognize a common virus-binding hot spot on ACE2 using structurally divergent S1-CTDs. Second, subtle structural changes in viral RBDs can lead to a complete receptor switch. For example, HCoV-NL63 and PRCV recognize two different protein receptors using structurally conserved S1-CTDs with divergent RBMs, and so do SARS-CoV and MERS-CoV. Moreover, MHV and BCoV S1-NTDs recognize a protein receptor and a sugar receptor, respectively, through subtle conformational changes in receptor-binding loops. Third, it is a successful viral strategy to steal a host protein and evolve it into viral RBDs with novel protein receptor specificities or altered sugar receptor specificities. For example, MHV and BCoV S1-NTDs have the same structural fold as human galectins, but they recognize a novel protein receptor and a different sugar receptor, respectively. Fourth, a few residue changes at the receptor binding interface can lead to efficient cross-species infection and human-to-human transmission of a virus. For example, SARS-CoV needed only one or two mutations in its RBD to transmit from palm civets to humans. These virology principles may be extended from the Coronaviridae family to other virus families.
What are the remaining important questions regarding receptor recognition mechanisms of coronaviruses? First, what are the crystal structures of α-coronavirus S1-NTDs, γ-coronavirus S1-NTDs, and γ-coronavirus S1-CTDs? We have hypothesized that α-coronavirus and γ-coronavirus S1-NTDs have a galectin fold and that γ-coronavirus S1-CTDs have either a β-sandwich fold or a β-sheet fold. These hypotheses need to be tested using experimentally determined crystal structures of these S1 domains. Second, what are the detailed sugar-binding mechanisms for coronavirus S1-NTDs? The crystal structures of coronavirus S1-NTDs complexed with sugar receptors will reveal how sugar receptor specificities are achieved in these viral lectins across different coronavirus genera. Third, why do coronaviruses rely on peptidases as their receptors? Three of the four known protein receptors for coronaviruses are peptidases: ACE2, APN, and DPP4. They are all recognized by S1-CTDs of different coronaviruses. It is highly unlikely that the use of peptidases as coronavirus receptors is simply a coincidence. On the other hand, these receptors' peptidase activities have no effects on coronavirus entry, indicating that their common physiological function in degrading peptides was not the reason why they were selected as coronavirus receptors. To fully understand why peptidases became chosen receptors for coronaviruses, it will be important in the future to comprehensively examine the physiological functions of these peptidase receptors. Last, what was the evolutionary origin of coronavirus S1-CTDs? So far, coronavirus S1-CTDs appear to have a novel fold not related to any other proteins in the protein structure database. However, our previous structural studies of coronavirus spikes repeatedly showed that tertiary structures of viral proteins can deceive the currently available tertiary structural analysis software (98). Instead, our structural topology analysis is a powerful tool to identify structural homology among viral proteins (29, 103). This approach may help identify the evolutionary origin of coronavirus S1-CTDs. To sum up, structural studies in the past decade have elucidated many puzzles surrounding receptor recognition, evolution, and cross-species transmission of coronaviruses. Future structural studies will continue to solve the remaining puzzles as well as new puzzles that may emerge regarding the receptor recognition mechanisms of coronaviruses.
This work was supported by NIH grant R01AI089728.
Fang Li is an Associate Professor of Pharmacology at the University of Minnesota. He received his Ph.D. in Structural Biology from Yale University and postdoctoral training in Structural Virology from Harvard Medical School. He started to work on structural biology of coronaviruses in 2003, motivated by the SARS outbreak that swept the world that year. Since his publication of the crystal structure of SARS-CoV receptor-binding protein complexed with its human receptor in 2005, he has determined a number of crystal structures of other coronavirus receptor-binding proteins complexed with their respective receptor. In addition to solving structures, he has identified the host receptor for bat coronavirus HKU4 and revealed the cell entry mechanism of MERS coronavirus. His research interests cover how viruses explore different host receptors and other host factors to expand their host ranges and how they transmit from animals to humans to cause epidemics.
Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)
Full text links
Read article at publisher's site: https://doi.org/10.1128/jvi.02615-14
Read article for free, from open access legal sources, via Unpaywall: https://europepmc.org/articles/pmc4338876?pdf=render
Citations & impact
Impact metrics
Article citations
Design of customized coronavirus receptors.
Nature, 30 Oct 2024
Cited by: 0 articles | PMID: 39478224
Phylogenetic Analysis of Porcine Epidemic Diarrhea Virus (PEDV) during 2020-2022 and Isolation of a Variant Recombinant PEDV Strain.
Int J Mol Sci, 25(20):10878, 10 Oct 2024
Cited by: 0 articles | PMID: 39456662 | PMCID: PMC11507624
Isolation and characterization of a novel S1-gene insertion porcine epidemic diarrhea virus with low pathogenicity in newborn piglets.
Virulence, 15(1):2397512, 16 Sep 2024
Cited by: 0 articles | PMID: 39282989 | PMCID: PMC11407387
Structural basis for mouse receptor recognition by bat SARS2-like coronaviruses.
Proc Natl Acad Sci U S A, 121(32):e2322600121, 31 Jul 2024
Cited by: 0 articles | PMID: 39083418 | PMCID: PMC11317568
Investigation of Transmission and Evolution of PEDV Variants and Co-Infections in Northeast China from 2011 to 2022.
Animals (Basel), 14(15):2168, 25 Jul 2024
Cited by: 0 articles | PMID: 39123693 | PMCID: PMC11311072
Go to all (360) article citations
Data
Data behind the article
This data has been text mined from the article, or deposited into data resources.
BioStudies: supplemental material and supporting data
Protein structures in PDBe (Showing 7 of 7)
-
(1 citation)
PDBe - 3KBHView structure
-
(1 citation)
PDBe - 3R4DView structure
-
(1 citation)
PDBe - 4F5CView structure
-
(1 citation)
PDBe - 2AJFView structure
-
(1 citation)
PDBe - 4KR0View structure
-
(1 citation)
PDBe - 4H14View structure
-
(1 citation)
PDBe - 1A3KView structure
Show less
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
Funding
Funders who supported this work.
NIAID NIH HHS (2)
Grant ID: R01AI089728
Grant ID: R01 AI089728