Abstract
Free full text
ncRNAimprint: A comprehensive database of mammalian imprinted noncoding RNAs
Abstract
Imprinted noncoding RNAs (ncRNAs) are expressed mono-allelically in a parent-of-origin-dependent manner, which is mainly evident in mammals. Lying at a crossroad between imprinted genes and ncRNAs, imprinted ncRNAs show distinct features. They are likely to function in nontraditional ways compared to non-imprinted ncRNAs, and are much more responsible for the mechanism of genomic imprinting compared to imprinted protein-coding genes. An increasing number of human diseases have been shown to be related to abnormalities in imprinted ncRNAs. Due to their functional importance, many studies focusing on imprinted ncRNAs have been published in recent years; however, there is no systematic collection or description of imprinted ncRNAs and the rapidly growing knowledge is scattered in various places. Here, we describe a new database, ncRNAimprint, which serves as a comprehensive resource center for mammalian imprinted ncRNAs. A catalog of imprinted ncRNAs, including snoRNAs, microRNAs, piRNAs, siRNAs, antisense ncRNAs, and mRNA-like ncRNAs, was annotated in detail using information extracted from relevant literature and databases. Comprehensive collections of imprinted ncRNA-related diseases, imprinting control regions (ICRs), and imprinted regions were manually compiled to provide resources for current research focusing on imprinted ncRNAs. Small RNA deep sequencing reads that fully matched within imprinted regions were also included to offer useful evidence in detecting novel imprinted ncRNAs and to aid in analyzing expression patterns of known imprinted ncRNAs. A search page including four effective search forms and two graphical browsers was created for rapid retrieval and visualization of these data. The imprinted ncRNA database is freely accessible at http://rnaqueen.sysu.edu.cn/ncRNAimprint.
INTRODUCTION
Genomic imprinting is an epigenetic gene regulatory mechanism that results in mono-allelic gene expression in a parent-of-origin-dependent manner, which is mainly evident in placental mammals and flowering plants (Wood and Oakey 2006; Feil and Berger 2007). In mammals, imprinted genes include both protein-coding genes and noncoding RNAs (ncRNAs) (Morison et al. 2005; Royo and Cavaille 2008). At present, all known imprinted ncRNAs belong to the traditional categories of ncRNAs, including small nucleolar RNAs (snoRNAs), microRNAs (miRNAs), small interfering RNAs (siRNAs), Pivi-interacting RNAs (piRNAs), antisense ncRNAs, and mRNA-like ncRNAs (Seitz et al. 2004b; Peters and Robson 2008; Royo and Cavaille 2008; Koerner et al. 2009). Imprinted ncRNAs lie at a crossroad between imprinted genes and ncRNA genes, with common and distinct features characteristic of both.
Functional characterizations reveal that imprinted ncRNAs are likely to function in nontraditional ways compared to their non-imprinted counterparts. For example, in vitro studies have shown that the snoRNA H/MBII-52 negatively regulates editing and alternative splicing of the serotonin 2C receptor (5htr2c) pre-RNA (Flomen et al. 2004; Vitali et al. 2005; Kishore and Stamm 2006; Doe et al. 2009), a function that differs completely from the traditional roles of snoRNAs in modifying ribosomal RNAs (rRNAs) or small nuclear RNAs (snRNAs) (Kiss 2002). In fact, other imprinted snoRNAs, such as H/MBII-85 and RBII-36, have unknown targets and are referred to as “orphan” snoRNAs (Seitz et al. 2004b). Similarly, imprinted miRNAs also engage in unexpected activities in mammals in which miRNAs function mostly by silencing their targets rather than by cleaving them, with no allele preference. Yet studies have shown that antiPeg11-hosted imprinted miRNAs display an allele-specific regulation of Peg11/ Rtl1 in trans and guide cleavage of the Peg11/ Rtl1 mRNA due to the perfect complementary base-pairing between this miRNA and its target (Davis et al. 2005).
In addition, compared to imprinted protein-coding genes, imprinted ncRNAs show different imprinting features and are more responsible than imprinted protein-coding genes for the mechanism of genomic imprinting. It is imprinted ncRNAs, rather than protein-coding genes, that coexist with large imprinted regions and may contribute to the evolution and regulation of genomic imprinting. Evolutionary studies have indicated that most imprinted protein-coding genes are well conserved between species that use imprinting and species that do not use imprinting; however, examined imprinted ncRNAs are present only in organisms that use imprinting (Hore et al. 2007; Edwards et al. 2008; Nahkuri et al. 2008; Smits et al. 2008; Zhang and Qu 2009). A great deal of small imprinted ncRNAs, which cluster in imprinted regions and vary among mammals, were predicted to be associated with the acquisition and expansion of genomic imprinting (Glazov et al. 2008; Zhang and Qu 2009). Moreover, several imprinted antisense ncRNAs have been shown to be essential for mono-allelic silencing of flanking imprinted genes, and their transcription or they themselves may function as cis-restricted gene silencers in the regulation of imprinting (Pauler et al. 2007; Koerner et al.2009).
Aside from ncRNAs, other important factors in the regulation of imprinting are imprinting control regions (ICRs), also referred to as ICEs (imprinting control elements), which display parent-specific DNA methylation established during gametogenesis and generally control the imprinting patterns of an entire imprinted region (Reinhart and Chaillet 2005; Edwards and Ferguson-Smith 2007). Certain ICRs function by collaborating with imprinted ncRNAs to establish and maintain normal expression pattern of flanking imprinted genes (Pauler et al. 2007; Koerner et al.2009); however, abnormalities in imprinting patterns could cause serious diseases and disorders in species that use imprinting. Studies on the pathology of imprinting-related human diseases have demonstrated that imprinted ncRNAs play an important role. A growing number of human diseases, such as Prader-Willi syndrome (PWS), Beckwith-Wiedemann syndrome (BWS), Silver-Russell syndrome (SRS), transient neonatal diabetes mellitus (TNDM), and various tumors, has been shown to be related to certain imprinted ncRNAs (Bliek et al. 2001; Temple and Shield 2002; Zhang et al. 2003; Bliek et al. 2006; Sahoo et al. 2008).
Due to their functional significance, studies of imprinted ncRNAs are currently very active in both fields of genomic imprinting and ncRNA, and there has been a rapid accumulation of relevant knowledge in the past few years. Yet these data are scattered either throughout the literature or in various online databases. Although certain ncRNA records are included in databases of imprinted genes such as Geneimprint (http://www.geneimprint.com/), Catalog of Imprinted Genes (Morison et al. 2001), and WAMIDEX (Schulz et al. 2008), many relevant annotations are insufficient and no discrimination is made between imprinted ncRNAs and protein-coding genes. As for databases that are focused on ncRNAs, such as snoRNABase (Lestrade and Weber 2006), miRBase (Griffiths-Jones et al. 2006), and NONCODE (Liu et al. 2005), information about imprinting features, including expressed allele, imprinted tissues and stages, ICRs and imprinted region, is rarely annotated. The condition of these databases makes it challenging for researchers who are focused on imprinted ncRNAs to quickly retrieve information that is relevant to their studies.
Another daunting challenge is to identify novel imprinted ncRNAs. At present, detection of imprinted ncRNAs, which is limited to several major imprinted loci and developmental stages in a few organisms, is far from sufficient. Functional characterizations are restricted to several particular examples and are not well generalized among mammalian imprinted ncRNAs, which is largely due to lack of support from adequate data. Therefore, genome-wide identification of imprinted ncRNAs in more species is required; however, this work is experimentally challenging because imprinted ncRNAs display tissue-, developmental stage-, and even species-specific expression patterns (Cavaille et al. 2000; Seitz et al. 2004b; Royo et al. 2007). More recently, deep sequencing technologies have been widely used to identify novel ncRNAs, which have led to the production of increasingly available sequencing data of small RNAs from diverse tissues and cell lines (Mardis 2008a; Mardis 2008b; Lister et al. 2009). An increasing number of bioinformatics tools has also been developed to explore these data (Friedlander et al. 2008; Langmead et al. 2009; Yang et al. 2010). Using these tools to analyze deep sequencing data would provide valuable evidence to detect novel imprinted ncRNAs.
Here we describe a database, ncRNAimprint, which integrates comprehensive information about mammalian imprinted ncRNAs. Three major sections of data were included: (1) a catalog of imprinted ncRNAs, with detailed annotation concerning imprinting features; (2) three comprehensive collections of imprinted ncRNA-related diseases, ICRs, and imprinted regions; and (3) deep sequencing reads that have been fully matched within known imprinted regions. To provide a simplified database structure, we classified all ncRNAs (including both experimentally validated and predicted data) into six categories according to their corresponding traditional families: snoRNAs, miRNAs, siRNAs, piRNAs, antisense ncRNAs, and mRNA-like ncRNAs. Each category was assigned detailed annotation fields to record information extracted from relevant literature and databases. In addition, three comprehensive collections were included to facilitate current research foci on (1) pathological roles of imprinted ncRNAs in related diseases; (2) functions of imprinted ncRNAs in the imprinting mechanism; and (3) evolutionary relationships between imprinted ncRNAs and the expansion of genomic imprinting. The currently available deep sequencing data of small ncRNAs were also mapped to known imprinted regions, and fully matched reads were retained to provide helpful evidence in the identification of novel imprinted ncRNAs and to facilitate the examination of imprinted ncRNA expression patterns. Although the quantity of validated imprinted ncRNAs and relevant knowledge is presently limited, we believe abundant novel imprinted ncRNAs will be discovered, especially when the rapidly increasing deep-sequencing data and the ongoing completion of additional mammalian whole-genome sequences are considered. Moreover, by providing a centralized resource center, ncRNAimprint would facilitate wider detection and functional analysis of imprinted ncRNAs, and even help inspire novel ideas from researchers. The ncRNAimprint database will be constantly updated to keep pace with new studies and published data. The database is freely accessible at http://rnaqueen.sysu.edu.cn/ncRNAimprint.
RESULTS
A comprehensively annotated catalog of mammalian imprinted ncRNAs
ncRNAimprint has been designed to focus primarily on imprinted ncRNAs, which possess distinctive characteristics from both imprinted protein-coding genes and non-imprinted ncRNAs. Imprinted ncRNAs were collected and classified into six categories according to their previously defined families (snoRNAs, miRNAs, piRNAs, siRNAs, antisense ncRNAs, and mRNA-like ncRNAs). In addition, detailed annotations involving imprinting features were organized together in one web page per ncRNA entry. All known sequences can be batch downloaded using the “Download” page on the ncRNAimprint website.
Both validated and predicted data on imprinted ncRNAs were collected and imported into our database. Presently, ncRNAimprint contains data on imprinted ncRNAs from nine species, including human, rhesus, mouse, rat, cow, sheep, pig, wallaby, and opossum. Detailed statistics of each type of ncRNA in each species are listed in Table 1. Version 1.0 of the ncRNAimprint database contains 7253 ncRNAs, with predicted piRNAs making up the largest proportion of imprinted ncRNAs. Sequences of imprinted piRNAs and siRNAs were obtained by mapping predicted sequencing data to known imprinted regions (see Materials and Methods) without further experimental validation. Thus, current collections of all piRNAs and siRNAs are presented as predicted data. The other four categories (snoRNAs, miRNAs, antisense ncRNAs, and mRNA-like ncRNAs) account for a total of 570 ncRNAs. The annotations of these families are, for the most part, validated because they were manually compiled and extracted from relevant literature and databases. The most widely available data are from humans and mice.
TABLE 1.
Imprinted ncRNAs can be browsed by using the “Browse” page (Fig. 1), where entries are displayed by ncRNA families. Each entry contains a dynamic link to its annotation page. A screenshot of one representative ncRNA annotation page is presented in Figure 2. Detailed annotations including imprinting features were compiled for each ncRNA category (Supplemental Table S1). In addition, category-specific annotation items, such as “overlapped sense gene” for antisense ncRNAs and “mature/miR* sequence” for miRNAs, were also assigned (Supplemental Table S1).
On the “Search” page, an effective search interface has been created and provided to search for imprinted ncRNAs. Detailed search fields are available for users to retrieve information using a variety of characteristics, including species, chromosome, name, ncRNA category, imprinting status, expressed allele, imprinted tissue, and other descriptive keywords. For example, researchers can obtain all maternally expressed imprinted miRNAs on mouse chromosome 12 by launching such a search, as shown in Figure 3.
A user-friendly graphical browser, MapView, which zooms and scrolls over chromosomes, was created to display the ncRNAs within imprinted regions of a genome (Fig. 2). The tracks of imprinted ncRNAs and regions contain dynamic links to relevant annotation pages, where links to “MapView” are also available (Fig. 2). Organisms and regions can be selected from the forms as queries, or standardized chromosomal positions can be entered. With this tool, researchers can conveniently examine imprinted ncRNAs that are clustered, overlapping, or located in the same imprinted region.
Comprehensive collections: Related diseases, ICRs, and imprinted regions
Studies regarding imprinted ncRNAs mostly concern (1) pathology of imprinted ncRNA-related diseases; (2) imprinting mechanisms; (3) evolution of imprinted ncRNAs and their relationship to the origin and expansion of genomic imprinting; and (4) identification of novel imprinted ncRNAs. In order to provide comprehensive information to facilitate these studies, four categories were organized and imported into ncRNAimprint: imprinted ncRNA-related diseases, ICRs, imprinted regions, and deep sequencing data. In this section, collections of diseases, ICRs, and imprinted regions will be introduced. Statistics summarizing these data are listed in Table 2.
TABLE 2.
A single imprinted ncRNA can be responsible for several disorders; however, single diseases could be associated with multiple ncRNAs. A detailed organization of these complex relationships is necessary for further studies on pathological roles of imprinted ncRNAs. By extracting information from publications and relevant databases, we established a collection of disease–ncRNA relationships in ncRNAimprint. Currently, 127 records are included, and each record represents a single relationship between one ncRNA and one disease. A special search interface for imprinted ncRNA-related diseases was created and is presented on the “Search” page. By launching a query, users can easily retrieve all disorders related to one imprinted ncRNA (Fig. 4), or all ncRNAs relevant to a certain disease. A full list of disease–ncRNA relationships is able to be downloaded from the “Download” page and can be browsed by using the “Browse” function on the website (Fig. 5).
ICRs are important controllers of imprinting patterns over entire imprinted domains (Reinhart and Chaillet 2005; Edwards and Ferguson-Smith 2007), which influence the expression of imprinted ncRNAs. In addition, studies have demonstrated that imprinted ncRNAs participate in imprinting regulation, and some ncRNAs function by interacting with ICRs (Pauler et al. 2007; Koerner et al.2009). As a result, a collection of ICRs is also crucial and helpful for studying regulatory roles of imprinted ncRNAs. We extracted the scattered information on ICRs from research papers, and 15 records were obtained. These data can be retrieved from the “Search,” “Browse,” (Fig. 5) or “Download” pages of ncRNAimprint.
Evolutionary studies of imprinted regions indicate that imprinted ncRNAs are more responsible for the origin and evolution of genomic imprinting than imprinted protein-coding genes (see Introduction); however, current evolutionary analyses of both imprinted ncRNAs and imprinted regions are limited to several well known examples (Paulsen et al. 2005; Edwards et al. 2008; Glazov et al. 2008; Nahkuri et al. 2008; Smits et al. 2008; Zhang and Qu 2009). To undertake broader studies of more imprinted regions, a full list of known imprinted regions becomes necessary. Therefore, ncRNAimprint includes a catalog of imprinted regions. Unfortunately, due to the small amount of available data from other species, only human and mouse data were compiled. In total, 72 entries were obtained. These data can be obtained using the “Search”, “Browse” (Fig. 5), or “Download” pages of ncRNAimprint. In addition, all included ncRNAs in each imprinted region can be viewed in the graphical browser “MapView” (Fig. 2).
Involvement of deep sequencing data
The identification of novel imprinted ncRNAs is an important part of present studies and further progress in this field. Both the generalization of functional characterizations of imprinted ncRNAs and comprehensive evolutionary analyses rely greatly on a sufficient number of experimentally validated imprinted ncRNAs. Yet, these data are far from adequate, even in known imprinted regions. A global survey of genomic imprinting in mice using deep sequencing reveals a large fraction of novel ncRNAs might reside within known imprinted loci (Babak et al. 2008).
We downloaded deep sequencing reads of small ncRNAs from Gene Expression Omnibus (GEO) in the National Center for Biotechnology Information (NCBI) database (Barrett et al. 2009) and mapped them to imprinted regions in the human and mouse genomes to obtain fully matched reads (see Materials and Methods). These results were imported into our database and are displayed graphically on the website using the “DeepMap” browser (Fig. 6). This graphical interface is a helpful tool to examine expression patterns of known imprinted ncRNAs and novel ones. For example, as shown in Figure 6, the abundance track of reads that overlap with mmu-mir-379 reveals the expression features of this miRNA. The upper red block indicates the abundance of the mature miRNA, miR-379, while the lower block represents the degraded strand of the pre-miRNA. This observation applies to most miRNAs, which could be used to predict novel miRNAs in areas where similar patterns appear in the genome. The software for identifying miRNAs from deep sequencing data, miRDeep, has also integrated this concept in its design (Friedlander et al. 2008).
DISCUSSION
A comprehensive resource center for imprinted ncRNA
We developed a database, ncRNAimprint, which houses a manually curated catalog of mammalian imprinted ncRNAs, as well as comprehensive collections of imprinted ncRNA-related human diseases, ICRs, and imprinted regions. We classified all ncRNAs according to their traditional families and annotated them in detail, which would provide a clearer picture of the characteristics of imprinted ncRNAs and useful data resources for researchers. For example, the collection of imprinted miRNAs could serve as helpful data to identify novel imprinted miRNAs in other organisms, or to analyze origin and evolution of imprinted miRNAs. By searching paralogs miRNAs of imprinted miRNAs from database miRBase (Griffiths-Jones et al. 2006), we have obtained a great number of novel imprinted miRNA candidates for further analysis (data not shown). Moreover, the three comprehensive collections of diseases, ICRs, and imprinted regions would facilitate current research foci. For instance, using the ncRNAimprint database, researchers can analyze the evolution of genomic imprinting on an unprecedentedly universal scale, including all mammals and even other species that have fully sequenced genomes. Using both imprinted regions and ncRNAs as queries to retrieve sequences of syntenic loci and paralogs from these genomes would provide valuable raw data for further evolutionary studies. Novel candidates of imprinted regions and ncRNAs might be predicted by exploring these data.
In addition, by an exhaustive retrieval and organization of data, we have made two important discoveries that might have been previously overlooked by researchers of imprinted ncRNAs. First, by downloading miRNA tracks located within imprinted regions from the UCSC Table Browser (http://genome.ucsc.edu/; Rhead et al. 2010), we found a large cluster of miRNAs (Fig. 7) present in one intron of the mouse imprinted gene, Scm-like with four mbt domains 2 (Sfmbt2) (Kuzmin et al. 2008). In total, 65 miRNA copies (40 different miRNAs) were included in this cluster, which, to date, would be the largest known miRNA cluster located in an imprinted region; however, no research paper or review has noticed this Sfmbt2-hosted miRNA cluster because its miRNA members were detected by several different studies. Basic genomic features of the cluster are shown in Figure 7. Similar to another large imprinted miRNA cluster, the mir-379/mir-656 cluster (Seitz et al. 2004a; Glazov et al. 2008), certain location patterns of Sfmbt2-hosted miRNAs suggest an evolution process by tandem duplication of an ancient precursor sequence or, more likely, of several sequences as a group (Fig. 7A). Members of this cluster belong to two miRNA families according to miRBase (Griffiths-Jones et al. 2006), mmu-mir-466 and mmu-mir-467. The latter one could be divided into four subfamilies: mmu-mir-297, mmu-mir-467, mmu-mir-466, and mmu-mir-669. T-Coffee (version 8.69; Notredame et al. 2000) alignment results reveal the members of each family are well conserved (Fig. 7B), but members from different families/subfamilies show high diversity (data not shown). By analyzing genomic sequence and deep sequencing data, we found no miRNA candidate in the syntenic region of human, suggesting this cluster is rodent- and even mouse-specific. Whether possible counterparts of this cluster are present in rat is under study. This species-specific feature is distinct from the mir-379/mir-656 cluster, which is relatively well conserved among human, mouse, and rat (Glazov et al. 2008). Several aspects of this cluster should be examined further, such as whether this cluster is present in other species, whether this cluster is located in a larger imprinted region that includes genes in addition to Sfmbt2, whether this cluster coexists with the imprinting expression of Sfmbt2, and the function, expression, and imprinting patterns of the miRNAs within this cluster.
Second, we found that one mouse box H/ACA snoRNA, ACA54, is hosted by an imprinted gene, the nucleosome assembly protein 1-like 4 (Nap1l4) (Engemann et al. 2000; Umlauf et al. 2004). The current prevailing viewpoint is that imprinted snoRNAs include only C/D RNAs because no H/ACA box RNA has been found to be imprinted (Seitz et al. 2004b; Royo and Cavaille 2008). Apparently, this potentially imprinted H/ACA snoRNA has been ignored. Unlike other imprinted snoRNAs that are devoid of traditional targets, this snoRNA is predicted to guide the pseudouridylation of 28S rRNA U3801 (Schattner et al. 2006). In addition, according to the alignment result presented by snoRNABase (Lestrade and Weber 2006), this snoRNA is well conserved among vertebrates.
As exemplified by the two cases above, ncRNAimprint provides a centralized comprehensive resource center for studies concerning imprinted ncRNAs. In addition, the database will be regularly updated according to newly published data in order to present up-to-date information.
Detection of novel imprinted ncRNAs from deep sequencing data
As discussed above (see Introduction), mapping deep sequencing data to known imprinted regions could provide valuable information, not only for analyzing expression patterns of known imprinted ncRNAs, but also in detecting novel noncoding transcripts in these loci. Although most of these data have been explored on a whole genome scale that already includes imprinted regions, relevant studies paid little attention to examining imprinted regions. As a result, valuable data about imprinted ncRNAs might be ignored or filtered out. For example, the miRNA members of Sfmbt2-hosted miRNA cluster (see Discussion, above) were actually detected in several different studies that were not focused on whether the newly identified miRNAs are imprinted or not. Therefore, this cluster was easily overlooked. Furthermore, by analyzing deep sequencing data we have obtained a set of miRNA candidates present within imprinted regions (data not shown). The graphical browser, DeepMap, which was created to show the tracks of fully matched deep sequencing reads in human and mouse imprinted regions, can provide helpful clues to identify novel noncoding transcripts in certain imprinted regions. In addition, researchers can also download the sequences of imprinted regions from ncRNAimprint, and map deep sequencing reads of their concerned libraries to these regions. To account for the rapid accumulation of available data, we will regularly update our database to ensure the inclusion of newly published deep sequencing data and novel imprinted regions.
MATERIALS AND METHODS
Data collection and detailed annotations
An exhaustive search for reviews of imprinted ncRNAs was performed using queries from a table of keywords, including, but not limited to, “ncRNA,” “genomic imprinting,” “noncoding,” and “imprinted.” A catalog including ncRNA name and organism was compiled based on analyses of these reviews and their cited references. For each ncRNA, we extracted annotations, references, and sequences by performing comprehensive searches in relevant databases (Supplemental Table S2). Imprinting features, such as imprinted region, imprinted tissues and stages, ICRs and expressed allele, which could not be found in existing databases, were obtained by analyzing related publications. The compiled annotations were then formalized and automatically imported into a MySQL database with tables for four ncRNA categories: snoRNAs, miRNAs, antisense ncRNAs, and mRNA-like ncRNAs. Each table was assigned general annotation fields, as well as specific annotation fields for its corresponding category (Supplemental Table S1).
Annotations for imprinted ncRNA-related diseases, ICRs, and imprinted regions were also obtained by analyzing literature and extracting data from relevant databases (Supplemental Table S2). Most information about imprinted ncRNA-related ncRNAs was collected from the Online Mendelian Inheritance in Man (OMIM) (http://www.ncbi.nlm.nih.gov/omim/) and miR2Disease (only for miRNAs) (Jiang et al. 2009) databases, using ncRNA names or aliases as queries. A collection of ICRs was mainly extracted from research papers, and a summary table of mammalian imprinted genes was downloaded from the Catalog of Imprinted Genes (Morison et al. 2001). From this table, the beginning and end of each imprinted region were defined as the first and last gene, respectively, that are known to be imprinted at that location. Only data from the human and mouse regions were collected. Sequences were downloaded from the human (Mar. 2006; NCBI36/hg18) and mouse (July 2007; NCBI37/mm9) assemblies using the UCSC Genome Browser (Rhead et al. 2010). Annotations of diseases, ICRs, and imprinted regions were manually compiled and imported respectively into three separate tables in the MySQL database.
To add newly detected miRNAs and snoRNAs that may not be included in any review about imprinted ncRNAs, tracks of snoRNAs and miRNAs in all imprinted regions were downloaded from the UCSC Table Browser (http://genome.ucsc.edu/; Rhead et al. 2010). Human sno/miRNA tracks were downloaded from the human (Mar. 2006; NCBI36/hg18) assembly. Mouse miRNA tracks were downloaded from the mouse (July 2007; NCBI37/mm9) assembly.
Deep sequencing data processing
A total of 69 small-RNA libraries of diverse tissues and cell lines from human and mouse were compiled from nine related studies (Supplementary Table S2) and downloaded from the NCBI GEO website (Barrett et al. 2009). Three predicted piRNA data sets (accession numbers: DQ569913–DQ601958; DQ539889–DQ569912; AB349185–AB353040) and one oocyte clustered small RNA data set (accession numbers: AB334800–AB349184) were downloaded from the NCBI Genbank database (Benson et al. 2008). The reads with 3' adapters or barcodes were truncated to remove these sequences using our in-house Pascal scripts. Reads without adapters in each library were mapped to human and mouse imprinted regions, using Bowtie (version 0.9.9.3; Langmead et al. 2009) with options: –f –k 200 –v 0. Only reads that matched over their entire length were retained for import into our MySQL database.
ACKNOWLEDGMENTS
We thank Xiao-Hong Chen, Qiao-Juan Huang, and Yi-Ling Chen for their technical assistance. We are also grateful to Yi-Jun Zhang, Zhi-Wei Dong, and Chong-Jian Chen for their useful suggestions. This research was supported by the National Natural Science Foundation of China (No. 30830066, 30771151, 30870530, and 30900820) and the National Basic Research Program (No. 2005CB724600) from the Ministry of Science and Technology of China.
Footnotes
Article published online ahead of print. Article and publication date are at http://www.rnajournal.org/cgi/doi/10.1261/rna.2226910.
REFERENCES
- Babak T, Deveale B, Armour C, Raymond C, Cleary MA, van der Kooy D, Johnson JM, Lim LP 2008. Global survey of genomic imprinting by transcriptome sequencing. Curr Biol 18: 1735–1741 [Abstract] [Google Scholar]
- Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Marshall KA, et al. 2009. NCBI GEO: Archive for high-throughput functional genomic data. Nucleic Acids Res 37: D885–D890 [Europe PMC free article] [Abstract] [Google Scholar]
- Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL 2008. GenBank. Nucleic Acids Res 36: D25–D30 [Europe PMC free article] [Abstract] [Google Scholar]
- Bliek J, Maas SM, Ruijter JM, Hennekam RC, Alders M, Westerveld A, Mannens MM 2001. Increased tumour risk for BWS patients correlates with aberrant H19 and not KCNQ1OT1 methylation: Occurrence of KCNQ1OT1 hypomethylation in familial cases of BWS. Hum Mol Genet 10: 467–476 [Abstract] [Google Scholar]
- Bliek J, Terhal P, van den Bogaard MJ, Maas S, Hamel B, Salieb-Beugelaar G, Simon M, Letteboer T, van der Smagt J, Kroes H, et al. 2006. Hypomethylation of the H19 gene causes not only Silver-Russell syndrome (SRS) but also isolated asymmetry or an SRS-like phenotype. Am J Hum Genet 78: 604–614 [Europe PMC free article] [Abstract] [Google Scholar]
- Cavaille J, Buiting K, Kiefmann M, Lalande M, Brannan CI, Horsthemke B, Bachellerie JP, Brosius J, Huttenhofer A 2000. Identification of brain-specific and imprinted small nucleolar RNA genes exhibiting an unusual genomic organization. Proc Natl Acad Sci 97: 14311–14316 [Europe PMC free article] [Abstract] [Google Scholar]
- Davis E, Caiment F, Tordoir X, Cavaille J, Ferguson-Smith A, Cockett N, Georges M, Charlier C 2005. RNAi-mediated allelic trans-interaction at the imprinted Rtl1/Peg11 locus. Curr Biol 15: 743–749 [Abstract] [Google Scholar]
- Doe CM, Relkovic D, Garfield AS, Dalley JW, Theobald DE, Humby T, Wilkinson LS, Isles AR 2009. Loss of the imprinted snoRNA mbii-52 leads to increased 5htr2c pre-RNA editing and altered 5HT2CR-mediated behaviour. Hum Mol Genet 18: 2140–2148 [Europe PMC free article] [Abstract] [Google Scholar]
- Edwards CA, Ferguson-Smith AC 2007. Mechanisms regulating imprinted genes in clusters. Curr Opin Cell Biol 19: 281–289 [Abstract] [Google Scholar]
- Edwards CA, Mungall AJ, Matthews L, Ryder E, Gray DJ, Pask AJ, Shaw G, Graves JA, Rogers J, Dunham I, et al. 2008. The evolution of the DLK1-DIO3 imprinted domain in mammals. PLoS Biol 6: e135 10.1371/journal.pbio.0060135 [Europe PMC free article] [Abstract] [Google Scholar]
- Engemann S, Strodicke M, Paulsen M, Franck O, Reinhardt R, Lane N, Reik W, Walter J 2000. Sequence and functional comparison in the Beckwith-Wiedemann region: Implications for a novel imprinting centre and extended imprinting. Hum Mol Genet 9: 2691–2706 [Abstract] [Google Scholar]
- Feil R, Berger F 2007. Convergent evolution of genomic imprinting in plants and mammals. Trends Genet 23: 192–199 [Abstract] [Google Scholar]
- Flomen R, Knight J, Sham P, Kerwin R, Makoff A 2004. Evidence that RNA editing modulates splice site selection in the 5-HT2C receptor gene. Nucleic Acids Res 32: 2113–2122 [Europe PMC free article] [Abstract] [Google Scholar]
- Friedlander MR, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S, Rajewsky N 2008. Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotechnol 26: 407–415 [Abstract] [Google Scholar]
- Glazov EA, McWilliam S, Barris WC, Dalrymple BP 2008. Origin, evolution, and biological role of miRNA cluster in DLK-DIO3 genomic region in placental mammals. Mol Biol Evol 25: 939–948 [Abstract] [Google Scholar]
- Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ 2006. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res 34: D140–D144 [Europe PMC free article] [Abstract] [Google Scholar]
- Hore TA, Rapkins RW, Graves JA 2007. Construction and evolution of imprinted loci in mammals. Trends Genet 23: 440–448 [Abstract] [Google Scholar]
- Jiang Q, Wang Y, Hao Y, Juan L, Teng M, Zhang X, Li M, Wang G, Liu Y 2009. miR2Disease: A manually curated database for microRNA deregulation in human disease. Nucleic Acids Res 37: D98–D104 [Europe PMC free article] [Abstract] [Google Scholar]
- Kishore S, Stamm S 2006. The snoRNA HBII-52 regulates alternative splicing of the serotonin receptor 2C. Science 311: 230–232 [Abstract] [Google Scholar]
- Kiss T 2002. Small nucleolar RNAs: An abundant group of noncoding RNAs with diverse cellular functions. Cell 109: 145–148 [Abstract] [Google Scholar]
- Koerner MV, Pauler FM, Huang R, Barlow DP 2009. The function of non-coding RNAs in genomic imprinting. Development 136: 1771–1783 [Europe PMC free article] [Abstract] [Google Scholar]
- Kuzmin A, Han Z, Golding MC, Mann MR, Latham KE, Varmuza S 2008. The PcG gene Sfmbt2 is paternally expressed in extraembryonic tissues. Gene Expr Patterns 8: 107–116 [Europe PMC free article] [Abstract] [Google Scholar]
- Langmead B, Trapnell C, Pop M, Salzberg SL 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10: R25 10.1186/gb-2009-10-3-r25 [Europe PMC free article] [Abstract] [Google Scholar]
- Lestrade L, Weber MJ 2006. snoRNA-LBME-db, a comprehensive database of human H/ACA and C/D box snoRNAs. Nucleic Acids Res 34: D158–D162 [Europe PMC free article] [Abstract] [Google Scholar]
- Lister R, Gregory BD, Ecker JR 2009. Next is now: New technologies for sequencing of genomes, transcriptomes, and beyond. Curr Opin Plant Biol 12: 107–118 [Europe PMC free article] [Abstract] [Google Scholar]
- Liu C, Bai B, Skogerbo G, Cai L, Deng W, Zhang Y, Bu D, Zhao Y, Chen R 2005. NONCODE: An integrated knowledge database of non-coding RNAs. Nucleic Acids Res 33: D112–D115 [Europe PMC free article] [Abstract] [Google Scholar]
- Mardis ER 2008a. Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet 9: 387–402 [Abstract] [Google Scholar]
- Mardis ER 2008b. The impact of next-generation sequencing technology on genetics. Trends Genet 24: 133–141 [Abstract] [Google Scholar]
- Morison IM, Paton CJ, Cleverley SD 2001. The imprinted gene and parent-of-origin effect database. Nucleic Acids Res 29: 275–276 [Europe PMC free article] [Abstract] [Google Scholar]
- Morison IM, Ramsay JP, Spencer HG 2005. A census of mammalian imprinting. Trends Genet 21: 457–465 [Abstract] [Google Scholar]
- Nahkuri S, Taft RJ, Korbie DJ, Mattick JS 2008. Molecular evolution of the HBII-52 snoRNA cluster. J Mol Biol 381: 810–815 [Abstract] [Google Scholar]
- Notredame C, Higgins DG, Heringa J 2000. T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol 302: 205–217 [Abstract] [Google Scholar]
- Pauler FM, Koerner MV, Barlow DP 2007. Silencing by imprinted noncoding RNAs: Is transcription the answer? Trends Genet 23: 284–292 [Europe PMC free article] [Abstract] [Google Scholar]
- Paulsen M, Khare T, Burgard C, Tierling S, Walter J 2005. Evolution of the Beckwith-Wiedemann syndrome region in vertebrates. Genome Res 15: 146–153 [Europe PMC free article] [Abstract] [Google Scholar]
- Peters J, Robson JE 2008. Imprinted noncoding RNAs. Mamm Genome 19: 493–502 [Abstract] [Google Scholar]
- Reinhart B, Chaillet JR 2005. Genomic imprinting: Cis-acting sequences and regional control. Int Rev Cytol 243: 173–213 [Abstract] [Google Scholar]
- Rhead B, Karolchik D, Kuhn RM, Hinrichs AS, Zweig AS, Fujita PA, Diekhans M, Smith KE, Rosenbloom KR, Raney BJ, et al. 2010. The UCSC Genome Browser database: Update 2010. Nucleic Acids Res 38: D613–D619 [Europe PMC free article] [Abstract] [Google Scholar]
- Royo H, Cavaille J 2008. Non-coding RNAs in imprinted gene clusters. Biol Cell 100: 149–166 [Abstract] [Google Scholar]
- Royo H, Basyuk E, Marty V, Marques M, Bertrand E, Cavaille J 2007. Bsr, a nuclear-retained RNA with monoallelic expression. Mol Biol Cell 18: 2817–2827 [Europe PMC free article] [Abstract] [Google Scholar]
- Sahoo T, del Gaudio D, German JR, Shinawi M, Peters SU, Person RE, Garnica A, Cheung SW, Beaudet AL 2008. Prader-Willi phenotype caused by paternal deficiency for the HBII-85 C/D box small nucleolar RNA cluster. Nat Genet 40: 719–721 [Europe PMC free article] [Abstract] [Google Scholar]
- Schattner P, Barberan-Soler S, Lowe TM 2006. A computational screen for mammalian pseudouridylation guide H/ACA RNAs. RNA 12: 15–25 [Europe PMC free article] [Abstract] [Google Scholar]
- Schulz R, Woodfine K, Menheniott TR, Bourc'his D, Bestor T, Oakey RJ 2008. WAMIDEX: A web atlas of murine genomic imprinting and differential expression. Epigenetics 3: 89–96 [Europe PMC free article] [Abstract] [Google Scholar]
- Seitz H, Royo H, Bortolin ML, Lin SP, Ferguson-Smith AC, Cavaille J 2004a. A large imprinted microRNA gene cluster at the mouse Dlk1-Gtl2 domain. Genome Res 14: 1741–1748 [Europe PMC free article] [Abstract] [Google Scholar]
- Seitz H, Royo H, Lin SP, Youngson N, Ferguson-Smith AC, Cavaille J 2004b. Imprinted small RNA genes. Biol Chem 385: 905–911 [Abstract] [Google Scholar]
- Smits G, Mungall AJ, Griffiths-Jones S, Smith P, Beury D, Matthews L, Rogers J, Pask AJ, Shaw G, VandeBerg JL, et al. 2008. Conservation of the H19 noncoding RNA and H19-IGF2 imprinting mechanism in therians. Nat Genet 40: 971–976 [Abstract] [Google Scholar]
- Temple IK, Shield JP 2002. Transient neonatal diabetes, a disorder of imprinting. J Med Genet 39: 872–875 [Europe PMC free article] [Abstract] [Google Scholar]
- Umlauf D, Goto Y, Cao R, Cerqueira F, Wagschal A, Zhang Y, Feil R 2004. Imprinting along the Kcnq1 domain on mouse chromosome 7 involves repressive histone methylation and recruitment of Polycomb group complexes. Nat Genet 36: 1296–1300 [Abstract] [Google Scholar]
- Vitali P, Basyuk E, Le Meur E, Bertrand E, Muscatelli F, Cavaille J, Huttenhofer A 2005. ADAR2-mediated editing of RNA substrates in the nucleolus is inhibited by C/D small nucleolar RNAs. J Cell Biol 169: 745–753 [Europe PMC free article] [Abstract] [Google Scholar]
- Wood AJ, Oakey RJ 2006. Genomic imprinting in mammals: Emerging themes and established theories. PLoS Genet 2: e147 10.1371/journal.pgen.0020147 [Europe PMC free article] [Abstract] [Google Scholar]
- Yang JH, Shao P, Zhou H, Chen YQ, Qu LH 2010. deepBase: A database for deeply annotating and mining deep sequencing data. Nucleic Acids Res 38: D123–D130 [Europe PMC free article] [Abstract] [Google Scholar]
- Zhang Y, Qu L 2009. Non-coding RNAs and the acquisition of genomic imprinting in mammals. Sci China C Life Sci 52: 195–204 [Abstract] [Google Scholar]
- Zhang X, Zhou Y, Mehta KR, Danila DC, Scolavino S, Johnson SR, Klibanski A 2003. A pituitary-derived MEG3 isoform functions as a growth suppressor in tumor cells. J Clin Endocrinol Metab 88: 5119–5126 [Abstract] [Google Scholar]
Articles from RNA are provided here courtesy of The RNA Society
Full text links
Read article at publisher's site: https://doi.org/10.1261/rna.2226910
Read article for free, from open access legal sources, via Unpaywall: http://rnajournal.cshlp.org/content/16/10/1889.full.pdf
Free to read at www.rnajournal.org
http://www.rnajournal.org/cgi/content/abstract/16/10/1889
Free after 12 months at www.rnajournal.org
http://www.rnajournal.org/cgi/content/full/16/10/1889
Free after 12 months at www.rnajournal.org
http://www.rnajournal.org/cgi/reprint/16/10/1889
Citations & impact
Impact metrics
Citations of article over time
Alternative metrics
Smart citations by scite.ai
Explore citation contexts and check if this article has been
supported or disputed.
https://scite.ai/reports/10.1261/rna.2226910
Article citations
Imprinted Long Non-Coding RNAs in Mammalian Development and Disease.
Int J Mol Sci, 24(17):13647, 04 Sep 2023
Cited by: 4 articles | PMID: 37686455 | PMCID: PMC10487962
Review Free full text in Europe PMC
Role of microRNA and Long Non-Coding RNA in Hepatocellular Carcinoma.
Curr Pharm Des, 26(4):415-428, 01 Jan 2020
Cited by: 15 articles | PMID: 31939724 | PMCID: PMC7403690
Review Free full text in Europe PMC
Maternal 5mCpG Imprints at the PARD6G-AS1 and GCSAML Differentially Methylated Regions Are Decoupled From Parent-of-Origin Expression Effects in Multiple Human Tissues.
Front Genet, 9:36, 01 Mar 2018
Cited by: 6 articles | PMID: 29545821 | PMCID: PMC5838017
NGS-FC: A Next-Generation Sequencing Data Format Converter.
IEEE/ACM Trans Comput Biol Bioinform, 15(5):1683-1691, 03 Jul 2017
Cited by: 1 article | PMID: 28682264
piRNAs and Their Functions in the Brain.
Int J Hum Genet, 16(1-2):53-60, 01 Mar 2016
Cited by: 28 articles | PMID: 27512315 | PMCID: PMC4976825
Go to all (23) article citations
Data
Data behind the article
This data has been text mined from the article, or deposited into data resources.
BioStudies: supplemental material and supporting data
Nucleotide Sequences (Showing 8 of 8)
- (1 citation) ENA - AB334800
- (1 citation) ENA - AB349184
- (1 citation) ENA - AB349185
- (1 citation) ENA - DQ569912
- (1 citation) ENA - DQ539889
- (1 citation) ENA - DQ601958
- (1 citation) ENA - DQ569913
- (1 citation) ENA - AB353040
Show less
Similar Articles
To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.
RNAdb 2.0--an expanded database of mammalian non-coding RNAs.
Nucleic Acids Res, 35(database issue):D178-82, 01 Dec 2006
Cited by: 110 articles | PMID: 17145715 | PMCID: PMC1751534
Non-coding RNAs in imprinted gene clusters.
Biol Cell, 100(3):149-166, 01 Mar 2008
Cited by: 116 articles | PMID: 18271756
Review
Non-coding RNAs and the acquisition of genomic imprinting in mammals.
Sci China C Life Sci, 52(3):195-204, 18 Mar 2009
Cited by: 15 articles | PMID: 19294344
Review
Silencing by imprinted noncoding RNAs: is transcription the answer?
Trends Genet, 23(6):284-292, 18 Apr 2007
Cited by: 104 articles | PMID: 17445943 | PMCID: PMC2847181
Review Free full text in Europe PMC