Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

STAR

Published: 01 January 2013 Publication History

Abstract

Motivation: Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases.
Results: To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80–90% success rate, corroborating the high precision of the STAR mapping strategy.
Availability and implementation: STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

Cited By

View all
  • (2024)SARS-CoV-2 disrupts host gene networksComputers in Biology and Medicine10.1016/j.compbiomed.2024.109343183:COnline publication date: 1-Dec-2024
  • (2024)GeneappComputers in Biology and Medicine10.1016/j.compbiomed.2024.108789178:COnline publication date: 19-Sep-2024
  • (2024)Deep pan-cancer analysis and multi-omics evidence reveal that ALG3 inhibits CD8+ T cell infiltration by suppressing chemokine secretion and is associated with 5-fluorouracil sensitivityComputers in Biology and Medicine10.1016/j.compbiomed.2024.108666177:COnline publication date: 24-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Bioinformatics
Bioinformatics  Volume 29, Issue 1
January 2013
147 pages

Publisher

Oxford University Press, Inc.

United States

Publication History

Published: 01 January 2013

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SARS-CoV-2 disrupts host gene networksComputers in Biology and Medicine10.1016/j.compbiomed.2024.109343183:COnline publication date: 1-Dec-2024
  • (2024)GeneappComputers in Biology and Medicine10.1016/j.compbiomed.2024.108789178:COnline publication date: 19-Sep-2024
  • (2024)Deep pan-cancer analysis and multi-omics evidence reveal that ALG3 inhibits CD8+ T cell infiltration by suppressing chemokine secretion and is associated with 5-fluorouracil sensitivityComputers in Biology and Medicine10.1016/j.compbiomed.2024.108666177:COnline publication date: 24-Jul-2024
  • (2024)Towards understanding post-COVID-19 conditionComputers in Biology and Medicine10.1016/j.compbiomed.2024.108507175:COnline publication date: 18-Jul-2024
  • (2024)Effect of RNA-Seq data normalization on protein interactome mapping for Alzheimer’s diseaseComputational Biology and Chemistry10.1016/j.compbiolchem.2024.108028109:COnline publication date: 1-Apr-2024
  • (2024)Towards a Computational Approach to Quantification of Allele Specific Expression at Population LevelBioinformatics and Biomedical Engineering10.1007/978-3-031-64636-2_10(127-139)Online publication date: 15-Jul-2024
  • (2023)Novel Approaches Toward Scalable Composable Workflows in Hyper-Heterogeneous Computing EnvironmentsProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3626283(2097-2108)Online publication date: 12-Nov-2023
  • (2023)On de novo Bridging Paired-end RNA-seq DataProceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3584371.3612987(1-5)Online publication date: 3-Sep-2023
  • (2023)On the Maximal Independent Sets of k-mers with the Edit DistanceProceedings of the 14th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3584371.3612982(1-6)Online publication date: 3-Sep-2023
  • (2023)Integrating differential expression, co-expression and gene network analysis for the identification of common genes associated with tumor angiogenesis deregulationJournal of Biomedical Informatics10.1016/j.jbi.2023.104421144:COnline publication date: 1-Aug-2023
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media