Nothing Special   »   [go: up one dir, main page]

Heo, 2015 - Google Patents

Improving quality of high-throughput sequencing reads

Heo, 2015

View PDF
Document ID
3939140136031624093
Author
Heo Y
Publication year

External Links

Snippet

Rapid advances in high-throughput sequencing (HTS) technologies have led to an exponential increase in the amount of sequencing data. HTS sequencing reads, however, contain far more errors than does data collected through traditional sequencing methods …
Continue reading at www.ideals.illinois.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/22Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/18Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for functional genomics or proteomics, e.g. genotype-phenotype associations, linkage disequilibrium, population genetics, binding site identification, mutagenesis, genotyping or genome annotation, protein-protein interactions or protein-nucleic acid interactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/24Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for machine learning, data mining or biostatistics, e.g. pattern finding, knowledge discovery, rule extraction, correlation, clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/20Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for hybridisation or gene expression, e.g. microarrays, sequencing by hybridisation, normalisation, profiling, noise correction models, expression ratio estimation, probe design or probe optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/14Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for phylogeny or evolution, e.g. evolutionarily conserved regions determination or phylogenetic tree construction
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES OR MICRO-ORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or micro-organisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or micro-organisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/28Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30289Database design, administration or maintenance
    • G06F17/30303Improving data quality; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30533Other types of queries
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES OR MICRO-ORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or micro-organisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or micro-organisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6846Common amplification features
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES OR MICRO-ORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or micro-organisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or micro-organisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Hybridisation probes

Similar Documents

Publication Publication Date Title
Amarasinghe et al. Opportunities and challenges in long-read sequencing data analysis
Weisenfeld et al. Direct determination of diploid genome sequences
Liao et al. Current challenges and solutions of de novo assembly
Liu et al. Interrogating the “unsequenceable” genomic trinucleotide repeat disorders by long-read sequencing
Pfeifer From next-generation resequencing reads to a high-quality variant data set
Xu et al. Comparison of somatic mutation calling methods in amplicon and whole exome sequence data
Li Toward better understanding of artifacts in variant calling from high-coverage samples
Abnizova et al. Computational errors and biases in short read next generation sequencing
US10777301B2 (en) Hierarchical genome assembly method using single long insert library
US20210174901A1 (en) METHOD FOR SIMULTANEOUS DETECTION OF GENOME-WIDE COPY NUMBER CHANGES, cnLOH, INDELS, AND GENE MUTATIONS
US9165109B2 (en) Sequence assembly and consensus sequence determination
US20130110407A1 (en) Determining variants in genome of a heterogeneous sample
Alic et al. Objective review of de novo stand‐alone error correction methods for NGS data
Li et al. Single nucleotide polymorphism (SNP) detection and genotype calling from massively parallel sequencing (MPS) data
Wong et al. DNA sequencing technologies: sequencing data protocols and bioinformatics tools
Ahsan et al. A survey of algorithms for the detection of genomic structural variants from long-read sequencing data
Chen et al. Recent advances in sequence assembly: principles and applications
Heo Improving quality of high-throughput sequencing reads
Te Boekhorst et al. Computational problems of analysis of short next generation sequencing reads
US10424395B2 (en) Computation pipeline of single-pass multiple variant calls
Mikheenko et al. TandemMapper and TandemQUAST: mapping long reads and assessing/improving assembly quality in extra-long tandem repeats
WO2019132010A1 (en) Method, apparatus and program for estimating base type in base sequence
Heo et al. Comprehensive evaluation of error-correction methodologies for genome sequencing data
Heo et al. Comprehensive assessment of error correction methods for high-throughput sequencing data
Niehus et al. PopDel identifies medium-size deletions jointly in tens of thousands of genomes