Nothing Special   »   [go: up one dir, main page]

Yi et al., 2019 - Google Patents

Learning from data-rich problems: a case study on genetic variant calling

Yi et al., 2019

View PDF
Document ID
12080240882496184377
Author
Yi R
Chang P
Baid G
Carroll A
Publication year
Publication venue
arXiv preprint arXiv:1911.05151

External Links

Snippet

Next Generation Sequencing can sample the whole genome (WGS) or the 1-2% of the genome that codes for proteins called the whole exome (WES). Machine learning approaches to variant calling achieve high accuracy in WGS data, but the reduced number …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/22Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/24Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for machine learning, data mining or biostatistics, e.g. pattern finding, knowledge discovery, rule extraction, correlation, clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/28Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/18Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for functional genomics or proteomics, e.g. genotype-phenotype associations, linkage disequilibrium, population genetics, binding site identification, mutagenesis, genotyping or genome annotation, protein-protein interactions or protein-nucleic acid interactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions

Similar Documents

Publication Publication Date Title
Marlétaz et al. A new spiralian phylogeny places the enigmatic arrow worms among gnathiferans
JP6814981B2 (en) Learning device, identification device, learning identification system, and program
Barker et al. EvoPipes. net: bioinformatic tools for ecological and evolutionary genomics
Major et al. HLA typing from 1000 genomes whole genome and whole exome illumina data
Mieth et al. DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies
Gusareva et al. Practical aspects of genome-wide association interaction analysis
Notin et al. TranceptEVE: Combining family-specific and family-agnostic models of protein sequences for improved fitness prediction
Kaminski et al. pLM-BLAST: distant homology detection based on direct comparison of sequence representations from protein language models
Yi et al. Learning from data-rich problems: a case study on genetic variant calling
Pagnuco et al. HMMER Cut-off Threshold Tool (HMMERCTTER): Supervised classification of superfamily protein sequences with a reliable cut-off threshold
Mock et al. BERTax: taxonomic classification of DNA sequences with Deep Neural Networks
Wang et al. WaveNano: a signal‐level nanopore base‐caller via simultaneous prediction of nucleotide labels and move labels through bi‐directional WaveNets
US20230298692A1 (en) Method, System and Computer Program Product for Determining Presentation Likelihoods of Neoantigens
Soni et al. A new test suggests hundreds of amino acid polymorphisms in humans are subject to balancing selection
Dediu Tone and genes: New cross-linguistic data and methods support the weak negative effect of the “derived” allele of ASPM on tone, but not of Microcephalin
Abbas et al. Role of Genetics in Diagnosis and Management of Hypertrophic Cardiomyopathy: A Glimpse into the Future
Venkata Subramaniya et al. Protein contact map denoising using generative adversarial networks
Robinson et al. Approximate B ayesian estimation of extinction rate in the F innish D aphnia magna metapopulation
Aljouie et al. Cross-validation and cross-study validation of kidney cancer with machine learning and whole exome sequences from the National Cancer Institute
Jia et al. NLPEI: A novel self-interacting protein prediction model based on natural language processing and evolutionary information
Ramachandran et al. Deep learning for better variant calling for cancer diagnosis and treatment
Jagodnik et al. HetIG-PreDiG: A heterogeneous integrated graph model for predicting human disease genes based on gene expression
Strzoda et al. A mapping-free NLP-based technique for sequence search in Nanopore long-reads
Cai et al. Identification of protein complexes from tandem affinity purification/mass spectrometry data via biased random walk
O’Fallon et al. Jovian enables direct inference of germline haplotypes from short reads via sequence-to-sequence modeling