Yi et al., 2019 - Google Patents
Learning from data-rich problems: a case study on genetic variant callingYi et al., 2019
View PDF- Document ID
- 12080240882496184377
- Author
- Yi R
- Chang P
- Baid G
- Carroll A
- Publication year
- Publication venue
- arXiv preprint arXiv:1911.05151
External Links
Snippet
Next Generation Sequencing can sample the whole genome (WGS) or the 1-2% of the genome that codes for proteins called the whole exome (WES). Machine learning approaches to variant calling achieve high accuracy in WGS data, but the reduced number …
- 230000002068 genetic 0 title description 4
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/22—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/24—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for machine learning, data mining or biostatistics, e.g. pattern finding, knowledge discovery, rule extraction, correlation, clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/28—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/18—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for functional genomics or proteomics, e.g. genotype-phenotype associations, linkage disequilibrium, population genetics, binding site identification, mutagenesis, genotyping or genome annotation, protein-protein interactions or protein-nucleic acid interactions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Marlétaz et al. | A new spiralian phylogeny places the enigmatic arrow worms among gnathiferans | |
JP6814981B2 (en) | Learning device, identification device, learning identification system, and program | |
Barker et al. | EvoPipes. net: bioinformatic tools for ecological and evolutionary genomics | |
Major et al. | HLA typing from 1000 genomes whole genome and whole exome illumina data | |
Mieth et al. | DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies | |
Gusareva et al. | Practical aspects of genome-wide association interaction analysis | |
Notin et al. | TranceptEVE: Combining family-specific and family-agnostic models of protein sequences for improved fitness prediction | |
Kaminski et al. | pLM-BLAST: distant homology detection based on direct comparison of sequence representations from protein language models | |
Yi et al. | Learning from data-rich problems: a case study on genetic variant calling | |
Pagnuco et al. | HMMER Cut-off Threshold Tool (HMMERCTTER): Supervised classification of superfamily protein sequences with a reliable cut-off threshold | |
Mock et al. | BERTax: taxonomic classification of DNA sequences with Deep Neural Networks | |
Wang et al. | WaveNano: a signal‐level nanopore base‐caller via simultaneous prediction of nucleotide labels and move labels through bi‐directional WaveNets | |
US20230298692A1 (en) | Method, System and Computer Program Product for Determining Presentation Likelihoods of Neoantigens | |
Soni et al. | A new test suggests hundreds of amino acid polymorphisms in humans are subject to balancing selection | |
Dediu | Tone and genes: New cross-linguistic data and methods support the weak negative effect of the “derived” allele of ASPM on tone, but not of Microcephalin | |
Abbas et al. | Role of Genetics in Diagnosis and Management of Hypertrophic Cardiomyopathy: A Glimpse into the Future | |
Venkata Subramaniya et al. | Protein contact map denoising using generative adversarial networks | |
Robinson et al. | Approximate B ayesian estimation of extinction rate in the F innish D aphnia magna metapopulation | |
Aljouie et al. | Cross-validation and cross-study validation of kidney cancer with machine learning and whole exome sequences from the National Cancer Institute | |
Jia et al. | NLPEI: A novel self-interacting protein prediction model based on natural language processing and evolutionary information | |
Ramachandran et al. | Deep learning for better variant calling for cancer diagnosis and treatment | |
Jagodnik et al. | HetIG-PreDiG: A heterogeneous integrated graph model for predicting human disease genes based on gene expression | |
Strzoda et al. | A mapping-free NLP-based technique for sequence search in Nanopore long-reads | |
Cai et al. | Identification of protein complexes from tandem affinity purification/mass spectrometry data via biased random walk | |
O’Fallon et al. | Jovian enables direct inference of germline haplotypes from short reads via sequence-to-sequence modeling |