Grangier et al., 2021 - Google Patents
The trade-offs of domain adaptation for neural language modelsGrangier et al., 2021
View PDF- Document ID
- 6394897304953600891
- Author
- Grangier D
- Iter D
- Publication year
- Publication venue
- arXiv preprint arXiv:2109.10274
External Links
Snippet
This work connects language model adaptation with concepts of machine learning theory. We consider a training setup with a large out-of-domain set and a small in-domain set. We derive how the benefit of training a model on either set depends on the size of the sets and …
- 230000004301 light adaptation 0 title abstract description 14
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G06K9/627—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches based on distances between the pattern to be recognised and training or reference patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6296—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Grangier et al. | The trade-offs of domain adaptation for neural language models | |
Alibrahim et al. | Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization | |
Nam et al. | Spread spurious attribute: Improving worst-group accuracy with spurious attribute estimation | |
Malinin et al. | Reverse kl-divergence training of prior networks: Improved uncertainty and adversarial robustness | |
Xu et al. | Neural response generation via gan with an approximate embedding layer | |
Ravanelli et al. | Improving speech recognition by revising gated recurrent units | |
Lu et al. | Reinforcement learning-powered semantic communication via semantic similarity | |
CN112396129B (en) | Challenge sample detection method and universal challenge attack defense system | |
Hong et al. | Sentiment analysis with deeply learned distributed representations of variable length texts | |
CN110275939B (en) | Method and device for determining conversation generation model, storage medium and electronic equipment | |
Chen et al. | End-to-end learning of LDA by mirror-descent back propagation over a deep architecture | |
Carroll et al. | Uni [mask]: Unified inference in sequential decision problems | |
Zheng et al. | Efficient neural architecture search for end-to-end speech recognition via straight-through gradients | |
Ng et al. | De’hubert: Disentangling noise in a self-supervised model for robust speech recognition | |
WO2015011521A1 (en) | An incremental learner via an adaptive mixture of weak learners distributed on a non-rigid binary tree | |
Sun et al. | Neural semi-supervised learning for text classification under large-scale pretraining | |
Naveen et al. | Deep learning for threat actor attribution from threat reports | |
Ling et al. | Semi-supervised few-shot learning via multi-factor clustering | |
Mostafa et al. | GOF at Arabic hate speech 2022: breaking the loss function convention for data-imbalanced Arabic offensive text detection | |
Hanawal et al. | Unsupervised early exit in dnns with multiple exits | |
Tong et al. | Graph convolutional network based semi-supervised learning on multi-speaker meeting data | |
Pal et al. | Self supervised BERT for legal text classification | |
Zhang et al. | Learning to search efficient densenet with layer-wise pruning | |
Yu et al. | ANEDL: adaptive negative evidential deep learning for open-set semi-supervised learning | |
Chien et al. | Stochastic adversarial learning for domain adaptation |