Nathani et al., 2021 - Google Patents
Part of speech tagging for a resource poor language: Sindhi in Devanagari script using HMM and CRFNathani et al., 2021
View PDF- Document ID
- 14281827314905554271
- Author
- Nathani B
- Joshi N
- Publication year
- Publication venue
- Proceedings of the 18th International Conference on Natural Language Processing (ICON)
External Links
Snippet
Part of speech tagging is a pre-processing step of various NLP applications. Mainly it is used in Machine Translation. This research proposes two POS taggers, ie, an HMM-based and CRF based tagger. To develop this tagger, the corpus of manually annotated 30,000 …
- 201000000522 chronic kidney disease 0 title description 33
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/2775—Phrasal analysis, e.g. finite state techniques, chunking
- G06F17/278—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G06F17/2881—Natural language generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tiedemann | Recycling translations: Extraction of lexical data from parallel corpora and their application in natural language processing | |
Antony et al. | Parts of speech tagging for Indian languages: a literature survey | |
Patil et al. | Survey of named entity recognition systems with respect to Indian and foreign languages | |
CN110532328A (en) | A kind of text concept figure building method | |
Dien et al. | POS-tagger for English-Vietnamese bilingual corpus | |
Abdurakhmonova et al. | Linguistic functionality of Uzbek Electron Corpus: uzbekcorpus. uz | |
Mishra et al. | Context based NLP framework of textual tagging for low resource language | |
Nathani et al. | Part of speech tagging for a resource poor language: Sindhi in Devanagari script using HMM and CRF | |
CN109815497B (en) | Character attribute extraction method based on syntactic dependency | |
Belay et al. | Impacts of homophone normalization on semantic models for amharic | |
Amri et al. | Amazigh POS tagging using TreeTagger: a language independant model | |
Dibitso et al. | Part of speech tagging for Setswana African language | |
Abdelghany et al. | Doc2Vec: An approach to identify Hadith Similarities | |
Jafar Tafreshi et al. | A novel approach to conditional random field-based named entity recognition using Persian specific features | |
Mbouopda et al. | A Word Representation to Improve Named Entity Recognition in Low-resource Languages | |
Seresangtakul et al. | Thai-Isarn dialect parallel corpus construction for machine translation | |
Tedla | Tigrinya morphological segmentation with bidirectional long short-term memory neural networks and its effect on English-Tigrinya machine translation | |
Lingam et al. | English to Telugu Rule based Machine Translation System: A Hybrid Approach | |
Yohan et al. | Automatic named entity identification and classification using heuristic based approach for telugu | |
Patheja et al. | Analysis of part of speech tagging | |
Samir et al. | Training and evaluation of TreeTagger on Amazigh corpus | |
Yuyun et al. | Enhancing Neural Machine Translation Model for Low-Resource Languages: A Case Study of Indonesian to Mamuju | |
SAMIR et al. | AMAZIGH NAMED ENTITY RECOGNITION: A NOVEL APPROACH. | |
Jamwal | Modeling translation of code mixed English-Dogri language | |
Deshmukh | Comparison of generative and discriminative models of part of speech taggers for marathi language |