Nathani et al., 2021 - Google Patents

Part of speech tagging for a resource poor language: Sindhi in Devanagari script using HMM and CRF

Nathani et al., 2021

Document ID: 14281827314905554271
Author: Nathani B; Joshi N
Publication year: 2021
Publication venue: Proceedings of the 18th International Conference on Natural Language Processing (ICON)

External Links

Cited by

Snippet

Part of speech tagging is a pre-processing step of various NLP applications. Mainly it is used in Machine Translation. This research proposes two POS taggers, ie, an HMM-based and CRF based tagger. To develop this tagger, the corpus of manually annotated 30,000 …

Continue reading at aclanthology.org (PDF) (other versions)

201000000522 chronic kidney disease 0 title description 33

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/2775—Phrasal analysis, e.g. finite state techniques, chunking
- G06F17/278—Named entity recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G06F17/2881—Natural language generation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German

Similar Documents

Publication	Publication Date	Title
Tiedemann	2003	Recycling translations: Extraction of lexical data from parallel corpora and their application in natural language processing
Antony et al.	2011	Parts of speech tagging for Indian languages: a literature survey
Patil et al.	2016	Survey of named entity recognition systems with respect to Indian and foreign languages
CN110532328A (en)	2019-12-03	A kind of text concept figure building method
Dien et al.	2003	POS-tagger for English-Vietnamese bilingual corpus
Abdurakhmonova et al.	2021	Linguistic functionality of Uzbek Electron Corpus: uzbekcorpus. uz
Mishra et al.	2022	Context based NLP framework of textual tagging for low resource language
Nathani et al.	2021	Part of speech tagging for a resource poor language: Sindhi in Devanagari script using HMM and CRF
CN109815497B (en)	2023-04-18	Character attribute extraction method based on syntactic dependency
Belay et al.	2021	Impacts of homophone normalization on semantic models for amharic
Amri et al.	2019	Amazigh POS tagging using TreeTagger: a language independant model
Dibitso et al.	2019	Part of speech tagging for Setswana African language
Abdelghany et al.	2020	Doc2Vec: An approach to identify Hadith Similarities
Jafar Tafreshi et al.	2020	A novel approach to conditional random field-based named entity recognition using Persian specific features
Mbouopda et al.	2019	A Word Representation to Improve Named Entity Recognition in Low-resource Languages
Seresangtakul et al.	2019	Thai-Isarn dialect parallel corpus construction for machine translation
Tedla	2018	Tigrinya morphological segmentation with bidirectional long short-term memory neural networks and its effect on English-Tigrinya machine translation
Lingam et al.	2014	English to Telugu Rule based Machine Translation System: A Hybrid Approach
Yohan et al.	2014	Automatic named entity identification and classification using heuristic based approach for telugu
Patheja et al.	2012	Analysis of part of speech tagging
Samir et al.	2019	Training and evaluation of TreeTagger on Amazigh corpus
Yuyun et al.	2024	Enhancing Neural Machine Translation Model for Low-Resource Languages: A Case Study of Indonesian to Mamuju
SAMIR et al.	2018	AMAZIGH NAMED ENTITY RECOGNITION: A NOVEL APPROACH.
Jamwal	2021	Modeling translation of code mixed English-Dogri language
Deshmukh	2018	Comparison of generative and discriminative models of part of speech taggers for marathi language