Banasiak et al., 2018 - Google Patents

Extended N-Gram Model for Analysis of Polish Texts

Banasiak et al., 2018

Document ID: 5800075568816895911
Author: Banasiak D; Mierzwa J; Sterna A
Publication year: 2018
Publication venue: Man-Machine Interactions 5: 5th International Conference on Man-Machine Interactions, ICMMI 2017 Held at Kraków, Poland, October 3-6, 2017

External Links

Cited by

Snippet

The paper presents extended N-gram model designed for analysis of texts in Polish language. One of possible applications of the model is automatic detection and correction of errors that occur during computerized text edition. N-grams belong to the group of statistical …

Continue reading at link.springer.com (other versions)

238000004458 analytical method 0 title abstract description 30

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
- G06F17/2223—Handling non-latin characters, e.g. kana-to-kanji conversion
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/2775—Phrasal analysis, e.g. finite state techniques, chunking
- G06F17/278—Named entity recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/24—Editing, e.g. insert/delete
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/211—Formatting, i.e. changing of presentation of document
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling

Similar Documents

Publication	Publication Date	Title
Amjad et al.	2020	“Bend the truth”: Benchmark dataset for fake news detection in Urdu language and its evaluation
Şeker et al.	2012	Initial explorations on using CRFs for Turkish named entity recognition
Sawalha et al.	2013	SALMA: standard Arabic language morphological analysis
Kruczek et al.	2020	Are n-gram categories helpful in text classification?
Jabbar et al.	2018	An improved Urdu stemming algorithm for text mining based on multi-step hybrid approach
Cing et al.	2020	Improving accuracy of part-of-speech (POS) tagging using hidden markov model and morphological analysis for Myanmar Language
Parupalli et al.	2018	Bcsat: A benchmark corpus for sentiment analysis in telugu using word-level annotations
Zhang et al.	2019	PKU paraphrase bank: A sentence-level paraphrase corpus for Chinese
Wong et al.	2014	iSentenizer‐μ: Multilingual Sentence Boundary Detection Model
Chizhikova et al.	2022	Multilingual case-insensitive named entity recognition
Onyenwe et al.	2019	Toward an effective igbo part-of-speech tagger
Alian et al.	2023	Syntactic-semantic similarity based on dependency tree kernel
Mokanarangan et al.	2016	Tamil morphological analyzer using support vector machines
Rezai et al.	2017	FarsiTag: A part-of-speech tagging system for Persian
Tukeyev et al.	2021	Universal programs for stemming, segmentation, morphological analysis of Turkic words
Kapočiūtė-Dzikienė et al.	2017	A comparison of Lithuanian morphological analyzers
Al_Janabi et al.	2019	Pragmatic text mining method to find the topics of citation network
Nguyen-Son et al.	2019	Detecting machine-translated paragraphs by matching similar words
Aejas et al.	2021	Named entity recognition for cultural heritage preservation
Myint et al.	2019	Disambiguation using joint entropy in part of speech of written Myanmar text
Banasiak et al.	2018	Extended N-Gram Model for Analysis of Polish Texts
Reentovich et al.	2016	The first one-million corpus for the Belarusian NooJ module
Almujaiwel	2019	Grammatical construction of function words between old and modern written Arabic: A corpus-based analysis
Turki Khemakhem et al.	2016	POS tagging without a tagger: using aligned corpora for transferring knowledge to under-resourced languages
Worku et al.	2022	Amharic fake news detection on social media using feature fusion