Lin et al., 2021 - Google Patents

Common sense beyond english: Evaluating and improving multilingual language models for commonsense reasoning

Lin et al., 2021

Document ID: 13410391615186549353
Author: Lin B; Lee S; Qiao X; Ren X
Publication year: 2021
Publication venue: arXiv preprint arXiv:2106.06937

External Links

Cited by

Snippet

Commonsense reasoning research has so far been limited to English. We aim to evaluate and improve popular multilingual language models (ML-LMs) to help advance commonsense reasoning (CSR) beyond English. We collect the Mickey Corpus, consisting …

Continue reading at arxiv.org (PDF) (other versions)

230000002708 enhancing 0 abstract description 5

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G06F17/2881—Natural language generation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30427—Query translation
- G06F17/3043—Translation of natural language queries to structured queries
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass

Similar Documents

Publication	Publication Date	Title
Lin et al.	2021	Common sense beyond english: Evaluating and improving multilingual language models for commonsense reasoning
Ross et al.	2021	Tailor: Generating and perturbing text with semantic controls
Liu et al.	2019	Unsupervised paraphrasing by simulated annealing
US10496756B2 (en)	2019-12-03	Sentence creation system
Van Der Goot et al.	2021	MultiLexNorm: A shared task on multilingual lexical normalization
Zhao et al.	2014	A bootstrapping based refinement framework for mining opinion words and targets
Chen et al.	2022	Transformers go for the LOLs: Generating (humourous) titles from scientific abstracts end-to-end
Maruf et al.	2019	A survey on document-level machine translation: Methods and evaluation
Li et al.	2015	Extracting biomedical event with dual decomposition integrating word embeddings
Conia et al.	2023	Increasing coverage and precision of textual information in multilingual knowledge graphs
Mitra et al.	2021	Zero-shot Multi-lingual Interrogative Question Generation for" People Also Ask" at Bing
Howcroft et al.	2022	Most NLG is low-resource: here’s what we can do about it
Gerlach	2015	Improving statistical machine translation of informal language: a rule-based pre-editing approach for French forums
Badaro et al.	2020	A link prediction approach for accurately mapping a large-scale Arabic lexical resource to English WordNet
KR100725723B1 (en)	2007-06-08	Method and Device for Restoring Omitted Component of Korean Subject Using Constraints
Demir	2022	Turkish data-to-text generation using sequence-to-sequence neural networks
Desai et al.	2021	Taxonomic survey of Hindi Language NLP systems
Han et al.	2014	Unsupervised quality estimation model for english to german translation and its application in extensive supervised evaluation
Petrasova et al.	2018	Building the semantic similarity model for social network data streams
Van Thin et al.	2023	A systematic literature review on Vietnamese aspect-based sentiment analysis
Zhang et al.	2023	Crocosum: A benchmark dataset for cross-lingual code-switched summarization
Bhattacharya et al.	2021	Enhancing aspect extraction for hindi
Anke et al.	2015	TALN-UPF: Taxonomy learning exploiting CRF-based hypernym extraction on encyclopedic definitions
Calleja et al.	2024	Benchmark for automatic keyword extraction in spanish: Datasets and methods
Abdellatif et al.	2024	A transformer-based approach for augmenting software engineering chatbots datasets