Multilingual multitask joint neural information extraction

Lin, Ying

Permalink

https://hdl.handle.net/2142/109521

Description

Title

Multilingual multitask joint neural information extraction

Author(s)

Lin, Ying

Issue Date

2020-12-02

Director of Research (if dissertation) or Advisor (if thesis)

Ji, Heng

Doctoral Committee Chair(s)

Ji, Heng

Committee Member(s)

Han, Jiawei
Zhai, ChengXiang
Roth, Dan
Stoyanov, Veselin

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Joint Information Extraction
Multitask Learning

Abstract

In the age of information overload, the ability to automatically extract useful structured information from texts is urgently needed by a wide range of applications, such as information retrieval and question answering. Over the past decades, researchers have proposed various Information Extraction (IE) techniques to discover important knowledge elements (e.g., entities, relations, events) from unstructured documents. However, as these approaches typically rely on specific hand-crafted rules or manually annotated data, it is usually expensive to adapt them for new settings, such as new languages, domains, scenarios, or genres. Therefore, the goal of this thesis is to develop more robust and portable models for information extraction tasks and build a joint neural architecture that performs multiple IE tasks within a single model. We first focus on the generality of IE models. As most existing neural models use word embeddings as input features, they are sensitive to the quality of word representations. We investigate the possible factors that cause performance degradation when applying a name tagger to new data and tackle this issue from two aspects: 1. Robustness. The reliability and the amount of information of each feature is inconsistent among words. We incorporate reliability signals and dynamic feature composition to enable to model to select reliable and effective features. 2. Generality. Overfitting is a major problem leading to the huge performance gap between seen and unseen names. As a solution, we encourage the model to leverage contextual features that are more general. Next, we explore the portability of models for sequence labeling, the underlying problem of many Natural Language Processing (NLP) tasks such as name tagging. Current models cannot be applied to very dissimilar settings (e.g., other languages), whereas annotating new data for all possible settings is infeasible. Hence, we propose to transfer knowledge across different models through multitask learning to reduce the need for data annotation. To maximize the knowledge being transferred, we design a unified and extendable architecture that integrates multiple transfer approaches. After that, we extend this framework to more IE tasks and propose a joint neural architecture, OneIE, that performs multilingual entity, relation, and event extraction simultaneously. In addition to multitask learning, we further incorporate global features to capture the cross-subtask and cross-instance interactions among knowledge elements. Finally, we propose OneIE to perform joint inference without using additional global features.

Graduation Semester

2020-12

Type of Resource

Thesis

Permalink

http://hdl.handle.net/2142/109521

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Dept. of Computer Science

Multilingual multitask joint neural information extraction

Lin, Ying

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In