Thuyết Trình TWP

1.
Introduction:
- NLP is a tract of AI, devoted to making computers understand the
statements or words written in human languages. Its function is to ease
users’ work and to satisfy the wish to communicate with computers in
NL and can be classified into 2 parts: Natural Language Understanding
and Natural Language Generation. The main objectives of NLP include
the interpretation, analysis, and manipulation of natural language data
for the intended purpose with the use of various algorithms, tools, and
methods
2. Components of NLP: There are two main components
- NLU: enables machines to understand natural language and analyze it
by extracting concepts, entities, emotions, keywords, etc. It is used in
customer care applications to understand problems reported by
customers, involving the meaning of the language, language context,
and various forms of the language.
- NLG: the process of producing phrases, sentences, and paragraphs
that are meaningful from an internal representation, happening in four
phases: identifying the goals, planning on how goals may be achieved
by evaluating the situation, context, and available communicative
sources and releasing the text
3. Applications of NLP:
- Machine Translation: translating phrases from one language to
another, and keeping the meaning of the sentences intact with
grammar and tenses.
- Text Categorization: for example: categorize trouble tickets or
complaint requests
- Information Extraction: extracting entities such as names, places,
prices, and reviews -> a powerful way to summarize the information
relevant to users’ needs and build databases or classify items
- Text Summarization: the explosion of information -> summarize the
data while keeping the meaning intact is required -> recognize the
important information for a large set of data and understand deeper
meanings
- Medicine: extract and summarize information on any signs or
symptoms and response data with the aim of identifying the side
effects of any medicine and labeling it.
4. Dataset: corpus: a collection of linguistic data, either compiled from written
texts or transcribed from recorded speech from different languages
5. SOTA models in NLP: NLP’s term started in the 1940s with the definition of
machine translation. By the end of the last decade, this field exploded with the
release of general-purpose sentence processors and speech recognition, and
after that neural language modeling (based on the occurrence of words). After
that, the existence of word embeddings and NN in NLP helps the model to
understand the meaning and the context of the word, tackling problems such
as Sentiment Analysis or Text Summarization, and Machine Translation.
Recently, some pre-trained models such as BERT or GPT using Transformer
Architecture having the potential to learn longer-term dependency and can be
fine-tuned in the training process have pushed this field.
a. Word2Vec:
Methodology: This section serves as the backbone of the report, meticulously

dissecting the core methodologies of Word2Vec—the Skip-gram and Continuous
Bag of Words models. It intricately explains the training process, unraveling how the
model adeptly estimates word representations by capturing intricate context and co-
occurrence patterns. The discussion includes mathematical formulations and
algorithmic intricacies, offering readers a comprehensive understanding of the
model's inner workings.
Key Concepts: Central to Word2Vec are several key concepts, such as the
nuanced understanding of word similarity and the pivotal role of context in shaping
meaningful word embeddings. This section delves into these concepts, providing
illustrative examples and connecting them to the broader theoretical framework of
distributed representations.
Experimental Results: The empirical validation of Word2Vec is explored in this
section, starting with a detailed account of the experimental setup. It dissects the
datasets employed, elucidates the metrics used for evaluation, and meticulously
presents the outcomes. The discussion spans various tasks, from word similarity to
analogy completion, showcasing the model's versatility and effectiveness in
capturing semantic nuances. The impact of Word2Vec on downstream applications
is thoroughly examined, providing a holistic view of its real-world applicability.
b. Glove
GloVe is based on the idea that words that appear together in text are related to
each other. The basic idea behind the GloVe word embedding is to derive the
relationship between the words from statistics.
Unlike the occurrence matrix, the co-occurrence matrix tells you how often a
particular word pair occurs together. Each value in the co-occurrence matrix
represents a pair of words occurring together. The vectors are learned by
minimizing a loss function that measures the difference between the actual
co-occurrence counts of words in the training corpus and the predicted co-
occurrence counts based on the word vectors.
GloVe has several advantages over other word embedding methods, such as
Word2Vec. It is more efficient to train, and it can handle larger corpora of text.
It is also less sensitive to outliers, such as words that appear only a few times
in the training corpus.
c. Seq2Seq
In the paper "Sequence to Sequence Learning with Neural Networks", Sutskever,

Vinyals, and Le introduce a new machine learning model for sequence-to-sequence
learning tasks, where both the input and output are sequences. This model is called
a sequence-to-sequence neural network (Seq2Seq).
A Seq2Seq model consists of two neural networks: an encoder and a decoder. The
encoder transforms the input into an internal representation, and the decoder uses
this internal representation to generate the output. The encoder of a Seq2Seq model
is a recurrent neural network (RNN) that takes an input sequence and produces a
hidden state vector. The decoder is another RNN that takes the hidden state vector
as input and produces an output sequence.
The paper describes how Seq2Seq models can be trained using a variety of
methods, including supervised learning, reinforcement learning, and self-supervised
learning. Seq2Seq models can achieve high performance on a variety of sequence-
to-sequence tasks, including:
● Machine translation
● Summarization
● Question answering
● Chatbots
d. ULMFiT: Transfer learning has been successfully applied to computer vision

tasks, but existing approaches in NLP still require task-specific modifications
and training from scratch. In this paper, we propose Universal Language
Model Fine-tuning (ULMFiT), an effective transfer learning method that can be
applied to any task in NLP.
ULMFiT consists of three steps:
1. **Pre-train a language model (LM) on a large, general-domain text corpus:** This

captures general language patterns and knowledge.
2. **Fine-tune the LM on the target task's dataset:** Adapt it to the specific language
characteristics of the task.
3. **Fine-tune the classifier on the target task:** Train a classifier on top of the fine-
tuned LM to make predictions for the specific task.
ULMFiT achieves state-of-the-art results on text classification tasks with significantly
less training data. It also improves performance on tasks with limited training data.
Additionally, ULMFiT is applicable to a wide range of NLP tasks. It provides a
general framework for transfer learning in NLP. ULMFiT has influenced subsequent
research on transfer learning for NLP and paved the way for more advanced
approaches like ELMo and BERT.
e. Neural Networks
f. Transformer Architecture
Transformer is a deep learning (DL) model, based on a self-attention mechanism

that weights the importance of each part of the input data differently. It is mainly used
in computer vision (CV) and natural language processing (NLP).
Similar to recurrent neural networks (RNNs), transformers are designed to process

sequential input data like natural language, and perform tasks like text
summarization and translation. However, unlike RNNs, transformers process the
entire input at once. The attention mechanism allows the model to focus on the most
relevant parts of the input for each output.
The Transformer architecture follows an encoder-decoder structure but does not rely
on recurrence and convolutions in order to generate an output.
In a nutshell, the task of the encoder, on the left half of the Transformer architecture,
is to map an input sequence to a sequence of continuous representations, which is
then fed into a decoder.
The decoder, on the right half of the architecture, receives the output of the encoder
together with the decoder output at the previous time step to generate an output
sequence.
g. BERT
Unlike traditional language models that process text in a sequential manner, BERT
uses a bidirectional approach. It considers both the left and right context of each
word in a sentence, allowing it to capture more comprehensive contextual
information.
BERT is pre-trained on a large amount of unlabeled text data, such as Wikipedia
articles, using two main tasks: masked language modeling and next sentence
prediction. Masked language modeling involves randomly masking some words in a
sentence and training the model to predict the masked words based on the
surrounding context. Next sentence prediction involves training the model to
determine whether two sentences appear consecutively in a document.
During fine-tuning, BERT is trained on labeled data specific to the target task,
allowing it to adapt and perform well on various natural language processing tasks.
h. T5
The paper's goal of measure general language learning abilities through a unified
"text-to-text" format across various tasks.T5 follows the Transformer architecture and
is trained in a "text-to-text" manner. This means that it is trained to perform various
tasks by converting the input and output of each task into a text format. It is trained
on a large corpus of diverse data, including web documents, books, and Wikipedia
articles. This format involves presenting the model with input text and instructing it to
generate corresponding output text, creating a consistent training objective for pre-
training and fine-tuning. Tasks include translation, summarization, question
answering, and classification. The authors detail how specific tasks are framed within
this framework, such as text classification or translation, and mention the exceptions
and adaptations made for certain tasks like regression or pronoun resolution. The
paragraph emphasizes the simplicity and effectiveness of the text-to-text framework,
aligning with prior work in the field.
T5 introduced the “Text-to-Text” framework, in which every NLP task

(Translation, Classification, etc.) has the same underlying structure in which
text is fed as input to the model and text is produced as output
6. Challenges: Some of the common challenges are: Contextual words and

phrases in the language where the same words and phrases can have
different meanings in a sentence which are easy for humans to understand
but make a challenging task. Such type of challenges can also be faced when
with dealing Synonyms in the language because humans use many different
words to express the same idea, in the language different levels of complexity
such as large, huge, and big may be used by different persons which become
a challenging task to process the language and design algorithms to adopt all
these issues. Further in language, Homonyms, the words used to be
pronounced the same but have different definitions are also problematic for
question answering and speech-to-text applications because they aren’t
written in text form.
=>

Thuyết Trình TWP

Uploaded by

Copyright:

Available Formats

Thuyết Trình TWP

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thuyết Trình TWP

Uploaded by

Copyright:

Available Formats

1.

Methodology: This section serves as the backbone of the report, meticulously

In the paper "Sequence to Sequence Learning with Neural Networks", Sutskever,

d. ULMFiT: Transfer learning has been successfully applied to computer vision

ULMFiT consists of three steps:

1. Pre-train a language model (LM) on a large, general-domain text corpus: This

Transformer is a deep learning (DL) model, based on a self-attention mechanism

Similar to recurrent neural networks (RNNs), transformers are designed to process

T5 introduced the “Text-to-Text” framework, in which every NLP task

6. Challenges: Some of the common challenges are: Contextual words and

You might also like