AI Unit 5
AI Unit 5
AI Unit 5
Unit 5
NLP stands for Natural Language Processing, which is a part of Computer Science, Human
language, and Artificial Intelligence. It is the technology that is used by machines to
understand, analyse, manipulate, and interpret human's languages. It helps developers
to organize knowledge for performing tasks such as translation, automatic
summarization, Named Entity Recognition (NER), speech recognition, relationship
extraction, and topic segmentation.
Till the year 1980, natural language processing systems were based on complex sets of
hand-written rules. After 1980, NLP introduced machine learning algorithms for language
processing. Now, modern NLP consists of various applications, like speech recognition,
machine translation, and machine text reading. When we combine all these applications
then it allows the artificial intelligence to gain knowledge of the world.
Advantages of NLP
NLP helps users to ask questions about any subject and get a direct response within
seconds.
NLP offers exact answers to the question means it does not offer unnecessary and
unwanted information.
NLP helps computers to communicate with humans in their languages.
It is very time efficient.
Most of the companies use NLP to improve the efficiency of documentation
processes, accuracy of documentation, and identify the information from large
databases.
Disadvantages of NLP
Applications of NLP
1. Question Answering
Question Answering focuses on building systems that automatically answer the questions
asked by humans in a natural language.
2. Spam Detection
Sentiment Analysis is also known as opinion mining. It is used on the web to analyse the
attitude, behaviour, and emotional state of the sender. This application is implemented
through a combination of NLP (Natural Language Processing) and statistics by assigning
the values to the text (positive, negative, or natural), identify the mood of the context
(happy, sad, angry, etc.)
4. Machine Translation
Machine translation is used to translate text or speech from one natural language to
another natural language.
Example: Google Translator
5. Spelling correction
Microsoft Corporation provides word processor software like MS-word, PowerPoint for the
spelling correction.
6. Speech Recognition
Speech recognition is used for converting spoken words into text. It is used in applications,
such as mobile, home automation, video recovery, dictating to Microsoft Word, voice
biometrics, voice user interface, and so on.
7. Chatbot
Implementing the Chatbot is one of the important applications of NLP. It is used by many
companies to provide the customer's chat services.
8. Information extraction
Information extraction is one of the most important applications of NLP. It is used for
extracting structured information from unstructured or semi-structured machine-
readable documents.
Components of NLP
Natural Language Understanding (NLU) helps the machine to understand and analyse
human language by extracting the metadata from content such as concepts, entities,
keywords, emotion, relations, and semantic roles.
NLU mainly used in Business applications to understand the customer's problem in both
spoken and written language.
Natural Language Generation (NLG) acts as a translator that converts the computerized
data into natural language representation. It mainly involves Text planning, Sentence
planning, and Text Realization.
Parsing
The word ‘Parsing’ whose origin is from Latin word ‘pars’ (which means ‘part’), is used to
draw exact meaning or dictionary meaning from the text. It is also called Syntactic analysis
or syntax analysis. Comparing the rules of formal grammar, syntax analysis checks the
text for meaningfulness. The sentence like “Give me hot ice-cream”, for example, would be
rejected by parser or syntactic analyzer.
In this sense, we can define parsing or syntactic analysis or syntax analysis as follows
It may be defined as the process of analyzing the strings of symbols in natural language
conforming to the rules of formal grammar.
We can understand the relevance of parsing in NLP with the help of following points −
In deep parsing, the search strategy will It is the task of parsing a limited part of
give a complete syntactic structure to a the syntactic information from the given
sentence. task.
It is suitable for complex NLP It can be used for less complex NLP
applications. applications.
Recursive descent parsing is one of the most straightforward forms of parsing. Following
are some important points about recursive descent parser −
Shift-reduce parser
Chart parser
Grammar
Grammar is defined as the rules for forming well-structured sentences. Grammar also
plays an essential role in describing the syntactic structure of well-formed programs, like
denoting the syntactical rules used for conversation in natural languages.
In the theory of formal languages, grammar is also applicable in Computer Science, mainly
in programming languages and data structures. Example - In the C programming
language, the precise grammar rules state how functions are made with the help of lists
and statements.
Context-free grammar consists of a set of rules expressing how symbols of the language
can be grouped and ordered together and a lexicon of words and symbols.
One example rule is to express an NP (or noun phrase) that can be composed of either a
ProperNoun or a determiner (Det) followed by a Nominal, a Nominal in turn can consist of
one or more Nouns: NP → DetNominal, NP → ProperNoun; Nominal → Noun |
NominalNoun
Context-free rules can also be hierarchically embedded, so we can combine the previous
rules with others, like the following, that express facts about the lexicon: Det → a Det →
the Noun → flight
A Context free grammar consists of a set of rules or productions, each expressing the ways
the symbols of the language can be grouped, and a lexicon of words
Context-free grammar (CFG) can also be seen as the list of rules that define the set of all
well-formed sentences in a language. Each rule has a left-hand side that identifies a
syntactic category and a right-hand side that defines its alternative parts reading from left
to right. - Example: The rule s --> np vp means that "a sentence is defined as a noun
phrase followed by a verb phrase."
Formalism in rules for context-free grammar: A sentence in the language defined by a CFG
is a series of words that can be derived by systematically applying the rules, beginning
with a rule that has s on its left-hand side.
Use of parse tree in context-free grammar: A convenient way to describe a parse is to show
its parse tree, simply a graphical display of the parse.
G= (V, T, P, S)
Where,
In CFG, the start symbol is used to derive the string. You can derive the string by
repeatedly replacing a non-terminal by the right hand side of the production, until all non-
terminal have been replaced by terminal symbols.