Nothing Special   »   [go: up one dir, main page]

History of NLP

Download as odt, pdf, or txt
Download as odt, pdf, or txt
You are on page 1of 7

NATURAL LANGUAGE PROCESSING.

MODULE -01

HISTORY OF NLP: (ppt)

LEVELS OF NATURAL LANGUAGE PROCESSING.

1. Phonology: Based on the pronounciation of the words.

Involves phonetic analysis to understand pronounciation


• Ambiguity: Example 1: Throwaway or Thow away or Throw a way.
• Example 2: See or Sea / Sun or Sun.
• Question: Working on speech corpus starts from phonological level.

2. Morphology: Word forms split into morpheme by removing suffix and prefix.
◦ Question : Splitting of word reusable to re-use-able is called Morphhological
analysis.
◦ Example: unlockable. (Un-lockable or Unlock able)

3. Lexical: Deals with meanings of words

Lexical ambiguity is a writing error that occurs when a sentence contains a word that
has more than one meaning.
◦ Example: Tank ( dress material, army vechile, water storage equiptment).
◦ Example: "You know, somebody actually complimented me on my driving today.
They left a little note on the windscreen; it said, 'Parking Fine.' So that was
nice."(English comedian Tim Vine)
Here parking fine refers to the penalty and not a compilment.
◦ She is looking for a match. ( game or marriage match)
◦ The fisherman went to the bank. ( money bank or river bank).
Link: https://www.thoughtco.com/what-is-lexical-ambiguity-1691226
4. Syantactic : In English grammar, syntactic ambiguity (also called structural ambiguity
or grammatical ambiguity) is the presence of two or more possible meanings within a
single sentence or sequence of words, as opposed to lexical ambiguity, which is
the presence of two or more possible meanings within a single word.

◦ Example 1: The professor said on Monday he would give an exam. This sentence
means either that it was on Monday that the professor told the class about the exam
or that the exam would be given on Monday.
◦ The chicken is ready to eat. This sentence either means the chicken is cooked and
can be eaten now or the chicken is ready to be fed.
◦ The burglar threatened the student with the knife. This sentence either means that
a knife-wielding burglar threatened a student or the student a burglar threatened
was holding a knife.
◦ The burglar threatened the student with the knife. This sentence either means that
a knife-wielding burglar threatened a student or the student a burglar threatened
was holding a knife.
◦ "One morning, I shot an elephant in my pajamas. How he got in my pajamas I don't
know."
—Groucho Marx

◦ The government asks us to save soap and waste paper.



• The ambiguity here is who was in the pajamas, Groucho or the elephant? Groucho,
answering the question in the opposite way of expectation, gets his laugh.

Link: https://www.thoughtco.com/syntactic-ambiguity-grammar-1692179

5. Semantic : Lexical ambiguity is a writing error that occurs when a sentence contains a
word that has more than one meaning. This problem, which is also called semantic
ambiguity, obscures the writer's intent and confuses the reader.

6. Discourse: The evidence discussed above suggests that discourse interpretation is a


process that involves reasoning with underspecified representations and may involve
generating more than one hypothesis in parallel.

• Hook up the cable to the serial port. It is on the back of the computer.

Commonsense reasoning may also tell us that some extensions are equivalent
for the purposes at hand, thus can be merged. An example from the TRAINS
corpus is the sentence Hook up the engine to the boxcar, and move it to Avon.:
even though it can refer either to the engine or to the boxcar, and therefore two
extensions could be obtained by discourse interpretation, the difference
between these two extensions would be immaterial as far as the plan is
concerned, because moving one object would necessarily entail moving the
other; the two hypotheses can therefore be merged.

• John was coming dejected from the school (who is John: most
likely a student?)
• He could not control the class (who is John now? Most likely the teacher?)
• Sentence-3: Teacher should not have made him responsible (who is John now?
Most likely a student again, albeit a special student-the monitor?)

IN SHORT USING PRONOUNS WHICH CAUSE AMBIGUITY.

Link:
https://www.researchgate.net/profile/Massimo_Poesio/publication/2791407_Ambiguity_Un
derspecification_and_Discourse_Interpretation/links/00b495204f2ef93fa1000000/Ambiguit
y-Underspecification-and-Discourse-Interpretation.pdf

7. Pragmatic :Missing the real intention of the speaker


◦ It is the source of humour.
◦ Happens due to lack of common sense knowledge rather than language
proficiency
In a Nut-Shell:

1. Pragmatic -Using Common sense to identify User Intention, Opinion and Sentiment
2. Discourse -Resolving coreferenceamong entities represented using pronouns3.
3. Semantic -Word sense disambiguation.
4. Syntactic –build grammar based parse trees
5. Lexical-Locate the word, its POS tags, morphemes and semantic tags using lexicon
6. Morphology -Word forms split into morpheme by removing suffix and prefix
7. Phonology -The pronounciationof the word.

Stemming VS Lemmatization:

Both are text normalization techniques within the field of Natural language Processing that
are used to prepare text, words, and documents for further processing.

Stemming:Stemming is the process of producing morphological variants of a root/base word.


Stemming programs are commonly referred to as stemming algorithms or stemmers.
Lemmatization :In contrast to stemming, lemmatization looks beyond word reduction and
considers a language’s full vocabulary to apply a morphological analysis to words. The lemma
of ‘was’ is ‘be’ and the lemma of ‘mice’ is ‘mouse’.

Hence they are not the same.

Applications of NLP:

Few major ones are:

1. Information Retrival: This is usually done when we google something. It considers


the keywords and searches for relavant data in all the documents.
Finding relavent information from the user querries.

Information retrieval works on different scales. In web search, the system has to search
billions of documents stored on millions of computers providing answers to
incomplete, ambiguous questions asked by users not being fooled by site providers
manipulating site content in an attempt to boost their search engine rankings.
Question:Assume you are doing a review on impact of COVID on college education. As a first
step you may be downloading articles and papers from
google relevant to the topic. What is this task in NLP ?
a. Machine Translation
b. Information Retrieval
c. Information Extraction
d. All of them
2. Information Extraction: Information extraction (IE) is the task of automatically
extracting structured information from unstructured and/or semi-structured machine-
readable documents. In most of the cases this activity concerns processing human
language texts by means of natural language processing (NLP)
1. Name Entity Extraction:Named entity recognition (NER)is probably the first step
towards information extraction that seeks to locate and classify named entities in
text into pre-defined categories such as the names of persons, organizations,
locations, expressions of times, quantities, monetary values, percentages, etc. NER
is used in many fields in Natural Language Processing (NLP), and it can help
answering many real-world questions.
You work as a recruiter in a leading software company. You are supposed to design a tool
which will browse through the resumes of
thousands of job applicants submitted in pdf format and filter based on your expectations on
age, prior experience, education, previous
income etc. Which of the following core NLP task/application will you implement ?
a. Summarization
b. Machine Translation
c. Named Entity Extraction
d. Spell and Grammar Checking

2.Role Filling.: (She removed it earlier it was there).

3. Test-Categorization: Assigning one (or more) pre-defined category to a text.


4. Summarization: Generating a short summary from one or more documents, sometimes
based on a given query:
5. Spelling and Grammer Check.
6. Question Answering.
7. Machine Translation.
8. Sentiment Analysis (Reviews of varios websites)
9. Speech Recognization. ( When we say “ Ok google” and give commands or “Hey Siri” and
verbally instruct.
10.Spoken Dialogu system.
11. Speech synthesis.

Level of difficulties:--

Easy (mostly solved)


–Spell and grammar checking
–Some text categorization tasks
–Some named-entity recognition tasks

Intermediate (good progress)


–Information retrieval
–Sentiment analysis
–Machine translation
–Information extraction

Difficult (still hard)


–Question answering
–Summarization
–Dialog systems

You might also like