research-article

Information Extraction for Conversational Systems in Indian Languages - Arnekt IECSIL

Authors:

Anand Kumar MAuthors Info & Claims

FIRE '18: Proceedings of the 10th Annual Meeting of the Forum for Information Retrieval Evaluation

Pages 18 - 20

https://doi.org/10.1145/3293339.3293344

Published: 06 December 2018 Publication History

Get Access

Abstract

Data being the new source of wealth, mining intelligence from every possible units of it, has become today's salient feature in many fields. Text data is not limited to one language and this has showcased its usability in creating multiple applications from various languages. Development of Indian languages is just getting better both in terms of resource and application specific. Information Extraction for Conversational Systems in Indian Languages - Arnekt IECSIL has taken its step in creating its own resource in Indian languages (Hindi, Kannada, Malayalam, Tamil and Telugu) for Named Entity Recognition (NER) and Information Extraction (IE) tasks. This overview paper will be detailing more on the existing Indian language corpora development and the steps taken for building our own corpus along with its statistics.

References

[1]

Brijesh Bhatt and Pushpak Bhattacharyya. 2012. Domain specific ontology extractor for indian languages. In Proceedings of the 10th Workshop on Asian Language Resources. 75--84.

Google Scholar

[2]

VV Devadath and Dipti Misra Sharma. 2016. Significance of an accurate sandhi-splitter in shallow parsing of dravidian languages. In Proceedings of the ACL 2016 Student Research Workshop. 37--42.

Google Scholar

[3]

Hyderabad International Institute of Information Technology. {n. d.}. Tamil Shallow Parser. ({n. d.}).

Google Scholar

[4]

Dinesh Kumar and Gurpreet Singh Josan. 2010. Part of speech taggers for morphologically rich indian languages: a survey. International Journal of Computer Applications 6, 5 (2010), 32--41.

Crossref

Google Scholar

[5]

Animesh Nayan, B Ravi Kiran Rao, Pawandeep Singh, Sudip Sanyal, and Ratna Sanyal. 2008. Named entity recognition for Indian languages. In Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages.

Google Scholar

[6]

Siva Reddy and Serge Sharoff. 2011. Cross language POS taggers (and other tools) for Indian languages: An experiment with Kannada using Telugu resources. In Proceedings of the Fifth International Workshop On Cross Lingual Information Access. 11--19.

Google Scholar

Cited By

View all

Pareed AIdicula S(2019)A Relation Extraction System for Indian LanguagesAdvances in Science, Technology and Engineering Systems Journal10.25046/aj0402084:2(65-69)Online publication date: 2019
https://doi.org/10.25046/aj040208

Recommendations

A Flexible Text Mining System for Entity and Relation Extraction in PubMed
DTMBIO '15: Proceedings of the ACM Ninth International Workshop on Data and Text Mining in Biomedical Informatics

Due to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means ...
Transliteration normalization for Information Extraction and Machine Translation

Foreign name transliterations typically include multiple spelling variants. These variants cause data sparseness and inconsistency problems, increase the Out-of-Vocabulary (OOV) rate, and present challenges for Machine Translation, Information ...
Learning multilingual named entity recognition from Wikipedia

We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

FIRE '18: Proceedings of the 10th Annual Meeting of the Forum for Information Retrieval Evaluation

December 2018

68 pages

ISBN:9781450362085

DOI:10.1145/3293339

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

ISI: Information Sciences Institute

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 December 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

FIRE'18

FIRE'18: Forum for Information Retrieval Evaluation

December 6 - 9, 2018

Gandhinagar, India

Acceptance Rates

Overall Acceptance Rate 19 of 64 submissions, 30%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
94
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Pareed AIdicula S(2019)A Relation Extraction System for Indian LanguagesAdvances in Science, Technology and Engineering Systems Journal10.25046/aj0402084:2(65-69)Online publication date: 2019
https://doi.org/10.25046/aj040208

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Recommendations

A Flexible Text Mining System for Entity and Relation Extraction in PubMed

Transliteration normalization for Information Extraction and Machine Translation

Learning multilingual named entity recognition from Wikipedia

Comments

Information

Published In

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations