research-article

Non-Local Dictionary Based Japanese Dish Names Recognition Using Multi-Feature CRF from Online Reviews

Authors:

Katsuhiko Kaji,

Nobuo Kawaguchi,

Kei HiroiAuthors Info & Claims

WIMS '14: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14)

Article No.: 14, Pages 1 - 9

https://doi.org/10.1145/2611040.2611063

Published: 02 June 2014 Publication History

Abstract

In cuisine recommender service, online user review is an important data source avoiding a cold-start problem. Cuisine-domain named entity recognition(NER) can be used as an entrance to comprehend the semantic information of reviews. This paper describes a supervised approach recognizing Japanese dish name entity (DNE) from online reviews of Japanese cuisine website. In the first stage, this work adopts tweets as the data source to construct the dictionary of dish name elements through semantic rules and use Bayesian posterior to remove noise. Next stage, we maps first-stage dictionary as a non-local feature into Conditional Random Field (CRF) to recognize the dish name. This method can automatically add new dish name elements into the non-local dictionary by iteration during the recognition proceeding. By using 10-fold validation, experimental results show our method can reach 84.38% in F1 score and outperform the two baselines using the dictionary or CRF with term feature separately.

References

[1]

H. L. Chieu and L.-N. Teow. Combining local and non-local information with dual decomposition for named entity recognition from text. In Proceedings of 15th International Conference on Information Fusion, 2012.

[2]

D. DeCaprio, J. P. Vinson, M. D. Pearson, P. Montgomery, M. Doherty, and J. E. Galagan. Conrad: Gene prediction using conditional random fields. Genome Res, 17:1389--1398, 2007.

[3]

T. Finin, W. Murnane, A. Karandikar, N. Keller, J. Martineau, and M. Dredze. Annotating named entities in twitter data with crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, pages 80--88, Los Angeles, California, 1992.

Digital Library

[4]

M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th conference on Computational linguistics, Nantes, France, August 23-28 1992.

Digital Library

[5]

J. Besag. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. B, 36(2):192--236, 1974.

[6]

A. Jimeno, E. Jimenez-Ruiz, V. Lee, S. Gaudan, R. Berlanga, and D. Rebholz-Schuhmann. Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinform, 9(Suppl 3), 2008.

[7]

J. Kazama and K. Torisawa. Exploiting wikipedia as external knowledge for named entity recognition. In EMNLP, 2007.

[8]

V. Krishnan and C. D. Manning. An effective two-stage model for exploiting non-local dependencies in named entity recognition. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 1121--1128, Sydney, Australia, July 17-18 2006.

Digital Library

[9]

J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, pages 282--289. Morgan Kaufmann Publishers Inc., 2001.

Digital Library

[10]

X. Liu, S. Zhang, F. Wei, and M. Zhou. Recognizing named entities in tweets. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, June 19-24 2011.

Digital Library

[11]

X. Mao, W. Xu, Y. Dong, S. He, and H. Wang. Using non-local features to improve named entity recognition recall. In The 21st Pacific Asia Conference on Language, Information and Computation, Seoul, Korea, 2007.

[12]

A. McCallum and W. Li. Early results for named entity recognition with conditional random fields. In feature induction and web-enhanced lexicons, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, pages 188--191, Edmonton, Canada, 2003.

Digital Library

[13]

D. Nadeau and S. Sekine. A survey of named entity recognition and classification. Linguisticae Investigationes, pages 30:3--26, 2007.

[14]

L. Ratinov and D. Roth. Design challenges and misconceptions in named entity recognition. In CoNLL, pages 147--155, 2009.

Digital Library

[15]

A. Ritter, M. S. Clark, and O. Etzioni. Named entity recognition in tweets: An experimental study. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), pages 1524--1534, Edinburgh, Scotland, 2011.

Digital Library

[16]

F. Sha and F. Pereira. Shallow parsing with conditional random fields. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 134--141, Edmonton, Canada, 2003.

Digital Library

[17]

C. C. Shih, T. C. Peng, and W. S. Lai. Mining the blogosphere to generate local cuisine hotspots for mobile map service. In Fourth International Conference on Digital Information Management, page 151--158, 2009.

[18]

K. Shinzato, S. Sekine, N. Yoshinaga, and K. Torisawa. Constructing dictionaries for named entity recognition on specific domains from the web. In Web Content Mining with Human Language Technologies Workshop on the 5th International Semantic Web, 2006.

[19]

K. S. E. F. Tjong and F. D. Meulder. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of CoNLL-2003, page 142--147. W. Daelemans and M. Osborne, Eds. Edmonton, Canada, 2003.

Digital Library

[20]

R. Tsai and C. Chou. Extracting dish names from chinese blog reviews using suffix arrays and a multi-modal crf model. In First International Workshop on Entity-Oriented Search. ACM SIGIR, 2011.

[21]

O. Vechtomova. A semi-supervised approach to extracting multiword entity names from user reviews. In Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search, pages 1--6, 2012.

Digital Library

[22]

K. Yoshida and J. Tsujii. Reranking for biomedical named-entity recognition. In Workshop: Biological translational and clinical language processing, page 209--216, 2007.

Digital Library

[23]

H. Zhang, Q. Liu, H. Yu, X. Cheng, and S. Bai. Chinese named entity recognition using role model. the International Journal of Computational Linguistics and Chinese Language Processing, 8(2):29--60, 2003.

[24]

W. Zhang, T. Yoshida, X. Tang, and T.-B. Ho. Improving effectiveness of mutual information for substantival multiword expression extraction. Expert Systems with Applications: An International Journal, 36(8):10919--10930, October 2009.

Digital Library

Cited By

Liao CKaji KHiroi KKawaguchi N(2014)Design and Implementation of Event Information Summarization SystemProceedings of the 2014 IEEE 38th International Computer Software and Applications Conference Workshops10.1109/COMPSACW.2014.96(572-577)Online publication date: 21-Jul-2014
https://dl.acm.org/doi/10.1109/COMPSACW.2014.96

Index Terms

Non-Local Dictionary Based Japanese Dish Names Recognition Using Multi-Feature CRF from Online Reviews
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval

Recommendations

A named entity recognition method towards product reviews based on BiLSTM-attention-CRF

Named entity recognition (NER) towards product review intends to identify domain dependent named entities (e.g., organisation name, product name, etc.) from product reviews. Due to the fragmentation and non-construction of product reviews, traditional ...
Chinese clinical named entity recognition via multi-head self-attention based BiLSTM-CRF
Abstract
Clinical named entity recognition (CNER) is a fundamental step for many clinical Natural Language Processing (NLP) systems, which aims to recognize and classify clinical entities such as diseases, symptoms, exams, body parts and treatments in ...
Highlights
- A Multi-head Self-attention-based BiLSTM-CRF model (MUSA-BiLSTM-CRF) for Chinese clinical named entity recognition
- An improved character-level feature representation method combining character embedding and character-label embedding
CRF-based active learning for Chinese named entity recognition
SMC'09: Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics

Conditional Random Fields (CRFs) have been used for many sequence labeling tasks and got excellent results. Further, the supervised model strongly depends on the huge training data. Active learning is a different way rather than relying on a large ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

WIMS '14: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14)

June 2014

506 pages

ISBN:9781450325387

DOI:10.1145/2611040

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Aristotle University of Thessaloniki

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 June 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WIMS '14

WIMS '14: 4th International Conference on Web Intelligence, Mining and Semantics

June 2 - 4, 2014

Thessaloniki, Greece

Acceptance Rates

WIMS '14 Paper Acceptance Rate 41 of 90 submissions, 46%;

Overall Acceptance Rate 140 of 278 submissions, 50%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
91
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liao CKaji KHiroi KKawaguchi N(2014)Design and Implementation of Event Information Summarization SystemProceedings of the 2014 IEEE 38th International Computer Software and Applications Conference Workshops10.1109/COMPSACW.2014.96(572-577)Online publication date: 21-Jul-2014
https://dl.acm.org/doi/10.1109/COMPSACW.2014.96

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten