Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2611040.2611063acmotherconferencesArticle/Chapter ViewAbstractPublication PageswimsConference Proceedingsconference-collections
research-article

Non-Local Dictionary Based Japanese Dish Names Recognition Using Multi-Feature CRF from Online Reviews

Published: 02 June 2014 Publication History

Abstract

In cuisine recommender service, online user review is an important data source avoiding a cold-start problem. Cuisine-domain named entity recognition(NER) can be used as an entrance to comprehend the semantic information of reviews. This paper describes a supervised approach recognizing Japanese dish name entity (DNE) from online reviews of Japanese cuisine website. In the first stage, this work adopts tweets as the data source to construct the dictionary of dish name elements through semantic rules and use Bayesian posterior to remove noise. Next stage, we maps first-stage dictionary as a non-local feature into Conditional Random Field (CRF) to recognize the dish name. This method can automatically add new dish name elements into the non-local dictionary by iteration during the recognition proceeding. By using 10-fold validation, experimental results show our method can reach 84.38% in F1 score and outperform the two baselines using the dictionary or CRF with term feature separately.

References

[1]
H. L. Chieu and L.-N. Teow. Combining local and non-local information with dual decomposition for named entity recognition from text. In Proceedings of 15th International Conference on Information Fusion, 2012.
[2]
D. DeCaprio, J. P. Vinson, M. D. Pearson, P. Montgomery, M. Doherty, and J. E. Galagan. Conrad: Gene prediction using conditional random fields. Genome Res, 17:1389--1398, 2007.
[3]
T. Finin, W. Murnane, A. Karandikar, N. Keller, J. Martineau, and M. Dredze. Annotating named entities in twitter data with crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, pages 80--88, Los Angeles, California, 1992.
[4]
M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th conference on Computational linguistics, Nantes, France, August 23-28 1992.
[5]
J. Besag. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. B, 36(2):192--236, 1974.
[6]
A. Jimeno, E. Jimenez-Ruiz, V. Lee, S. Gaudan, R. Berlanga, and D. Rebholz-Schuhmann. Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinform, 9(Suppl 3), 2008.
[7]
J. Kazama and K. Torisawa. Exploiting wikipedia as external knowledge for named entity recognition. In EMNLP, 2007.
[8]
V. Krishnan and C. D. Manning. An effective two-stage model for exploiting non-local dependencies in named entity recognition. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 1121--1128, Sydney, Australia, July 17-18 2006.
[9]
J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning, pages 282--289. Morgan Kaufmann Publishers Inc., 2001.
[10]
X. Liu, S. Zhang, F. Wei, and M. Zhou. Recognizing named entities in tweets. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, June 19-24 2011.
[11]
X. Mao, W. Xu, Y. Dong, S. He, and H. Wang. Using non-local features to improve named entity recognition recall. In The 21st Pacific Asia Conference on Language, Information and Computation, Seoul, Korea, 2007.
[12]
A. McCallum and W. Li. Early results for named entity recognition with conditional random fields. In feature induction and web-enhanced lexicons, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, pages 188--191, Edmonton, Canada, 2003.
[13]
D. Nadeau and S. Sekine. A survey of named entity recognition and classification. Linguisticae Investigationes, pages 30:3--26, 2007.
[14]
L. Ratinov and D. Roth. Design challenges and misconceptions in named entity recognition. In CoNLL, pages 147--155, 2009.
[15]
A. Ritter, M. S. Clark, and O. Etzioni. Named entity recognition in tweets: An experimental study. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), pages 1524--1534, Edinburgh, Scotland, 2011.
[16]
F. Sha and F. Pereira. Shallow parsing with conditional random fields. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 134--141, Edmonton, Canada, 2003.
[17]
C. C. Shih, T. C. Peng, and W. S. Lai. Mining the blogosphere to generate local cuisine hotspots for mobile map service. In Fourth International Conference on Digital Information Management, page 151--158, 2009.
[18]
K. Shinzato, S. Sekine, N. Yoshinaga, and K. Torisawa. Constructing dictionaries for named entity recognition on specific domains from the web. In Web Content Mining with Human Language Technologies Workshop on the 5th International Semantic Web, 2006.
[19]
K. S. E. F. Tjong and F. D. Meulder. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of CoNLL-2003, page 142--147. W. Daelemans and M. Osborne, Eds. Edmonton, Canada, 2003.
[20]
R. Tsai and C. Chou. Extracting dish names from chinese blog reviews using suffix arrays and a multi-modal crf model. In First International Workshop on Entity-Oriented Search. ACM SIGIR, 2011.
[21]
O. Vechtomova. A semi-supervised approach to extracting multiword entity names from user reviews. In Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search, pages 1--6, 2012.
[22]
K. Yoshida and J. Tsujii. Reranking for biomedical named-entity recognition. In Workshop: Biological translational and clinical language processing, page 209--216, 2007.
[23]
H. Zhang, Q. Liu, H. Yu, X. Cheng, and S. Bai. Chinese named entity recognition using role model. the International Journal of Computational Linguistics and Chinese Language Processing, 8(2):29--60, 2003.
[24]
W. Zhang, T. Yoshida, X. Tang, and T.-B. Ho. Improving effectiveness of mutual information for substantival multiword expression extraction. Expert Systems with Applications: An International Journal, 36(8):10919--10930, October 2009.

Cited By

View all
  • (2014)Design and Implementation of Event Information Summarization SystemProceedings of the 2014 IEEE 38th International Computer Software and Applications Conference Workshops10.1109/COMPSACW.2014.96(572-577)Online publication date: 21-Jul-2014

Index Terms

  1. Non-Local Dictionary Based Japanese Dish Names Recognition Using Multi-Feature CRF from Online Reviews

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WIMS '14: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14)
      June 2014
      506 pages
      ISBN:9781450325387
      DOI:10.1145/2611040
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      In-Cooperation

      • Aristotle University of Thessaloniki

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 June 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Conditional Random Field
      2. Non-Local Dictionary
      3. Text Extraction
      4. Twitter

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      WIMS '14

      Acceptance Rates

      WIMS '14 Paper Acceptance Rate 41 of 90 submissions, 46%;
      Overall Acceptance Rate 140 of 278 submissions, 50%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 19 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2014)Design and Implementation of Event Information Summarization SystemProceedings of the 2014 IEEE 38th International Computer Software and Applications Conference Workshops10.1109/COMPSACW.2014.96(572-577)Online publication date: 21-Jul-2014

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media