Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3219819.3219839acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

OpenTag: Open Attribute Value Extraction from Product Profiles

Published: 19 July 2018 Publication History

Abstract

Extraction of missing attribute values is to find values describing an attribute of interest from a free text input. Most past related work on extraction of missing attribute values work with a closed world assumption with the possible set of values known beforehand, or use dictionaries of values and hand-crafted features. How can we discover new attribute values that we have never seen before? Can we do this with limited human annotation or supervision? We study this problem in the context of product catalogs that often have missing values for many attributes of interest.
In this work, we leverage product profile information such as titles and descriptions to discover missing values of product attributes. We develop a novel deep tagging model OpenTag for this extraction problem with the following contributions: (1) we formalize the problem as a sequence tagging task, and propose a joint model exploiting recurrent neural networks (specifically, bidirectional LSTM) to capture context and semantics, and Conditional Random Fields (CRF) to enforce tagging consistency; (2) we develop a novel attention mechanism to provide interpretable explanation for our model's decisions; (3) we propose a novel sampling strategy exploring active learning to reduce the burden of human annotation. OpenTag does not use any dictionary or hand-crafted features as in prior works. Extensive experiments in real-life datasets in different domains show that OpenTag with our active learning strategy discovers new attribute values from as few as 150 annotated samples (reduction in 3.3x amount of annotation effort) with a high F-score of 83%, outperforming state-of-the-art models.

Supplementary Material

MP4 File (mukherjee_opentag_extraction.mp4)

References

[1]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR Vol. abs/1409.0473 (2014).
[2]
Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733 (2016).
[3]
Laura Chiticariu, Rajasekar Krishnamurthy, Yunyao Li, Frederick Reiss, and Shivakumar Vaithyanathan. 2010. Domain Adaptation of Rule-based Annotators for Named-entity Recognition Tasks (EMNLP '10). 1002--1012.
[4]
Jason P. C. Chiu and Eric Nichols. 2015. Named entity recognition with bidirectional LSTM-CNNs. arXiv preprint arXiv:1511.08308 (2015).
[5]
Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. JMLR Vol. 12, Aug (2011), 2493--2537.
[6]
Aron Culotta and Andrew McCallum. 2005. Reducing Labeling Effort for Structured Prediction Tasks (AAAI'05). 746--751.
[7]
Aron Culotta and Andrew McCallum. 2005. Reducing labeling effort for structured prediction tasks AAAI, Vol. Vol. 5. 746--751.
[8]
Rayid Ghani, Katharina Probst, Yan Liu, Marko Krema, and Andrew Fano. 2006. Text Mining for Product Attribute Extraction. SIGKDD Explor. Newsl. (2006).
[9]
James Hammerton. 2003. Named entity recognition with long short-term memory (HLT-NAACL '03). 172--175.
[10]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation Vol. 9, 8 (1997), 1735--1780.
[11]
Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF Models for Sequence Tagging. CoRR Vol. abs/1508.01991 (2015).
[12]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR Vol. abs/1412.6980 (2014).
[13]
Zornitsa Kozareva, Qi Li, Ke Zhai, and Weiwei Guo. 2016. Recognizing Salient Entities in Shopping Queries (ACL '16). 107--111.
[14]
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data (ICML '01). 282--289.
[15]
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. HLT-NAACL. 260--270.
[16]
Xiao Ling and Daniel S. Weld. 2012. Fine-grained Entity Recognition (AAAI'12).
[17]
Xuezhe Ma and Eduard Hovy. 2016. End-to-end sequence labeling via bi-directional lstm-cnns-crf. arXiv preprint arXiv:1603.01354 (2016).
[18]
Andrei Mikheev, Marc Moens, and Claire Grover. 1999. Named Entity Recognition Without Gazetteers (EACL '99). 1--8.
[19]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality (NIPS'13). 3111--3119.
[20]
Ajinkya More. 2016. Attribute Extraction from Product Titles in eCommerce. CoRR Vol. abs/1608.04670 (2016).
[21]
David Nadeau and Satoshi Sekine. 2007. A survey of named entity recognition and classification. Linguisticae Investigationes Vol. 30, 1 (2007), 3--26.
[22]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation (EMNLP '14). 1532--1543.
[23]
G. Petasis, S. Petridis, G. Paliouras, V. Karkaletsis, S. J. Perantonis, and C. D. Spyropoulos. 2000. Symbolic and neural learning for named-entity recognition Proceedings of the Symposium on Computational Intelligence and Learning. 58--66.
[24]
Petar Petrovski and Christian Bizer. 2017. Extracting Attribute-value Pairs from Product Specifications on the Web (WI '17). 558--565.
[25]
Duangmanee (Pew) Putthividhya and Junling Hu. 2011. Bootstrapped Named Entity Recognition for Product Attribute Extraction (EMNLP '11). 1557--1567.
[26]
Cicero Santos and Victor Guimaraes. 2015. Boosting named entity recognition with neural character embeddings. arXiv preprint arXiv:1505.05008 (2015).
[27]
Tobias Scheffer, Christian Decomain, and Stefan Wrobel. 2001. Active hidden markov models for information extraction ISIDA. Springer, 309--318.
[28]
Burr Settles and Mark Craven. 2008. An Analysis of Active Learning Strategies for Sequence Labeling Tasks (EMNLP '08). 1070--1079.
[29]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. JMLR Vol. 15 (2014), 1929--1958.
[30]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, and Kaiser. 2017. Attention is all you need. In NIPS.

Cited By

View all
  • (2024)Enhancing Human Activity Recognition in Smart Homes with Self-Supervised Learning and Self-AttentionSensors10.3390/s2403088424:3(884)Online publication date: 29-Jan-2024
  • (2024)Building Knowledge Graph for Products at Web ScaleProceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)10.1145/3632410.3633292(498-500)Online publication date: 4-Jan-2024
  • (2024)Building Natural Language Interface for Product SearchProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680070(4768-4776)Online publication date: 21-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. active learning
  2. attention mechanism
  3. deep sequence tagging
  4. imputation
  5. neural networks
  6. open extraction

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '18
Sponsor:

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)411
  • Downloads (Last 6 weeks)80
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Enhancing Human Activity Recognition in Smart Homes with Self-Supervised Learning and Self-AttentionSensors10.3390/s2403088424:3(884)Online publication date: 29-Jan-2024
  • (2024)Building Knowledge Graph for Products at Web ScaleProceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)10.1145/3632410.3633292(498-500)Online publication date: 4-Jan-2024
  • (2024)Building Natural Language Interface for Product SearchProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680070(4768-4776)Online publication date: 21-Oct-2024
  • (2024)LLM-Ensemble: Optimal Large Language Model Ensemble Method for E-commerce Product Attribute Value ExtractionProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661357(2910-2914)Online publication date: 10-Jul-2024
  • (2024)Multi-Label Zero-Shot Product Attribute-Value ExtractionProceedings of the ACM Web Conference 202410.1145/3589334.3645649(2259-2270)Online publication date: 13-May-2024
  • (2024)MuJo-SF: Multimodal Joint Slot Filling for Attribute Value Prediction of E-Commerce CommoditiesIEEE Transactions on Multimedia10.1109/TMM.2024.340766726(10354-10366)Online publication date: 2024
  • (2024)A Deep Learning model for Question Analysis in Low-resource Languages: A Dataset and Case Study for Persian2024 14th International Conference on Pattern Recognition Systems (ICPRS)10.1109/ICPRS62101.2024.10677830(1-7)Online publication date: 15-Jul-2024
  • (2024)Exploring generative frameworks for product attribute value extractionExpert Systems with Applications10.1016/j.eswa.2023.122850243(122850)Online publication date: Jun-2024
  • (2024)A self-attention-based CNN-Bi-LSTM model for accurate state-of-charge estimation of lithium-ion batteriesJournal of Energy Storage10.1016/j.est.2024.11152488(111524)Online publication date: May-2024
  • (2024)Using LLMs for the Extraction and Normalization of Product Attribute ValuesAdvances in Databases and Information Systems10.1007/978-3-031-70626-4_15(217-230)Online publication date: 1-Sep-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media