Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3394486.3403047acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Learning to Extract Attribute Value from Product via Question Answering: A Multi-task Approach

Published: 20 August 2020 Publication History

Abstract

Attribute value extraction refers to the task of identifying values of an attribute of interest from product information. It is an important research topic which has been widely studied in e-Commerce and relation learning. There are two main limitations in existing attribute value extraction methods: scalability and generalizability. Most existing methods treat each attribute independently and build separate models for each of them, which are not suitable for large scale attribute systems in real-world applications. Moreover, very limited research has focused on generalizing extraction to new attributes.
In this work, we propose a novel approach for Attribute Value Extraction via Question Answering (AVEQA) using a multi-task framework. In particular, we build a question answering model which treats each attribute as a question and identifies the answer span corresponding to the attribute value in the product context. A unique BERT contextual encoder is adopted and shared across all attributes to encode both the context and the question, which makes the model scalable. A distilled masked language model with knowledge distillation loss is introduced to improve the model generalization ability. In addition, we employ a no-answer classifier to explicitly handle the cases where there are no values for a given attribute in the product context. The question answering, distilled masked language model and the no answer classification are then combined into a unified multi-task framework. We conduct extensive experiments on a public dataset. The results demonstrate that the proposed approach outperforms several state-of-the-art methods with large margin.

Supplementary Material

MP4 File (3394486.3403047.mp4)
We propose a novel approach for Attribute Value Extraction via Question Answering (AVEQA) using a multi-task framework. In particular, we build a question answering model which treats each attribute as a question and identifies the answer span corresponding to the attribute value in the product context. A unique BERT contextual encoder is adopted and shared across all attributes to encode both the context and the question, which makes the model scalable. A distilled masked language model with knowledge distillation loss is introduced to improve the model generalization ability. In addition, we employ a no-answer classifier to explicitly handle the cases where there are no values for a given attribute in the product context. The question answering, distilled masked language model and the no answer classification are then combined into a unified multi-task framework.

References

[1]
R. Baradaran, R. Ghiasi, and H. Amirkhani. A survey on machine reading comprehension systems. CoRR, abs/2001.01582, 2020.
[2]
D. Carmel, L. Lewin-Eytan, and Y. Maarek. Product question answering using customer generated content - research challenges. In SIGIR, pages 1349--1350, 2018.
[3]
K. Chen, L. Feng, Q. Chen, G. Chen, and L. Shou. EXACT: attributed entity extraction by annotating texts. In SIGIR, pages 1349--1352, 2019.
[4]
L. Chiticariu, R. Krishnamurthy, Y. Li, F. Reiss, and S. Vaithyanathan. Domain adaptation of rule-based annotators for named-entity recognition tasks. In EMNLP, pages 1002--1012, 2010.
[5]
J. P. C. Chiu and E. Nichols. Named entity recognition with bidirectional lstm-cnns. TACL, 4:357--370, 2016.
[6]
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. P. Kuksa. Natural language processing (almost) from scratch. J. Mach. Learn. Res., 12:2493--2537, 2011.
[7]
J. Devlin, M. Chang, K. Lee, and K. Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, pages 4171--4186, 2019.
[8]
R. Ghani, K. Probst, Y. Liu, M. Krema, and A. E. Fano. Text mining for product attribute extraction. SIGKDD Explorations, 8(1):41--48, 2006.
[9]
V. Gopalakrishnan, S. P. Iyengar, A. Madaan, R. Rastogi, and S. H. Sengamedu. Matching product titles using web-based enrichment. In CIKM, pages 605--614, 2012.
[10]
G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
[11]
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735--1780, 1997.
[12]
E. Hoffer, I. Hubara, and D. Soudry. Train longer, generalize better: closing the generalization gap in large batch training of neural networks. In NIPS, pages 1731--1741, 2017.
[13]
Z. Huang, W. Xu, and K. Yu. Bidirectional LSTM-CRF models for sequence tagging. CoRR, abs/1508.01991, 2015.
[14]
K. S. D. Ishwari, A. K. R. R. Aneeze, S. Sudheesan, H. J. D. A. Karunaratne, A. Nugaliyadde, and Y. Mallawarachchi. Advances in natural language question answering: A review. CoRR, abs/1904.05276, 2019.
[15]
A. Ittycheriah, M. Franz, W. Zhu, A. Ratnaparkhi, and R. J. Mammone. Ibm's statistical question answering system. In TREC, 2000.
[16]
N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, and P. T. P. Tang. On large-batch training for deep learning: Generalization gap and sharp minima. In ICLR, 2017.
[17]
D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
[18]
L. Kodra and E. K. Mecc e. Question answering systems: A review on present developments, challenges and trends. International Journal of Advanced Computer Science and Applications, 8(10.14569), 2017.
[19]
Z. Kozareva, Q. Li, K. Zhai, and W. Guo. Recognizing salient entities in shopping queries. In ACL, 2016.
[20]
T. Kwiatkowski, J. Palomaki, O. Redfield, M. Collins, A. P. Parikh, C. Alberti, D. Epstein, I. Polosukhin, J. Devlin, K. Lee, K. Toutanova, L. Jones, M. Kelcey, M. Chang, A. M. Dai, J. Uszkoreit, Q. Le, and S. Petrov. Natural questions: a benchmark for question answering research. TACL, 7:452--466, 2019.
[21]
G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer. Neural architectures for named entity recognition. In NAACL-HLT, pages 260--270, 2016.
[22]
O. Levy, M. Seo, E. Choi, and L. Zettlemoyer. Zero-shot relation extraction via reading comprehension. In CoNLL, pages 333--342, 2017.
[23]
Z. Li and D. Hoiem. Learning without forgetting. TPAMI, 40(12):2935--2947, 2018.
[24]
X. Ling and D. S. Weld. Fine-grained entity recognition. In AAAI, 2012.
[25]
C. Lockard, X. L. Dong, P. Shiralkar, and A. Einolghozati. CERES: distantly supervised relation extraction from the semi-structured web. PVLDB, 2018.
[26]
X. Ma and E. H. Hovy. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In ACL, 2016.
[27]
A. Mikheev, M. Moens, and C. Grover. Named entity recognition without gazetteers. In EACL, pages 1--8, 1999.
[28]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111--3119, 2013.
[29]
A. More. Attribute extraction from product titles in ecommerce. CoRR, abs/1608.04670, 2016.
[30]
D. Nadeau and S. Sekine. A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1):3--26, 2007.
[31]
A. Nugaliyadde, K. W. Wong, F. Sohel, and H. Xie. Reinforced memory network for question answering. In ICONIP, 2017.
[32]
S. Pawar, G. K. Palshikar, and P. Bhattacharyya. Relation extraction : A survey. CoRR, abs/1712.05191, 2017.
[33]
N. Peng, H. Poon, C. Quirk, K. Toutanova, and W. Yih. Cross-sentence n-ary relation extraction with graph lstms. TACL, 5:101--115, 2017.
[34]
P. Petrovski and C. Bizer. Extracting attribute-value pairs from product specifications on the web. In ICWI, pages 558--565, 2017.
[35]
D. Putthividhya and J. Hu. Bootstrapped named entity recognition for product attribute extraction. In EMNLP, pages 1557--1567, 2011.
[36]
D. Qiu, L. Barbosa, X. L. Dong, Y. Shen, and D. Srivastava. DEXTER: large-scale discovery and extraction of product specifications on the web. PVLDB, 8(13):2194--2205, 2015.
[37]
P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang. Squad: 100,000+ questions for machine comprehension of text. In EMNLP, pages 2383--2392, 2016.
[38]
S. Riedel, L. Yao, and A. McCallum. Modeling relations and their mentions without labeled text. In ECML/PKDD, pages 148--163, 2010.
[39]
E. Riloff and M. Thelen. A rule-based question answering system for reading comprehension tests. In ANLP/NAACL Workshop, pages 13--19, 2000.
[40]
K. Shinzato and S. Sekine. Unsupervised extraction of attributes and their values from product description. In IJCNLP, pages 1339--1347, 2013.
[41]
F. M. Suchanek, G. Ifrim, and G. Weikum. Combining linguistic and statistical analysis to extract relations from web documents. In SIGKDD, 2006.
[42]
M. Tan, B. Xiang, and B. Zhou. Lstm-based deep learning models for non-factoid answer selection. CoRR, abs/1511.04108, 2015.
[43]
D. Vandic, J. van Dam, and F. Frasincar. Faceted product search powered by the semantic web. Decision Support Systems, 53(3):425--437, 2012.
[44]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need. In NIPS, pages 5998--6008, 2017.
[45]
Q. Wang, B. Kanagal, V. Garg, and D. Sivakumar. Constructing a comprehensive events database from the web. In CIKM, pages 229--238, 2019.
[46]
Y. W. Wong, D. Widdows, T. Lokovic, and K. Nigam. Scalable attribute-value extraction from semi-structured text. In ICDM Workshops, pages 302--307, 2009.
[47]
S. Wu, L. Hsiao, X. Cheng, B. Hancock, T. Rekatsinas, P. Levis, and C. Ré . Fonduer: Knowledge base construction from richly formatted data. In SIGMOD, pages 1301--1316, 2018.
[48]
Y. Xian, C. H. Lampert, B. Schiele, and Z. Akata. Zero-shot learning - A comprehensive evaluation of the good, the bad and the ugly. TPAMI, 2019.
[49]
C. Xiong, S. Merity, and R. Socher. Dynamic memory networks for visual and textual question answering. In ICML, pages 2397--2406, 2016.
[50]
H. Xu, W. Wang, X. Mao, X. Jiang, and M. Lan. Scaling up open tagging from tens to thousands: Comprehension empowered attribute value extraction from product title. In ACL, pages 5214--5223, 2019.
[51]
Z. Yang, Z. Dai, Y. Yang, J. G. Carbonell, R. Salakhutdinov, and Q. V. Le. Xlnet: Generalized autoregressive pretraining for language understanding. In NIPS, pages 5754--5764, 2019.
[52]
D. Zeng, K. Liu, S. Lai, G. Zhou, and J. Zhao. Relation classification via convolutional deep neural network. In COLING, pages 2335--2344, 2014.
[53]
J. Zhao, Z. Guan, and H. Sun. Riker: Mining rich keyword representations for interpretable product question answering. In SIGKDD, pages 1389--1398, 2019.
[54]
G. Zheng, S. Mukherjee, X. L. Dong, and F. Li. Opentag: Open attribute value extraction from product profiles. In SIGKDD, pages 1049--1058, 2018.
[55]
Y. Zhu, R. Kiros, R. S. Zemel, R. Salakhutdinov, R. Urtasun, A. Torralba, and S. Fidler. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In ICCV, pages 19--27, 2015.

Cited By

View all
  • (2024)Attentive Review Semantics-Aware Recommendation Model for Rating PredictionElectronics10.3390/electronics1314281513:14(2815)Online publication date: 17-Jul-2024
  • (2024)Chaining Text-to-Image and Large Language Model: A Novel Approach for Generating Personalized e-commerce BannersProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671636(5825-5835)Online publication date: 25-Aug-2024
  • (2024)LLM-Ensemble: Optimal Large Language Model Ensemble Method for E-commerce Product Attribute Value ExtractionProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661357(2910-2914)Online publication date: 10-Jul-2024
  • Show More Cited By

Index Terms

  1. Learning to Extract Attribute Value from Product via Question Answering: A Multi-task Approach

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    August 2020
    3664 pages
    ISBN:9781450379984
    DOI:10.1145/3394486
    This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 August 2020

    Check for updates

    Author Tags

    1. attribute value extraction
    2. generalization
    3. question answering

    Qualifiers

    • Research-article

    Conference

    KDD '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,448
    • Downloads (Last 6 weeks)149
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Attentive Review Semantics-Aware Recommendation Model for Rating PredictionElectronics10.3390/electronics1314281513:14(2815)Online publication date: 17-Jul-2024
    • (2024)Chaining Text-to-Image and Large Language Model: A Novel Approach for Generating Personalized e-commerce BannersProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671636(5825-5835)Online publication date: 25-Aug-2024
    • (2024)LLM-Ensemble: Optimal Large Language Model Ensemble Method for E-commerce Product Attribute Value ExtractionProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661357(2910-2914)Online publication date: 10-Jul-2024
    • (2024)Multi-Label Zero-Shot Product Attribute-Value ExtractionProceedings of the ACM Web Conference 202410.1145/3589334.3645649(2259-2270)Online publication date: 13-May-2024
    • (2024)MuJo-SF: Multimodal Joint Slot Filling for Attribute Value Prediction of E-Commerce CommoditiesIEEE Transactions on Multimedia10.1109/TMM.2024.340766726(10354-10366)Online publication date: 1-Jan-2024
    • (2024)KGLink: A Column Type Annotation Method that Combines Knowledge Graph and Pre-Trained Language Model2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00083(1023-1035)Online publication date: 13-May-2024
    • (2024)Evolving to multi-modal knowledge graphs for engineering design: state-of-the-art and future challengesJournal of Engineering Design10.1080/09544828.2023.2301230(1-40)Online publication date: 6-Jan-2024
    • (2024)Exploring generative frameworks for product attribute value extractionExpert Systems with Applications10.1016/j.eswa.2023.122850243(122850)Online publication date: Jun-2024
    • (2024)Using LLMs for the Extraction and Normalization of Product Attribute ValuesAdvances in Databases and Information Systems10.1007/978-3-031-70626-4_15(217-230)Online publication date: 1-Sep-2024
    • (2024)QPAVE: A Multi-task Question Answering Approach for Fine-Grained Product Attribute Value ExtractionBig Data Analytics and Knowledge Discovery10.1007/978-3-031-68323-7_28(331-345)Online publication date: 18-Aug-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media