Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3539597.3570489acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article
Open access

Generating Explainable Product Comparisons for Online Shopping

Published: 27 February 2023 Publication History

Abstract

An essential part of making shopping purchase decisions is to compare and contrast products based on key differentiating features, but doing this manually can be overwhelming. Prior methods offer limited product comparison capabilities, e.g., via pre-defined common attributes that may be difficult to understand, or irrelevant to a particular product or user. Automatically generating an informative, natural-sounding, and factually consistent comparative text for multiple product and attribute types is a challenging research problem. We describe HCPC (Human Centered Product Comparison), to tackle two kinds of comparisons for online shopping: (i) product-specific, to describe and compare products based on their key attributes; and (ii) attribute-specific comparisons, to compare similar products on a specific attribute. To ensure that comparison text is faithful to the input product data, we introduce a novel multi-decoder, multi-task generative language model. One decoder generates product comparison text, and a second one generates supportive, explanatory text in the form of product attribute names and values. The second task imitates a copy mechanism, improving the comparison generator, and its output is used to justify the factual accuracy of the generated comparison text, by training a factual consistency model to detect and correct errors in the generated comparative text. We release a new dataset (https://registry.opendata.aws/) of ~15K human generated sentences, comparing products on one or more attributes (the first such data we know of for product comparison). We demonstrate on this data that HCPC significantly outperforms strong baselines, by ~10% using automatic metrics, and ~5% using human evaluation.

Supplementary Material

MP4 File (wsdm23-fp1696_final.mp4)
Presentation video

References

[1]
G. Alvarez and P. Cavanagh. 2004. The capacity of visual short-term memory is set both by visual information load and by number of objects. Psychological science (2004).
[2]
J. Álvarez Márquez and J. Ziegler. 2020. In-Store Augmented Reality-Enabled Product Comparison and Recommendation. In ACM RecSys.
[3]
K. Bennani-Smires, C. Musat, A. Hossmann, M. Baeriswyl, and M. Jaggi. 2018. Simple unsupervised keyphrase extraction using sentence embeddings. arXiv preprint arXiv:1801.04470 (2018).
[4]
J. Carbonell and J. Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In ACM SIGIR.
[5]
A. Challa, K. Upasani, A. Balakrishnan, and R. Subba. 2019. Generate, Filter, and Rank: Grammaticality Classification for Production-Ready NLG Systems. arXiv preprint arXiv:1904.03279 (2019).
[6]
S. Chen, Y. Zhang, and Q. Yang. 2021. Multi-task learning in natural language processing: An overview. arXiv preprint arXiv:2109.09138 (2021).
[7]
E. Dale and J.S. Chall. 1948. A formula for predicting readability: Instructions. Educational research bulletin (1948).
[8]
E. Davis. 2014. Representations of commonsense knowledge. Morgan Kaufmann.
[9]
Statista Research Department. 2014--2023. Global retail e-commerce sales. https://www.statista.com/statistics/379046/ worldwide-retail-e-commerce-sales/
[10]
J. DeYoung, S. Jain, N. Fatema Rajani, E. Lehman, C. Xiong, R. Socher, and B. Wallace. 2019. ERASER: A benchmark to evaluate rationalized NLP models. arXiv preprint arXiv:1911.03429 (2019).
[11]
A. Elgohary, C. Meek, M. Richardson, A. Fourney, G. Ramos, and A.H. Awadallah. 2021. NL-EDIT: Correcting semantic parse errors through natural language interaction. arXiv:2103.14540 (2021).
[12]
B. Gino, L. Yang, P. Damian, R. Oliver, C. Massimiliano, W. Roger, et al. 2020. On identifiability in transformers. In International Conference on Learning Representations.
[13]
Z. Guo, M. Schlichtkrull, and A. Vlachos. 2022. A survey on automated fact-checking. Transactions of the Association for Computational Linguistics (2022).
[14]
J. Hao, T. Zhao, J. Li, X. Dong, C. Faloutsos, Y. Sun, and W. Wang. 2020. P-Companion: A principled framework for diversified complementary product recommendation. In ACM CIKM.
[15]
N. Hossain, M. Ghazvininejad, and L. Zettlemoyer. 2020. Simple and effective retrieve-edit-rerank text generation. In ACL.
[16]
G. Häubl and V. Trifts. 2000. Consumer Decision Making in Online Shopping Environments: The Effects of Interactive Decision Aids. Marketing Science (2000).
[17]
S. Jain, S. Wiegreffe, Y. Pinter, and B. Wallace. 2020. Learning to Faithfully Rationalize by Construction. In Association for Computational Linguistics.
[18]
Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y. Xu, E. Ishii, Y. Bang, A. Madotto, and P. Fung. 2022. Survey of hallucination in natural language generation. arXiv preprint arXiv:2202.03629 (2022).
[19]
N. Jindal and B. Liu. 2006. Identifying comparative sentences in text documents. In ACM SIGIR.
[20]
M. Kale and A. Rastogi. 2020. Template guided text generation for task-oriented dialogue. arXiv preprint arXiv:2004.15006 (2020).
[21]
A. Kamis. 2006. Search strategies in shopping engines: An experimental investigation. IJEC (2006).
[22]
N. Keskar, B. McCann, L. Varshney, C. Xiong, and R. Socher. 2019. Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858 (2019).
[23]
W. Kessler and J. Kuhn. 2014. A Corpus of Comparisons in Product Reviews. In LREC.
[24]
W. Kry'sci'nski, B. McCann, C. Xiong, and R. Socher. 2020. Evaluating the factual consistency of abstractive text summarization. EMNLP (2020).
[25]
G. Lample, A. Conneau, L. Denoyer, and M. Ranzato. 2018. Unsupervised machine translation using monolingual corpora only. ICLR (2018).
[26]
T. Le and H. Lauw. 2021. Explainable Recommendation with Comparative Constraints on Product Aspects. In ACM WSDM.
[27]
M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, et al. 2019. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).
[28]
Y. Li and K. Yao. 2021. Interpretable nlg for task-oriented dialogue systems with heterogeneous rendering machines. In Proceedings of the AAAI Conference on Artificial Intelligence.
[29]
Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu, Linjun Shou, Ming Gong, et al. 2020. Glge: A new general language generation evaluation benchmark. arXiv preprint arXiv:2011.11928 (2020).
[30]
H. Liu, Q. Yin, and W.Y. Wang. 2019c. Towards explainable NLP: A generative explanation framework for text classification. ACL (2019).
[31]
X. Liu, D. Liu, B. Yang, H. Zhang, J. Ding, W. Yao, W. Luo, H. Zhang, and J. Su. 2022. KGR4: Retrieval, Retrospect, Refine and Rethink for Commonsense Generation. In Proceedings of the AAAI Conference on Artificial Intelligence.
[32]
X. Liu, L. Mou, F. Meng, H. Zhou, J. Zhou, and S. Song. 2019a. Unsupervised paraphrasing by simulated annealing. arXiv preprint arXiv:1909.03588 (2019).
[33]
Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, et al. 2019b. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[34]
J. Mallinson, J. Adamek, E. Malmi, and A. Severyn. 2022. EdiT5: Semi-Autoregressive Text-Editing with T5 Warm-Start. arXiv preprint arXiv:2205.12209 (2022).
[35]
K. Maurya and M. Desarkar. 2020. Learning to distract: A hierarchical multi-decoder network for automated generation of long distractors for multiple-choice questions for reading comprehension. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management.
[36]
F. Moraes, J. Yang, R. Zhang, and V. Murdock. 2020. The role of attributes in product quality comparisons. In CHIIR.
[37]
F. Nan, R. Nallapati, Z. Wang, C. Santos, H. Zhu, D. Zhang, K. McKeown, and B. Xiang. 2021. Entity-level Factual Consistency of Abstractive Text Summarization. arXiv preprint arXiv:2102.09130 (2021).
[38]
J. Ni, J. Li, and J. McAuley. 2019. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In EMNLP-IJCNLP.
[39]
S. Panthaplackel, M. Allamanis, and M. Brockschmidt. 2021. Copy that! editing sequences by copying spans. In Proceedings of the AAAI Conference on Artificial Intelligence.
[40]
Y. Park and U. Gretzel. 2010. Influence of consumers' online decision-making style on comparison shopping proneness and perceived usefulness of comparison shopping tools. JCER (2010).
[41]
D. Pruthi, M. Gupta, B. Dhingra, G. Neubig, and Z. Lipton. 2020. Learning to Deceive with Attention-Based Explanations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
[42]
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019).
[43]
N. Reimers and I. Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In EMNLP-IJCNLP.
[44]
S. Rothe, J. Mallinson, E. Malmi, S. Krause, and A. Severyn. 2021. A simple recipe for multilingual grammatical error correction. arXiv preprint arXiv:2106.03830 (2021).
[45]
K. Rydzewska, J. Pawłowska, R. Nielek, A. Wierzbicki, and G. Sedek. 2021. Cognitive Limitations of Older E-Commerce Customers in Product Comparison Tasks. In IFIP Conference on HCI.
[46]
B. Scheibehenne, R. Greifeneder, and P. Todd. 2010. Can there ever be too many options? A meta-analytic review of choice overload. JCR (2010).
[47]
S. Sen, K. Gupta, A. Ekbal, and P. Bhattacharyya. 2019. Multilingual unsupervised NMT using shared encoder and language-specific decoders. In ACL.
[48]
D. Smilkov, N. Thorat, B. Kim, F. Viégas, and M. Wattenberg. 2017. Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825 (2017).
[49]
R. Socher, A. Perelygin, J. Wu, J. Chuang, C. Manning, A. Ng, and C. Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP.
[50]
Y. Su, D. Vandyke, S. Baker, Y. Wang, and N. Collier. 2021. Keep the primary, rewrite the secondary: A two-stage approach for paraphrase generation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
[51]
M. Sundararajan, A. Taly, and Q. Yan. 2017. Axiomatic attribution for deep networks. In International conference on machine learning.
[52]
N. Vedula, M. Collins, E. Agichtein, and O. Rokhlenko. 2022. What Matters for Shoppers: Investigating Key Attributes for Online Product Comparison. In European Conference on Information Retrieval. 231--239.
[53]
N. Vedula and S. Parthasarathy. 2021. Face-keg: Fact checking explained using knowledge graphs. In Proceedings of WSDM.
[54]
A.K. Vijayakumar, M. Cogswell, R.R. Selvaraju, Q. Sun, S. Lee, D. Crandall, and D. Batra. 2016. Diverse beam search: Decoding diverse solutions from neural sequence models. arXiv preprint arXiv:1610.02424 (2016).
[55]
S. Volokhin, M. Collins, O. Rokhlenko, and E. Agichtein. 2022. Generating and Validating Contextually Relevant Justifications for Conversational Recommendation. In ACM SIGIR Conference on Human Information Interaction and Retrieval.
[56]
M. Yasunaga, J. Leskovec, and P. Liang. 2021. LM-Critic: Language Models for Unsupervised Grammatical Error Correction. EMNLP (2021).
[57]
J. Zhang, Y. Zhao, M. Saleh, and P. Liu. 2020b. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In ICML.
[58]
T. Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. 2020a. Bertscore: Evaluating text generation with bert. ICLR (2020).
[59]
C. Zhou, G. Neubig, J. Gu, M. Diab, P. Guzman, L. Zettlemoyer, and M. Ghazvininejad. 2020. Detecting hallucinated content in conditional neural sequence generation. arXiv preprint arXiv:2011.02593 (2020).
[60]
C. Zhu, M. Zeng, and X. Huang. 2019. Multi-task learning for natural language generation in task-oriented dialogue. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP-IJCNLP).

Cited By

View all
  • (2024)Explainable Generative AI (GenXAI): a survey, conceptualization, and research agendaArtificial Intelligence Review10.1007/s10462-024-10916-x57:11Online publication date: 15-Sep-2024
  • (2023)Disentangling User Conversations with Voice Assistants for Online ShoppingProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591974(1939-1943)Online publication date: 19-Jul-2023
  • (2023)Comparative relation mining of customer reviews based on a hybrid CSR methodConnection Science10.1080/09540091.2023.225171735:1Online publication date: 6-Oct-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining
February 2023
1345 pages
ISBN:9781450394079
DOI:10.1145/3539597
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 February 2023

Check for updates

Author Tags

  1. NLG
  2. explainability
  3. fact checking
  4. product comparison

Qualifiers

  • Research-article

Conference

WSDM '23

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6,150
  • Downloads (Last 6 weeks)102
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Explainable Generative AI (GenXAI): a survey, conceptualization, and research agendaArtificial Intelligence Review10.1007/s10462-024-10916-x57:11Online publication date: 15-Sep-2024
  • (2023)Disentangling User Conversations with Voice Assistants for Online ShoppingProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591974(1939-1943)Online publication date: 19-Jul-2023
  • (2023)Comparative relation mining of customer reviews based on a hybrid CSR methodConnection Science10.1080/09540091.2023.225171735:1Online publication date: 6-Oct-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media