Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3397271.3401067acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Towards Linking Camouflaged Descriptions to Implicit Products in E-commerce

Published: 25 July 2020 Publication History

Abstract

As the emergence of E-commerce services, billions of products are sold online everyday. How to detect illegal products from the large-scale online products has become an important and practical research problem. In order to evade detection, malicious sellers usually utilize camouflaged text to describe their illegal products implicitly. Thus brings great challenges to the current detection systems since newly camouflaged text can hardly be learned from historical data and the distribution of illegal and normal products is extremely unbalanced. Rather than solving this problem as a classification task in most previous efforts, we reformulate the problem from a perspective of implicit entity linking, which targets at linking a camouflaged description to a known product. In this paper, we introduce three types of context that could help to infer implicit entity from camouflaged descriptions and propose an end-to-end contextual representation model to capture the effect of different context. Furthermore, we involve a symmetric metric to model the matching score of the input title to the product by learning the mutual effect among the context. The experimental results on the datasets collected from a real-world E-commerce site demonstrate the advantage of the proposed model against the state-of-the-art methods.

References

[1]
Lutfiye Seda Mut Altin, Àlex Bravo Serrano, and Horacio Saggion. 2019. LaSTUS/ TALN at SemEval-2019 Task 6: Identification and Categorization of Offensive Language in Social Media with Attention-based Bi-LSTM model. In International Workshop on Semantic Evaluation, SemEval@NAACL-HLT. 672--677.
[2]
Prudhvi Ratna Badri Satya, Kyumin Lee, Dongwon Lee, Thanh Tran, and Jason Jiasheng Zhang. 2016. Uncovering fake likers in online social networks. In ACM International on Conference on Information and Knowledge Management. 2365--2370.
[3]
Luisa Bentivogli, Pamela Forner, Claudio Giuliano, Alessandro Marchetti, Emanuele Pianta, and Kateryna Tymoshenko. 2010. Extending English ACE 2005 corpus annotation with ground-truth links to Wikipedia. In The 2nd Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources. 19--27.
[4]
Razvan Bunescu and Marius Paca. 2006. Using encyclopedic knowledge for named entity disambiguation. In European Chapter of the Association for Computational Linguistics.
[5]
Hongyun Cai and Fuzhi Zhang. 2019. Detecting shilling attacks in recommender systems based on analysis of user rating behavior. Knowledge-Based Systems 177 (2019), 22--43.
[6]
Xiao Cheng and Dan Roth. 2013. Relational inference for wikification. In Empirical Methods in Natural Language Processing. 1787--1796.
[7]
James Cross and Liang Huang. 2016. Incremental Parsing with Minimal Features Using Bi-Directional LSTM. In Association for Computational Linguistics.
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171--4186.
[9]
Yotam Eshel, Noam Cohen, Kira Radinsky, Shaul Markovitch, Ikuya Yamada, and Omer Levy. 2017. Named Entity Disambiguation for Noisy Text. In Computational Natural Language Learning (CoNLL 2017).
[10]
Song Feng, Longfei Xing, Anupam Gogar, and Yejin Choi. 2012. Distributional footprints of deceptive product reviews. In International AAAI Conference on Weblogs and Social Media.
[11]
Yue Feng, Fattane Zarrinkalam, Ebrahim Bagheri, Hossein Fani, and Feras Al- Obeidat. 2018. Entity linking of tweets based on dominant entity candidates. Social Network Analysis and Mining 8, 1 (2018), 46.
[12]
David Mandell Freeman. 2017. Can you spot the fakes?: On the limitations of user feedback in online social networks. In International Conference on World Wide Web. 1093--1102.
[13]
Amir Globerson, Nevena Lazic, Soumen Chakrabarti, Amarnag Subramanya, Michael Ringaard, and Fernando Pereira. 2016. Collective entity resolution with multi-focal attention. In Association for Computational Linguistics. 621--631.
[14]
Guoxiu He, Yangyang Kang, Zhe Gao, Zhuoren Jiang, Changlong Sun, Xiaozhong Liu, Wei Lu, Qiong Zhang, and Luo Si. 2019. Finding Camouflaged Needle in a Haystack?: Pornographic Products Detection via Berrypicking Tree Model. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 365--374.
[15]
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust disambiguation of named entities in text. In Empirical Methods in Natural Language Processing. 782--792.
[16]
Hawre Hosseini. 2019. Implicit Entity Recognition, Classification and Linking in Tweets. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 1448--1448.
[17]
Hawre Hosseini, Tam T Nguyen, and Ebrahim Bagheri. 2018. Implicit Entity Linking Through Ad-Hoc Retrieval. In International Conference on Advances in Social Networks Analysis and Mining. 326--329.
[18]
Hawre Hosseini, Tam T. Nguyen, JimmyWu, and Ebrahim Bagheri. 2019. Implicit entity linking in tweets: An ad-hoc retrieval approach. Applied Ontology 14, 4 (2019), 451--477.
[19]
Longtao Huang, Ting Ma, Junyu Lin, Jizhong Han, and Songlin Hu. 2019. A Multimodal Text Matching Model for Obfuscated Language Identification in Adversarial Communication. In The World Wide Web Conference. 2844--2850.
[20]
Heng Ji, Ralph Grishman, Hoa Trang Dang, Kira Griffitt, and Joe Ellis. 2010. Overview of the TAC 2010 knowledge base population track. In TAC 2010, Vol. 3. 3--3.
[21]
Nitin Jindal and Bing Liu. 2008. Opinion spam and analysis. In International Conference on Web Search and Data Mining. 219--230.
[22]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Empirical Methods in Natural Language Processing. 1746--1751.
[23]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations.
[24]
Jon M Kleinberg. 1999. Hubs, authorities, and communities. ACM computing surveys (CSUR) 31, 4es (1999), 5.
[25]
Nevena Lazic, Amarnag Subramanya, Michael Ringgaard, and Fernando Pereira. 2015. Plato: A selective context model for entity resolution. Transactions of the Association for Computational Linguistics 3 (2015), 503--515.
[26]
Kyumin Lee, Brian David Eoff, and James Caverlee. 2011. Seven months with the devils: A long-term study of content polluters on twitter. In International AAAI Conference on Weblogs and Social Media.
[27]
Ao Li, Zhou Qin, Runshi Liu, Yiqun Yang, and Dong Li. 2019. Spam Review Detection with Graph Convolutional Networks. In International Conference on Information and Knowledge Management. 2703--2711.
[28]
Huayi Li, Geli Fei, Shuai Wang, Bing Liu, Weixiang Shao, Arjun Mukherjee, and Jidong Shao. 2017. Bimodal distribution and co-bursting in review spam detection. In International Conference on World Wide Web. 1063--1072.
[29]
Mingming Li, Shuai Zhang, Fuqing Zhu, Wanhui Qian, Liangjun Zang, Jizhong Han, and Songlin Hu. 2020. Symmetric Metric Learning with Adaptive Margin for Recommendation. (2020).
[30]
Bing Liu and Lei Zhang. 2012. A survey of opinion mining and sentiment analysis. In Mining text data. 415--463.
[31]
Lajanugen Logeswaran, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, Jacob Devlin, and Honglak Lee. 2019. Zero-Shot Entity Linking by Reading Entity Descriptions. In Association for Computational Linguistics. 3449--3460.
[32]
Yuqing Lu, Lei Zhang, Yudong Xiao, and Yangguang Li. 2013. Simultaneously detecting fake reviews and review spammers using factor graph model. In ACM web science conference. 225--233.
[33]
Suman Kalyan Maity, Santosh K. C., and Arjun Mukherjee. 2018. Spam2Vec: Learning Biased Embeddings for Spam Detection in Twitter. In Companion of the The Web Conference 2018. 63--64.
[34]
Pedro Henrique Martins, Zita Marinho, and André F. T. Martins. 2019. Joint Learning of Named Entity Recognition and Entity Linking. In Association for Computational Linguistics. 190--196.
[35]
Rada Mihalcea and Andras Csomai. 2007. Wikify!: linking documents to encyclopedic knowledge. In ACM conference on Conference on information and knowledge management. 233--242.
[36]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In International Conference on Learning Representations.
[37]
Seungwhan Moon, Leonardo Neves, and Vitor Carvalho. 2018. Multimodal named entity disambiguation for noisy social media posts. In Association for Computational Linguistics. 2000--2008.
[38]
David Mueller and Greg Durrett. 2018. Effective use of context in noisy entity linking. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1024--1029.
[39]
Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination. In Association for Computational Linguistics: Human language technologies-volume 1. 309--319.
[40]
Sujan Perera, Pablo N Mendes, Adarsh Alex, Amit P Sheth, and Krishnaprasad Thirunarayan. 2016. Implicit entity linking in tweets. In European Semantic Web Conference. 118--132.
[41]
Minh C Phan, Aixin Sun, Yi Tay, Jialong Han, and Chenliang Li. 2018. Pairlinking for collective entity disambiguation: Two could be better than all. IEEE Transactions on Knowledge and Data Engineering 31, 7 (2018), 1383--1396.
[42]
Jonathan Raphael Raiman and Olivier Michel Raiman. 2018. DeepType: multilingual entity linking by neural type system evolution. In AAAI Conference on Artificial Intelligence.
[43]
Weixiong Rao, Lei Chen, Pan Hui, and Sasu Tarkoma. 2012. Move: A large scale keyword-based content filtering and dissemination system. In International Conference on Distributed Computing Systems. 445--454.
[44]
Lev Ratinov, Dan Roth, Doug Downey, and Mike Anderson. 2011. Local and global algorithms for disambiguation to wikipedia. In Association for Computational Linguistics: Human Language Technologies-Volume 1. 1375--1384.
[45]
Mahmoud F Sayed and Douglas W Oard. 2019. Jointly Modeling Relevance and Sensitivity for Search Among Sensitive Content. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 615--624.
[46]
Saeedreza Shehnepoor, Mostafa Salehi, Reza Farahbakhsh, and Noel Crespi. 2017. NetSpam: A network-based spam detection framework for reviews in online social media. IEEE Transactions on Information Forensics and Security 12, 7 (2017), 1585--1595.
[47]
Ning Su, Yiqun Liu, Zhao Li, Yuli Liu, Min Zhang, and Shaoping Ma. 2018. Detecting Crowdturfing Add to Favorites Activities in Online Shopping. In World Wide Web Conference. 1673--1682.
[48]
Guan Wang, Sihong Xie, Bing Liu, and Philip S Yu. 2012. Identify online store reviewspammers via social reviewgraph. ACM Transactions on Intelligent Systems and Technology 3, 4 (2012), 61.
[49]
Yequan Wang, Minlie Huang, Xiaoyan Zhu, and Li Zhao. 2016. Attention-based LSTM for Aspect-level Sentiment Classification. In Empirical Methods in Natural Language Processing. 606--615.
[50]
Chang Xu and Jie Zhang. 2015. Towards collusive fraud detection in online reviews. In 2015 IEEE International Conference on Data Mining. IEEE, 1051--1056.
[51]
Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Advances in neural information processing systems. 649--657.

Cited By

View all
  • (2023)Detecting malicious reviews and users affecting social reviewing systemsComputers and Security10.1016/j.cose.2023.103407133:COnline publication date: 1-Oct-2023
  • (2022)A systemic functional linguistics approach to implicit entity recognition in tweetsInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10295759:4Online publication date: 1-Jul-2022
  • (2021)Learning to rank implicit entities on TwitterInformation Processing & Management10.1016/j.ipm.2021.10250358:3(102503)Online publication date: May-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2020
2548 pages
ISBN:9781450380164
DOI:10.1145/3397271
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. implicit entity linking
  2. knowledge graph
  3. neural networks

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China

Conference

SIGIR '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)4
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Detecting malicious reviews and users affecting social reviewing systemsComputers and Security10.1016/j.cose.2023.103407133:COnline publication date: 1-Oct-2023
  • (2022)A systemic functional linguistics approach to implicit entity recognition in tweetsInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10295759:4Online publication date: 1-Jul-2022
  • (2021)Learning to rank implicit entities on TwitterInformation Processing & Management10.1016/j.ipm.2021.10250358:3(102503)Online publication date: May-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media