Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3340531.3417445acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Large Scale Long-tailed Product Recognition System at Alibaba

Published: 19 October 2020 Publication History

Abstract

A practical large scale product recognition system suffers from the phenomenon of long-tailed imbalanced training data under the E-commercial circumstance at Alibaba. In addition to images of products at Alibaba, plenty of related side information (e.g. title and tags) reveal rich semantic information about images. Prior works mainly focus on addressing the long tail problem from the visual perspective only, but lack of consideration of leveraging the side information. In this paper, we present a novel side information based large scale visual recognition co-training (SICoT) system to deal with the long tail problem by leveraging the image related side information. In the proposed co-training system, we firstly introduce a bilinear word attention module which aims to construct a semantic embedding from the noisy side information. A visual feature and semantic embedding co-training scheme is then designed to transfer knowledge between those classes with abundant training data (head classes) to classes with few training data (tail classes) in an end-to-end fashion. Extensive experiments on four challenging large scale datasets, whose numbers of classes range from one thousand to one million, demonstrate the scalable effectiveness of the proposed SICoT system in alleviating the long tail problem.

Supplementary Material

MP4 File (3340531.3417445.mp4)
[Large Scale Long-tailed Product Recognition System at Alibaba] We present a large scale SKU-level product recognition system under the E-commercial circumstance at Alibaba. This system is driven by a proposed side information based co-training system (SICoT), which aims to deal the long-tailed problem using image related side information in an end-to-end fashion. It can recognize about 37 million products in Alibaba E-market within less 200 milliseconds.

References

[1]
Cristiano L Castro and et al. 2013. Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data. IEEE tran. on neural networks and learning systems 24, 6 (2013), 888--899.
[2]
Charles Corbiere and et al. 2017. Leveraging weakly annotated data for fashion image retrieval and label prediction. In ICCV'17. 2268--2274.
[3]
Qi Dong and et al. 2017. Class rectification hard mining for imbalanced deep learning. In ICCV'17. 1851--1860.
[4]
Q. Dong and et al. 2019. Imbalanced Deep Learning by Minority Class Incremental Rectification. PAMI'19 41, 6 (June 2019), 1367--1381.
[5]
E. A. Garcia and et al. 2009. Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering 21, 09 (2009), 1263--1284.
[6]
Kaiming He and et al. 2016. Deep residual learning for image recognition. In CVPR'16. 770--778.
[7]
Geoffrey Hinton and et al. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[8]
Armand Joulin and et al. 2016. Learning visual features from large weakly supervised data. In ECCV'16. Springer, 67--84.
[9]
Salman H Khan and et al. 2017. Cost-sensitive learning of deep feature representations from imbalanced data. IEEE trans. on neural networks and learning systems 29, 8 (2017), 3573--3587.
[10]
Wen Li, and et al. 2017. WebVision Challenge: Visual Learning and Understanding With Web Data. Arxiv Preprint (2017).
[11]
Joseph J Lim and et al. 2011. Transfer learning by borrowing examples for multiclass object detection. In NIPS'11. 118--126.
[12]
David Lopez-Paz and et al. 2015. Unifying distillation and privileged information. arXiv:stat.ML/1511.03643
[13]
Yunshan Ma and et al. 2019. Who, Where, and What to Wear? Extracting Fashion Knowledge from Social Media. In ACM MM'19. 257--265.
[14]
T. Maciejewski and et al. 2011. Local neighbourhood extension of SMOTE for mining imbalanced data. In 2011 IEEE Symposium on Comp. Intel. and Data Mining. 104--111.
[15]
Tomas Mikolov and et al. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013).
[16]
Ruslan Salakhutdinov and et al. 2011. Learning to share visual appearance for multiclass object detection. In CVPR'11. IEEE, 1481--1488.
[17]
Liuyihan Song and et al. 2020. Large-Scale Training System for 100-Million Classification at Alibaba. In ACM SIGKDD'20 (KDD '20). 993--1001.
[18]
Y. Tang and et al. 2009. SVMs Modeling for Highly Imbalanced Classification. IEEE Trans. on Systems, Man, and Cybernetics 39, 1 (2009), 281--288.
[19]
Vladimir Vapnik and et al. 2009. A new learning paradigm: Learning using privileged information. Neural Networks 22, 5 (2009), 544 -- 557. IJCNN2009.
[20]
Vladimir Vapnik and et al. 2015. Learning Using Privileged Information: Similarity Control and Knowledge Transfer. Journal of Machine Learning Research 16, 61 (2015), 2023--2049.
[21]
Weiran Wang. 2019. Everything old is new again: A multi-view learning approach to learning using privileged information and distillation. arXiv preprint arXiv:1903.03694 (2019).
[22]
X. Yang and et al. 2018. Person Re-Identification With Metric Learning Using Privileged Information. TIP'18 27, 2 (2018), 791--805.
[23]
X. Zhang and et al. 2017. Range Loss for Deep Face Recognition with Long-Tailed Training Data. In ICCV'17. 5419--5428.
[24]
Yanhao Zhang and et al. 2018. Visual Search at Alibaba. In ACM SIGKDD'18. 993--1001.
[25]
Zhi-Hua Zhou and et al. 2006. Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. on Knowledge and Data Engineering 18, 1 (2006), 63--77.
[26]
Xiangxin Zhu and et al. 2014. Capturing long-tail distributions of object subcategories. In CVPR'14. 915--922.

Cited By

View all
  • (2024)Collaborative Tag-Aware Graph Neural Network for Long-Tail Service RecommendationIEEE Transactions on Services Computing10.1109/TSC.2024.334985317:5(2124-2138)Online publication date: Sep-2024
  • (2024)Multimodal fine-grained grocery product recognition using image and OCR textMachine Vision and Applications10.1007/s00138-024-01549-935:4Online publication date: 7-Jun-2024
  • (2022)Utilizing Contrastive Learning To Address Long Tail Issue in Product CategorizationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557522(5081-5082)Online publication date: 17-Oct-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
October 2020
3619 pages
ISBN:9781450368599
DOI:10.1145/3340531
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attention
  2. co-training
  3. long-tailed
  4. product recognition
  5. side information

Qualifiers

  • Research-article

Conference

CIKM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)4
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Collaborative Tag-Aware Graph Neural Network for Long-Tail Service RecommendationIEEE Transactions on Services Computing10.1109/TSC.2024.334985317:5(2124-2138)Online publication date: Sep-2024
  • (2024)Multimodal fine-grained grocery product recognition using image and OCR textMachine Vision and Applications10.1007/s00138-024-01549-935:4Online publication date: 7-Jun-2024
  • (2022)Utilizing Contrastive Learning To Address Long Tail Issue in Product CategorizationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557522(5081-5082)Online publication date: 17-Oct-2022
  • (2022)Billion-Scale Pretraining with Vision Transformers for Multi-Task Visual Representations2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV51458.2022.00150(1431-1440)Online publication date: Jan-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media