research-article

An End-to-End Multi-modal-based Framework for Visual Identity Inspection System

Authors:

Yexing ZhangAuthors Info & Claims

BDIOT '24: Proceedings of the 2024 8th International Conference on Big Data and Internet of Things

Pages 40 - 46

https://doi.org/10.1145/3697355.3697362

Published: 12 December 2024 Publication History

Abstract

Visual identity (VI) comprises symbols and text elements that express the essence of the corporate. Misuse of the visual identity could significantly harm the corporate image and reputation. Therefore, it is important to early detect the misuse of visual identity and take the proper measurements. However, it is challenging to execute the inspection manually, especially for large-scale corporate in the modern era. In this paper, we introduce an end-to-end framework to manage an inspection system based on deep learning technique intending to automatically detect the misuse of corporate VI in various forms of media. More precisely, based on the characteristics of corporate logos in VI, we propose a novel method to systematically generate synthetic dataset, which is used to train the logo detection model. Furthermore, with the general common visual feature of logo and OCR algorithm, a robust rules-based engine is designed to automatically discriminate the VI misuse by the input image. Overall, the case study shows a precision of 90% and recall of 85%, and total processing time of each image could be less than 0.5 second.

References

[1]

Fonte, P., Martins, N., Raposo, D., Pereira, L. (2023). Dynamic Visual Identities: Fundamental Principles of Their Design. In: Martins, N., Raposo, D. (eds) Communication Design and Branding. Springer Series in Design and Innovation, vol 32. Springer, Cham.

[2]

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR (pp. 580–587).

Digital Library

[3]

Girshick, R. (2015). Fast R-CNN. In ICCV (pp. 1440–1448).

Digital Library

[4]

Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real time object detection with region proposal networks. In NIPS (pp. 91–99).

[5]

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real time object detection. In CVPR (pp. 779–788).

[6]

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., & Berg, A. (2016). SSD: Single shot multibox detector. In ECCV (pp. 21–37).

[7]

Lin, T., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017b). Focal loss for dense object detection. In ICCV.

[8]

Ashish, V., Noam, S., Niki, P., Jakob, U., Llion, J., Aidan, G., Łukasz, K., & Illia, P.(2017). Attention is all you need. In NeurIPS.

[9]

Hu, H., Gu, J., Zhang, Z., Dai, J., & Wei, Y. (2018a). Relation networksfor object detection. In CVPR.

[10]

Nicolas, C., Francisco, M., Gabriel, S., Nicolas, U., Alexander, K. & Sergey, Z.(2020). End-to-end object detection with transformers. In ECCV.

[11]

Sujuan Hou, Jiacheng Li, Weiqing Min, Qiang Hou, Yanna Zhao, Yuanjie Zheng, and Shuqiang Jiang, Deep learning for logo detection: A Survey, ACM Transactions on Multimedia Computing, Communications, and ApplicationsVolume 20,Issue 323, October 2023, Article No.: 72pp 1–23

[12]

J. Wang, W. Min, S. Hou, S. Ma, Y. Zheng, and S.Jiang,“LogoDet-3K: A large-scale image dataset for logo detection,” ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 18, no. 1, pp. 1–19, 2022.

Digital Library

[13]

C. Li, I. Feh´erv´ari, X. Zhao, I. Macˆedo, and S. Appalaraju, “SeeTek: Very large-scale open-set logo recognition with text-aware metric learning,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 587–596

[14]

X. Jin, W. Su, R. Zhang, Y. He, and H. Xue, “The open brands dataset:Unified brand detection and recognition at scale,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 2020, pp.4387–4391

[15]

Yuning Du, Chenxia Li, Ruoyu Guo, Xiaoting Yin, Weiwei Liu, Jun Zhou, et al. "PP-OCR: A Practical Ultra Lightweight OCR System", September. 2020.

[16]

B. Shi, X. Bai and C. Yao, "An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 11, pp. 2298-2304, 1 Nov. 2017

Digital Library

[17]

Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, and Hwalsuk Lee.(2019).Character Region Awareness for Text Detection, In CVPR, pp. 9365-9374

[18]

Wenhai Wang, Enze Xie, Xiang Li, Wenbo Hou, Tong Lu, Gang Yu and Shuai Shao.(2019).Shape Robust Text Detection with Progressive Scale Expansion Network. In CVPR, pp. 9336-9345

[19]

Shorten, C. and Khoshgoftaar, T. M. (2019). Image Data Augmentation for Deep Learning: A Survey. Journal of Big Data, 6(1), 60.

[20]

Tomasi, C., & Manduchi, R. (1998). Bilateral Filtering for Gray and Color Images. In Proceedings of the Sixth International Conference on Computer Vision (pp. 839-846). IEEE.

[21]

Pizer, S. M., Amburn, E. P., Austin, J. D., Cromartie, R., Geselowitz, A., Greer, T., ... & Zimmerman, J. B. (1987). Adaptive histogram equalization and its variations. Computer vision, graphics, and image processing, 39(3), 355-368.

[22]

Minghui Liao, Zhaoyi Wan, Cong Yao, Kai Chen and Xiang Bai.(2020). Real-time Scene Text Detection with Differentiable Binarization. In AAAI.

[23]

Haotian Liu, Chunyuan Li, Qingyang Wu and Yong Jae Lee.(2023). Visual Instruction Tuning. In NeurIPS Oral.

[24]

JaidedAI. (n.d.). EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. [Online]. Version 1.7.1. Github Repository. Available: https://github.com/JaidedAI/EasyOCR

[25]

DayBreak-u,"chineseocr_lite,"GitHub. [Online]. Available:https://github.com/DayBreak-u/chineseocr_lite.

[26]

Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91-110.

Digital Library

[27]

Diego A. Velazquez; Josep M. Gonfaus; Pau Rodríguez; F. Xavier Roca; Seiichi Ozawa; Jordi Gonzàlez.(2021).Logo Detection With No Priors. IEEE Access

[28]

Xiaojun Jia, Huanqian Yan, Yonglin Wu, Xingxing Wei, Xiaochun Cao, Yong Zhang.(2021).An Effective and Robust Detector for Logo Detection.arXiv:2108.00422

Index Terms

An End-to-End Multi-modal-based Framework for Visual Identity Inspection System
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Special purpose systems

Recommendations

Effective and efficient malware detection at the end host
SSYM'09: Proceedings of the 18th conference on USENIX security symposium

Malware is one of the most serious security threats on the Internet today. In fact, most Internet problems such as spam e-mails and denial of service attacks have malware as their underlying cause. That is, computers that are compromised with malware ...
Wavelet-based robust digital watermarking considering human visual system
CEA'07: Proceedings of the 2007 annual Conference on International Conference on Computer Engineering and Applications

In recent years, there have been many studies of digital watermarking as one of the way to protect the copyrights of digital content. It is required for digital watermarking method that the watermark is perceptually invisible and robust against various ...
Embedding Guided End-to-End Framework for Robust Image Watermarking
In recent years, deep learning-based watermarking algorithms have received extensive attention. However, the existing algorithms mainly use the autoencoder to insert watermark automatically and ignore using the prior knowledge to guide the watermark ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

BDIOT '24: Proceedings of the 2024 8th International Conference on Big Data and Internet of Things

September 2024

412 pages

ISBN:9798400717529

DOI:10.1145/3697355

Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 December 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

BDIOT 2024

BDIOT 2024: 2024 8th International Conference on Big Data and Internet of Things

September 14 - 16, 2024

Macau, China

Acceptance Rates

Overall Acceptance Rate 75 of 136 submissions, 55%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
14
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)5

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Table of Conten