Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3551349.3556966acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Accelerating OCR-Based Widget Localization for Test Automation of GUI Applications

Published: 05 January 2023 Publication History

Abstract

Optical character recognition (OCR) algorithms often run slow. They may take several seconds to recognize the texts on a GUI screen, which makes OCR-based widget localization in test automation unfriendly for use, especially on GPU-free computers. This paper first concludes a common type of widget text to be located in GUI testing: label text, which are short texts in widgets like buttons, menu items, and window titles. We then investigate the characteristics of texts on a GUI screen and introduce a fast GPU-independent Label Text Screening (LTS) technique to accelerate the OCR process for label text localization. The technique opens the black box of OCR engines and uses a combination of simple methods to avoid excessive text analysis on a screen as much as possible. Experiments show that, on the subject datasets, LTS reduces the average OCR-based label text localization time to a large extent. On 4k resolution GUI screens, it keeps the localization time below 0.5 seconds for over about 60% of cases without GPU support on a normal laptop computer. In contrast, the existing CPU-based approaches built on popular OCR engines Tesseract, PaddleOCR, and EasyOCR usually need over 2 seconds to achieve the same goal on the same platform. Even with GPU acceleration, they can hardly keep the analysis time in 1 second. We believe the proposed approach would be helpful for implementing OCR-based test automation tools.

References

[1]
2022. Appium: Automation for iOS and Android apps. http://appium.io
[2]
2022. EasyOCR. https://github.com/JaidedAI/EasyOCR
[3]
2022. OpenCV. https://opencv.org/
[4]
2022. PaddleOCR. https://github.com/PaddlePaddle/PaddleOCR
[5]
2022. python-Levenshtein. https://pypi.org/project/python-Levenshtein/
[6]
2022. Robotium. https://github.com/RobotiumTech/robotium
[7]
2022. The source code and the datasets used in the experiment. https://doi.org/10.6084/m9.figshare.19722013.v1
[8]
2022. TencentOCR. https://intl.cloud.tencent.com/products/ocr
[9]
2022. Tesseract. https://github.com/tesseract-ocr/tesseract
[10]
2022. UIAutomator. http://developer.android.com/tools/testingsupport-library
[11]
Mohammad Alahmadi, Abdulkarim Khormi, Biswas Parajuli, Jonathan Hassel, Sonia Haiduc, and Piyush Kumar. 2020. Code localization in programming screencasts. Empirical Software Engineering 25, 2 (2020), 1536–1572. https://doi.org/10.1007/s10664-019-09759-w
[12]
Emil Alégroth and Robert Feldt. 2017. On the long-term use of visual gui testing in industrial practice: a case study. Empirical Software Engineering 22, 6 (2017), 2937–2971. https://doi.org/10.1007/s10664-016-9497-6
[13]
Luca Ardito, Andrea Bottino, Riccardo Coppola, Fabrizio Lamberti, Francesco Manigrasso, Lia Morra, and Marco Torchiano. 2022. Feature matching-based approaches to improve the robustness of Android visual GUI testing. ACM Transactions on Software Engineering and Methodology 31, 2(2022). https://doi.org/10.1145/3477427
[14]
Luca Ardito, Riccardo Coppola, Maurizio Morisio, and Marco Torchiano. 2019. Espresso vs. EyeAutomate: an experiment for the comparison of two generations of Android GUI testing. In Proceedings of the Evaluation and Assessment on Software Engineering (Copenhagen, Denmark) (EASE ’19). ACM, New York, NY, USA, 13–22. https://doi.org/10.1145/3319008.3319022
[15]
Mohammad Bajammal, Andrea Stocco, Davood Mazinanian, and Ali Mesbah. 2022. A survey on the use of computer vision to improve software engineering tasks. IEEE Transactions on Software Engineering 48, 5 (2022), 1722–1742. https://doi.org/10.1109/TSE.2020.3032986
[16]
Titus Barik, Justin Smith, Kevin Lubick, Elisabeth Holmes, Jing Feng, Emerson Murphy-Hill, and Chris Parnin. 2017. Do developers read compiler error messages?. In Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 575–585. https://doi.org/10.1109/ICSE.2017.59
[17]
Emil Börjesson and Robert Feldt. 2012. Automated system testing using visual GUI testing tools: A comparative study in industry. In Proceedings of the IEEE 5th International Conference on Software Testing, Verification and Validation (ICST). 350–359. https://doi.org/10.1109/ICST.2012.115
[18]
Min Cai, Jiqiang Song, and Michael R. Lyu. 2002. A new approach for video text detection. In IEEE International Conference on Image Processing, Vol. 1. https://doi.org/10.1109/icip.2002.1037973
[19]
Tsung Hsiang Chang, Tom Yeh, and Robert C. Miller. 2010. GUI testing using computer vision. In Proceedings of International Conference on Human Factors in Computing Systems (CHI), Vol. 3. 1535–1544. https://doi.org/10.1145/1753326.1753555
[20]
Huizhong Chen, Sam S. Tsai, Georg Schroth, David M. Chen, Radek Grzeszczuk, and Bernd Girod. 2011. Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In International Conference on Image Processing (ICIP). 2609–2612. https://doi.org/10.1109/ICIP.2011.6116200
[21]
Jieshan Chen, Mulong Xie, Zhenchang Xing, Chunyang Chen, Xiwei Xu, Liming Zhu, and Guoqiang Li. 2020. Object detection for graphical user interface: Old fashioned or deep learning or a combination?. In Proceedings of the 28th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 1202–1214. https://doi.org/10.1145/3368089.3409691
[22]
Xiaoxue Chen, Lianwen Jin, Yuanzhi Zhu, Canjie Luo, and Tianwei Wang. 2021. Text recognition in the wild: a survey. Comput. Surveys 54, 2 (2021). https://doi.org/10.1145/3440756
[23]
Nathan Cooper, Carlos Bernal-Cárdenas, Oscar Chaparro, Kevin Moran, and Denys Poshyvanyk. 2021. It takes two to tango: combining visual and textual information for detecting duplicate video-based bug reports. In IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 957–969. https://doi.org/10.1109/ICSE43902.2021.00091
[24]
Biplab Deka, Zifeng Huang, Chad Franzen, Joshua Hibschman, Daniel Afergan, Yang Li, Jeffrey Nichols, and Ranjitha Kumar. 2017. Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (UIST). 845–854. https://doi.org/10.1145/3126594.3126651
[25]
Boris Epshtein, Eyal Ofek, and Yonatan Wexler. 2010. Detecting text in natural scenes with stroke width transform. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2963–2970. https://doi.org/10.1109/CVPR.2010.5540041
[26]
Mattia Fazzini and Alessandro Orso. 2017. Automated cross-platform inconsistency detection for mobile apps. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 308–318. https://doi.org/10.1109/ASE.2017.8115644
[27]
Yuzhe Gao, Xing Li, Jiajian Zhang, Yu Zhou, Dian Jin, Xiang Bai, Jing Wang, and Shenggao Zhu. 2021. Video text tracking with a spatio-temporal complementary model. IEEE Transactions on Image Processing 30 (2021), 9321–9331. https://doi.org/10.1109/TIP.2021.3124313
[28]
Gang Hu, Linjie Zhu, and Junfeng Yang. 2018. AppFlow: Using machine learning to synthesize robust, reusable UI tests. In Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 269–282. https://doi.org/10.1145/3236024.3236055
[29]
Rong Huang, Palaiahnakote Shivakumara, and Seiichi Uchida. 2013. Scene character detection by an edge-ray filter. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR). 462–466. https://doi.org/10.1109/ICDAR.2013.99
[30]
Weilin Huang, Zhe Lin, Jianchao Yang, and Jue Wang. 2013. Text localization in natural images using stroke feature transform and text covariance descriptors. In IEEE International Conference on Computer Vision (ICCV). 1241–1248. https://doi.org/10.1109/ICCV.2013.157
[31]
Xiaodong Huang. 2019. Automatic video scene text detection based on saliency edge map. Multimedia Tools and Applications 78, 24 (2019), 34819–34838. https://doi.org/10.1007/s11042-019-08045-7
[32]
Max Jaderberg, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2016. Reading text in the wild with convolutional neural networks. International Journal of Computer Vision 116, 1 (2016), 1–20. https://doi.org/10.1007/s11263-015-0823-z
[33]
Tauseef Khan, Ram Sarkar, and Ayatullah Faruk Mollah. 2021. Deep learning approaches to scene text detection: a comprehensive review. 54, 5 (2021), 3239–3298. https://doi.org/10.1007/s10462-020-09930-6
[34]
Vijeta Khare, Palaiahnakote Shivakumara, Paramesran Raveendran, and Michael Blumenstein. 2016. A blind deconvolution model for scene text detection and recognition in video. Pattern Recognition 54(2016), 128–148. https://doi.org/10.1016/j.patcog.2016.01.008
[35]
Abdulkarim Khormi, Mohammad Alahmadi, and Sonia Haiduc. 2020. A study on the accuracy of OCR engines for source code transcription from programming screencasts. In Proceedings of the IEEE/ACM 17th International Conference on Mining Software Repositories (MSR). 65–75. https://doi.org/10.1145/3379597.3387468
[36]
Jung Jin Lee, Pyoung Hean Lee, Seong Whan Lee, Alan Yuille, and Christof Koch. 2011. AdaBoost for text detection in natural scene. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR). 429–434. https://doi.org/10.1109/ICDAR.2011.93
[37]
Jun Wei Lin, Navid Salehnamadi, and Sam Malek. 2020. Test automation in open-source Android apps: a large-scale empirical study. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1078–1089. https://doi.org/10.1145/3324884.3416623
[38]
Chunmei Liu, Chunheng Wang, and Ruwei Dai. 2005. Text detection in images based on unsupervised classification of edge-based features. In 8th International Conference on Document Analysis and Recognition (ICDAR). 610–614 Vol. 2. https://doi.org/10.1109/ICDAR.2005.228
[39]
Xiaoqian Liu and Weiqiang Wang. 2012. Robustly extracting captions in videos based on stroke-like edges and spatio-temporal analysis. IEEE Transactions on Multimedia 14, 2 (2012), 482–489. https://doi.org/10.1109/TMM.2011.2177646
[40]
Shangbang Long, Xin He, and Cong Yao. 2021. Scene text detection and recognition: the deep learning era. International Journal of Computer Vision 129, 1 (2021), 161–184. https://doi.org/10.1007/s11263-020-01369-0
[41]
Michael R. Lyu, Jiqiang Song, and Min Cai. 2005. A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Transactions on Circuits and Systems for Video Technology 15, 2(2005), 243–255. https://doi.org/10.1109/TCSVT.2004.841653
[42]
Shilpa Mahajan and Rajneesh Rani. 2021. Text detection and localization in scene images: a broad review. 54, 6 (2021), 4317–4377. https://doi.org/10.1007/s10462-021-10000-8
[43]
V. N. Manjunath Aradhya, H. T. Basavaraju, and D. S. Guru. 2021. Decade research on text detection in images/videos: a review. 14, 2 (2021), 405–431. https://doi.org/10.1007/s12065-019-00248-z
[44]
Tuan Anh Nguyen and Christoph Csallner. 2016. Reverse engineering mobile application user interfaces with REMAUI. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). 248–259. https://doi.org/10.1109/ASE.2015.32
[45]
Ednawati Rainarli, Suprapto, and Wahyono. 2021. A decade: Review of scene text detection methods. Computer Science Review 42 (2021), 100434. https://doi.org/10.1016/j.cosrev.2021.100434
[46]
Zobeir Raisi, Mohamed A. Naiel, Paul Fieguth, Steven Wardell, and John Zelek. 2020. Text detection and recognition in the wild: a review. (2020), 13–15. arxiv:2006.04305http://arxiv.org/abs/2006.04305
[47]
Palaiahnakote Shivakumara, Rushi Padhuman Sreedhar, Trung Quy Phan, Shijian Lu, and Chew Lim Tan. 2012. Multioriented video scene text detection through bayesian classification and boundary growing. IEEE Transactions on Circuits and Systems for Video Technology 22, 8(2012), 1227–1235. https://doi.org/10.1109/TCSVT.2012.2198129
[48]
Richard Szeliski. 2010. Computer vision: algorithms and applications (1st ed.). Springer-Verlag, Berlin, Heidelberg.
[49]
Wenhai Wang, Enze Xie, Xiaoge Song, Yuhang Zang, Wenjia Wang, Tong Lu, Gang Yu, and Chunhua Shen. 2019. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proceedings of the IEEE International Conference on Computer Vision. 8439–8448. https://doi.org/10.1109/ICCV.2019.00853
[50]
Mulong Xie, Sidong Feng, Zhenchang Xing, Jieshan Chen, and Chunyang Chen. 2020. UIED: a hybrid tool for GUI element detection. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). ACM, New York, NY, USA, 1655–1659. https://doi.org/10.1145/3368089.3417940
[51]
Tongtong Xu, Minxue Pan, Yu Pei, Guiyin Li, Xia Zeng, Tian Zhang, Yuetang Deng, and Xuandong Li. 2021. GUIDER: GUI structure and vision co-guided test script repair for Android apps. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). 191–203. https://doi.org/10.1145/3460319.3464830
[52]
Bo Yang, Zhenchang Xing, Xin Xia, Chunyang Chen, Deheng Ye, and Shanping Li. 2021. Don’t do that! Hunting down visual design smells in complex UIs against design guidelines. In Proceedings of the International Conference on Software Engineering (ICSE). 761–772. https://doi.org/10.1109/ICSE43902.2021.00075
[53]
Qixiang Ye and David Doermann. 2015. Text detection and recognition in imagery: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 7(2015), 1480–1500. https://doi.org/10.1109/TPAMI.2014.2366765
[54]
Xu Cheng Yin, Ze Yu Zuo, Shu Tian, and Cheng Lin Liu. 2016. Text detection, tracking and recognition in video: a comprehensive survey. IEEE Transactions on Image Processing 25, 6 (2016), 2752–2773. https://doi.org/10.1109/TIP.2016.2554321
[55]
Chong Yu, Yonghong Song, and Yuanlin Zhang. 2016. Scene text localization using edge analysis and feature pool. Neurocomputing 175, PartA (2016), 652–661. https://doi.org/10.1016/j.neucom.2015.10.105
[56]
Shengcheng Yu, Chunrong Fang, Zhenfei Cao, Xu Wang, Tongyu Li, and Zhenyu Chen. 2021. Prioritize crowdsourced test reports via deep screenshot understanding. In Proceedings of the International Conference on Software Engineering (ICSE). 946–956. https://doi.org/10.1109/ICSE43902.2021.00090
[57]
Shengcheng Yu, Chunrong Fang, Yexiao Yun, and Yang Feng. 2021. Layout and image recognition driving cross-platform automated mobile testing. In Proceedings of the International Conference on Software Engineering (ICSE). 1561–1571. https://doi.org/10.1109/ICSE43902.2021.00139
[58]
Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, Weiran He, and Jiajun Liang. 2017. EAST: an efficient and accurate scene text detector. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2642–2651. https://doi.org/10.1109/CVPR.2017.283

Cited By

View all
  • (2024)What's Wrong With Low-Code Development Platforms? An Empirical Study of Low-Code Development Platform BugsIEEE Transactions on Reliability10.1109/TR.2023.329500973:1(695-709)Online publication date: Mar-2024
  • (2023)Vision-Based Widget Mapping for Test Migration across Mobile Platforms: Are We There Yet?Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00068(1416-1428)Online publication date: 11-Nov-2023
  • (2023)A Retrospective Analysis of Grey Literature for AI-Supported Test AutomationQuality of Information and Communications Technology10.1007/978-3-031-43703-8_7(90-105)Online publication date: 13-Sep-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering
October 2022
2006 pages
ISBN:9781450394758
DOI:10.1145/3551349
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. GUI testing
  2. OCR
  3. computer vision
  4. test automation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ASE '22

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)140
  • Downloads (Last 6 weeks)12
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)What's Wrong With Low-Code Development Platforms? An Empirical Study of Low-Code Development Platform BugsIEEE Transactions on Reliability10.1109/TR.2023.329500973:1(695-709)Online publication date: Mar-2024
  • (2023)Vision-Based Widget Mapping for Test Migration across Mobile Platforms: Are We There Yet?Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00068(1416-1428)Online publication date: 11-Nov-2023
  • (2023)A Retrospective Analysis of Grey Literature for AI-Supported Test AutomationQuality of Information and Communications Technology10.1007/978-3-031-43703-8_7(90-105)Online publication date: 13-Sep-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media