Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3167132.3167236acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Text extraction and retrieval from smartphone screenshots: building a repository for life in media

Published: 09 April 2018 Publication History

Abstract

Daily engagement in life experiences is increasingly interwoven with mobile device use. Screen capture at the scale of seconds is being used in behavioral studies and to implement "just-in-time" health interventions. The increasing psychological breadth of digital information will continue to make the actual screens that people view a preferred if not required source of data about life experiences. Effective and efficient Information Extraction and Retrieval from digital screenshots is a crucial prerequisite to successful use of screen data. In this paper, we present the experimental workflow we exploited to: (i) pre-process a unique collection of screen captures, (ii) extract unstructured text embedded in the images, (iii) organize image text and metadata based on a structured schema, (iv) index the resulting document collection, and (v) allow for Image Retrieval through a dedicated vertical search engine application. The adopted procedure integrates different open source libraries for traditional image processing, Optical Character Recognition (OCR), and Image Retrieval. Our aim is to assess whether and how state-of-the-art methodologies can be applied to this novel data set. We show how combining OpenCV-based pre-processing modules with a Long short-term memory (LSTM) based release of Tesseract OCR, without ad hoc training, led to a 74% character-level accuracy of the extracted text. Further, we used the processed repository as baseline for a dedicated Image Retrieval system, for the immediate use and application for behavioral and prevention scientists. We discuss issues of Text Information Extraction and Retrieval that are particular to the screenshot image case and suggest important future work.

References

[1]
Thomas M Breuel. 2008. The OCRopus open source OCR system. In Electronic Imaging 2008. International Society for Optics and Photonics, 68150F--68150F.
[2]
Rafael C Carrasco. 2014. An open-source OCR evaluation tool. In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage. ACM, 179--184.
[3]
Agnese Chiatti, Xiao Yang, Mimi Brinberg, M. J. Cho, Anupriya Gagneja, Nilam Ram, Byron Reeves, and C. Lee Giles. 2017. Text Extraction from Smartphone Screenshots to Archive in situ Media Behavior. In Proceedings of the 9th International Conference on Knowledge Capture (K-CAP 2017). ACM.
[4]
Henrique Batista da Silva, Raquel Pereira de Almeida, Gabriel Barbosa da Fonseca, Carlos Caetano, Dario Vieira, Zenilton K. Gonçalves do Patrocínio, Jr., Arnaldo de Albuquerque Araújo, and Silvio Jamil F. Guimarães. 2016. Video Similarity Search by Using Compact Representations. In Proceedings of the 31st Annual ACM Symposium on Applied Computing (SAC 16). ACM, New York, NY, USA, 80--83.
[5]
Andreia V Faria, Kenichi Oishi, Shoko Yoshida, Argye Hillis, Michael I Miller, and Susumu Mori. 2015. Content-based image retrieval for brain MRI: An image-searching engine and population-based analysis to utilize past clinical data for future diagnosis. NeuroImage: Clinical 7 (2015), 367--376.
[6]
Wenyi Huang, Dafang He, Xiao Yang, Zihan Zhou, Daniel Kifer, and C Lee Giles. 2016. Detecting Arbitrary Oriented Text in the Wild with a Visual Attention Model. In Proceedings of the 2016 ACM on Multimedia Conference (MM '16). ACM, 551--555.
[7]
Itseez. 2015. Open Source Computer Vision Library. https://github.com/itseez/opencv. (2015).
[8]
Byung K. Jung, Sung Y. Shin, Wei Wang, Hyung D. Choi, and Jeong K. Pack. 2014. Similar MRI Object Retrieval Based on Modified Contour to Centroid Triangulation with Arc Difference Rate. In Proceedings of the 29th Annual ACM Symposium on Applied Computing (SAC '14). ACM, New York, NY, USA, 31--32.
[9]
V. I. Levenshtein. 1966. Binary Codes Capable of Correcting Deletions, Insertions and Reversals. Soviet Physics Doklady 10 (Feb. 1966), 707.
[10]
Yi Lu. 1995. Machine printed character segmentation: An overview. Pattern recognition 28, 1 (1995), 67--80.
[11]
Million Meshesha and C. V. Jawahar. 2008. Matching word images for content-based retrieval from printed document images. International Journal of Document Analysis and Recognition (IJDAR) 11, 1 (01 Oct 2008), 29--38.
[12]
N. Ram, D. Conroy, A. L. Pincus, A. Lorek, A. H. Rebar, M. J. Roche, J. Morack, M. Coccia, J. Feldman, and D. Gerstorf. 2014. Examining the interplay of processes across multiple time-scales: Illustration with the Intraindividual Study of Affect, Health, and Interpersonal Behavior (iSAHIB). Research in Human Development 11 (2014), 142--160. Issue 2.
[13]
Byron Reeves, Nilam Ram, Thomas N. Robinson, James J. Cummings, Lee Giles, Jennifer Pan, Agnese Chiatti, MJ Cho, Katie Roehrick, Xiao Yang, Anupriya Gagneja, Miriam Brinberg, Daniel Muise, Yingdan Lu, Mufan Luo, Andrew Fitzgerald, and Leo Yeykelis. 2017. Screenomics: A Framework to Capture and Analyze Personal Life Experiences and the Ways that Technology Shapes Them. In review (2017).
[14]
Stephen E Robertson, Steve Walker, Susan Jones, Micheline M Hancock-Beaulieu, Mike Gatford, et al. 1995. Okapi at TREC-3. Nist Special Publication Sp 109 (1995), 109.
[15]
Julius Schöning, Patrick Faion, and Gunther Heidemann. 2015. Semi-automatic Ground Truth Annotation in Videos: An Interactive Tool for Polygon-based Object Annotation and Segmentation. In Proceedings of the 8th International Conference on Knowledge Capture (K-CAP 2015). ACM, New York, NY, USA, Article 17, 4 pages.
[16]
Ray Smith. 2007. An overview of the Tesseract OCR engine. In Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on, Vol. 2. IEEE, 629--633.
[17]
Kamrul Hasan Talukder and Tania Mallick. 2014. Connected component based approach for text extraction from color image. In Computer and Information Technology (ICCIT), 2014 17th International Conference on. IEEE, 204--209.
[18]
Oeivind Due Trier and Anil K. Jain. 1995. Goal-directed evaluation of binarization methods. IEEE transactions on Pattern analysis and Machine Intelligence 17, 12 (1995), 1191--1201.
[19]
Dan Vanderkam. {n. d.}. localturk. https://github.com/danvk/localturk. ({n. d.}).
[20]
Kai Wang and Serge Belongie. 2010. Word spotting in the wild. In European Conference on Computer Vision. Springer, 591--604.
[21]
Tao Wang, David J Wu, Adam Coates, and Andrew Y Ng. 2012. End-to-end text recognition with convolutional neural networks. In Proceedings of the 21ist International Conference on Pattern Recognition (ICPR'12). IEEE, 3304--3308.
[22]
Qixiang Ye and David Doermann. 2015. Text detection and recognition in imagery: A survey. IEEE transactions on pattern analysis and machine intelligence 37, 7 (2015), 1480--1500.
[23]
T. Yeh, T. Chang, and R. C. Miller. 2009. Sikuli: using GUI screenshots for search and automation. In Proceedings of the 22nd annual ACM symposium on User interface software and technology. ACM, 183--192.
[24]
L. Yeykelis, J. J. Cummings, and B. Reeves. 2014. Multitasking on a single device: Arousal and the frequency, anticipation, and prediction of switching between media content on a computer. Journal of Communication 64 (2014), 167--192. Issue 1.

Cited By

View all
  • (2024)A Tool for Capturing Smartphone Screen TextProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642347(1-24)Online publication date: 11-May-2024
  • (2024)Robotic Process Automation Efficiency for Mobile App Testing: An Empirical InvestigationInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402450011634:07(1025-1046)Online publication date: 14-May-2024
  • (2024)Mobile and platform users’ mediatized rituals in response to terrorist attacks: a discourse analysis of continuously collected screenshotsJournal of Communication10.1093/joc/jqae052Online publication date: 31-Dec-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing
April 2018
2327 pages
ISBN:9781450351911
DOI:10.1145/3167132
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 April 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. image processing
  2. image retrieval
  3. text extraction

Qualifiers

  • Research-article

Conference

SAC 2018
Sponsor:
SAC 2018: Symposium on Applied Computing
April 9 - 13, 2018
Pau, France

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)58
  • Downloads (Last 6 weeks)3
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Tool for Capturing Smartphone Screen TextProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642347(1-24)Online publication date: 11-May-2024
  • (2024)Robotic Process Automation Efficiency for Mobile App Testing: An Empirical InvestigationInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402450011634:07(1025-1046)Online publication date: 14-May-2024
  • (2024)Mobile and platform users’ mediatized rituals in response to terrorist attacks: a discourse analysis of continuously collected screenshotsJournal of Communication10.1093/joc/jqae052Online publication date: 31-Dec-2024
  • (2023)How Digitally Extractable Attributes of YouTube Video Thumbnails and Titles Affect Video Views2023 IEEE 7th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE)10.1109/ICITISEE58992.2023.10405243(82-87)Online publication date: 29-Nov-2023
  • (2023)Digital Trace Data Collection for Social Media Effects Research: APIs, Data Donation, and (Screen) TrackingCommunication Methods and Measures10.1080/19312458.2023.218131918:2(124-141)Online publication date: 27-Feb-2023
  • (2022)An automatic approach to detect problems in Android builds through screenshot analysisProceedings of the 37th ACM/SIGAPP Symposium on Applied Computing10.1145/3477314.3507273(926-932)Online publication date: 25-Apr-2022
  • (2020)Mobile data donations: Assessing self-report accuracy and sample biases with the iOS Screen Time functionMobile Media & Communication10.1177/20501579209591069:2(293-313)Online publication date: 30-Sep-2020
  • (2020)BELTProceedings of the Seventh ACM Conference on Learning @ Scale10.1145/3386527.3406727(277-280)Online publication date: 12-Aug-2020
  • (2020)A Heuristic Baseline Method for Metadata Extraction from Scanned Electronic Theses and DissertationsProceedings of the ACM/IEEE Joint Conference on Digital Libraries in 202010.1145/3383583.3398590(515-516)Online publication date: 1-Aug-2020
  • (2019)IoT Enabled Prescription Reading Smart Medicine Dispenser Implementing Maximally Stable Extremal Regions and OCR2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)10.1109/I-SMAC47947.2019.9032709(134-138)Online publication date: Dec-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media