Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1459359.1459413acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

HOTPAPER: multimedia interaction with paper using mobile phones

Published: 26 October 2008 Publication History

Abstract

The popularity of camera phones enables many exciting multimedia applications. In this paper, we present a novel technology and several applications that allow users to interact with paper documents, books, and magazines. This interaction is in the form of reading and writing electronic information, such as images, web urls, video, and audio, to the paper medium by pointing a camera phone at a patch of text on a document. Our application does not require any special markings, barcodes, or watermarks on the paper document. Instead, we propose a document recognition algorithm that automatically determines the location of a patch of text in a large collection of document images given a small document image. This is very challenging because the majority of phone cameras lack autofocus and macro capabilities and they produce low quality images and video. We developed a novel algorithm, Brick Wall Coding (BWC), that performs image-based document recognition using the mobile phone video frames. Given a document patch image, BWC utilizes the layout, i.e. relative locations, of word boxes in order to determine the original file, page, and the location on the page. BWC runs real-time (4 frames per second) on a Treo 700w smartphone with a 312 MHz processor and 64MB RAM. Using our method we can recognize blurry document patch frames that contain as little as 4-5 lines of text and a video resolution as low as 176x144. We performed experiments by indexing 4397 document pages and querying this database with 533 document patches. Besides describing the basic algorithm, this paper also describes several applications that are enabled by mobile phone-paper interaction, such as inserting electronic annotations to paper, using paper as a tangible interface to collect and communicate multimedia data, and collaborative homework.

References

[1]
J. Wang, S. Zhai, J. Canny, "Camera Phone Based Motion Sensing: Interaction Techniques, Applications and Performance Study", ACM Symposium on User Interface Software and Technology, pp. 101--110, 2006.
[2]
A. Haro, K. Mori, V. Setlur, T. Capin, "Mobile Camera-based Adaptive Viewing", ACM Int. Conf. on Mobile Ubiquitous Multimedia, pp. 78--83, 2005.
[3]
M. Davis, M. Smith, F. Stentiford, A. Bambidele, J. Canny, N. Good, S. King, and R. Janakiraman. "Using Context and Similarity for Face and Location Identification", IS&T/SPIE Electronic Imaging Conf., 2006.
[4]
R. B. Yeh, C. Liao, S. R. Klemmer, F. Guimbretière, B. Lee, B. Kakaradov, J. Stamberger, A. Paepcke, "ButterflyNet: A Mobile Capture and Access System for Field Biology," ACM CHI, pp. 571--580, 2006.
[5]
Anoto Pen, http://www.anoto.com/
[6]
D. Hecht, "Printed Embedded Data Graphical User Interfaces," IEEE Computer, v. 34, no. 3, 47--55, 2001.
[7]
J. Graham, B. Erol, J. J. Hull and D. S. Lee, "The Video Paper Multimedia Playback System," ACM Multimedia Conference, 2003.
[8]
S. Klemmer, J. Graham, G. Wolff, J. Landay, "Books with Voices: Paper Transcripts as a Tangible Interface to Oral Histories", ACM CHI Conference, vol. 5, no. 1, pp. 89--96, 2003.
[9]
D. Schmalstieg, D. Wagner, "Experiences with Handheld Augmented Reality", IEEE/ACM ISMAR, pp. 3--15, 2007.
[10]
J. J. Hull, B. Erol, J. Graham, Q. Ke, H. Kishi, J. Moraleda and D. G. Van Olst, "Paper-based Augmented Reality," Int. Conf. on Artificial Reality and Telexistence, pp. 205--209, 2007.
[11]
D. G. Lowe, "Object Recognition from Local Scale-Invariant Features", Proc. of the Int. Conf. on Computer Vision, pp. 1150--1157, 1999.
[12]
Q. Liu, P. McEvoy, and C. J. Lai, "Mobile Camera Supported Document Redirection", ACM Multimedia Conf., pp. 791--792, 2006.
[13]
W.-C. Chen, Y. Xiong, J. Gao, N. Gelfand, R. Grzeszczuk. "Efficient Extraction of Robust Image Features on Mobile Devices", IEEE/ACM ISMAR, pp. 287--288, 2007.
[14]
T. Nakai, K. Kise, M. Iwamura, "Use of Affine Invariants in Locally Likely Arrangement Hashing for Camera-Based Document Image Retrieval", Document Analysis Systems, pp. 541--552, 2006.
[15]
X. Liu and D. Doermann, "Mobile retriever - Finding document with a snapshot", Int. Workshop on Camera-Based Document Analysis and Recognition, 2007.
[16]
J. Y. Gil and R. Kimmel, "Efficient Dilation, Erosion, Opening, and Closing Algorithms", IEEE Transactions on PAMI, pp. 1606--1617, 2002.
[17]
Transym Optical Character Recognition software, http://www.transym.com/

Cited By

View all
  • (2024)A Market-Ready Ecosystem for Publishing and Reading Augmented BooksHuman-Centered Design, Operation and Evaluation of Mobile Communications10.1007/978-3-031-60487-4_5(58-75)Online publication date: 1-Jun-2024
  • (2020)Robust Combined Binarization Method of Non-Uniformly Illuminated Document Images for Alphanumerical Character RecognitionSensors10.3390/s2010291420:10(2914)Online publication date: 21-May-2020
  • (2020)A Lightweight Mobile Outdoor Augmented Reality Method Using Deep Learning and Knowledge Modeling for Scene Perception to Improve Learning ExperienceInternational Journal of Human–Computer Interaction10.1080/10447318.2020.184816337:9(884-901)Online publication date: 25-Nov-2020
  • Show More Cited By

Index Terms

  1. HOTPAPER: multimedia interaction with paper using mobile phones

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '08: Proceedings of the 16th ACM international conference on Multimedia
      October 2008
      1206 pages
      ISBN:9781605583037
      DOI:10.1145/1459359
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 October 2008

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. linking paper to electronic data
      2. markerless linking
      3. mobile imaging
      4. mobile interaction

      Qualifiers

      • Research-article

      Conference

      MM08
      Sponsor:
      MM08: ACM Multimedia Conference 2008
      October 26 - 31, 2008
      British Columbia, Vancouver, Canada

      Acceptance Rates

      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 03 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Market-Ready Ecosystem for Publishing and Reading Augmented BooksHuman-Centered Design, Operation and Evaluation of Mobile Communications10.1007/978-3-031-60487-4_5(58-75)Online publication date: 1-Jun-2024
      • (2020)Robust Combined Binarization Method of Non-Uniformly Illuminated Document Images for Alphanumerical Character RecognitionSensors10.3390/s2010291420:10(2914)Online publication date: 21-May-2020
      • (2020)A Lightweight Mobile Outdoor Augmented Reality Method Using Deep Learning and Knowledge Modeling for Scene Perception to Improve Learning ExperienceInternational Journal of Human–Computer Interaction10.1080/10447318.2020.184816337:9(884-901)Online publication date: 25-Nov-2020
      • (2019)Improvement of Image Binarization Methods Using Image Preprocessing with Local Entropy Filtering for Alphanumerical Character Recognition PurposesEntropy10.3390/e2106056221:6(562)Online publication date: 4-Jun-2019
      • (2019)Mobile Visual Search Compression With Grassmann Manifold EmbeddingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2018.288117729:11(3356-3366)Online publication date: 1-Nov-2019
      • (2018)Pulp NonfictionProceedings of the 2018 CHI Conference on Human Factors in Computing Systems10.1145/3173574.3173691(1-11)Online publication date: 21-Apr-2018
      • (2017)The Fusion of an Ultrasonic and Spatially Aware System in a Mobile-Interaction DeviceSymmetry10.3390/sym90801379:8(137)Online publication date: 30-Jul-2017
      • (2017)IllumiPaperProceedings of the 2017 CHI Conference on Human Factors in Computing Systems10.1145/3025453.3025525(5605-5618)Online publication date: 2-May-2017
      • (2016)Freehand interaction with a paper-based input interfaceInternational Journal of Computer Applications in Technology10.1504/IJCAT.2016.07875954:2(106-120)Online publication date: 1-Jan-2016
      • (2016)Exploring Low-Cost, Internet-Free Information Access for Resource-Constrained CommunitiesACM Transactions on Computer-Human Interaction10.1145/299049823:6(1-34)Online publication date: 3-Dec-2016
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media