Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

The handwritten trie: indexing electronic ink

Published: 22 May 1995 Publication History

Abstract

The emergence of the pen as the main interface device for personal digital assistants and pen-computers has made handwritten text, and more generally ink, a first-class object. As for any other type of data, the need of retrieval is a prevailing one. Retrieval of handwritten text is more difficult than that of conventional data since it is necessary to identify a handwritten word given slightly different variations in its shape. The current way of addressing this is by using handwriting recognition, which is prone to errors and limits the expressiveness of ink. Alternatively, one can retrieve from the database handwritten words that are similar to a query handwritten word using techniques borrowed from pattern and speech recognition. In particular, Hidden Markov Models (HMM) can be used as representatives of the handwritten words in the database. However, using HMM techniques to match the input against every item in the database (sequential searching) is unacceptably slow and does not scale up for large ink databases. In this paper, an indexing technique based on HMMs is proposed. The new index is a variation of the trie data structure that uses HMMs and a new search algorithm to provide approximate matching. Each node in the tree contains handwritten letters, where each letter is represented by an HMM. Branching in the trie is based on the ranking of matches given by the HMMs. The new search algorithm is parametrized so that it provides means for controlling the matching quality of the search process via a time-based budget. The index dramatically improves the search time in a database of handwritten words. Due to the variety of platforms for which this work is aimed, ranging from personal digital assistants to desktop computers, we implemented both main-memory and disk-based systems. The implementations are reported in this paper, along with performance results that show the practicality of the technique under a variety of conditions.

References

[1]
W. Aref, D. Barbar~, and P. Vallabhaneni. The Handwritten Trie: Indexing Electronic Ink. Technical report, M.I.T.L, October 1994.
[2]
Walid Aref and Daniel Barbar~. The Hidden Markov Model Tree Index: A Practical Approach to Fast Recognition of Handwritten Documents in Large Databases. Technical Report MITL-TR-84- 93, MITL, January 1994.
[3]
Walid G. Aref, Padmavathi Vallabhaneni, and Daniel Barbar~. Towards a realization of handwritten databases: Training and recognition. Technical Report MITL-TR-98-94, Matsushita information Technology Laboratory, Princeton, NJ, March 1994.
[4]
R. Bakis. Continuous speech word recognition via centisecond acoustic states. In Proc. ASA Meeting, Washington, DC, April 1976.
[5]
Daniel Barbar~. Method to index electronic handwritten documents. Technical Report MITL-TR- 77-93, Matsushita Information Technology Laboratory, Princeton, N3, November 1993.
[6]
R. Carr and D. Shafcr. The Power of PenPoint. Addison-Wesley, 1991.
[7]
Rene de la Briandais. File searching using variable length keys. in Proceedings of the Western Joint Computer Conference, pages 295-298, 1959.
[8]
E. Fredkin. Trie memory. Communications of the A CM, 3:490-500~ 1960.
[9]
A. Kaltenmeier, T. Caesar, J. M. Gloger, and E. Mandler. Sophisticated topology of hidden markov models for cursive script recognition. In Proceeding8 of the Sad. International Confe,ence on Document Analysis and Recognition, pages 139- 142, Tsukuba Science City, October 1993.
[10]
D. E. Knuth. The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison-Wesley, Reading, MA, 1978.
[11]
K. Landau, S. Major, and C. Wiederhold. The Role of PDA in the Office. Notes of the Seminar at PC-Expo, June 1994.
[12]
S. E. Levinson, L. R. Rabiner, and M. M. Sondhi. An introduction to the application of the theory of probabilistic functions of a markov proces to automatic speech recognition. Bell System Technical Journal, 62(4):1035-1074, April 1983.
[13]
D. P. Lopresti and A. Tomkins. Approximate Matching of Hand-Drawn Pictograms. In Proceedings of the Third International Workshop or~ Frontiers in Handwriting Recognition, May 1993.
[14]
D. P. Lopresti and A. Tomkins. Pictographic Naming. In Proceedings of INTERCHI93, 1993.
[15]
I-ice-Seen Park and Seong-Whan Lee. Large-set handwritten character recognition with multiple stochastic models. In Proceedings of ~he Znd. International Conference on Document Analysis and Recognition, pages 143-146, Tsukuba Science City, October 1993.
[16]
L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceeding of the IEEE, 77(2):257-285, February 1989.
[17]
Lawrence Rabiner and Biing-Hwang Juang. Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs, N.J., 1993.
[18]
Dean H. Rubine. The Automatic Recognition of Gestures. PhD thesis, Carnegie Mellon University, Pittsburgh~ PA, December 1991. Also technical report number CMU-CS-91-202.
[19]
Bong Kee Sin and Jim Hyuung Kim. A statistical approach with HMMs for on-line cursive I-iangui (Korean script) recognition. In Proceedings of the ~nd. International Conference on Document Analysis and Recognition, pages 147-150, Tsukuba Science City, October 1993.
[20]
W. Smith. The Newton Application Architecture. In Proceedings of the IEEE Computer Conference, San Francisco, 1994.

Cited By

View all
  • (2022)The “AI + R” - tree: An Instance-optimized R - tree2022 23rd IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM55031.2022.00023(9-18)Online publication date: Jun-2022
  • (2020)A Tutorial on Learned Multi-dimensional IndexesProceedings of the 28th International Conference on Advances in Geographic Information Systems10.1145/3397536.3426358(1-4)Online publication date: 3-Nov-2020
  • (2018)Electronic Ink IndexingEncyclopedia of Database Systems10.1007/978-1-4614-8265-9_143(1279-1285)Online publication date: 7-Dec-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 24, Issue 2
May 1995
490 pages
ISSN:0163-5808
DOI:10.1145/568271
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGMOD '95: Proceedings of the 1995 ACM SIGMOD international conference on Management of data
    June 1995
    508 pages
    ISBN:0897917316
    DOI:10.1145/223784
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 May 1995
Published in SIGMOD Volume 24, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)13
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)The “AI + R” - tree: An Instance-optimized R - tree2022 23rd IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM55031.2022.00023(9-18)Online publication date: Jun-2022
  • (2020)A Tutorial on Learned Multi-dimensional IndexesProceedings of the 28th International Conference on Advances in Geographic Information Systems10.1145/3397536.3426358(1-4)Online publication date: 3-Nov-2020
  • (2018)Electronic Ink IndexingEncyclopedia of Database Systems10.1007/978-1-4614-8265-9_143(1279-1285)Online publication date: 7-Dec-2018
  • (2017)Electronic Ink IndexingEncyclopedia of Database Systems10.1007/978-1-4899-7993-3_143-2(1-7)Online publication date: 6-Sep-2017
  • (2009)Electronic Ink IndexingEncyclopedia of Database Systems10.1007/978-0-387-39940-9_143(972-978)Online publication date: 2009
  • (2007)Duplicate Elimination in Space-partitioning Tree IndexesProceedings of the 19th International Conference on Scientific and Statistical Database Management10.1109/SSDBM.2007.10Online publication date: 9-Jul-2007
  • (2007)Redundant Bit Vectors for Robust Indexing and Retrieval of Electronic InkNinth International Conference on Document Analysis and Recognition (ICDAR 2007)10.1109/ICDAR.2007.4378737(387-391)Online publication date: Sep-2007
  • (2006)Space-Partitioning Trees in PostgreSQLProceedings of the 22nd International Conference on Data Engineering10.1109/ICDE.2006.146Online publication date: 3-Apr-2006
  • (2005)Matching algorithm for hangul recognition based on PDAProceedings of the 8th international conference on Artificial Neural Networks: computational Intelligence and Bioinspired Systems10.1007/11494669_109(889-898)Online publication date: 8-Jun-2005
  • (2002)Handwritten document retrievalProceedings Eighth International Workshop on Frontiers in Handwriting Recognition10.1109/IWFHR.2002.1030915(233-238)Online publication date: 2002
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media