Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1284420.1284427acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
Article

A system for understanding imaged infographics and its applications

Published: 28 August 2007 Publication History

Abstract

Information graphics, or infographics, are visual representations of information, data or knowledge. Understanding of infographics in documents is a relatively new research problem, which becomes more challenging when infographics appear as raster images. This paper describes technical details and practical applications of the system we built for recognizing and understanding imaged infographics located in document pages. To recognize infographics in raster form, both graphical symbol extraction and text recognition need to be performed. The two kinds of information are then auto-associated to capture and store the semantic information carried by the infographics. Two practical applications of the system are introduced in this paper, including supplement to traditional optical character recognition (OCR) system and providing enriched information for question answering (QA). To test the performance of our system, we conducted experiments using a collection of downloaded and scanned infographic images. Another set of scanned document pages from the University of Washington document image database were used to demonstrate how the system output can be used by other applications. The results obtained confirm the practical value of the system.

References

[1]
S. Bergler, C. Y. Suen, C. Nadal, N. Nobile, B. Waked, and A. Bloch, Logical block labeling of diverse types of document images, DLIA'99, 1999, 4470--4475.
[2]
T. M. Breuel, Layout Analysis based on Text Line Segment Hypotheses, DLIA'03, Edinburgh, Scotland, August, 2003.
[3]
S. Carberry, S. Elzer, N. Green, K. McCoy and D. Chester, Extending Document Summarization to Information Graphics, Proc. of the ACL Workshop on Text Summarization, 2004.
[4]
S. Carberry, S. Elzer, N. Green, K. McCoy, and D. Chester, Understanding Information Graphics: A Discourse-Level Problem, Proc. of SigDial, 2003, 1--12.
[5]
W. S. Cleveland, The elements of graphing data, Chapman and Hall, New York, 1985, 1994.
[6]
W. S. Cleveland, Visualizing data, Hobart Press, Summit, New Jersey, USA, 1993.
[7]
H. Cui, M.-Y. Kan and T. S. Chua, Unsupervised Learning of Soft Patterns for Definitional Question Answering, Proc. of the Thirteenth World Wide Web conference (WWW 2004), New York, May 17-22, 2004, 90--99.
[8]
A. Fitzgibbon, M. Pilu, and R. B. Fisher, Directed Least Square Fitting of Ellipses, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 21, 1999, 476--480.
[9]
R. P. Futrelle, I. A. Kakadiaris, J. Alexander, C. M. Carriero, N. Nikolakis, J. M. Futrelle, Understanding diagrams in technical documents, IEEE Computer, Vol.25, 1992, 75--78.
[10]
L. Gillard, P. Bellot, M. El-Bèze, Evaluations of Question Answering and Evaluations of the Evaluation, The fifth Int. Conf. on Language Resources and Evaluation, LREC 2006, Genoa, Italy, 24-26 May 2006.
[11]
W. Huang, C. L. Tan and W. K. Leow, Model based chart image recognition, 6th Int. Workshop on Graphics Recognition, GREC'03, 2003, 87--99.
[12]
W. Huang, S. Zong and C. L. Tan, Chart image classification using multiple-instance learning, WACV'07, Feb 21st-22nd, 2007, Austin, Texas, USA.
[13]
R. J. Kate and R. J. Mooney, Using String-Kernels for Learning Semantic Parsers, Proc. of the Joint 21st Int. Conf. on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL-2006), Australia, July 2006, 913--920.
[14]
J. H. Larkin and H. A. Simon. Why a Diagram is (sometimes) Worth Ten Thousand Words. In Cognitive Science, Vol. 11, No. 1, 1987, 65--100.
[15]
R. Quinlan. C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, San Mateo, CA, 1993.
[16]
D. Ravichandran and E. Hovy, Learning Surface Text Patterns for a Question Answering System, Proc. of ACL'02, Philadelphia, July 2002, 41--47.
[17]
K. Tombre, S. Tabbone, L. Pélissier, B. Lamiroy and P. Dosch. Text/Graphics Separation Revisited. In 5th Int. Workshop on DAS, 2002, 200--211.
[18]
F. Wang and M. Y. Kan, NPIC: Hierarchical synthetic image classification using image search and generic features, Proc. of Conf. on Image and Video Retrieval, 2006, 473--482.
[19]
J. Xu, R. M. Weischedel and A. Licuanan, Evaluation of an extraction-based approach to answering definitional questions, Proc. of SIGIR '04, Sheffield, UK, 2004, 418--424.
[20]
N. Yokokura and T. Watanabe, Layout-Based Approach for extracting constructive elements of bar-charts, Graphics recognition: algorithms and systems, GREC, 1997, 163--174.
[21]
J. Yu, J. Hunter, E. Reiter and S. Sripada, Recognizing visual patterns to communicate gas turbine time-series data, ES2002, 2002, 105--118.
[22]
B. Yuan and C. L. Tan. A Multi-level Component Grouping Algorithm and Its Applications. In 8th Int. Conf. on Doc. Analysis and Recognition, ICDAR'05, 2005, 1178--1181.
[23]
Y. Zheng, C. Liu, X. Ding and S. Pan, A Form Frame-Line Detection Algorithm Based on Directional Single-Connected Chain, Journal of Software, Vol. 13, 2002, 790--796.
[24]
Y. Zhou and C. L. Tan, Hough-based Model for Recognizing Bar Charts in Document Images, SPIE conference on Document image and retrieval, 2001.
[25]
Y. Zhou and C. L. Tan, Learning-based scientific chart recognition, 4th IAPR Int. Workshop on Graphics Recognition, GREC'01, 2001, 482--492.
[26]
Y. Zhou and C. L. Tan, Coordinate systems reconstruction for graphical documents by Hough feature clustering and geometric analysis, Int. Conf. on Pattern Recognition, ICPR'04, Cambridge, UK, 23-26 Aug 2004.

Cited By

View all
  • (2023)Infographics for Information ConveyanceHandbook of Research on Revisioning and Reconstructing Higher Education After Global Crises10.4018/978-1-6684-5934-8.ch016(320-368)Online publication date: 20-Jan-2023
  • (2023)Intelligent visualization and visual analyticsJournal of Image and Graphics10.11834/jig.23003428:6(1909-1926)Online publication date: 2023
  • (2023)The State of the Art in Creating Visualization Corpora for Automated Chart AnalysisComputer Graphics Forum10.1111/cgf.1485542:3(449-470)Online publication date: 27-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DocEng '07: Proceedings of the 2007 ACM symposium on Document engineering
August 2007
236 pages
ISBN:9781595937766
DOI:10.1145/1284420
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 August 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. applications
  2. association of text and graphics
  3. document image understanding
  4. infographics

Qualifiers

  • Article

Conference

DocEng07
Sponsor:
DocEng07: ACM Symposium on Document Engineering
August 28 - 31, 2007
Manitoba, Winnipeg, Canada

Acceptance Rates

Overall Acceptance Rate 194 of 564 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)56
  • Downloads (Last 6 weeks)6
Reflects downloads up to 02 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Infographics for Information ConveyanceHandbook of Research on Revisioning and Reconstructing Higher Education After Global Crises10.4018/978-1-6684-5934-8.ch016(320-368)Online publication date: 20-Jan-2023
  • (2023)Intelligent visualization and visual analyticsJournal of Image and Graphics10.11834/jig.23003428:6(1909-1926)Online publication date: 2023
  • (2023)The State of the Art in Creating Visualization Corpora for Automated Chart AnalysisComputer Graphics Forum10.1111/cgf.1485542:3(449-470)Online publication date: 27-Jun-2023
  • (2023)GraphDescriptor: Augmenting Node-Link Diagrams With Textual Descriptions2023 IEEE 16th Pacific Visualization Symposium (PacificVis)10.1109/PacificVis56936.2023.00027(177-186)Online publication date: Apr-2023
  • (2023)ChartEye: A Deep Learning Framework for Chart Information Extraction2023 International Conference on Digital Image Computing: Techniques and Applications (DICTA)10.1109/DICTA60407.2023.00082(554-561)Online publication date: 28-Nov-2023
  • (2023)Automatic Chart Understanding: A ReviewIEEE Access10.1109/ACCESS.2023.329805011(76202-76221)Online publication date: 2023
  • (2023)X-Space: Interaction design of extending mixed reality space from Web2D visualizationVisual Informatics10.1016/j.visinf.2023.10.0017:4(73-83)Online publication date: Dec-2023
  • (2023)Chart classification: a survey and benchmarking of different state-of-the-art methodsInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-023-00443-w27:1(19-44)Online publication date: 20-Jun-2023
  • (2022)Parsing Line Chart Images Using Linear Programming2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV51458.2022.00261(2553-2562)Online publication date: Jan-2022
  • (2022)AI4VIS: Survey on Artificial Intelligence Approaches for Data VisualizationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.309900228:12(5049-5070)Online publication date: 1-Dec-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media