Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Investigating and reflecting on the integration of automatic data analysis and visualization in knowledge discovery

Published: 27 May 2010 Publication History

Abstract

The aim of this work is to survey and reflect on the various ways visualization and data mining can be integrated to achieve effective knowledge discovery by involving the best of human and machine capabilities. Following a bottom-up bibliographic research approach, the article categorizes the observed techniques in classes, highlighting current trends, gaps, and potential future directions for research. In particular it looks at strengths and weaknesses of information visualization (infovis) and data mining, and for which purposes researchers in infovis use data mining techniques and reversely how researchers in data mining employ infovis techniques. The article then proposes, on the basis of the extracted patterns, a series of potential extensions not found in literature. Finally, we use this information to analyze the discovery process by comparing the analysis steps from the perspective of information visualization and data mining. The comparison brings to light new perspectives on how mining and visualization can best employ human and machine strengths. This activity leads to a series of reflections and research questions that can help to further advance the science of visual analytics.

References

[1]
J.A. Fails and J. Olsen, "Interactive machine learning," IU '03: Proceedings of the 8th international conference on Intelligent user interfaces, New York, NY, USA: ACM, 2003, pp. 39--45.
[2]
M. Ware, E. Frank, G. Holmes, M. Hall, and I.H. Witten, "Interactive machine learning: letting users build classifiers," International Journal of Human Computer Studies, vol. 55, 2001, pp. 281--292.
[3]
J.J. Thomas and K.A. Cook, Illuminating the path: The research and development agenda for visual analytics, IEEE, 2005.
[4]
D.A. Keim, F. Mansmann, J. Schneidewind, J. Thomas, and H. Ziegler, "Visual analytics: Scope and challenges," Visual Data Mining: Theory, Techniques and Tools for Visual Analytics, Springer, 2008, pp. 76--90.
[5]
M.O. Ward, "A taxonomy of glyph placement strategies for multidimensional data visualization," Information Visualization, vol. 1, 2002, pp. 194--210.
[6]
A. Morrison, G. Ross, and M. Chalmers, "Fast multidimensional scaling through sampling, springs and interpolation," Information Visualization, vol. 2, 2003, pp. 68--77.
[7]
P. Yang, "Interactive Hierarchical Dimension Ordering, Spacing and Filtering for Exploration of High Dimensional Datasets," Oct. 2003.
[8]
W. Peng, M.O. Ward, and E.A. Rundensteiner, "Clutter reduction in multi-dimensional data visualization using dimension reordering," IEEE Symposium on Information Visualization, 2004. INFOVIS 2004, pp. 89--96.
[9]
J. Heer and D. Boyd, "Vizster: Visualizing online social networks," Proceedings of the 2005 IEEE Symposium on Information Visualization, 2005, pp. 33--40.
[10]
J. Johansson, P. Ljung, M. Jern, and M. Cooper, "Revealing Structure within Clustered Parallel Coordinates Displays," Proceedings of the Proceedings of the 2005 IEEE Symposium on Information Visualization, IEEE Computer Society, 2005, p. 17.
[11]
A. Jakulin, M. Mo***ina, J. Dem***ar, I. Bratko, and B. Zupan, "Nomograms for visualizing support vector machines," Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, Chicago, Illinois, USA: ACM, 2005, pp. 108--117.
[12]
Pak Chung Wong, W. Cowley, H. Foote, E. Jurrus, and J. Thomas, "Visualizing sequential patterns for text mining," Information Visualization, 2000. InfoVis 2000. IEEE Symposium on, 2000, pp. 105--111.
[13]
M. Ankerst, M. Ester, and H. Kriegel, "Towards an effective cooperation of the user and the computer for classification," Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2000, pp. 179--188.
[14]
E. Müller, I. Assent, R. Krieger, T. Jansen, and T. Seidl, "Morpheus: interactive exploration of subspace clustering," Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2008, pp. 1089--1092.
[15]
S.T. Teoh and K. Ma, "PaintingClass: interactive construction, visualization and exploration of decision trees," Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, D.C.: ACM, 2003, pp. 667--672.
[16]
M. Ankerst, C. Elsen, M. Ester, and H. Kriegel, "Visual classification: an interactive approach to decision tree construction," Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 1999, pp. 392--396.
[17]
Q. Cui and J. Yang, "Measuring Data Abstraction Quality in Multiresolution Visualizations," IEEE Transactions on Visualization and Computer Graphics, vol. 12, 2006, pp. 709--716.
[18]
D. Yang, Z. Xie, E.A. Rundensteiner, and M.O. Ward, "Managing discoveries in the visual analytics process," SIGKDD Explor. Newsl., vol. 9, 2007, pp. 22--29.
[19]
G. Ellis and A. Dix, "Density control through random sampling: an architectural perspective," Information Visualisation, IV 2002., 2002, pp. 82--90.
[20]
E. Bertini and G. Santucci, "Give chance a chance: modeling density to enhance scatter plot quality through random data sampling," Information Visualization, vol. 5, 2006, pp. 95--110.
[21]
R.A. Amar, J.T. Stasko, "Knowledge Precepts for Design and Evaluation of Information Visualizations," IEEE Transactions on Visualization and Computer Graphics, vol. 11, 2005, pp. 432--442.
[22]
C. Plaisant, J. Fekete, and G. Grinstein, "Promoting Insight-Based Evaluation of Visualizations: From Contest to Benchmark Repository," Visualization and Computer Graphics, IEEE Transactions on, vol. 14, 2008, pp. 120--134.
[23]
J. Seo and B. Shneiderman, "A Rank-by-Feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections," Proceedings of the IEEE Symposium on Information Visualization, IEEE Computer Society, 2004, pp. 65--72.
[24]
P. Pirolli and S. Card, "The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis," Proceedings of International Conference on Intelligence Analysis, 2005.
[25]
J. Mackinlay, "Automating the design of graphical presentations of relational information," ACM Transactions on Graphics, vol. 5, 1986.
[26]
D. Keim, "Visual Analytics: Combining Automated Discovery with Interactive Visualizations. (Invited Talk at VAKD'09 - http://www.hiit.fi/vakd09/keim.html)."

Cited By

View all
  • (2024)TextVista: NLP-Enriched Time-Series Text Data VisualizationsProceedings of the 50th Graphics Interface Conference10.1145/3670947.3670971(1-12)Online publication date: 3-Jun-2024
  • (2023)A Model for Types and Levels of Automation in Visual Analytics: A Survey, a Taxonomy, and ExamplesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.316376529:8(3550-3568)Online publication date: 1-Aug-2023
  • (2022)Development and Evaluation of Two Approaches of Visual Sensitivity Analysis to Support Epidemiological ModelingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.3209464(1-11)Online publication date: 2022
  • Show More Cited By
  1. Investigating and reflecting on the integration of automatic data analysis and visualization in knowledge discovery

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM SIGKDD Explorations Newsletter
      ACM SIGKDD Explorations Newsletter  Volume 11, Issue 2
      December 2009
      128 pages
      ISSN:1931-0145
      EISSN:1931-0153
      DOI:10.1145/1809400
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 May 2010
      Published in SIGKDD Volume 11, Issue 2

      Check for updates

      Author Tags

      1. data mining
      2. knowledge discovery
      3. visual analytics
      4. visual data mining
      5. visualization

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)18
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 22 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)TextVista: NLP-Enriched Time-Series Text Data VisualizationsProceedings of the 50th Graphics Interface Conference10.1145/3670947.3670971(1-12)Online publication date: 3-Jun-2024
      • (2023)A Model for Types and Levels of Automation in Visual Analytics: A Survey, a Taxonomy, and ExamplesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.316376529:8(3550-3568)Online publication date: 1-Aug-2023
      • (2022)Development and Evaluation of Two Approaches of Visual Sensitivity Analysis to Support Epidemiological ModelingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.3209464(1-11)Online publication date: 2022
      • (2022)Visual Knowledge Discovery with Artificial Intelligence: Challenges and Future DirectionsIntegrating Artificial Intelligence and Visualization for Visual Knowledge Discovery10.1007/978-3-030-93119-3_1(1-27)Online publication date: 5-Jun-2022
      • (2019)Solving challenges at the interface of simulation and big data using machine learningProceedings of the Winter Simulation Conference10.5555/3400397.3400444(572-583)Online publication date: 8-Dec-2019
      • (2019)Open our visualization eyes, individualization: On Albrecht Dürer’s 1515 wood cut celestial chartsInformation Visualization10.1177/147387161988111419:2(137-162)Online publication date: 17-Dec-2019
      • (2019)Visual Analytics to Identify Temporal Patterns and Variability in Simulations from Cellular AutomataACM Transactions on Modeling and Computer Simulation10.1145/326574829:1(1-26)Online publication date: 24-Jan-2019
      • (2019)Solving Challenges at the Interface of Simulation and Big Data Using Machine Learning2019 Winter Simulation Conference (WSC)10.1109/WSC40007.2019.9004755(572-583)Online publication date: Dec-2019
      • (2018)HYBINTSecurity and Communication Networks10.1155/2018/56258602018Online publication date: 1-Jan-2018
      • (2018)TreePOD: Sensitivity-Aware Selection of Pareto-Optimal Decision TreesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.274515824:1(174-183)Online publication date: Jan-2018
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media