Visual Representations for Data Analytics: User Study

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1997))

Included in the following conference series:

International Conference on Computer-Human Interaction Research and Applications

305 Accesses

Abstract

One of the characteristics of big data is its internal complexity and also variety manifested in many types of datasets that are to be managed, searched, or analyzed. In their natural forms, some of the data entities are unstructured, such as texts or multimedia objects, while some are structured but too complex. In this paper, we have investigated how visualizations of various complex datasets perform in the role of universal data representations for both human users and deep learning models. In a user study, we have evaluated several visualizations of complex relational data, where some proved their superior performance with respect to the precision and speed of classification by human users. Moreover, the same visualizations also led to effective classification performance when used with deep learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Visualizations for universal deep-feature representations: survey and taxonomy

Article Open access 16 September 2023

Review of Intelligent Data Analysis and Data Visualization

Introduction to VISUAL 2014 - Workshop on Visualizations and User Interfaces for Knowledge Engineering and Linked Data Analytics

Notes

1.
In order to ensure comparability of attributes, their values were transformed w.r.t. empirical cumulative distribution function.
2.
The study is accessible from http://hmon.ms.mff.cuni.cz:5002/visualrepresentation/join?guid=JvmOL0_d8aldQyy7wRTzPLJeStoHv5Cc.
3.
In all cases, we have used Euclidean distance. Each feature was converted to numeric value and normalized linearly between 0 and 1.
4.
Note that in the case of Nutrients dataset, we also selected a fixed sub-sample of displayed classes: a correct one plus three additional.
5.
Overall, the typical time to complete the whole study was 15–25 min.
6.
Note that because four classes were shown to each user in the case of Nutrients dataset, the accuracy of the random guessing would be 0.25.
7.
Pearson’s correlation of −0.57 and −0.94 if the results of Tabular baseline are removed.
8.
Note that we only evaluated visualizations that were readily available in PNG format, so we discarded the parallel coordinates and raw tabular data.

References

Afchar, D., Melchiorre, A.B., Schedl, M., Hennequin, R., Epure, E.V., Moussallam, M.: Explainability in music recommender systems. AI Mag. 43(2), 190–208 (2022)
Google Scholar
Albo, Y., Lanir, J., Bak, P., Rafaeli, S.: Off the radar: comparative evaluation of radial visualization solutions for composite indicators. IEEE Trans. Vis. Comput. Graph. 22(1), 569–578 (2015)
Article Google Scholar
Antonov, A.: ChernoffFace Python package (2022). https://github.com/antononcube/Python-packages/tree/main/ChernoffFace
Borg, I., Staufenbiel, T.: Performance of snow flakes, suns, and factorial suns in the graphical representation of multivariate data. Multivar. Behav. Res. 27(1), 43–55 (1992)
Article Google Scholar
Chan, W.W.Y.: A survey on multivariate data visualization. Department of Computer Science and Engineering. Hong Kong University of Science and Technology 8(6), 1–29 (2006)
Google Scholar
Chernoff, H.: The use of faces to represent points in k-dimensional space graphically. J. Am. Stat. Assoc. 68(342), 361–368 (1973)
Article Google Scholar
Dokoupil, P., Peska, L.: Easystudy: Framework for easy deployment of user studies on recommender systems. In: Proceedings of the 17th ACM Conference on Recommender Systems, pp. 1196–1199. RecSys ’23, Association for Computing Machinery, New York, NY, USA (2023)
Google Scholar
Etemadpour, R., Motta, R., de Souza Paiva, J.G., Minghim, R., De Oliveira, M.C.F., Linsen, L.: Perception-based evaluation of projection methods for multidimensional data visualization. IEEE Trans. Vis. Comput. Graph. 21(1), 81–94 (2014)
Google Scholar
Flury, B., Riedwyl, H.: Graphical representation of multivariate data by means of asymmetrical faces. J. Am. Stat. Assoc. 76(376), 757–765 (1981)
Article Google Scholar
Forsell, C., Seipel, S., Lind, M.: Simple 3D glyphs for spatial multivariate data. In: IEEE Symposium on Information Visualization, 2005. INFOVIS 2005, pp. 119–124. IEEE (2005)
Google Scholar
Fuchs, J., Isenberg, P., Bezerianos, A., Fischer, F., Bertini, E.: The influence of contour on similarity perception of star glyphs. IEEE Trans. Vis. Comput. Graph. 20(12), 2251–2260 (2014)
Article Google Scholar
Fuchs, J., Jäckle, D., Weiler, N., Schreck, T.: Leaf Glyphs: story telling and data analysis using environmental data glyph metaphors. In: Braz, J., et al. (eds.) VISIGRAPP 2015. CCIS, vol. 598, pp. 123–143. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29971-6_7
Chapter Google Scholar
Grosse Deters, H., Timm, W., Nattkemper, T.W.: REEFSOM-a metaphoric data display for exploratory data mining. Brains, Minds and Media 2 (2006)
Google Scholar
Hamner, C., Turner, D., Young, D.: Comparisons of several graphical methods for representing multivariate data. Comput. Math. Appl. 13(7), 647–655 (1987)
Article Google Scholar
Holten, D., Van Wijk, J.J.: Evaluation of cluster identification performance for different PCP variants. In: Computer Graphics Forum, vol. 29, pp. 793–802. Wiley Online Library (2010)
Google Scholar
Inselberg, A., Dimsdale, B.: Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proceedings of the First IEEE Conference on Visualization: Visualization 1990, pp. 361–378. IEEE (1990)
Google Scholar
Chang, K.: Parallel coordinates Github repository (2023). https://github.com/syntagmatic/parallel-coordinates/blob/master/examples/data/nutrients.csv
Keogh, E., Mueen, A.: Curse of Dimensionality, pp. 314–315. Springer, US, Boston, MA (2017). https://doi.org/10.1007/978-1-4899-7687-1_192
Klippel, A., Hardisty, F., Li, R., Weaver, C.: Colour-enhanced star plot glyphs: Can salient shape characteristics be overcome? Cartographica: Int. J. Geograph. Inf. Geovis. 44(3), 217–231 (2009)
Google Scholar
Klippel, A., Hardisty, F., Weaver, C.: Star plots: how shape characteristics influence classification tasks. Cartogr. Geogr. Inf. Sci. 36(2), 149–163 (2009)
Article Google Scholar
Kohonen, T.: The self-organizing map. Neurocomputing 21(1–3), 1–6 (1998)
Article Google Scholar
Lee, M.D., Reilly, R.E., Butavicius, M.E.: An empirical evaluation of Chernoff faces, star glyphs, and spatial visualizations for binary data. In: Proceedings of the Asia-Pacific symposium on Information visualisation-Volume 24, pp. 1–10 (2003)
Google Scholar
Liu, S., Maljovec, D., Wang, B., Bremer, P.T., Pascucci, V.: Visualizing high-dimensional data: advances in the past decade. IEEE Trans. Vis. Comput. Graph. 23(3), 1249–1268 (2016)
Article Google Scholar
van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
Google Scholar
Mezzich, J.E., Worthington, D.R.: A comparison of graphical representations of multidimensional psychiatric diagnostic data. In: Graphical Representation of Multivariate Data, pp. 123–141. Elsevier (1978)
Google Scholar
Mohammed, L.T., AlHabshy, A.A., ElDahshan, K.A.: Big data visualization: a survey. In: 2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), pp. 1–12. IEEE (2022)
Google Scholar
Morris, C.J., Ebert, D.S., Rheingans, P.L.: Experimental analysis of the effectiveness of features in chernoff faces. In: 28th AIPR Workshop: 3D Visualization for Data Exploration and Decision Making, vol. 3905, pp. 12–17. SPIE (2000)
Google Scholar
Naji, M.A., Filali, S.E., Aarika, K., Benlahmar, E.H., Abdelouhahid, R.A., Debauche, O.: Machine learning algorithms for breast cancer prediction and diagnosis. Procedia Comput. Sci. 191, 487–492 (2021)
Article Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)
Google Scholar
Scitovski, R., Sabo, K., Martínez-Álvarez, F., Ungar, Š.: Indexes, pp. 101–115. Springer International Publishing, Cham (2021)
Google Scholar
Sharma, A., Vans, E., Shigemizu, D., Boroevich, K.A., Tsunoda, T.: DeepInsight: a methodology to transform a non-image data to an image for convolution neural network architecture. Sci. Rep. 9(1) (2019)
Google Scholar
Skopal, T.: On visualizations in the role of universal data representation. In: Proceedings of the 2020 on International Conference on Multimedia Retrieval, ICMR 2020, Dublin, Ireland, June 8–11, 2020, pp. 362–367. ACM (2020)
Google Scholar
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Google Scholar
Ventocilla, E., Riveiro, M.: A comparative user study of visualization techniques for cluster analysis of multidimensional data sets. Inf. Vis. 19(4), 318–338 (2020)
Article Google Scholar
Wilkinson, L.: An experimental evaluation of multivariate graphical point representations. In: Proceedings of the 1982 Conference on Human Factors in Computing Systems, pp. 202–209 (1982)
Google Scholar
Wolberg, W., Mangasarian, O., Street, N., Street, W.: Breast Cancer Wisconsin (Diagnostic). UCI Machine Learning Repository (1995). https://doi.org/10.24432/C5DW2B
Wong, P.C., Bergeron, R.D.: 30 years of multidimensional multivariate visualization. Sci. Vis. 2, 3–33 (1994)
Google Scholar
Hamidani, Z.: Spotify tracks DB (2019). https://www.kaggle.com/datasets/zaheenhamidani/ultimate-spotify-tracks-db
Zhao, Y., et al.: Evaluating multi-dimensional visualizations for understanding fuzzy clusters. IEEE Trans. Vis. Comput. Graph. 25(1), 12–21 (2018)
Article Google Scholar

Download references

Acknowledgements

This paper has been supported by Czech Science Foundation (GAČR) project 22-21696S and by Charles University grant SVV-260698/2023.

Author information

Authors and Affiliations

Faculty of Mathematics and Physics, Charles University, Prague, Czechia
Ladislav Peska, Ivana Sixtova, David Hoksza, David Bernhauer & Tomas Skopal

Authors

Ladislav Peska
View author publications
You can also search for this author in PubMed Google Scholar
Ivana Sixtova
View author publications
You can also search for this author in PubMed Google Scholar
David Hoksza
View author publications
You can also search for this author in PubMed Google Scholar
David Bernhauer
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Skopal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ladislav Peska .

Editor information

Editors and Affiliations

Instituto Superior Técnico, IT - Institute of Telecommunications, Lisbon, Portugal
Hugo Plácido da Silva
University of Turin, Turin, Italy
Pietro Cipresso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peska, L., Sixtova, I., Hoksza, D., Bernhauer, D., Skopal, T. (2023). Visual Representations for Data Analytics: User Study. In: da Silva, H.P., Cipresso, P. (eds) Computer-Human Interaction Research and Applications. CHIRA 2023. Communications in Computer and Information Science, vol 1997. Springer, Cham. https://doi.org/10.1007/978-3-031-49368-3_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-49368-3_14
Published: 23 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49367-6
Online ISBN: 978-3-031-49368-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics