Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1413370.1413422acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

High performance multivariate visual data exploration for extremely large data

Published: 15 November 2008 Publication History

Abstract

One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.

References

[1]
C. Nieter and J. R. Cary, "VORPAL: A Versatile Plasma Simulation Code," J. Comput. Phys., vol. 196, no. 2, pp. 448--473, 2004.
[2]
C. Geddes, C. Toth, J. van Tilborg, E. Esarey, C. Schroeder, D. Bruhwiler, C. Nieter, J. Cary, and W. Leemans, "High-Quality Electron Beams from a Laser Wakefield Accelerator Using Plasma-Channel Guiding," Nature, vol. 438, pp. 538--541, 2004, 1BNL-55732.
[3]
A. Inselberg, "Parallel coordinates for multidimensional displays," in Spatial Information Technologies for Remote Sensing Today and Tomorrow, The Ninth William T. Pecora Memorial Remote Sensing Symposium, IEEE Computer Society Press, 1984, pp. 312--324.
[4]
E. J. Wegman, "Hyperdimensional data analysis using parallel coordinates," Journal of the American Statistical Association, vol. 85, no. 411, pp. 664--675, Sep. 1990.
[5]
A. Inselberg, H. Hauser, M. Ward, and L. Yang, "Modern parallel coordinates: from relational information to clear patterns, tutorial," in IEEE Visualization, October 2006.
[6]
A. Inselberg, Parallel Coordinates Visual Multidimensional Geometry and Its Applications. Springer-Verlag, 2008.
[7]
Y.-H. Fua, M. O. Ward, and E. A. Rundensteiner, "Hierarchical parallel coordinates for exploration of large datasets," in IEEE Visualization 1999. Los Alamitos, CA, USA: IEEE Computer Society Press, 1999, pp. 43--50.
[8]
J. Johansson, P. Ljung, M. Jern, and M. Cooper, "Revealing structure within clustered parallel coordinates displays," in INFOVIS '05: Proceedings of the Proceedings of the 2005 IEEE Symposium on Information Visualization. Washington, DC, USA: IEEE Computer Society, 2005, p. 17.
[9]
M. Novotný, "Visually effective information visualization of large data," in Proceedings of Central European Seminar on Computer Graphics (CESCG), 2004.
[10]
M. Novotný and H. Hauser, "Outlier-preserving focus+context visualization in parallel coordinates," IEEE Transactions on Visualization and Computer Graphics, vol. 12, no. 5, pp. 893--900, 2006.
[11]
S. Chaudhuri and U. Dayal, "An overview of data warehousing and OLAP technology," ACM SIGMOD Record, vol. 26, no. 1, pp. 65--74, Mar. 1997.
[12]
P. O'Neil, "Model 204 architecture and performance," in 2nd International Workshop in High Performance Transaction Systems, Asilomar, CA, ser. Lecture Notes in Computer Science, vol. 359. Springer-Verlag, Sep. 1987, pp. 40--59.
[13]
C.-Y. Chan and Y. E. Ioannidis, "Bitmap index design and evaluation," in SIGMOD, 1998, pp. 355--366.
[14]
K. Wu, E. Otoo, and A. Shoshani, "On the performance of bitmap indices for high cardinality attributes," in VLDB, 2004, pp. 24--35.
[15]
FastBit is available from https://codeforge.lbl.gov/projects/fastbit/.
[16]
K. Wu, E. Otoo, and A. Shoshani, "Compressing bitmap indexes for faster search operations," in SSDBM '02, Edinburgh, Scotland, 2002, pp. 99--108.
[17]
K. Wu, E. Otoo, and A. Shoshani, "Optimizing bitmap indices with efficient compression," ACM Transactions on Database Systems, vol. 31, pp. 1--38, 2006.
[18]
E. W. Bethel, S. Campbell, E. Dart, K. Stockinger, and K. Wu, "Accelerating Network Traffic Analysis Using Query-Driven Visualization," in Proceedings of 2006 IEEE Symposium on Visual Analytics Science and Technology. IEEE Computer Society Press, October 2006, pp. 115--122, 1BNL-59891.
[19]
K. Wu, W. Koegler, J. Chen, and A. Shoshani, "Using bitmap index for interactive exploration of large datasets," in SSDBM '2003. Washington, DC, USA: IEEE Computer Society, 2003, pp. 65--74.
[20]
K. Stockinger, E. W. Bethel, S. Campbell, E. Dart, and K. Wu, "Detecting Distributed Scans Using High-Performance Query-Driven Visualization," in SC '06: Proceedings of the 2006 ACM/IEEE Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press, October 2006.
[21]
K. Stockinger, K. Wu, and A. Shoshani, "Strategies for processing ad hoc queries on large data warehouses," in DOLAP '02, McLean, Virginia, USA, 2002, pp. 72--79.
[22]
K. Wu, K. Stockinger, and A. Shosani, "Breaking the curse of cardinality on bitmap indexes," in SSDBM 2008, 2008, pp. 348--365.
[23]
K. Stockinger, J. Shalf, K. Wu, and E. W. Bethel, "Query-Driven Visualization of Large Data Sets," in Proceedings of IEEE Visualization 2005. IEEE Computer Society Press, October 2005, pp. 167--174, 1BNL-57511.
[24]
R. Bellman, Adaptive Control Processes: A Guided Tour. Princeton University Press, 1961.
[25]
K. Stockinger, E. W. Bethel, S. Campbell, E. Dart, and K. Wu, "Detecting Distributed Scans Using High-Performance Query-Driven Visualization," in SC '06: Proceedings of the 2006 ACM/IEEE Conference on High Performance Computing, Networking, Storage and Analysis. New York, NY, USA: IEEE Computer Society Press, October 2006, 1BNL-60053.
[26]
H. Childs, E. S. Brugger, K. S. Bonnell, J. S. Meredith, M. Miller, B. J. Whitlock, and N. Max, "A contract-based system for large data visualization," in Proceedings of IEEE Visualization 2005, October 2005, pp. 190--198.
[27]
VisIt is available from https://wci.llnl.gov/codes/visit/.
[28]
C. C. Law, A. Henderson, and J. Ahrens, "An application architecture for large data visualization: a case study," in PVG 01: Proceedings of the IEEE 2001 symposium on parallel and large-data visualization and graphics. IEEE Press, 2001, pp. 125--128.
[29]
EnSight Gold: http://www.ensight.com/ensight-gold.html.
[30]
S. Guha, K. Shim, and J. Woo, "Rehist: relative error histogram construction algorithms," in VLDB '04: Proceedings of the Thirtieth international conference on Very large Data Bases. VLDB Endowment, 2004, pp. 300--311.
[31]
C. G. R. Geddes, "Plasma channel guided laser wakefield accelerator," Ph.D. dissertation, University of California, Berkeley, 2005.
[32]
L. Gosink, J. Shalf, K. Stockinger, K. Wu, and E. W. Bethel, "HDF5-FastQuery: Accelerating Complex Queries on HDF Datasets using Fast Bitmap Indices," in Proceedings of the 18th International Conference on Scientific and Statistical Database Management. IEEE Computer Society Press, July 2006, 1BNL-59602.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing
November 2008
739 pages
ISBN:9781424428359

Sponsors

Publisher

IEEE Press

Publication History

Published: 15 November 2008

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

SC '08
Sponsor:

Acceptance Rates

SC '08 Paper Acceptance Rate 59 of 277 submissions, 21%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Towards Efficient Big DataProceedings of the 2nd International Conference on Smart Digital Environment10.1145/3289100.3289108(42-47)Online publication date: 18-Oct-2018
  • (2014)GeoLensProceedings of the 2014 IEEE/ACM International Symposium on Big Data Computing10.1109/BDC.2014.12(35-44)Online publication date: 8-Dec-2014
  • (2014)DIRAQCluster Computing10.1007/s10586-014-0358-z17:4(1101-1119)Online publication date: 1-Dec-2014
  • (2013)A classification of scientific visualization algorithms for massive threadingProceedings of the 8th International Workshop on Ultrascale Visualization10.1145/2535571.2535591(1-10)Online publication date: 17-Nov-2013
  • (2013)An analytical framework for particle and volume data of large-scale combustion simulationsProceedings of the 8th International Workshop on Ultrascale Visualization10.1145/2535571.2535590(1-8)Online publication date: 17-Nov-2013
  • (2013)GoldRushProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503279(1-12)Online publication date: 17-Nov-2013
  • (2013)Scalable in situ scientific data encoding for analytical query processingProceedings of the 22nd international symposium on High-performance parallel and distributed computing10.1145/2493123.2465527(1-12)Online publication date: 17-Jun-2013
  • (2013)Scalable in situ scientific data encoding for analytical query processingProceedings of the 22nd international symposium on High-performance parallel and distributed computing10.1145/2462902.2465527(1-12)Online publication date: 17-Jun-2013
  • (2013)imMensProceedings of the 15th Eurographics Conference on Visualization10.1111/cgf.12129(421-430)Online publication date: 17-Jun-2013
  • (2012)Parallel I/O, analysis, and visualization of a trillion particle simulationProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/2388996.2389077(1-12)Online publication date: 10-Nov-2012
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media