Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3365109.3368789acmconferencesArticle/Chapter ViewAbstractPublication PagesbdcatConference Proceedingsconference-collections
research-article

GPU Acceleration of Range Queries over Large Data Sets

Published: 02 December 2019 Publication History

Abstract

Data management systems commonly use bitmap indices to increase the efficiency of querying scientific data. Bitmaps are usually highly compressible and can be queried directly using fast hardware-supported bitwise logical operations. The processing of bitmap queries is inherently parallel in structure, which suggests they could benefit from concurrent computer systems. In particular, bitmap-range queries offer a highly parallel computational problem, and the hardware features of graphics processing units (GPUs) offer an alluring platform for accelerating their execution. In this paper, we present three GPU algorithms and one CPU based algorithm for the parallel execution of bitmap-range queries. We show that in 95% of our tests, using real and synthetic data, the GPU algorithms greatly outperform the parallel CPU algorithm. For these tests, the GPU algorithms provide up to 87.7× speedup and an average speedup of 30.22× over the parallel CPU algorithm. In addition to enhancing performance, augmenting traditional bitmap query systems with GPUs to offload bitmap query processing allows the CPU to process other requests.

References

[1]
Witold Andrzejewski and Robert Wrembel. 2010. GPU-WAH: Applying GPUs to compressing bitmap indexes with word aligned hybrid. In International Conference on Database and Expert Systems Applications . Springer, Berlin,Heidelberg, 315--329.
[2]
Witold Andrzejewski and Robert Wrembel. 2011. GPU-PLWAH: GPU-based implementation of the PLWAH algorithm for compressing bitmaps. Control and cybernetics, Vol. 40 (2011), 627--650.
[3]
Gennady Antoshenkov. 1995. Byte-aligned bitmap compression. In Proceedings DCC'95 Data Compression Conference . IEEE, 476.
[4]
Peter Bailey, Joseph Myre, Stuart DC Walsh, David J Lilja, and Martin O Saar. 2009. Accelerating lattice Boltzmann fluid flow simulations using graphics processors. In 2009 International Conference on Parallel Processing. IEEE, 550--557.
[5]
Peter Bakkum and Kevin Skadron. 2010. Accelerating SQL Database Operations on a GPU with CUDA. In Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units (GPGPU-3). ACM, New York, NY, USA, 94--103. https://doi.org/10.1145/1735688.1735706
[6]
Nathan Bell and Jared Hoberock. 2012. Thrust: A productivity-oriented library for CUDA . In GPU computing gems Jade edition . Elsevier, 359--371.
[7]
Samy Chambi, Daniel Lemire, Owen Kaser, and Robert Godin. 2016. Better Bitmap Performance with Roaring Bitmaps. Softw. Pract. Exper., Vol. 46, 5 (May 2016), 709--719.
[8]
Jerry Chou, Mark Howison, Brian Austin, Kesheng Wu, Ji Qiang, E. Wes Bethel, Arie Shoshani, Oliver Rübel, Prabhat, and Rob D. Ryne. 2011. Parallel Index and Query for Large Scale Data Analysis. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC '11). 30:1--30:11.
[9]
Alessandro Colantonio and Roberto Di Pietro. 2010. Concise: Compressed 'n' Composable Integer Set. Inform. Process. Lett., Vol. 110, 16 (2010), 644--650.
[10]
Fabian Corrales, David Chiu, and Jason Sawin. 2011. Variable Length Compression for Bitmap Indices. In Database and Expert Systems Applications, Abdelkader Hameurlain, Stephen W. Liddle, Klaus-Dieter Schewe, and Xiaofang Zhou (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 381--395.
[11]
C CUDA. 2019. Best practice guide. https://docs.nvidia.com/cuda/cuda-c-best-practices-guide
[12]
Leonardo Dagum and Ramesh Menon. 1998. OpenMP: An industry-standard API for shared-memory programming. Computing in Science & Engineering, Vol. 5, 1 (1998), 46--55.
[13]
Franccois Deliège and Torben Bach Pedersen. 2010. Position List Word Aligned Hybrid: Optimizing Space and Performance for Compressed Bitmaps. In International Conference on Extending Database Technology (EDBT '10). 228--239.
[14]
Bin Dong, Surendra Byna, and Kesheng Wu. 2014. Parallel query evaluation as a Scientific Data Service. 2014 IEEE International Conference on Cluster Computing (CLUSTER) (2014), 194--202.
[15]
Jack Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, and Ichitaro Yamazaki. 2014. Accelerating Numerical Dense Linear Algebra Calculations with GPUs . Numerical Computations with GPUs (2014), 1--26.
[16]
Francesco Fusco, Marc Ph. Stoecklin, and Michail Vlachos. 2010. Net-Fli: On-the-fly Compression, Archiving and Indexing of Streaming Network Traffic. VLDB, Vol. 3, 2 (2010), 1382--1393.
[17]
Francesco Fusco, Michail Vlachos, Xenofontas Dimitropoulos, and Luca Deri. 2013. Indexing Million of Packets Per Second Using GPUs. In Proceedings of the 2013 Conference on Internet Measurement Conference (IMC '13). 327--332.
[18]
Luke J. Gosink, Kesheng Wu, E. Wes Bethel, John D. Owens, and Kenneth I. Joy. 2009. Data Parallel Bin-Based Indexing for Answering Queries on Multi-core Architectures. In Scientific and Statistical Database Management, Marianne Winslett (Ed.). 110--129.
[19]
Gheorghi Guzun, Guadalupe Canahuate, David Chiu, and Jason Sawin. 2014. A tunable compression framework for bitmap indices. In 2014 IEEE 30th International Conference on Data Engineering. IEEE, 484--495.
[20]
S. Haas, T. Karnagel, O. Arnold, E. Laux, B. Schlegel, G. Fettweis, and W. Lehner. 2016. HW/SW-database-codesign for compressed bitmap index processing. In 2016 IEEE 27th International Conference on Application-specific Systems, Architectures and Processors (ASAP). 50--57.
[21]
Max Heimel and Volker Markl. 2012. In Proceedings of VLDB . 33--44.
[22]
Jinwoong Kim, Sul-Gi Kim, and Beomseok Nam. 2013. Parallel Multi-dimensional Range Query Processing with R-trees on GPU. J. Parallel Distrib. Comput., Vol. 73, 8 (2013), 1195--1207.
[23]
Victor W Lee, Changkyu Kim, Jatin Chhugani, Michael Deisher, Daehyun Kim, Anthony D Nguyen, Nadathur Satish, Mikhail Smelyanskiy, Srinivas Chennupaty, Per Hammarlund, et almbox. 2010. Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU . ACM SIGARCH computer architecture news, Vol. 38, 3 (2010), 451--460.
[24]
M. Lichman. 2013. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[25]
Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE transactions on information theory, Vol. 28, 2 (1982), 129--137.
[26]
Duane Merrill. 2016. Cub: Cuda unbound. URL: http://nvlabs. github. io/cub (2016).
[27]
Joseph Myre, Stuart DC Walsh, D Lilja, and Martin O Saar. 2011. Performance analysis of single-phase, multiphase, and multicomponent lattice-Boltzmann fluid flow simulations on GPU clusters. Concurrency and Computation: Practice and Experience, Vol. 23, 4 (2011), 332--350.
[28]
X. Nguyen, T. Hoang, H. Nguyen, K. Inoue, and C. Pham. 2018. An FPGA-Based Hardware Accelerator for Energy-Efficient Bitmap Index Creation. IEEE Access, Vol. 6 (2018), 16046--16059.
[29]
Ray P. Norris. 2010. Data Challenges for Next-generation Radio Telescopes. In Proceedings of the 2010 Sixth IEEE International Conference on e-Science Workshops (E-SCIENCEW '10). IEEE, 21--24.
[30]
Murat Sariyar, Andreas Borg, and Klaus Pommerening. 2011. Controlling false match rates in record linkage using extreme value theory. Journal of Biomedical Informatics, Vol. 44, 4 (2011), 648--654.
[31]
Y. Su, G. Agrawal, and J. Woodring. 2012. Indexing and Parallel Query Processing Support for Visualizing Climate Datasets. In 2012 41st International Conference on Parallel Processing. IEEE, 249--258.
[32]
Stanimire Tomov, Jack Dongarra, and Marc Baboulin. 2010a. Towards dense linear algebra for hybrid GPU accelerated manycore systems. Parallel Comput., Vol. 36, 5--6 (2010), 232--240.
[33]
Stanimire Tomov, Rajib Nath, Hatem Ltaief, and Jack Dongarra. 2010b. Dense linear algebra solvers for multicore with GPU accelerators. In 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW). IEEE, 1--8.
[34]
Nhat-Phuong Tran, Myungho Lee, and Dong Hoon Choi. 2015. Memory-efficient parallelization of 3D lattice Boltzmann flow solver on a GPU. In 2015 IEEE 22nd International Conference on High Performance Computing (HiPC). IEEE, 315--324.
[35]
Sebastiaan J. van Schaik and Oege de Moor. 2011. A Memory Efficient Reachability Data Structure Through Bit Vector Compression. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD '11). 913--924.
[36]
Stuart DC Walsh, Martin O Saar, Peter Bailey, and David J Lilja. 2009. Accelerating geoscience and engineering system simulations on graphics hardware. Computers & Geosciences, Vol. 35, 12 (2009), 2353--2364.
[37]
Nicolas Weber and Michael Goesele. 2017. MATOG: array layout auto-tuning for CUDA . ACM Transactions on Architecture and Code Optimization (TACO), Vol. 14, 3 (2017), 28.
[38]
Kesheng Wu, Ekow J Otoo, and Arie Shoshani. 2002. Compressing bitmap indexes for faster search operations. In Proceedings 14th International Conference on Scientific and Statistical Database Management. IEEE, 99--108.
[39]
Kesheng Wu, Ekow J. Otoo, and Arie Shoshani. 2006. Optimizing bitmap indices with efficient compression. ACM Trans. Database Syst., Vol. 31, 1 (2006), 1--38.
[40]
K. Wu, E. J. Otoo, A. Shoshani, and H. Nordberg. 2001. Notes on design and implementation of compressed bit vectors . Technical Report LBNL/PUB-3161. Lawrence Berkeley National Laboratory.
[41]
Beytullah Yildiz, Kesheng Wu, Suren Byna, and Arie Shoshani. 2019. Parallel membership queries on very large scientific data sets using bitmap indexes. Concurrency and Computation: Practice and Experience (2019), e5157.

Cited By

View all
  • (2023)Workload-Aware Cache Management of Bitmap IndicesProceedings of the IEEE/ACM 10th International Conference on Big Data Computing, Applications and Technologies10.1145/3632366.3632386(1-10)Online publication date: 4-Dec-2023
  • (2023)Dependency Prediction of Long-Time Resource Uses in HPC EnvironmentIEEE Access10.1109/ACCESS.2023.334104611(141871-141888)Online publication date: 2023
  • (2022)A Statistical Performance Analysis of GPU WAH Range QueryingDisruptive Technologies for Big Data and Cloud Applications10.1007/978-981-19-2177-3_1(1-9)Online publication date: 2-Aug-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
BDCAT '19: Proceedings of the 6th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies
December 2019
174 pages
ISBN:9781450370165
DOI:10.1145/3365109
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 December 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bitmap indices
  2. gpu
  3. range queries
  4. wah compression

Qualifiers

  • Research-article

Conference

UCC '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 27 of 93 submissions, 29%

Upcoming Conference

BDCAT '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)2
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Workload-Aware Cache Management of Bitmap IndicesProceedings of the IEEE/ACM 10th International Conference on Big Data Computing, Applications and Technologies10.1145/3632366.3632386(1-10)Online publication date: 4-Dec-2023
  • (2023)Dependency Prediction of Long-Time Resource Uses in HPC EnvironmentIEEE Access10.1109/ACCESS.2023.334104611(141871-141888)Online publication date: 2023
  • (2022)A Statistical Performance Analysis of GPU WAH Range QueryingDisruptive Technologies for Big Data and Cloud Applications10.1007/978-981-19-2177-3_1(1-9)Online publication date: 2-Aug-2022
  • (2021)Caching Support for Range Query Processing on Bitmap IndicesProceedings of the 33rd International Conference on Scientific and Statistical Database Management10.1145/3468791.3468800(49-60)Online publication date: 6-Jul-2021
  • (2020)Parallel acceleration of CPU and GPU range queries over large data setsJournal of Cloud Computing10.1186/s13677-020-00191-w9:1Online publication date: 5-Aug-2020
  • (2020)Exploring Means to Enhance the Efficiency of GPU Bitmap Index Query ProcessingData Science and Engineering10.1007/s41019-020-00148-86:2(209-228)Online publication date: 30-Nov-2020
  • (2020)Increasing the Efficiency of GPU Bitmap Index Query ProcessingDatabase Systems for Advanced Applications10.1007/978-3-030-59419-0_21(339-355)Online publication date: 22-Sep-2020

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media