Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Approaches of enhancing interoperations among high performance computing and big data analytics via augmentation

Published: 01 June 2020 Publication History

Abstract

The dawn of exascale computing and its convergence with big data analytics has greatly spurred research interests. The reasons are straightforward. Traditionally, high performance computing (HPC) systems have been used for scientific applications involving majority of compute-intensive tasks. At the same time, the proliferation of big data resulted into design of data-intensive processing paradigms like Apache big data stack. Big data generating at high pace necessitates faster processing mechanisms for getting insights at a real time. For this, the HPC systems may serve as panacea for solving the big data problems. Though the HPC systems have the capability to give the promising results for big data, directly integrating them with existing data-intensive frameworks like Apache big data stack is not straightforward due to challenges associated with them. This triggers a research on seamlessly integrating these two paradigms based on interoperable framework, programming model, and system architecture. The aim of this paper is to assess a progress made in HPC world as an effort to augment it with big data analytics support. As an outcome of this, the taxonomy showing the factors to be considered for augmenting HPC systems with big data support has been put forth. This paper sheds light upon how big data frameworks can be ported to HPC platforms as a preliminary step towards the convergence of big data and exascale computing ecosystem. The focus is given on research issues related to augmenting HPC paradigms with big data frameworks and corresponding approaches to address those issues. This paper also discusses data-intensive as well as compute-intensive processing paradigms, benchmark suites and workloads, and future directions in the domain of integrating HPC with big data analytics.

References

[1]
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51 (2008)
[2]
White T Hadoop: The Definitive Guide 2012 Newton O’Reilly Media, Inc.
[3]
Apache Spark. https://spark.apache.org. Accessed 22 Sep 2018
[4]
Reed DA and Dongarra J Exascale computing and big data Commun. ACM 2015 58 56-68
[5]
Elsebakhi E et al.Large-scale machine learning based on functional networks for biomedical big data with high performance computing platformsJ. Comput. Sci.20151169-813435046
[6]
Bianchini G, Caymes-Scutari P, and Méndez-Garabetti M Evolutionary-Statistical System: a parallel method for improving forest fire spread prediction J. Comput. Sci. 2015 6 58-66
[7]
Zhao G, Bryan BA, King D, Song X, and Yu Q Parallelization and optimization of spatial analysis for large scale environmental model data assembly Comput. Electron. Agric. 2012 89 94-99
[8]
Bhangale, U.M., Kurte, K.R., Durbha, S.S., King, R.L., Younan, N.H.: Big data processing using HPC for remote sensing disaster data. In: Geoscience and Remote Sensing Symposium (IGARSS), 2016, pp. 5894–5897. IEEE International (2016)
[10]
[12]
HPCC. https://hpccsystems.com. Accessed 30 Sep 2018
[13]
Bridges. https://www.psc.edu/bridges. Accessed 30 Sep 2018
[18]
Park, B.H., Hukerikar, S., Adamson, R., Engelmann, C.: Big data meets HPC log analytics: scalable approach to understanding systems at extreme scale. In: IEEE International Conference on Cluster Computing (CLUSTER), 2017, pp. 758–765 (2017)
[19]
Moise, D.: Experiences with performing MapReduce analysis of scientific data on HPC platforms. In: Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing, pp. 11–18 (2016)
[20]
Fox, G.C., Qiu, J., Kamburugamuve, S., Jha, S., Luckow, A.: HPC-ABDS high performance computing enhanced Apache big data stack. In: 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 1057–1066 (2015)
[21]
Fox, G., Qiu, J., Jha, S., Ekanayake, S., Kamburugamuve, S.: Big data, simulations and HPC convergence. In: Big Data Benchmarking, pp. 3–17. Springer (2015)
[22]
Veiga J, Expósito RR, Taboada GL, and Touriño J Analysis and evaluation of MapReduce solutions on an HPC cluster Comput. Electr. Eng. 2016 50 200-216
[23]
Xenopoulos, P., Daniel, J., Matheson, M., Sukumar, S.: Big data analytics on HPC architectures: performance and cost. In 2016 IEEE International Conference on Big Data (Big Data), pp. 2286–2295 (2016)
[24]
Asaadi, H., Khaldi, D., Chapman, B.: A comparative survey of the HPC and big data paradigms: analysis and experiments. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 423–432 (2016)
[25]
Wasi-ur-Rahman M, Islam NS, Lu X, and Panda DKDK A comprehensive study of MapReduce over Lustre for intermediate data placement and shuffle strategies on HPC clusters IEEE Trans. Parallel Distrib. Syst. 2017 28 633-646
[26]
Usman, S., Mehmood, R., Katib, I.: Big data and HPC convergence: the cutting edge and outlook. In: Smart Societies, Infrastructure, Technologies and Applications, pp. 11–26. Springer (2018)
[27]
Asch M et al. Big data and extreme-scale computing: pathways to convergence-toward a shaping strategy for a future software and data ecosystem for scientific inquiry Int. J. High Perform. Comput. Appl. 2018 32 435-479
[28]
The convergence of big data and extreme-scale HPC. https://www.hpcwire.com/2018/08/31/the-convergence-of-big-data-and-extreme-scale-hpc/. Accessed 22 Sep 2018
[29]
Luckow, A., Paraskevakos, I., Chantzialexiou, G., Jha, S.: Hadoop on HPC: integrating Hadoop and pilot-based dynamic resource management. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1607–1616 (2016)
[30]
Ross, R.B., Thakur, R., et al.: PVFS: a parallel file system for Linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, pp. 391–430 (2000)
[31]
Nagle, D., Serenyi, D., Matthews, A.: The Panasas ActiveScale storage cluster: delivering scalable high bandwidth storage. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, p. 53 (2004)
[32]
Eisler M, Labiaga R, and Stern H Managing NFS and NIS: Help for Unix System Administrators 2001 Newton O’Reilly Media, Inc.
[33]
Schwan, P., et al.: Lustre: building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium, vol. 2003, pp. 380–386 (2003)
[34]
Schmuck, F.B., Haskin, R.L.: GPFS: a shared-disk file system for large computing clusters. In: FAST, vol. 2 (2002)
[35]
Gu, Y., Grossman, R.L., Szalay, A., Thakar, A.: Distributing the Sloan digital sky survey using UDT and sector. In: Second IEEE International Conference on e-Science and Grid Computing, 2006. e-Science’06, p. 56 (2006)
[36]
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10 (2010)
[37]
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. ACM 37 (2003). 10.1145/1165389.945450
[38]
OpenMP. https://www.openmp.org. Accessed 20 Aug 2018
[39]
MPICH. https://www.mpich.org. Accessed 20 Aug 2018
[40]
[43]
OpenACC. https://www.openacc.org. Accessed 2 Feb 2019
[44]
Zhang F et al. CloudFlow: a data-aware programming model for cloud workflow applications on modern HPC systems Future Gener. Comput. Syst. 2015 51 98-110
[45]
Venkata, M.G., Aderholdt, F., Parchman, Z.: SharP: Towards programming extreme-scale systems with hierarchical heterogeneous memory. In: 2017 46th International Conference on Parallel Processing Workshops (ICPPW), pp. 145–154 (2017)
[46]
Fadika Z, Dede E, Govindaraju M, and Ramakrishnan L MARIANE: using MapReduce in HPC environments Future Gener. Comput. Syst. 2014 36 379-388
[47]
Luckow, A., et al.: P*: a model of pilot-abstractions. CoRR (2012). http://arxiv.org/abs/1207.6644
[48]
Neves, M.V., Ferreto, T., De Rose, C.: Scheduling MapReduce jobs in HPC clusters. In: Euro-Par 2012 Parallel Processing: 18th International Conference, Euro-Par 2012, Proceedings, pp. 179–190. Springer, Berlin (2012)
[49]
Sato, K., et al.: A user-level InfiniBand-based file system and checkpoint strategy for burst buffers. In: 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 21–30 (2014). 10.1109/ccgrid.2014.24
[50]
Daly JT A higher order estimate of the optimum checkpoint interval for restart dumps Future Gener. Comput. Syst. 2006 22 303-312
[52]
TrinityX. https://trinityx.eu. Accessed 8 March 2019
[53]
OpenStack. https://www.openstack.org/. Accessed 8 March 2019
[54]
Docker. https://www.docker.com. Accessed 8 March 2019
[55]
Slurm elastic computing. https://slurm.schedmd.com/elastic_computing.html. Accessed 8 March 2019
[56]
Xen. https://xenproject.org. Accessed 8 March 2019
[57]
VMware. https://www.vmware.com. Accessed 8 March 2019
[58]
KVM. https://www.linux-kvm.org. Accessed 8 March 2019
[59]
VirtualBox. https://www.virtualbox.org. Accessed 8 March 2019
[60]
Regola, N., Ducom, J.-C.: Recommendations for virtualization technologies in high performance computing. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp. 409–416 (2010)
[61]
Biederman EW and Networx L Multiple instances of the global Linux namespaces Proc. Linux Symp. 2006 1 101-112
[63]
Linux containers. https://linuxcontainers.org. Accessed 10 March 2019
[64]
Linux-VServer. www.linux-vserver.org. Accessed 10 March 2019
[65]
OpenVZ. https://openvz.org. Accessed 10 March 2019
[66]
LXD Linux containers. https://linuxcontainers.org/lxd/introduction. Accessed 10 March 2019
[67]
rkt-CoreOS. https://coreos.com/rkt/. Accessed 10 March 2019
[68]
Kurtzer GM, Sochat V, and Bauer MW Singularity: scientific containers for mobility of compute PLoS ONE 2017 12 e0177459
[70]
Priedhorsky, R., Randles, T.: Charliecloud: unprivileged containers for user-defined software stacks in HPC. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p. 36 (2017
[71]
Soltesz S, Pötzl H, Fiuczynski ME, Bavier A, and Peterson L Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors ACM SIGOPS Oper. Syst. Rev. 2007 41 275-287
[72]
Julian, S., Shuey, M., Cook, S.: Containers in research: initial experiences with lightweight infrastructure. In: Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale, p. 25 (2016)
[73]
Kozhirbayev Z and Sinnott RO A performance comparison of container-based technologies for the cloud Future Gener. Comput. Syst. 2017 68 175-182
[74]
Medrano-Jaimes, F., Lozano-Rizk, J.E., Castañeda-Avila, S., Rivera-Rodriguez, R.: Use of containers for high-performance computing. In: International Conference on Supercomputing in Mexico, pp. 24–32 (2018)
[75]
Martin JP, Kandasamy A, and Chandrasekaran K Exploring the support for high performance applications in the container runtime environment Hum. Centric Comput. Inf. Sci. 2018 8 1
[76]
Shafer, J.: I/O virtualization bottlenecks in cloud computing today. In: Proceedings of the 2nd Conference on I/O Virtualization, p. 5 (2010)
[77]
Yassour B-A, Ben-Yehuda M, and Wasserman O Direct Device Assignment for Untrusted Fully-Virtualized Virtual Machines 2008 Haifa IBM
[78]
Liu, J., Huang, W., Abali, B., Panda, D.K.: High performance VMM-bypass I/O in virtual machines. In: USENIX Annual Technical Conference, General Track, pp. 29–42 (2006)
[80]
Gugnani, S., Lu, X., Panda, D.K.: Performance characterization of Hadoop workloads on SR-IOV-enabled virtualized InfiniBand clusters. In: Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, pp. 36–45 (2016)
[81]
Hillenbrand, M., Mauch, V., Stoess, J., Miller, K., Bellosa, F.: Virtual InfiniBand clusters for HPC clouds. In: Proceedings of the 2nd International Workshop on Cloud Computing Platforms, p. 9 (2012)
[82]
Nicolae B and Cappello F BlobCR: virtual disk based checkpoint–restart for HPC applications on IaaS clouds J. Parallel Distrib. Comput. 2013 73 698-711
[83]
Ren J, Qi Y, Dai Y, Xuan Y, and Shi Y nOSV: a lightweight nested-virtualization VMM for hosting high performance computing on cloud J. Syst. Softw. 2017 124 137-152
[84]
Zhang, J., Lu, X., Chakraborty, S., Panda, D.K. Slurm-V: extending Slurm for building efficient HPC cloud with SR-IOV and IVShmem. In: European Conference on Parallel Processing, pp. 349–362 (2016)
[85]
Duran-Limon HA, Flores-Contreras J, Parlavantzas N, Zhao M, and Meulenert-Peña A Efficient execution of the WRF model and other HPC applications in the cloud Earth Sci. Inform. 2016 9 365-382
[86]
Duran-Limon HA, Siller M, Blair GS, Lopez A, and Lombera-Landa JF Using lightweight virtual machines to achieve resource adaptation in middleware IET Softw. 2011 5 229-237
[87]
Yang, C.-T., Wang, H.-Y., Ou, W.-S., Liu, Y.-T., Hsu, C.-H.: On implementation of GPU virtualization using PCI pass-through. In: 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings, pp. 711–716 (2012)
[88]
Jo H, Jeong J, Lee M, and Choi DH Exploiting GPUs in virtual machine for BioCloud Biomed. Res. Int. 2013
[89]
Prades J, Reaño C, and Silla F On the effect of using rCUDA to provide CUDA acceleration to Xen virtual machines Clust. Comput. 2019 22 185-204
[90]
Mavridis I and Karatza H Combining containers and virtual machines to enhance isolation and extend functionality on cloud computing Future Gener. Comput. Syst. 2019 94 674-696
[91]
Gad R et al. Zeroing memory deallocator to reduce checkpoint sizes in virtualized HPC environments J. Supercomput. 2018 74 6236-6257
[92]
Trusted Computing Group. https://trustedcomputinggroup.org. Accessed 27 Feb 2019
[93]
Goldman, K., Sailer, R., Pendarakis, D., Srinivasan, D.: Scalable integrity monitoring in virtualized environments. In: Proceedings of the Fifth ACM Workshop on Scalable Trusted Computing, pp. 73–78 (2010)
[94]
Zhang, J., Lu, X., Panda, D.K.: Is singularity-based container technology ready for running MPI applications on HPC clouds? In: Proceedings of the 10th International Conference on Utility and Cloud Computing, pp. 151–160 (2017)
[95]
De Benedictis M and Lioy A Integrity verification of Docker containers for a lightweight cloud environment Future Gener. Comput. Syst. 2019 97 236-246
[96]
Costan V and Devadas S Intel SGX explained IACR Cryptol. ePrint Arch. 2016 2016 86
[97]
Arnautov, S., et al.: SCONE: secure Linux containers with Intel SGX. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI16), pp. 689–703 (2016)
[98]
Sailer, R., Zhang, X., Jaeger, T., Van Doorn, L.: Design and implementation of a TCG-based integrity measurement architecture. In: USENIX Security Symposium, vol. 13, pp. 223–238 (2004)
[99]
Sun, Y., et al.: Security namespace: making Linux security frameworks available to containers. In: 27th USENIX Security Symposium USENIX Security 18, pp. 1423–1439 (2018)
[101]
Bézivin J On the unification power of models Softw. Syst. Model. 2005 4 171-188
[102]
Paraiso, F., Challita, S., Al-Dhuraibi, Y., Merle, P.: Model-driven management of docker containers. In: 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), pp. 718–725 (2016)
[103]
Pérez A, Moltó G, Caballer M, and Calatrava A Serverless computing for container-based architectures Future Gener. Comput. Syst. 2018 83 50-59
[104]
AWS Lambda. https://aws.amazon.com/lambda. Accessed 1 March 2019
[105]
Medel V et al. Pham C, Altmann J, Bañares JÁ, et al. Client-side scheduling based on application characterization on Kubernetes Economics of Grids, Clouds, Systems, and Services 2017 Cham Springer 162-176
[106]
Yang, X., Liu, N., Feng, B., Sun, X.-H., Zhou, S.: PortHadoop: support direct HPC data processing in Hadoop. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 223–232 (2015)
[107]
Ruan, G., Plale, B.: Horme: random access big data analytics. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 364–373 (2016)
[108]
McAuley, J., Leskovec, J.: Hidden factors and hidden topics: understanding rating dimensions with review text. In: Proceedings of the 7th ACM Conference on Recommender Systems, pp. 165–172 (2013)
[109]
Ren, K., Zheng, Q., Patil, S., Gibson, G.: IndexFS: scaling file system metadata performance with stateless caching and bulk insertion. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 237–248 (2014)
[110]
Takatsu, F., Hiraga, K., Tatebe, O.: PPFS: a scale-out distributed file system for post-petascale systems. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications, pp. 1477–1484 (2016)
[111]
Islam, N.S., Lu, X., Wasi-ur-Rahman, M., Shankar, D., Panda, D.K.: Triple-H: a hybrid approach to accelerate HDFS on HPC clusters with heterogeneous storage architecture. In: 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 101–110 (2015)
[112]
Welsh M, Culler D, and Brewer E SEDA: an architecture for well-conditioned, scalable Internet services ACM SIGOPS Oper. Syst. Rev. 2001 35 230-243
[113]
Wasi-ur-Rahman, M., Lu, X., Islam, N.S., Rajachandrasekar, R., Panda, D.K.: High-performance design of YARN MapReduce on modern HPC clusters with Lustre and RDMA. In: Parallel and Distributed Processing Symposium (IPDPS), 2015, pp. 291–300. IEEE International (2015)
[114]
Rahman, M.W., Lu, X., Islam, N.S., Rajachandrasekar, R., Panda, D.K.: MapReduce over Lustre: can RDMA-based approach benefit? In: Silva, F., Dutra, I., Santos Costa, V. (eds.) Euro-Par 2014 Parallel Processing: 20th International Conference. Proceedings, Porto, Portugal, 25–29 August 2014, pp. 644–655. Springer (2014)
[115]
Li, H., Ghodsi, A., Zaharia, M., Shenker, S., Stoica, I.: Tachyon: reliable, memory speed storage for cluster computing frameworks. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 1–15 (2014)
[116]
Zhao, D., et al.: FusionFS: toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 61–70 (2014)
[117]
Xuan P, Ligon WB, Srimani PK, Ge R, and Luo FAccelerating big data analytics on HPC clusters using two-level storageParallel Comput.20176118-343602747
[118]
Raynaud, T., Haque, R., Ait-Kaci, H.: CedCom: a high-performance architecture for Big Data applications. In: 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA), pp. 621–632 (2014)
[119]
Cheng, P., Lu, Y., Du, Y., Chen, Z.: Experiences of converging big data analytics frameworks with high performance computing systems. In: Yokota, R., Wu, W. (eds.) Supercomputing Frontiers, pp. 90–106. Springer (2018)
[120]
Bhimji W et al. Accelerating Science with the NERSC Burst Buffer Early User Program 2016 Berkeley Lawrence National Laboratory
[121]
Wang, T., Oral, S., Pritchard, M., Vasko, K., Yu, W.: Development of a burst buffer system for data-intensive applications. arXiv Prepr. arXiv1505.01765 (2015)
[122]
Henseler, D., Landsteiner, B., Petesch, D., Wright, C., Wright, N.J.: Architecture and design of Cray DataWarp. In: Cray User Group, CUG (2016)
[123]
Wang, T., Mohror, K., Moody, A., Sato, K., Yu, W.: An ephemeral burst-buffer file system for scientific applications. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p. 69 (2016)
[124]
Tang, K., et al.: Toward managing HPC burst buffers effectively: draining strategy to regulate bursty I/O behavior. In: 2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 87–98 (2017)
[126]
Islam, N.S., Shankar, D., Lu, X., Wasi-Ur-Rahman, M., Panda, D.K.: Accelerating I/O performance of big data analytics on HPC clusters through RDMA-based key-value store. In: 2015 44th International Conference on Parallel Processing, pp. 280–289 (2015)
[127]
Wang, Y., Goldstone, R., Yu, W., Wang, T.: Characterization and optimization of memory-resident MapReduce on HPC systems. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp. 799–808 (2014)
[128]
Yildiz, O., Zhou, A.C., Ibrahim, S.: Improving the effectiveness of burst buffers for big data processing in HPC systems with Eley. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 87–91 (2017)
[129]
Yildiz O, Zhou AC, and Ibrahim S Improving the effectiveness of burst buffers for big data processing in HPC systems with Eley Future Gener. Comput. Syst. 2018
[130]
Chaimov, N., et al.: Scaling Spark on HPC systems. In: Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, pp. 97–110 (2016)
[131]
Islam, N.S., Wasi-ur-Rahman, M., Lu, X., Panda, D.K.: High performance design for HDFS with byte-addressability of NVM and RDMA. In: Proceedings of the 2016 International Conference on Supercomputing, p. 8 (2016)
[132]
Wang, T., et al.: BurstMem: a high-performance burst buffer system for scientific applications. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 71–79 (2014)
[133]
Hadoop workload analysis. http://www.pdl.cmu.edu/HLA/index.shtml. Accessed 27 Feb 2018
[134]
Liu, N., et al.: On the role of burst buffers in leadership-class storage systems. In 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–11 (2012). 10.1109/msst.2012.6232369
[135]
Wasi-ur-Rahman, M., Islam, N.S., Lu, X., Panda, D.K.: NVMD: non-volatile memory assisted design for accelerating MapReduce and DAG execution frameworks on HPC systems. In: IEEE International Conference on Big Data (Big Data), pp. 369–374 (2017)
[136]
Moving computation is cheaper than moving data. https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html. Accessed 22 Sep 2018
[137]
Liu Q et al. Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks Concurr. Comput. Pract. Exp. 2014 26 1453-1473
[138]
Klasky, S., et al.: In situ data processing for extreme-scale computing. In: Proceedings of SciDAC (2011)
[140]
Foster, I., et al.: Computing just what you need: online data analysis and reduction at extreme scales. In: European Conference on Parallel Processing, pp. 3–19 (2017)
[141]
Mackey G, Sehrish S, Mitchell C, Bent J, and Wang J USFD: a unified storage framework for SOAR HPC scientific workflows Int. J. Parallel Emerg. Distrib. Syst. 2012 27 347-367
[143]
Tao, D., Di, S., Chen, Z., Cappello, F.: Significantly improving lossy compression for scientific data sets based on multidimensional prediction and error-controlled quantization. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 1129–1139 (2017)
[144]
Son SW, Sehrish S, Liao W, Oldfield R, and Choudhary A Reducing I/O variability using dynamic I/O path characterization in petascale storage systems J. Supercomput. 2017 73 2069-2097
[145]
Wang, T., Oral, S., Pritchard, M., Wang, B., Yu, W.: TRIO: burst buffer based I/O orchestration. In: 2015 IEEE International Conference on Cluster Computing, pp. 194–203 (2015)
[146]
Kougkas, A., Dorier, M., Latham, R., Ross, R., Sun, X.-H.: Leveraging burst buffer coordination to prevent I/O interference. In: 2016 IEEE 12th International Conference on e-Science (e-Science), pp. 371–380 (2016)
[147]
Zhang X, Jiang S, Diallo A, and Wang L IR+: removing parallel I/O interference of MPI programs via data replication over heterogeneous storage devices Parallel Comput. 2018 76 91-105
[148]
Han, J., et al.: Accelerating a burst buffer via user-level I/O isolation. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 245–255 (2017)
[149]
Xu C et al. Exploiting analytics shipping with virtualized MapReduce on HPC backend storage servers IEEE Trans. Parallel Distrib. Syst. 2016 27 185-196
[150]
da Silva, R.F., Callaghan, S., Deelman, E.: On the use of burst buffers for accelerating data-intensive scientific workflows. In: Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science, p. 2 (2017)
[151]
Dreher, M., Raffin, B.: A flexible framework for asynchronous in situ and in transit analytics for scientific simulations. In: 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 277–286 (2014)
[152]
Malitsky, N.: Bringing the HPC reconstruction algorithms to Big Data platforms. In: 2016 New York Scientific Data Summit (NYSDS), pp. 1–8 (2016)
[153]
OpenFabrics. http://www.openfabrics.org/. Accessed 22 Sep 2018
[154]
Wasi-ur-Rahman, M., et al.: High-performance RDMA-based design of Hadoop MapReduce over InfiniBand. In: 2013 IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum (IPDPSW), pp. 1908–1917 (2013)
[155]
Rahman, M.W., Lu, X., Islam, N.S., Panda, D.K.: HOMR: a hybrid approach to exploit maximum overlapping in MapReduce over high performance interconnects. In: Proceedings of the 28th ACM International Conference on Supercomputing, pp. 33–42 (2014)
[156]
High Performance Data Analytics: Experiences of Porting the Apache Hama Graph Analytics Framework to an HPC InfiniBand Connected Cluster (White Paper). https://gdmissionsystems.com/-/media/General-Dynamics/Cyber-and-Electronic-Warfare-Systems/PDF/Brochures/high-performance-data-analytics-whitepaper-2015.ashx
[157]
Li, M., Lu, X., Hamidouche, K., Zhang, J., Panda, D.K.: Mizan-RMA: accelerating Mizan graph processing framework with MPI RMA. In: IEEE 23rd International Conference on High Performance Computing (HiPC), 42–51 (2016)
[158]
Li, M., et al.: Designing MPI library with on-demand paging (ODP) of InfiniBand: challenges and benefits. In: SC’16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 433–443 (2016)
[159]
Lu, X., Wang, B., Zha, L., Xu, Z.: Can MPI benefit Hadoop and MapReduce applications? In: 2011 40th International Conference on Parallel Processing Workshops, pp. 371–379 (2011)
[160]
Wang, Y., Xu, C., Li, X., Yu, W.: JVM-bypass for efficient Hadoop shuffling. In 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp. 569–578 (2013)
[161]
Sur, S., Wang, H., Huang, J., Ouyang, X., Panda, D.K.: Can high-performance interconnects benefit Hadoop distributed file system? In: Workshop on Micro Architectural Support for Virtualization, Data Center Computing, and Clouds (MASVDC). Held in Conjunction with MICRO (2010)
[162]
Jose, J., et al.: Memcached design on high performance RDMA capable interconnects. In: 2011 International Conference on Parallel Processing, pp. 743–752 (2011)
[163]
Jose, J., Luo, M., Sur, S., Panda, D.K.: Unifying UPC and MPI runtimes: experience with MVAPICH. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, p. 5 (2010)
[164]
Islam, N.S., et al.: High performance RDMA-based design of HDFS over InfiniBand. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 35 (2012)
[165]
Huang, J., et al.: High-performance design of HBase with RDMA over InfiniBand. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp. 774–785 (2012)
[166]
Lu, X., et al.: High-performance design of Hadoop RPC with RDMA over InfiniBand. In: 2013 42nd International Conference on Parallel Processing, pp. 641–650 (2013)
[167]
Islam, N.S., Lu, X., Rahman, M.W., Panda, D.K.: SOR-HDFS: a SEDA-based approach to maximize overlapping in RDMA-enhanced HDFS. In: Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, pp. 261–264 (2014)
[168]
Lu, X., Rahman, M.W.U., Islam, N., Shankar, D., Panda, D.K.: Accelerating Spark with RDMA for big data processing: early experiences. In: 2014 IEEE 22nd Annual Symposium on High-Performance Interconnects, pp. 9–16 (2014)
[169]
Islam, N.S., Lu, X., Wasi-ur-Rahman, M., Panda, D.K.: Can parallel replication benefit Hadoop distributed file system for high performance interconnects? In: 2013 IEEE 21st Annual Symposium on High-Performance Interconnects, pp. 75–78 (2013)
[170]
Katevenis M et al. Next generation of Exascale-class systems: ExaNeSt Project and the status of its interconnect and storage development Microprocess. Microsyst. 2018 61 58-71
[171]
Zahid F, Gran EG, Bogdański B, Johnsen BD, and Skeie T Efficient network isolation and load balancing in multi-tenant HPC clusters Future Gener. Comput. Syst. 2017 72 145-162
[172]
Wang J et al. SideIO: a side I/O system framework for hybrid scientific workflow J. Parallel Distrib. Comput. 2017 108 45-58
[173]
Huang, D., et al.: UNIO: a unified I/O system framework for hybrid scientific workflow. In: Second International Conference on Cloud Computing and Big Data in Asia, pp. 99–114 (2015)
[175]
Magpie. https://github.com/LLNL/magpie. Accessed 22 Sep 2018
[176]
Moody, W.C., Ngo, L.B., Duffy, E., Apon, A.: JUMMP: job uninterrupted maneuverable MapReduce platform. In: 2013 IEEE International Conference on Cluster Computing (CLUSTER), pp. 1–8 (2013)
[177]
Krishnan, S., Tatineni, M., Baru, C.: myHadoop-Hadoop-on-Demand on Traditional HPC Resources. San Diego Supercomputer Center Technical Report. TR-2011-2. University of California, San Diego (2011)
[178]
Lu, T., et al.: Canopus: a paradigm shift towards elastic extreme-scale data analytics on HPC storage. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp. 58–69 (2017)
[180]
Mercier, M., Glesser, D., Georgiou, Y., Richard, O.: Big data and HPC collocation: using HPC idle resources for Big Data analytics. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 347–352 (2017). 10.1109/bigdata.2017.8257944
[181]
Turilli, M., Santcroos, M., Jha, S.: A comprehensive perspective on the pilot-job abstraction. CoRR (2015). http://arxiv.org/abs/1508.04180
[182]
Merzky, A., Santcroos, M., Turilli, M., Jha, S.: RADICAL-Pilot: scalable execution of heterogeneous and dynamic workloads on supercomputers. CoRR (2015). http://arxiv.org/abs/1512.08194
[183]
Merzky A, Weidner O, and Jha S SAGA: a standardized access layer to heterogeneous distributed computing infrastructure SoftwareX 2015 1 3-8
[184]
SAGA-Hadoop. https://github.com/drelu/saga-hadoop. Accessed 22 Sep 2018
[185]
Rahman MW, Islam NS, Lu X, Shankar D, and Panda DK MR-Advisor: a comprehensive tuning, profiling, and prediction tool for MapReduce execution frameworks on HPC clusters J. Parallel Distrib. Comput. 2018 120 237-250
[186]
Jin, H., Ji, J., Sun, X.-H., Chen, Y., Thakur, R.: CHAIO: enabling HPC applications on data-intensive file systems. In: 2012 41st International Conference on Parallel Processing, pp. 369–378 (2012)
[187]
Aupy, G., Gainaru, A., Le Fèvre, V.: Periodic I/O scheduling for super-computers. In: International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, pp. 44–66 (2017)
[188]
Gao, C., Ren, R., Cai, H.: GAI: a centralized tree-based scheduler for machine learning workload in large shared clusters. In: International Conference on Algorithms and Architectures for Parallel Processing, pp. 611–629 (2018)
[189]
Ekanayake, S., Kamburugamuve, S., Fox, G.C.: SPIDAL Java: high performance data analytics with Java and MPI on large multicore HPC clusters. In: Proceedings of 24th High Performance Computing Symposium (2016)
[190]
NVIDIA NCCL. https://developer.nvidia.com/nccl. Accessed 22 Sep 2018
[191]
Wickramasinghe, U.S., Bronevetsky, G., Lumsdaine, A., Friedley, A.: Hybrid MPI: a case study on the Xeon Phi platform. In: ACM Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers, pp. 6:1–6:8 (2014)
[193]
Gittens, A., et al.: Matrix factorizations at scale: a comparison of scientific data analytics in Spark and C +MPI using three case studies. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 204–213 (2016). 10.1109/bigdata.2016.7840606
[194]
Jha, S., Qiu, J., Luckow, A., Mantha, P., Fox, G.C.: A tale of two data-intensive paradigms: applications, abstractions, and architectures. In: 2014 IEEE International Congress on Big Data (BigData Congress), pp. 645–652 (2014)
[195]
Reyes-Ortiz JL, Oneto L, and Anguita D Big data analytics in the cloud: Spark on Hadoop vs MPI/OpenMP on Beowulf Procedia Comput. Sci. 2015 53 121-130
[196]
Anderson M et al. Bridging the gap between HPC and Big Data frameworks Proc. VLDB Endow. 2017 10 901-912
[197]
Guo, Y., Bland, W., Balaji, P., Zhou, X.: Fault tolerant MapReduce-MPI for HPC clusters. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p. 34 (2015)
[199]
Moody, A., Bronevetsky, G., Mohror, K., De Supinski, B.R.: Design, modeling, and evaluation of a scalable multi-level checkpointing system. In: SC’10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2010)
[200]
Rajachandrasekar, R., Moody, A., Mohror, K., Panda, D.K.: A 1 PB/s file system to checkpoint three million MPI tasks. In: Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, pp. 143–154 (2013)
[202]
You Y et al. Scaling support vector machines on modern HPC platforms J. Parallel Distrib. Comput. 2015 76 16-31
[203]
TeraSort. http://sortbenchmark.org. Accessed 22 Sep 2018
[204]
Ahmad, F., Lee, S., Thottethodi, M., Vijaykumar, T.N.: PUMA: Purdue MapReduce benchmarks suite (2012)
[205]
IOZone benchmark. http://www.iozone.org. Accessed 22 Sep 2018
[206]
Shan, H., Shalf, J.: Using IOR to analyze the I/O performance for HPC platforms. In: Cray User Group Conference 2007, Seattle, WA, USA (2007)
[207]
Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The HiBench benchmark suite: characterization of the MapReduce-based data analysis. In: New Frontiers in Information and Software as Services, pp. 209–228 (2011)
[208]
Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The HiBench benchmark suite: characterization of the MapReduce-based data analysis. In: IEEE 26th International Conference on Data Engineering Workshops (ICDEW), pp. 41–51 (2010)
[209]
Gao, W., et al.: BigDataBench: a dwarf-based big data and AI benchmark suite. CoRR (2018). http://arxiv.org/abs/1802.08254
[210]
OSU HiBD-benchmark. http://hibd.cse.ohio-state.edu. Accessed 22 Sep 2018
[211]
HPL—a portable implementation of the high-performance Linpack benchmark for distributed-memory computers. http://www.netlib.org/benchmark/hpl/
[212]
Graph500. https://graph500.org/. Accessed 22 Sep 2018
[215]
Parallel Workload Archive. http://www.cs.huji.ac.il/labs/parallel/workload/. Accessed 22 Sep 2018
[216]
Albrecht, J.: Challenges for the LHC Run 3: Computing and Algorithms. (2016)

Cited By

View all
  • (2020)Security assurance of MongoDB in singularity LXCs: an elastic and convenient testbed using Linux containers to explore vulnerabilitiesCluster Computing10.1007/s10586-020-03154-723:3(1955-1971)Online publication date: 25-Jul-2020

Index Terms

  1. Approaches of enhancing interoperations among high performance computing and big data analytics via augmentation
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Cluster Computing
          Cluster Computing  Volume 23, Issue 2
          Jun 2020
          1079 pages

          Publisher

          Kluwer Academic Publishers

          United States

          Publication History

          Published: 01 June 2020
          Accepted: 17 July 2019
          Revision received: 23 April 2019
          Received: 22 October 2018

          Author Tags

          1. Big data
          2. High performance data analytics
          3. High performance computing

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 01 Nov 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2020)Security assurance of MongoDB in singularity LXCs: an elastic and convenient testbed using Linux containers to explore vulnerabilitiesCluster Computing10.1007/s10586-020-03154-723:3(1955-1971)Online publication date: 25-Jul-2020

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media