Nothing Special   »   [go: up one dir, main page]

Skip to main content

Big Data and HPC Convergence: The Cutting Edge and Outlook

  • Conference paper
  • First Online:
Smart Societies, Infrastructure, Technologies and Applications (SCITA 2017)

Abstract

The data growth over the last couple of decades increases on a massive scale. As the volume of the data increases so are the challenges associated with big data. The issues related to avalanche of data being produced are immense and cover variety of challenges that needs a careful consideration. The use of (High Performance Data Analytics) HPDA is increasing at brisk speed in many industries resulted in expansion of HPC market in these new territories. HPC and Big data are different systems, not only at the technical level, but also have different ecosystems. The world of workload is diverse enough and performance sensitivity is high enough that, we cannot have globally optimal and locally high sub-optimal solutions to all the issues related to convergence of big data and HPC. As we are heading towards exascale systems, the necessary integration of big data and HPC is a current hot topic of research but still at very infant stages. Both systems have different architecture and their integration brings many challenges. The main aim of this paper is to identify the driving forces, challenges, current and future trends associated with the integration of HPC and big data. We also propose architecture of big data and HPC convergence using design patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Singh, K., Kaur, R.: Hadoop: addressing challenges of big data. In: 2014 IEEE International Advance Computing Conference (IACC), pp. 686–689. IEEE (2014)

    Google Scholar 

  2. Charl, S.: IBM - HPC and HPDA for the Cognitive Journey with OpenPOWER. https://www-03.ibm.com/systems/power/solutions/bigdata-analytics/smartpaper/high-value-insights.html

  3. Keable, C.: The convergence of High Performance Computing and Big Data – Ascent. https://ascent.atos.net/convergence-high-performance-computing-big-data/

  4. Joseph, E., Sorensen, B.: IDC Update on How Big Data Is Redefining High Performance Computing. https://www.tacc.utexas.edu/documents/1084364/1136739/IDC+HPDA+Briefing+slides+10.21.2014_2.pdf

  5. Geist, A., Lucas, R.: Whitepaper on the Major Computer Science Challenges at Exascale (2009)

    Article  Google Scholar 

  6. Krishnan, S., Tatineni, M., Baru, C.: myHadoop-Hadoop-on-Demand on Traditional HPC Resources (2011)

    Google Scholar 

  7. Xuan, P., Denton, J., Ge, R., Srimani, P.K., Luo, F.: Big data analytics on traditional HPC infrastructure using two-level storage (2015)

    Google Scholar 

  8. Is Hadoop the New HPC. http://www.admin-magazine.com/HPC/Articles/Is-Hadoop-the-New-HPC

  9. Katal, A., Wazid, M., Goudar, R.H.: Big data: issues, challenges, tools and good practices. In: 2013 Sixth International Conference on Contemporary Computing (IC3), pp. 404–409. IEEE (2013)

    Google Scholar 

  10. Hess, K.: Hadoop vs. Spark: The New Age of Big Data. http://www.datamation.com/data-center/hadoop-vs.-spark-the-new-age-of-big-data.html

  11. Muhammad, J.: Is Apache Spark going to replace Hadoop? http://aptuz.com/blog/is-apache-spark-going-to-replace-hadoop/

  12. OLCF Staff Writer: OLCF Group to Offer Spark On-Demand Data Analysis. https://www.olcf.ornl.gov/2016/03/29/olcf-group-to-offer-spark-on-demand-data-analysis/

  13. Islam, N.S., Lu, X., Wasi-ur-Rahman, M., Shankar, D., Panda, D.K.: Triple-H: a hybrid approach to accelerate HDFS on HPC clusters with heterogeneous storage architecture. In: 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 101–110. IEEE (2015)

    Google Scholar 

  14. Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating MapReduce for multi-core and multiprocessor systems. In: 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pp. 13–24. IEEE (2007)

    Google Scholar 

  15. Tiwari, N., Sarkar, S., Bellur, U., Indrawan, M.: An empirical study of Hadoop’s energy efficiency on a HPC cluster. Procedia Comput. Sci. 29, 62–72 (2014)

    Article  Google Scholar 

  16. Woodie, A.: Does InfiniBand Have a Future on Hadoop? http://www.datanami.com/2015/08/04/does-infiniband-have-a-future-on-hadoop/

  17. Veiga, J., Exp, R.R., Taboada, G.L., Touri, J.: Analysis and Evaluation of Big Data Computing Solutions in an HPC Environment (2015)

    Google Scholar 

  18. Wang, Y., et al.: Assessing the performance impact of high-speed interconnects on MapReduce. In: Rabl, T., Poess, M., Baru, C., Jacobsen, H.-A. (eds.) WBDB-2012. LNCS, vol. 8163, pp. 148–163. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-53974-9_13

    Chapter  Google Scholar 

  19. Islam, N.S., Lu, X., Wasi-ur-Rahman, M., Panda, D.K.: Can parallel replication benefit Hadoop distributed file system for high performance interconnects? In: 2013 IEEE 21st Annual Symposium on High-Performance Interconnects, pp. 75–78. IEEE (2013)

    Google Scholar 

  20. Moore, J., Chase, J., Ranganathan, P., Sharma, R.: Making scheduling cool: temperature-aware workload placement in data centers (2005)

    Google Scholar 

  21. Reed, D.A., Dongarra, J.: Exascale computing and big data. Commun. ACM 58, 56–68 (2015)

    Article  Google Scholar 

  22. Rajovic, N., Puzovic, N., Vilanova, L., Villavieja, C., Ramirez, A.: The low-power architecture approach towards exascale computing. In: Proceedings of the Second Workshop on Scalable Algorithms for Large-Scale Systems - ScalA 2011, p. 1. ACM Press, New York (2011)

    Google Scholar 

  23. Cappello, F.: Fault tolerance in petascale/exascale systems: current knowledge, challenges and research opportunities. Int. J. High Perform. Comput. Appl. 23, 212–226 (2009)

    Article  Google Scholar 

  24. Gutierrez, D.: The Convergence of Big Data and HPC – insideBIGDATA. https://insidebigdata.com/2016/10/25/the-convergence-of-big-data-and-hpc/

  25. High Performance Data Analytics (HPDA) Market-Forecast 2022. https://www.marketresearchfuture.com/reports/high-performance-data-analytics-hpda-market

  26. Willard, C.G., Snell, A., Segervall, L., Feldman, M.: Top Six Predictions for HPC in 2015 (2015)

    Google Scholar 

  27. Egham: Gartner Says 8.4 Billion Connected “Things”; Will Be in Use in 2017, Up 31 Percent From 2016. http://www.gartner.com/newsroom/id/3598917

  28. El Baz, D.: IoT and the need for high performance computing. In: 2014 International Conference on Identification, Information and Knowledge in the Internet of Things, pp. 1–6. IEEE (2014)

    Google Scholar 

  29. Conway, S.: High Performance Data Analysis (HPDA): HPC - Big Data Convergence - insideHPC (2017)

    Google Scholar 

  30. Keutzer, K., Tim, M.: Our Pattern Language_Our Pattern Language (2016). Keutzer—EECS UC Berkeley, Tim—Intel. file:///Users/abdulmanan/Desktop/Our Pattern Language_Our Pattern Language.htm

    Google Scholar 

  31. Bodkin, R., Bodkin, R.: Big Data Patterns, pp. 1–23 (2017)

    Google Scholar 

  32. Mysore, D., Khupat, S., Jain, S.: Big data architecture and patterns, Part 1: Introduction to big data classification and architecture. https://www.ibm.com/developerworks/library/bd-archpatterns1/index.html

Download references

Acknowledgments

The authors acknowledge with thanks, the technical and financial support from the Deanship of Scientific Research (DSR) at the King Abdul-Aziz University (KAU), Jeddah, Saudi Arabia, under the grant number G-661-611-38. The work carried out in this paper is supported by the HPC Center at the King Abdul-Aziz University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sardar Usman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Usman, S., Mehmood, R., Katib, I. (2018). Big Data and HPC Convergence: The Cutting Edge and Outlook. In: Mehmood, R., Bhaduri, B., Katib, I., Chlamtac, I. (eds) Smart Societies, Infrastructure, Technologies and Applications. SCITA 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 224. Springer, Cham. https://doi.org/10.1007/978-3-319-94180-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94180-6_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94179-0

  • Online ISBN: 978-3-319-94180-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics