Apache Spark with HDFS cluster within Kubernetes
-
Updated
Jul 11, 2023 - Python
Apache Spark with HDFS cluster within Kubernetes
Jupyter notebook server prepared for running Spark with Scala kernels on a remote Spark master
Hadoop in docker cluster, created by docker-compose. Create Hadoop cluster in less than 5mins.
Simplified Hadoop Setup and Configuration Automation
News Sentiment Analysis using ETL pipeline
Hdfs Block Storage System
In this project we have used comments from reddit to play around with multiple functionalities of Apache Spark, HDFS and Docker.
Your go-to-cheatsheet to learn apache-Hadoop.
Modern Big Data Analysis: recommend which pair of United States airports should be connected with a high-speed passenger rail tunnel.
This project sets up a Hadoop (v3.2.3) cluster on a virtual machine (Multipass) on macOS. It includes instructions for configuring HDFS, YARN, and uploading files via command-line and web interface.
An example of installation Apache Spark on AWS
Add a description, image, and links to the hdfs-cluster topic page so that developers can more easily learn about it.
To associate your repository with the hdfs-cluster topic, visit your repo's landing page and select "manage topics."