Spark Ops Final
Spark Ops Final
Spark Ops Final
Administering Spark
Patrick Wendell
Databricks
Outline
Spark components
Cluster managers
Hardware & configuration
Linking with Spark
Monitoring and measuring
Outline
Spark components
Cluster managers
Hardware & configuration
Linking with Spark
Monitoring and measuring
Spark application
Driver program
Java program that creates a
SparkContext
Executors
Worker processes that execute
tasks and store data
Cluster manager
Cluster manager grants
executors to a Spark application
Driver program
Driver program decides when to
launch tasks on which executor
Types of Applications
Long lived/shared applications
Shark
May do mutli-user
scheduling within
Spark Streaming
allocation from
Job Server (Ooyala) cluster manger
Short lived applications
Standalone apps
Shell sessions
Outline
Spark components
Cluster managers
Hardware & configuration
Linking with Spark
Monitoring and measuring
Cluster Managers
Several ways to deploy Spark
1. Standalone mode (on-site)
2. Standalone mode (EC2)
3. YARN
4. Mesos
5. SIMR [not covered in this talk]
Standalone Mode
Bundled with Spark
Standalone Mode
1. (Optional) describe amount of
resources in conf/spark-env.sh
- SPARK_WORKER_CORES
- SPARK_WORKER_MEMORY
2. List slaves in conf/slaves
3. Copy configuration to slaves
4. Start/stop using
./bin/stop-all and ./bin/start-all
Standalone Mode
Some support for inter-application
scheduling
Set spark.cores.max to limit # of
cores each application can use
EC2 Deployment
Launcher bundled with Spark
Create cluster in 5 minutes
Sizes cluster for any EC2
instance type and # of nodes
Used widely by Spark team for
internal testing
EC2 Deployment
./spark-ec2
-t [instance type]
-k [key-name]
-i [path-to-key-file]
-s [num-slaves]
-r [ec2-region]
--spot-price=[spot-price]
EC2 Deployment
Creates:
Spark Sandalone cluster at
<ec2-master>:8080
HDFS cluster at
< ec2-master >:50070
MapReduce cluster at
< ec2-master >:50030
Apache Mesos
General-purpose cluster manager that
can run Spark, Hadoop MR, MPI, etc
Simply pass mesos://<master-url> to
SparkContext
Optional: set spark.executor.uri to a
pre-built Spark package in HDFS,
created by make-distribution.sh
Coarse-grained:
Apps get static CPU and memory allocations
Better predictability and latency, possibly at cost
of utilization
Hadoop YARN
In Spark 0.8.0:
Runs standalone apps only, launching driver
inside YARN cluster
YARN 0.23 to 2.0.x
Coming in 0.8.1:
Interactive shell
YARN 2.2.x support
Support for hosting Spark JAR in HDFS
YARN Steps
More Info
http://spark.incubator.apache.org/docs/
latest/cluster-overview.html
Detailed docs about each of standalone
mode, Mesos, YARN, EC2
Outline
Cluster components
Deployment options
Hardware & configuration
Linking with Spark
Monitoring and measuring
Local Disks
Spark uses disk for writing
shuffle data and paging out
RDDs
Ideally have several disks per
node in JBOD configuration
Set spark.local.dir with commaseparated disk locations
Memory
Recommend 8GB heap and up
Generally, more is better
For massive (>200GB) heaps you
may want to increase # of
executors per node (see
SPARK_WORKER_INSTANCES)
Network/CPU
For in-memory workloads,
network and CPU are often the
bottleneck
Ideally use 10Gb Ethernet
Works well on machines with
multiple cores (since parallel)
Environmentrelated configs
spark.executor.memory
How much memory you will
ask for from cluster manager
spark.local.dir
Where spark stores shuffle files
Outline
Cluster components
Deployment options
Hardware & configuration
Linking with Spark
Monitoring and measuring
Created using
maven or sbt
assembly
Hadoop Versions
Distribution
Release
CDH
4.X.X
2.0.0-mr1-chd4.X.X
2.0.0-chd4.X.X
3uX
0.20.2-cdh3uX
1.3
1.2.0
1.2
1.1.2
1.1
1.0.3
HDP
Outline
Cluster components
Deployment options
Hardware & configuration
Linking with Spark
Monitoring and measuring
Monitoring
Cluster Manager UI
Executor Logs
Spark Driver Logs
Application Web UI
Spark Metrics
Cluster Manager UI
Standalone mode: <master>:8080
Mesos, YARN have their own UIs
Executor Logs
Stored by cluster manager on each
worker
Default location in standalone mode:
/path/to/spark/work
Executor Logs
Application Web UI
http://spark-application-host:4040
(or use spark.ui.port to configure the port)
For executor / task / stage / memory
status, etc
Executors Page
Environment Page
Stage Information
Task Breakdown
App UI Features
Stages show where each operation
originated in code
All tables sortable by task length,
locations, etc
Metrics
Configurable metrics based on Coda
Hales Metrics library
Many Spark components can report
metrics (driver, executor, application)
Outputs: REST, CSV, Ganglia, JMX, JSON
Servlet
Metrics
More details:
http://spark.incubator.apache.org/docs/la
test/monitoring.html
More Information
Official docs:
http://spark.incubator.apache.org/docs/lat
est