Nothing Special   »   [go: up one dir, main page]

High Availability and Fault Tolerance: © 2011 Vmware Inc. All Rights Reserved

Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

High Availability and Fault Tolerance

Module 11

© 2011 VMware Inc. All rights reserved


You Are Here

Course Introduction Data Protection

Introduction to Virtualization Access & Authentication Control

Virtual Machines Resource Management and Monitoring

VMware vCenter Server High Availability

Configure and Manage Virtual Networks Scalability

Configure and Manage Virtual Storage Patch Management

Managing Virtual Machines Installing vSphere Components

VMware vSphere: Install, Configure, Manage – Revision A 11-2

© 2011 VMware Inc. All rights reserved


Importance

Most organizations rely on computer-based services like email,


databases, and Web-based applications. The failure of any of these
services can mean lost productivity and revenue. Configuring highly
available, computer-based services is extremely important for an
organization to remain competitive in contemporary business
environments.
With VMware vSphere® 5, a new high availability architecture has
been released.

VMware vSphere: Install, Configure, Manage – Revision A 11-3

© 2011 VMware Inc. All rights reserved


Module Lessons

Lesson 1: Introduction to vSphere High Availability


Lesson 2: Configuring vSphere High Availability
Lesson 3: vSphere High Availability Architecture
Lesson 4: Introduction to Fault Tolerance

VMware vSphere: Install, Configure, Manage – Revision A 11-4

© 2011 VMware Inc. All rights reserved


Lesson 1:
Introduction to
vSphere High Availability

VMware vSphere: Install, Configure, Manage – Revision A 11-5

© 2011 VMware Inc. All rights reserved


Learner Objectives

After this lesson, you should be able to do the following:


 Describe the various options that you can configure to ensure high
availability in a vSphere 5 environment.
 Discuss the response of vSphere High Availability when a
VMware® ESXi™ host, a virtual machine, or an application fails.

VMware vSphere: Install, Configure, Manage – Revision A 11-6

© 2011 VMware Inc. All rights reserved


VMware Offers Protection at Every Level

 Protection against hardware failures


 Planned maintenance with zero downtime
 Protection against unplanned downtime
and disasters

High Availability & Fault Tolerance


vSphere Site
Storage Recovery
VMware vSphere®
VMotion Manager
vMotion®, DRS
NIC Teaming,
Storage
Multipathing

3rd-Party
Backup Solutions,
VMware Data Recovery

Component Server Storage Data Site

VMware vSphere: Install, Configure, Manage – Revision A 11-7

© 2011 VMware Inc. All rights reserved


vCenter Server Availability - Recommendations

Make VMware vCenter Server™ and the components it relies on


highly available.
vCenter Server relies on:
 vCenter Server database:
• Cluster the database. Refer to the specific database documentation.
 Active Directory structure:
• Set up with multiple redundant servers.
Methods for making vCenter Server available:
 Use vSphere High Availability to protect the vCenter Server virtual
machine.
 Use VMware vCenter Server Heartbeat™.

VMware vSphere: Install, Configure, Manage – Revision A 11-8

© 2011 VMware Inc. All rights reserved


High Availability

A highly available system is one that is continuously operational for


a desirably long length of time.

Level of availability Downtime per year

What level of virtual 99% 87 hours (3.5 days)


machine availability is
99.9% 8.76 hours
important to you?
99.99% 52 minutes

99.999% 5 minutes

VMware vSphere: Install, Configure, Manage – Revision A 11-9

© 2011 VMware Inc. All rights reserved


vSphere High Availability

vSphere HA

Level of availability High availability

Amount of downtime Minimal

Works with all supported guest operating


Guest operating systems supported
systems

VMware ESXi hardware supported Works with all supported ESXi hardware

Use to provide high availability for the


Uses virtual machines that require that level of
protection.

VMware vSphere: Install, Configure, Manage – Revision A 11-10

© 2011 VMware Inc. All rights reserved


vSphere HA Failure Scenarios

 ESXi host failure


 Guest OS failure
 Application failure

VMware vSphere: Install, Configure, Manage – Revision A 11-11

© 2011 VMware Inc. All rights reserved


High Availability Failure Scenario - Host

LUN 1 LUN 2 LUN 3

virtual machine A virtual machine B


When a host fails,
virtual machine A virtual machine C virtual machine E
vSphere HA restarts
virtual machine B virtual machine D virtual machine F the affected virtual
machines on other
hosts
ESXi host ESXi host ESXi host

vCenter Server = vSphere HA cluster

VMware vSphere: Install, Configure, Manage – Revision A 11-12

© 2011 VMware Inc. All rights reserved


High Availability Failure Scenario – Guest Operating System

LUN 1 LUN 2 LUN 3

virtual machine A virtual machine C virtual machine E When a virtual


VMware tools VMware tools VMware tools machine stops
sending heartbeats
virtual machine B virtual machine D virtual machine F
VMware tools
or the virtual
VMware tools VMware tools
machine process
crashes (vmx),
ESXi host ESXi host ESXi host vSphere HA resets
the virtual machine

vCenter Server = vSphere HA cluster

VMware vSphere: Install, Configure, Manage – Revision A 11-13

© 2011 VMware Inc. All rights reserved


HA Failure Scenario - Application

LUN 1 LUN 2 LUN 3

application application application When an application


virtual machine A virtual machine C virtual machine E fails, vSphere HA
restarts the affected
application application application
virtual machine on
virtual machine B virtual machine D virtual machine F
the same host.

ESXi host ESXi host ESXi host Requires VMware


Tools to be installed

vCenter Server = vSphere HA cluster

VMware vSphere: Install, Configure, Manage – Revision A 11-14

© 2011 VMware Inc. All rights reserved


Review of Learner Objectives

You should be able to do the following:


 Describe the various options that you can configure to ensure high
availability in a vSphere 5 environment.
 Discuss the response of vSphere High Availability when an ESXi
host, a virtual machine, or an application fails.

VMware vSphere: Install, Configure, Manage – Revision A 11-15

© 2011 VMware Inc. All rights reserved


Lesson 2:
Configuring vSphere High Availability

VMware vSphere: Install, Configure, Manage – Revision A 11-16

© 2011 VMware Inc. All rights reserved


Learner Objectives

After this lesson, you should be able to do the following:


 Configure a vSphere HA cluster.

VMware vSphere: Install, Configure, Manage – Revision A 11-17

© 2011 VMware Inc. All rights reserved


Enabling vSphere HA

Enable vSphere HA by creating a cluster or modifying a vSphere


Distributed Resource Scheduler (DRS) cluster.

VMware vSphere: Install, Configure, Manage – Revision A 11-18

© 2011 VMware Inc. All rights reserved


Configuring vSphere HA Settings

Disable Host
Monitoring when
performing
maintenance
on any cluster/host.
Enabled is the
default setting.
Admission Control
refers to the amount
of available resources Admission control
that can be used to
helps ensure sufficient
start virtual machines
resources to provide
on a specific ESXi
host. high availability.
The default setting Default setting is
is to disallow power Host failures the
and other operations cluster tolerates.
that will violate the VMware
set Admission recommended
Control Policy. setting

VMware vSphere: Install, Configure, Manage – Revision A 11-19

© 2011 VMware Inc. All rights reserved


Admission Control Policy Choices

Policy Description Recommended use

Percentage of cluster Reserves specified When virtual machines


resources reserved as percentage of total capacity have highly variable CPU
failover spare capacity and memory reservations
Host failures cluster Reserves enough resources When virtual machines
tolerates to tolerate specified number have similar CPU/memory
of host failures reservations and similar
memory overheads
Specify a failover host Dedicates a host To accommodate
exclusively for failover organizational policies that
service dictate the use of a passive
failover host

VMware vSphere: Install, Configure, Manage – Revision A 11-20

© 2011 VMware Inc. All rights reserved


Configuring Virtual Machine Options

Configure options at the cluster level or per virtual machine.


VM restart priority determines relative order in
which virtual machines are restarted after a host
failure.

Host Isolation response determines what


happens to virtual machines when a host loses
the management network but continues running.

VMware vSphere: Install, Configure, Manage – Revision A 11-21

© 2011 VMware Inc. All rights reserved


Configuring Virtual Machine Monitoring

Reset a virtual machine


if its VMware Tools
heartbeat or VMware
Tools application
heartbeats are not
received.

Determine how
quickly failures are
detected.

Set monitoring sensitivity for


individual virtual machines.

VMware vSphere: Install, Configure, Manage – Revision A 11-22

© 2011 VMware Inc. All rights reserved


Importance of Redundant Heartbeat Networks

In a vSphere HA cluster, heartbeats are:


 Sent between the master and the slave hosts
 Used to determine if a master or slave host has failed
 Sent over a heartbeat network
The heartbeat network is:
 Implemented using a VMkernel port marked for management
Redundant heartbeat networks:
 Allow for the reliable detection of failures

VMware vSphere: Install, Configure, Manage – Revision A 11-23

© 2011 VMware Inc. All rights reserved


Redundancy Using NIC Teaming

You can use NIC teaming to create a redundant heartbeat network


on ESXi hosts.
Both port groups must be VMkernel ports.

NIC teaming on an ESXi host

VMware vSphere: Install, Configure, Manage – Revision A 11-24

© 2011 VMware Inc. All rights reserved


Redundancy Using Additional Networks

You can also create


redundancy by configuring
more heartbeat networks:
 On ESXi hosts, add one or
more VMkernel networks
marked for management
traffic.
Configure port group with
these settings:
 Set Load Balancing to
originating port ID.
 Do not enable Failback.
 Configure port group with
active/standby failover.

VMware vSphere: Install, Configure, Manage – Revision A 11-25

© 2011 VMware Inc. All rights reserved


Network Configuration and Maintenance

Before changing the networking configuration on the ESXi hosts


(adding port groups, removing vSwitches):
 Deselect
Enable Host
Monitoring.
 Place the
host in
maintenance
mode.
These steps prevent unwanted attempts to fail over virtual
machines.

VMware vSphere: Install, Configure, Manage – Revision A 11-26

© 2011 VMware Inc. All rights reserved


Cluster Resource Allocation Tab

How much CPU and memory resources is the cluster using now?
How much reserved capacity remains?

VMware vSphere: Install, Configure, Manage – Revision A 11-27

© 2011 VMware Inc. All rights reserved


Monitoring Cluster Status
cluster’s Summary tab

The vSphere HA Cluster Status


window displays details about host
operational status, virtual machine
protection, and heartbeat datastores
The Configuration Issues window
displays the current vSphere HA
operational status, including the
specific status and errors for each
master and slave host in the cluster.

VMware vSphere: Install, Configure, Manage – Revision A 11-28

© 2011 VMware Inc. All rights reserved


Lab 18

In this lab, you will modify slot sizes and admission control.
1. Create a cluster enabled for vSphere HA.
2. Add your ESXi host to a cluster.
3. Test vSphere HA functionality.
4. Prepare for the next lab.

VMware vSphere: Install, Configure, Manage – Revision A 11-29

© 2011 VMware Inc. All rights reserved


Review of Learner Objectives

You should be able to do the following:


 Configure a vSphere HA cluster.

VMware vSphere: Install, Configure, Manage – Revision A 11-30

© 2011 VMware Inc. All rights reserved


Lesson 3:
vSphere High Availability Architecture

VMware vSphere: Install, Configure, Manage – Revision A 11-31

© 2011 VMware Inc. All rights reserved


Learner Objectives

After this lesson, you should be able to do the following:


 Describe heartbeat mechanisms used by vSphere HA.
 Identify and discuss additional failure scenarios.

VMware vSphere: Install, Configure, Manage – Revision A 11-32

© 2011 VMware Inc. All rights reserved


vSphere HA Architecture: Agent Communication

datastore datastore datastore

FDM FDM FDM

vpxa hostd vpxa hostd vpxa hostd

ESXi host (slave) ESXi host (slave) ESXi host (master)

vpxd

= Management network
vCenter Server
VMware vSphere: Install, Configure, Manage – Revision A 11-33

© 2011 VMware Inc. All rights reserved


vSphere HA Architecture: Network Heartbeats

NAS/NFS VMFS Local

virtual machine A virtual machine C virtual machine E

virtual machine B virtual machine D virtual machine F

ESXi host ESXi host ESXi host


(slave) (slave) (master)

Management network 1
vCenter Server Management network 2

VMware vSphere: Install, Configure, Manage – Revision A 11-34

© 2011 VMware Inc. All rights reserved


vSphere HA Architecture: Datastore Heartbeats

NAS/NFS VMFS Local

virtual machine A virtual machine C virtual machine E

virtual machine B virtual machine D virtual machine F

ESXi host ESXi host ESXi host


(slave) (master) (slave)
Cluster Edit Settings Window

vCenter Server

Management network 1
Management network 2

VMware vSphere: Install, Configure, Manage – Revision A 11-35

© 2011 VMware Inc. All rights reserved


Additional HA Failure Scenarios

 Slave host failure


 Master host failure
 Host isolation
 Management network failures
• Network partition
• Network isolation

VMware vSphere: Install, Configure, Manage – Revision A 11-36

© 2011 VMware Inc. All rights reserved


Failed Slave Host
NAS/NFS VMFS
(lock file) (heartbeat region)

file locks file locks

virtual machine A virtual machine C virtual machine E

virtual machine B virtual machine D virtual machine F

ESXi host ESXi host ESXi host


(slave) ? (master) (slave)

vCenter Server primary heartbeat network


alternate heartbeat network

VMware vSphere: Install, Configure, Manage – Revision A 11-37

© 2011 VMware Inc. All rights reserved


Failed Master Host
NAS/NFS VMFS
(lock file) (heartbeat region)

file locks file locks

virtual machine A virtual machine C virtual machine E default gateway


(isolation address)
virtual machine B virtual machine D virtual machine F

ESXi host ESXi host ESXi host


Role: slave Role: master Role: slave
MOID: 98 MOID: 99 ? MOID: 100

primary heartbeat network


vCenter Server alternate heartbeat network
MOID = managed object ID

VMware vSphere: Install, Configure, Manage – Revision A 11-38

© 2011 VMware Inc. All rights reserved


Isolated Host

virtual machine A virtual machine C virtual machine E


The host is not observing
virtual machine D virtual machine F any election traffic on the
virtual machine B
management and cannot
ping its isolation
ESXi host ESXi host ESXi host address(es), the host is
isolated.

default gateway
(isolation address)

VMware vSphere: Install, Configure, Manage – Revision A 11-39

© 2011 VMware Inc. All rights reserved


Design Considerations

Host isolation events can be minimized through good design


 Implement redundant heartbeat networks
 Implement redundant isolation addresses
If host isolation events do occur, good design enables vSphere HA
to determine whether the isolated host is still alive
 Implement datastores so that they are separated from the
management network using one or both of the following approaches:
• Fibre Channel over fibre optic
• Physically separating your IP storage network from the management
network

VMware vSphere: Install, Configure, Manage – Revision A 11-40

© 2011 VMware Inc. All rights reserved


Network Partition

virtual machine A virtual machine C virtual machine E virtual machine G

virtual machine B virtual machine D virtual machine F virtual machine H

ESXi host ESXi host ESXi host ESXi host


MASTER SLAVE SLAVE SLAVE

MASTER

vCenter Server default gateway


(isolation address)

VMware vSphere: Install, Configure, Manage – Revision A 11-41

© 2011 VMware Inc. All rights reserved


Review of Learner Objectives

You should be able to do the following:


 Describe heartbeat mechanisms used by vSphere HA
 Identify and discuss additional failure scenarios

VMware vSphere: Install, Configure, Manage – Revision A 11-42

© 2011 VMware Inc. All rights reserved


Lesson 4:
Introduction to Fault Tolerance

VMware vSphere: Install, Configure, Manage – Revision A 11-43

© 2011 VMware Inc. All rights reserved


Learner Objectives

After this lesson, you should be able to do the following:


 List Fault Tolerance requirements and limitations.
 Describe Fault Tolerance operation.

VMware vSphere: Install, Configure, Manage – Revision A 11-44

© 2011 VMware Inc. All rights reserved


What Is Fault Tolerance (FT)?

FT:
 A fault-tolerant system is designed so that, in the event of an
unplanned outage, a backup virtual machine can immediately take
over with no loss of service. (The backup virtual machine is called a
secondary virtual machine.)
• Provides a higher level of business continuity than vSphere HA
• Provides zero downtime and zero data loss for applications
FT can be used for any application that needs to be available at all
times.
FT can be used with DRS:
 Fault-tolerant virtual machines benefit from better initial placement
and are included in the cluster’s load-balancing calculations.

VMware vSphere: Install, Configure, Manage – Revision A 11-45

© 2011 VMware Inc. All rights reserved


VMware Fault Tolerance

Fault Tolerance

Level of availability Fault tolerance

Amount of downtime Zero

Works with all supported guest


Guest operating systems supported
operating systems

ESXi hardware supported Widely compatible

Use to provide fault tolerance to


Uses
your critical virtual machines.

VMware vSphere: Install, Configure, Manage – Revision A 11-46

© 2011 VMware Inc. All rights reserved


Fault Tolerance in Action

vLockstep technology vLockstep technology

primary new
secondary new
VM primary
VM secondary
VM VM

FT provides zero-downtime, zero-data-loss protection to


virtual machines in a vSphere HA cluster.

VMware vSphere: Install, Configure, Manage – Revision A 11-47

© 2011 VMware Inc. All rights reserved


Fault Tolerance Guidelines

Check the requirements and limitations of FT.


Ensure enough ESXi hosts for fault-tolerant virtual machines:
 No more than four fault-tolerant virtual machines (primaries or
secondaries) on any single host
Store ISOs on shared storage for continuous access:
 Especially if used for important operations
Disable BIOS-based power management:
 Prevents the secondary virtual machine from having insufficient CPU
resources

VMware vSphere: Install, Configure, Manage – Revision A 11-48

© 2011 VMware Inc. All rights reserved


Enabling Fault Tolerance on a Virtual Machine

VMware vSphere: Install, Configure, Manage – Revision A 11-49

© 2011 VMware Inc. All rights reserved


Review of Learner Objectives

You should be able to do the following:


 List Fault Tolerance requirements and limitations.
 Describe Fault Tolerance operation.

VMware vSphere: Install, Configure, Manage – Revision A 11-50

© 2011 VMware Inc. All rights reserved


Key Points

 vSphere HA restarts virtual machines on the remaining hosts in the


cluster.
 Hosts in vSphere HA clusters have a master/slave relationship.
 Implement redundant heartbeat networks either with NIC teaming or
by creating additional heartbeat networks.
 FT provides zero downtime for applications that need to be available
at all times.

VMware vSphere: Install, Configure, Manage – Revision A 11-51

© 2011 VMware Inc. All rights reserved

You might also like