Ibm Powerha Systemmirror For Aix Cookbook: Front Cover
Ibm Powerha Systemmirror For Aix Cookbook: Front Cover
Ibm Powerha Systemmirror For Aix Cookbook: Front Cover
IBM PowerHA
SystemMirror for AIX
Cookbook
Explore the most recent
enterprise-ready features
Dino Quintero
Shawn Bodily
Daniel J Martin-Corben
Reshma Prathap
Kulwinder Singh
Ashraf Ali Thajudeen
William Nespoli Zanatta
ibm.com/redbooks
International Technical Support Organization
October 2014
SG24-7739-01
Note: Before using this information and the product it supports, read the information in Notices on
page xi.
This edition applies to IBM AIX 7.1.3 TL1 and IBM PowerHA SystemMirror 7.1.3 SP1.
Copyright International Business Machines Corporation 2009, 2014. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Part 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 3. Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.1 High availability planning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2 Planning for PowerHA 7.1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2.1 Planning strategy and example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2.2 Planning tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2.3 Getting started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2.4 Current environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2.5 Addressing single points of failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.2.6 Initial cluster design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.2.7 Completing the cluster overview planning worksheet . . . . . . . . . . . . . . . . . . . . . . 79
3.3 Planning cluster hardware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3.1 Overview of cluster hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3.2 Completing the cluster hardware planning worksheet . . . . . . . . . . . . . . . . . . . . . 81
3.4 Planning cluster software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.4.1 AIX and RSCT levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.4.2 Virtual Ethernet and vSCSI support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.4.3 Required AIX file sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.4.4 PowerHA 7.1.3 file sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.4.5 AIX files altered by PowerHA 7.1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.4.6 Application software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.4.7 Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.4.8 Completing the software planning worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.5 Operating system considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.6 Planning security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.6.1 Cluster security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Contents v
4.2.10 Create service IP labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.2.11 Create resource group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.2.12 Add resources into resource group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
4.2.13 Verify and synchronize cluster configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . 148
4.3 Installing PowerHA SystemMirror for IBM Systems Director plug-in. . . . . . . . . . . . . . 149
Contents vii
8.5 Federated security for cluster-wide security management . . . . . . . . . . . . . . . . . . . . . 331
8.5.1 Federated security components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
8.5.2 Federated security configuration requirement. . . . . . . . . . . . . . . . . . . . . . . . . . . 333
8.5.3 Federated security configuration details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Contents ix
NFS-exported file system or directory worksheet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
Application worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
Application server worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
Application monitor worksheet (custom) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
Resource group worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Cluster events worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498
Cluster file collections worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of
express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring
any obligation to you.
Any performance data contained herein was determined in a controlled environment. Therefore, the results
obtained in other operating environments may vary significantly. Some measurements may have been made
on development-level systems and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurements may have been estimated through
extrapolation. Actual results may vary. Users of this document should verify the applicable data for their
specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs.
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
AIX HyperSwap Redpaper
AIX 5L IBM Redbooks (logo)
DB2 Lotus RS/6000
Domino POWER Storwize
DS4000 Power Systems System i
DS8000 POWER6 System p
DYNIX/ptx POWER7 SystemMirror
FileNet POWER8 Tivoli
Global Technology Services PowerHA WebSphere
GPFS PowerVM XIV
HACMP Redbooks
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States,
other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other company, product, or service names may be trademarks or service marks of others.
This IBM Redbooks publication can help you install, tailor, and configure the new IBM
PowerHA Version 7.1.3, and understand new and improved features such as migrations,
cluster administration, and advanced topics like configuring in a virtualized environment
including workload partitions (WPARs).
With this book, you can gain a broad understanding of the IBM PowerHA SystemMirror
architecture. If you plan to install, migrate, or administer a high availability cluster, this book is
right for you.
This book can help IBM AIX professionals who seek a comprehensive and task-oriented
guide for developing the knowledge and skills required for PowerHA cluster design,
implementation, and daily system administration. It provides a combination of theory and
practical experience.
This book is targeted toward technical professionals (consultants, technical support staff, IT
architects, and IT specialists) who are responsible for providing high availability solutions and
support with the IBM PowerHA SystemMirror Standard on IBM POWER systems.
Authors
This book was produced by a team of specialists from around the world working at the
International Technical Support Organization (ITSO), Poughkeepsie Center.
Dino Quintero is a complex solutions Project Leader and an IBM Senior Certified IT
Specialist with the ITSO in Poughkeepsie, NY. His areas of knowledge include enterprise
continuous availability, enterprise systems management, system virtualization, technical
computing, and clustering solutions. He is an Open Group Distinguished IT Specialist. Dino
holds a Master of Computing Information Systems degree and a Bachelor of Science degree
in Computer Science from Marist College.
Shawn Bodily is a Senior IT Consultant for Clear Technologies in Dallas, Texas. He has 20
years of AIX experience and the last 17 years specializing in high availability and disaster
recovery primarily focused around PowerHA. He is double AIX advanced technical expert,
and is certified in POWER Systems and IBM Storage. He has written and presented
extensively about high availability and storage at technical conferences, webinars, and onsite
to customers. He is an IBM Redbooks platinum author who has co-authored seven IBM
Redbooks publications and two IBM Redpaper publications.
Daniel J Martin-Corben is a Technical Solutions Designer for IBM UK and has been working
with UNIX since he was 18 years old. He has held various roles in the sector but has finally
returned to IBM. In the early days, he worked on IBM Sequent DYNIX/ptx as a Database
Administrator (DBA). Upon joining IBM, he had his first introduction to IBM AIX and HACMP
(PowerHA) and the pSeries hardware, which has dominated his prolific career. IBM
POWER8 is his current focus, but he has extensive experience with various types of
storage, including IBM V7000, XIV, and SAN Volume Controller. He has strong skills and
knowledge with all IBM systems, and also with Solaris, Symantec, HP-UX, VMware, and
Windows.
Ashraf Ali Thajudeen is an Infrastructure Architect with IBM Singapore Global Technology
Services Delivery having more than eight years of experience in high availability and disaster
recovery architectures in UNIX environments. As an IBM Master Certified IT Specialist in
Infrastructure and Systems Management and TOGAF 9 Certified in Enterprise Architecture,
he has wide experience in designing, planning, and deploying PowerHA based solutions
across ASEAN Strategic Outsourcing accounts. His areas of expertise include designing and
implementing PowerHA and Tivoli automation solutions.
William Nespoli Zanatta is an IT Specialist from IBM Global Technology Services in Brazil.
He has been with IBM for four years, supporting enterprise environments that run AIX and
Linux systems on POWER and System x. He has background experience with other UNIX
varieties and software development. His current areas of expertise include IBM PowerVM,
PowerHA, and GPFS.
Ella Buslovic
International Technical Support Organization, Poughkeepsie Center
David Bennin, Dan Braden, Noel Carroll, Mike Coffey, Richard Conway, Rick Cotter,
Gary Domrow, Steven Finnes, PI Ganesh, Steve Lang, Gary Lowther, Paul Moyer,
Steve Pittman, Ravi Shankar, Scot Stansell, Timothy Thornal, Tom Weaver, Wayne Wilcox
IBM USA
Thanks to the authors of the previous edition of this book, PowerHA for AIX Cookbook:
Scott Vetter, Shawn Bodily, Rosemary Killeen, Liviu Rosca
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
Preface xv
xvi IBM PowerHA SystemMirror for AIX Cookbook
Summary of changes
This section describes the technical changes made in this edition of the book and in previous
editions. This edition might also include minor corrections and editorial changes that are not
identified.
Summary of Changes
for SG24-7739-01
for IBM PowerHA SystemMirror for AIX Cookbook
as created or updated on April 13, 2015.
New information
Includes information about the recent IBM PowerHA SystemMirror for 7.1.3 for AIX
Includes information about Cluster Aware AIX (CAA)
Changed information
Several chapters were removed and updates were made to this new publication to
incorporate the latest improvements to PowerHA.
Part 1 Introduction
In Part 1, we provide an overview of PowerHA and describe the PowerHA components as part
of a successful implementation.
We also introduce the basic PowerHA management concepts, with suggestions and
considerations to ease the system administrators job.
High availability solutions should eliminate single points of failure through appropriate design,
planning, selection of hardware, configuration of software, control of applications, a carefully
controlled environment, and change management discipline.
In short, we can define high availability as the process of ensuring, through the use of
duplicated or shared hardware resources, managed by a specialized software component,
that an application is available for use.
A short definition for cluster multiprocessing might be multiple applications running over
several nodes with shared or concurrent access to the data.
PowerHA is only one of the high availability technologies and builds on the increasingly
reliable operating systems, hot-swappable hardware, increasingly resilient applications, by
offering monitoring and automated response.
PowerHA version 7.1 and later rely heavily on CAA infrastructure available in AIX 6.1TL6 and
AIX 7.1. CAA provides communication interfaces and monitoring provision for PowerHA and
execution using CAA commands with clcmd.
PowerHA also provides disaster recovery functionality such as cross site mirroring, IBM
HyperSwap and Geographical Logical Volume Mirroring. These cross-site clustering
methods support PowerHA functionality between two geographic sites. Various methods exist
for replicating the data to remote sites. For more information, IBM PowerHA SystemMirror
7.1.2 Enterprise Edition for AIX, SG24-8106.
Stand-alone Days From last backup Basic hardware and software costs
Enhanced stand-alone Hours Until last transaction Double the basic hardware cost
High availability clusters Seconds Until last transaction Double hardware and additional
services; more costs
Fault-tolerant computing Zero downtime No loss of data Specialized hardware and software, very
expensive
The highly available solution for IBM POWER systems offers distinct benefits:
Proven solution (more than 20 years of product development)
Using off the shelf hardware components
Proven commitment for supporting our customers
IP version 6 (IPv6) support for both internal and external cluster communication
Smart Assist technology enabling high availability support for all prominent applications
Flexibility (virtually any application running on a stand-alone AIX system can be protected
with PowerHA)
1.2.1 Downtime
Downtime is the period when an application is not available to serve its clients. Downtime can
be classified in two categories, planned and unplanned:
Planned:
Hardware upgrades
Hardware/Software repair/replacement
Software updates/upgrades
Backups (offline backups)
Testing (periodic testing is required for cluster validation.)
Development
Unplanned:
Administrator errors
Application failures
Hardware failures
Operating system errors
Environmental disasters
The role of PowerHA is to maintain application availability through the unplanned outages and
normal day-to-day administrative requirements. PowerHA provides monitoring and automatic
recovery of the resources on which your application depends.
Good design can remove single points of failure in the cluster: nodes, storage, and networks.
PowerHA manages these, and also the resources required by the application (including the
application start/stop scripts).
As previously mentioned, a good design is able to avoid single points of failure, and PowerHA
can manage the availability of the application through downtimes. Table 1-2 lists each cluster
object, which, if it fails, can result in loss of availability of the application. Each cluster object
can be a physical or logical component.
Power supply Multiple circuits, power supplies, or uninterruptible power supply (UPS)
TCP/IP subsystem Use of non-IP networks to connect each node to its neighbor in a ring
Resource groups Use of resource groups to control all resources required by an application
In addition, other management tasks, such as modifying storage, managing users, can be
performed on the running cluster using the Cluster Single Point of Control (C-SPOC) without
interrupting user access to the application running on the cluster nodes. C-SPOC also
ensures that changes made on one node are replicated across the cluster in a consistent
manner.
Originally designed as a stand-alone product (known as HACMP classic), after the IBM high
availability infrastructure known as Reliable Scalable Clustering Technology (RSCT) became
available, HACMP adopted this technology and became HACMP Enhanced Scalability
(HACMP/ES), because it provides performance and functional advantages over the classic
version. Starting with HACMP v5.1, there are no more classic versions. Later HACMP
terminology was replaced with PowerHA with v5.5 and then to PowerHA SystemMirror v6.1.
Starting with the PowerHA 7.1, the Cluster Aware AIX (CAA) feature of the operating system
is used to configure, verify, and monitor the cluster services. This major change improved
reliability of PowerHA because the cluster service functions were running in kernel space
rather than user space. CAA was introduced in AIX 6.1TL6. At the time of writing this book,
the release is PowerHA 7.1.3 SP1.
Note: More information about new features in PowerHA 7.1.3 are in Guide to IBM
PowerHA SystemMirror for AIX Version 7.1.3, SG24-8167.
1.4.1 Terminology
The terminology used to describe PowerHA configuration and operation continues to evolve.
The following terms are used throughout this book:
Cluster Loosely-coupled collection of independent systems (nodes) or logical
partitions (LPARs) organized into a network for the purpose of sharing
resources and communicating with each other.
PowerHA defines relationships among cooperating systems where
peer cluster nodes provide the services offered by a cluster node if
that node is unable to do so. These individual nodes are together
responsible for maintaining the functionality of one or more
applications in case of a failure of any cluster component.
Node An IBM Power (System p, System i, or BladeCenter) system (or
LPAR) running AIX and PowerHA that is defined as part of a cluster.
Each node has a collection of resources (disks, file systems, IP
addresses, and applications) that can be transferred to another node
in the cluster in case the node or a component fails.
Clients A client is a system that can access the application running on the
cluster nodes over a local area network (LAN). Clients run a client
application that connects to the server (node) where the application
runs.
All components, CPUs, memory, and disks have a special design and provide continuous
service, even if one sub-component fails. Only special software solutions can run on fault
tolerant hardware.
Such systems are expensive and extremely specialized. Implementing a fault tolerant solution
requires a lot of effort and a high degree of customization for all system components.
In such systems, the software involved detects problems in the environment, and manages
application survivability by restarting it on the same or on another available machine (taking
over the identity of the original machine: node).
Therefore, eliminating all single points of failure (SPOF) in the environment is important. For
example, if the machine has only one network interface (connection), provide a second
network interface (connection) in the same node to take over in case the primary interface
providing the service fails.
Another important issue is to protect the data by mirroring and placing it on shared disk areas,
accessible from any machine in the cluster.
Remember, PowerHA is not a fault tolerant solution and should never be implemented as
such.
Table 1-3 shows the required PowerHA and AIX levels at the time this book was written.
6100-02-01 2.5.4.0
7100-00 3.1.0.0
7100-01-02 3.1.2.0
7100-02-01 3.1.4.0
7100-03-01 3.1.5.1
a. authorized program analysis report (APAR)
The following AIX base operating system (BOS) components are prerequisites for PowerHA:
bos.adt.lib
bos.adt.libm
bos.adt.syscalls
bos.ahafs
bos.cluster
bos.clvm.enh
bos.data
bos.net.tcp.client
bos.net.tcp.server
bos.rte.SRC
bos.rte.libc
bos.rte.libcfg
bos.rte.libcur
bos.rte.libpthreads
bos.rte.lvm
bos.rte.odm
cas.agent (optional, but required only for IBM Systems Director plug-in)
devices.common.IBM.storfwork.rte (optional, but required for sancomm)
To determine if the appropriate file sets are installed and what their levels are, issue the
following commands:
/usr/bin/lslpp -l rsct.compat.basic.hacmp
/usr/bin/lslpp -l rsct.compat.clients.hacmp
/usr/bin/lslpp -l rsct.basic.rte
/usr/bin/lslpp -l rsct.core.rmc
If the file sets are not present, install the appropriate version of RSCT.
1.6.2 Licensing
Most software vendors require that you have a unique license for each application for each
physical machine and also on a per core basis. Usually, the license activation code is entered
at installation time.
The application might also require a unique node-bound license (a separate license file on
each node).
Some applications also have restrictions with the number of floating licenses available within
the cluster for that application. To avoid this problem, be sure that you have enough licenses
for each cluster node so the application can run simultaneously on multiple nodes (especially
for concurrent applications).
For current information about PowerHA licensing, see the list of frequently asked questions:
http://www-03.ibm.com/systems/power/software/availability/aix/faq/index.html
For example, if all the data for a critical application resides on a single disk, and that specific
disk fails, then that disk is a single point of failure for the entire cluster, and is not protected by
PowerHA. AIX logical volume manager or storage subsystems protection must be used in this
case. PowerHA only provides takeover for the disk on the backup node, to make the data
available for use.
This is why PowerHA planning is so important, because your major goal throughout the
planning process is to eliminate single points of failure. A single point of failure exists when a
critical cluster function is provided by a single component. If that component fails, the cluster
has no other way of providing that function, and the application or service dependent on that
component becomes unavailable.
Also keep in mind that a well-planned cluster is easy to install, provides higher application
availability, performs as expected, and requires less maintenance than a poorly planned
cluster.
If you choose NIM, you must copy all the PowerHA file sets onto the NIM server and define an
lpp_source resource before proceeding with the installation.
More details about installing and configuring are in Chapter 4, Installation and configuration
on page 135.
To install the PowerHA software on a server node, complete the following steps:
1. If you are installing directly from the installation media, such as a CD/DVD or from a local
repository, enter the smitty install_all fast path command. The System Management
Interface Tool (SMIT) displays the Install and Update from ALL Available Software panel.
2. Enter the device name of the installation medium or installation directory in the INPUT
device/directory for software field and press Enter.
3. Enter the corresponding field values.
To select the software to install, press F4 for a software listing, or enter all to install all
server and client images. Select the packages you want to install according to your cluster
configuration. Some of the packages might require prerequisites that are not available in
your environment.
The following file sets are required and must be installed on all servers:
cluster.es.server
cluster.es.client
cluster.cspoc f
Read the license agreement and select Yes in the Accept new license agreements field.
You must choose Yes for this item to proceed with installation. If you choose No, the
installation might stop, and issue a warning that one or more file sets require the software
license agreements. You accept the license agreement only once for each node.
4. Press Enter to start the installation process.
Post-installation steps
To complete the installation, complete the following steps:
1. Verify the software installation by using the AIX lppchk command, and check the installed
directories to see if the expected files are present.
2. Run the lppchk -v and lppchk -c cluster* commands. Both commands run clean if the
installation is good; if not, use the proper problem determination techniques to fix any
problems.
3. A reboot might be required if RSCT prerequisites have been installed since the last time
the system was rebooted.
More information
For more information about upgrading PowerHA, see Chapter 5, Migration on page 153.
When the cluster is configured, the cluster topology and resource information is entered on
one node, a verification process is then run, and the data synchronized out to the other nodes
that are defined in the cluster. PowerHA keeps this data in its own Object Data Manager
(ODM) classes on each node in the cluster.
Although PowerHA can be configured or modified from any node in the cluster, a good
practice is to perform administrative operations from one node to ensure that PowerHA
definitions are kept consistent across the cluster. This way prevents a cluster configuration
update from multiple nodes that might result in inconsistent data.
Installation changes
The following AIX configuration changes are made:
These files are modified:
/etc/inittab
/etc/rc.net
/etc/services
/etc/snmpd.conf
/etc/snmpd.peers
/etc/syslog.conf
/etc/trcfmt
/var/spool/cron/crontabs/root
The hacmp group is added.
Also, using cluster configuration and verification, the /etc/hosts file can be changed by
adding or modifying entries.
The following network options are set to 1 (one) on startup:
nonlocsrcroute
ipsrcrouterecv
ipsrcroutesend
ipsrcrouteforward
Figure 2-1 on page 23 shows a typical cluster topology with these components:
Two nodes
Two IP networks (PowerHA logical networks) with redundant interfaces on each node
Shared storage
Repository disk
PowerHA cluster
A name is assigned to the cluster. The name can be up to 64 characters: [a-z], [A-Z], [0-9],
hyphen (-), or underscore (_). In previous versions of PowerHA, the cluster name could not
start with a number or contain hyphens, but this no longer the case with PowerHA v7.1.3. The
cluster ID (number) is also associated with the cluster. This automatically generates a unique
ID for the cluster. All heartbeat packets contain this ID, so two clusters on the same network
should never have the same ID.
Cluster nodes
Nodes form the core of a PowerHA cluster. A node is a system running an image of the AIX
operating system (stand-alone or a partition), PowerHA code and application software. The
maximum number of supported nodes is 16 in PowerHA version 7. However, CAA cluster now
supports 32 nodes.
When defining the cluster node, a unique name must be assigned and a communication path
to that node must be supplied (IP address or a resolvable IP label associated with one of the
interfaces on that node). The node name can be the host name (short), a fully qualified name
(hostname.domain.name), or any name up to 64 characters: [a-z], [A-Z], [0-9], hyphen (-), or
underscore (_). The name can start with either an alphabetic or numeric character.
The communication path is first used to confirm that the node can be reached, then used to
populate the ODM on each node in the cluster after secure communications are established
between the nodes. However, after the cluster topology and CAA cluster are configured, any
interface can be used to attempt to communicate between nodes in the cluster.
Sites
The use of sites is optional. They are designed for use in cross-site LVM mirroring,
PowerHA/EE configurations, or both. A site consists of one or more nodes that are grouped
together at a location. PowerHA supports a cluster that is divided into two sites. Site
relationships also can exist as part of a resource groups definition, but should be set to
ignore if sites are not used.
Although using sites outside PowerHA/EE and cross-site LVM mirroring is possible,
appropriate methods or customization must be provided to handle site operations. If sites
are defined, site events are run during node_up and node_down events.
Networks
In PowerHA, the term network is used to define a logical entity that groups the communication
interfaces used for IP communication between the nodes in the cluster, and for client access.
The networks in PowerHA can be defined with an attribute of either public (which is the
default) or private. Private networks indicate to CAA to not be used for heartbeat or
communications.
Each interface is capable of hosting several IP addresses. When configuring a cluster, you
define the IP addresses that PowerHA monitors by using CAA and the IP addresses that
PowerHA itself keeps highly available (the service IP addresses and persistent aliases).
Important: A good practice is to have all those IP addresses defined in the /etc/hosts file
on all nodes in the cluster. There is certainly no requirement to use fully qualified names.
While PowerHA is processing network changes, the NSORDER variable is set to local (for
example, pointing to /etc/hosts). However, another good practice is to set this in
/etc/netsvc.conf file.
In previous versions, this term was related to single point-to-point networks, such as disk
heartbeat and RS232. However, those device and network types no longer exist.
Network definitions can be added using the SMIT panels; however during the initial cluster
configuration, a discovery process is run, which automatically defines the networks and
assigns the interfaces to them.
The discovery process harvests information from the /etc/hosts file, defined interfaces,
defined adapters, and existing enhanced concurrent mode disks. The process then creates
the following files in the /usr/es/sbin/cluster/etc/config directory:
clip_config Contains details of the discovered interfaces; used in the F4 SMITlists.
clvg_config Contains details of each physical volume (PVID, volume group name,
status, major number, and so on) and a list of free major numbers.
Running discovery can also reveal any inconsistency in the network at your site.
PowerHA SystemMirror 7.1 and later uses CAA services to configure, verify, and monitor the
cluster topology. This is a major reliability improvement because core functions of the cluster
services, such as topology related services, now run in the kernel space. This makes it much
less susceptible to be affected by the workload generated in the user space.
Communication paths
Cluster communication is achieved by communicating over multiple redundant paths. The
following redundant paths provide a robust clustering foundation that is less prone to cluster
partitioning:
TCP/IP
PowerHA SystemMirror and Cluster Aware AIX, either through multicast or unicast, use all
network interfaces that are available for cluster communication. All of these interfaces are
discovered by default and used for health management and other cluster communication.
You can use the PowerHA SystemMirror management interfaces to remove any interface
that you do not want to be used by specifying these interfaces in a private network.
If all interfaces on that PowerHA network are unavailable on that node, PowerHA transfers
all resource groups containing IP labels on that network to another node with available
interfaces. This is a default behavior associated with a feature called selective fallover on
network loss. It can be disabled if you want.
SAN-based (sfwcomm)
A redundant high-speed path of communication is established between the hosts by using
the storage area network (SAN) fabric that exists in any data center between the hosts.
Discovery-based configuration reduces the burden for you to configure these links.
Repository disk
Health and other cluster communication is also achieved through the central repository
disk.
Repository disk
Cluster Aware AIX maintains cluster related configuration information such as node list,
various cluster tunables, and so on. Note that all of this configuration information is also
maintained in memory by Cluster Aware AIX (CAA) while it is operating. Hence in a live
cluster, CAA can recreate the configuration information when a new replacement disk for the
repository disk is provided.
Configuration management
CAA identifies the repository disk by an unique 128 bit UUID. The UUID is generated in the
AIX storage device drivers using the characteristics of the disk concerned. CAA stores the
repository disk related identity information in the AIX ODM CuAt as part of the cluster
information. The following is sample output from a PowerHA 7.1.3 cluster:
CuAt:
name = "cluster0"
CuAt:
name = "cluster0"
attribute = "clvdisk"
value = "2fb6d8b9-1147-45f9-185b-4e8e67716d4d"
type = "R"
generic = "DU"
rep = "s"
nls_index = 2
When an additional node tries to join a cluster during AIX boot time, CAA uses the ODM
information to locate the repository disk. The repository disk must be reachable to retrieve the
necessary information to join and synchronize with all other nodes in the cluster. If CAA is not
able to reach the repository disk, then CAA will not proceed with starting the cluster services
and will log the error about the repository disk in the AIX errorlog. In this case, the
administrator fixes the repository disk related issues and then starts CAA manually.
In case the ODM entry is missing due to which a node failed to join a cluster, the ODM entry
can be repopulated and the node forced to join the cluster using an undocumented option of
clusterconf. This assumes the administrator knows the hard disk name for the repository
disk.
clusterconf -r hdisk#
Health management
The repository disk plays a key role in bringing up and maintaining the health of the cluster.
The following are some of the ways the repository disk is used for heartbeats, cluster
messages and node to node synchronization.
There are two key ways the repository disk is used for health management across the cluster:
1. Continuous health monitoring
2. Distress time cluster communication
For continuous health monitoring CAA and disk device drivers maintain health counters per
node. These health counters are updated and read at least once every two seconds by the
storage framework device driver. The health counters of the other nodes are compared every
6 seconds to determine if the other nodes are still functional. This time may change if
necessary in the future.
When all the network interfaces have failed, for a node, then the node is in a distress
condition. In this distress environment, CAA and the storage framework use the repository
disk to do all the necessary communication between the distress node and other nodes. Note
that this type of communication requires certain area of the disk to be set aside per node for
writing the messages meant to be delivered to other nodes. This disk space is automatically
allocated at cluster creation time. No action from the customer is needed. When operating in
this mode, each node has to scan the message areas of all other nodes several times per
second to receive any messages meant for them.
Note that since this second method of communication is not the most efficient form of
communication, as it requires more polling of the disk, and it is expected that this form of
Failures of any of these writes and reads will result in repository failure related events to CAA
and PowerHA. This means the administrator would have to provide a new disk to be used as
a replacement disk for the original failed repository disk.
Therefore, PowerHA SystemMirror 7.1.2 Enterprise Edition does not allow a node to operate
if it no longer has access to the repository disk and also registers an abnormal node down
event. This allows a double failure scenario to be tolerated.
Split handling None: This is the default setting. Select this for the partitions to operate
policy independently of each other after the split occurs.
Tie breaker: Select this to use the disk that is specified in the Select tie
breaker field after a split occurs. When the split occurs, one site wins the
SCSI reservation on the tie breaker disk. The site that losses the SCSI
reservation uses the recovery action that is specified in the policy setting.
Note: If you select Tie breaker in the Merge handling policy field, you
must select Tie breaker for this field.
Manual: Select this to wait for manual intervention when a split occurs.
PowerHA SystemMirror does not perform any actions on the cluster until
you specify how to recover from the split.
Note: If you select Manual in the Merge handling policy field, you must
select Manual for this field.
Merge handling Majority: Select this to choose the partition with the highest number of
policy nodes the as primary partition.
Tie breaker: Select this to use the disk that is specified in the Select tie
breaker field after a merge occurs.
Note: If you select Tie breaker in the Split handling policy field, you
must select Tie breaker for this field.
Manual: Select this to wait for manual intervention when a merge occurs.
PowerHA SystemMirror does not perform any actions on the cluster until
you specify how to handle the merge.
Split and merge Reboot: Reboots all nodes in the site that does not win the tie breaker or is not
action plan responded to manually when using the manual choice option below. This is not
an editable option.
Select tie breaker Select an iSCSI disk or a SCSI disk that you want to use as the tie breaker disk.
It must support either SCSI-2 or SCSI-3 reserves.
The tie breaker is an optional feature you can use to prevent a partitioned cluster, also known
as split brain. If specified as an arbitrator in the split and merge policy of a cluster, the tie
breaker decides which partition of the cluster survives. The one containing a node that
succeeds in placing a SCSI-3 persistent reserve or SCSI-2 reserve on the tie breaker disk
wins, and hence survives. The loser is rebooted.
Similar behavior happens while merging the partitioned cluster. The nodes belonging to the
partition that is unable to place the SCSI-3 persistent reserve or SCSI-2 reserve belong to the
losing side and will be rebooted.
PowerHA SystemMirror Enterprise Edition v7.1.3 introduces new manual split and merge
policies. It can and should be applied globally across the cluster. However, there is an option
to specify whether it should apply to storage replication recovery.
Important: After PowerHA 7.1.0 or later, the RSCT topology service subsystem is
deactivated and all its functions are performed by CAA topology services.
IPAT through aliasing also obsoletes the concept of standby interfaces (all network interfaces
are labeled as boot interfaces).
As IP addresses are added to the interface through aliasing, more than one service IP label
can coexist on one interface. By removing the need for one interface per service IP address
that the node can host, IPAT through aliasing is the more flexible option and in some cases
can require less hardware. IPAT through aliasing also reduces fallover time, because adding
an alias to an interface is faster than removing the base IP address and then applying the
service IP address.
Although IPAT through aliasing can support multiple service IP labels and addresses, we still
suggest that you configure multiple interfaces per node per network. Swapping interfaces is
far less disruptive than moving the resource group over to another node.
node1 node2
IPAT through aliasing is supported only on networks that support the gratuitous ARP function
of AIX. Gratuitous ARP is when a host sends out an ARP packet before using an IP address
and the ARP packet contains a request for this IP address. In addition to confirming that no
other host is configured with this address, it ensures that the ARP cache on each machine on
the subnet is updated with this new address.
If multiple service IP alias labels or addresses are active on one node, PowerHA by default
equally distributes them among available interfaces on the logical network. This placement
For IPAT through aliasing, each boot interface on a node must be on a different subnet,
though interfaces on different nodes can obviously be on the same subnet. The service IP
labels can be on the same subnet as the boot adapter only if it is a single adapter
configuration. Otherwise, they must be on separate subnets also.
Important: For IPAT through aliasing networks, PowerHA will briefly have the service IP
addresses active on both the failed Interface and the takeover interface so it can preserve
routing. This might cause a DUPLICATE IP ADDRESS error log entry, which you can ignore.
Assigning a persistent node IP label for a network on a node allows you to have a highly
available node-bound address on a cluster network. This address can be used for
administrative purposes because it always points to a specific node regardless of whether
PowerHA is running.
Note: Configuring only one persistent node IP label per network per node is possible. For
example, if you have a node that is connected to two networks that are defined in
PowerHA, that node can be identified through two persistent IP labels (addresses), one for
each network.
The persistent IP labels are defined in the PowerHA configuration, and they become available
when the cluster definition is synchronized. A persistent IP label remains available on the
interface it was configured, even if PowerHA is stopped on the node, or the node is rebooted.
If the interface on which the persistent IP label is assigned fails while PowerHA is running, the
persistent IP label is moved to another interface in the same logical network on the same
node.
If the node fails or all interfaces on the logical network on the node fail, then the persistent IP
label will no longer be available.
The persistent IP alias must be on a different subnet of each of the boot interface subnets
(again unless a single adapter network configuration in which case a persistent IP might not
be needed) and either in the same subnet or in a different subnet as the service IP address.
When cluster sites are used, specifically linked sites, two more parameters are available:
Link failure detection timeout: This is time (in seconds) that the health management layer
waits before declaring that the inter-site link failed. A link failure detection can cause the
cluster to switch to another link and continue the communication. If all the links failed, this
results in declaring a site failure. The default is 30 seconds.
Site heartbeat cycle: This is number factor (1 - 10) that controls heartbeat between the
sites.
These settings, unlike in previous versions, apply to all networks. Also, like most changes, a
cluster synchronization is required for the changes to take affect. However the change is
dynamic so a cluster restart is not required.
For more details see CAA failure detection tunables on page 102.
For more details about cluster security, see Chapter 8, Cluster security on page 321.
The Cluster Communications daemon is started by inittab, with the entry being created
by the installation of PowerHA. The daemon is controlled by the system resource controller,
so startsrc, stopsrc, and refresh work. In particular, refresh is used to reread
/etc/cluster/rhosts and move the log files.
The real use of the /etc/cluster/rhosts file is before the cluster is first synchronized in an
insecure environment. After the CAA cluster is created, the only time that the file
The Cluster Communications daemon provides the transport medium for PowerHA cluster
verification, global ODM changes, and remote command execution. The following commands
use clcomd (they cannot be run by a standard user):
clrexec Run specific and potentially dangerous commands.
cl_rcp Copy AIX configuration files.
cl_rsh Used by the cluster to run commands in a remote shell.
clcmd Takes an AIX command and distributes it to a set of nodes that are members of
a cluster.
2.4.1 Definitions
PowerHA uses the underlying topology to ensure that the applications under its control and
the resources they require are kept highly available.
Service IP labels or addresses
Physical disks
Volume groups
Logical volumes
File systems
NFS
Application controller scripts
Application monitors
Tape resources
WLM integration
Workload partitions (WPARs)
For more details about implementing WPARs with PowerHA, see Chapter 13, WPARs and
PowerHA scenario on page 443.
The applications and the resources required are configured into resource groups. The
resource groups are controlled by PowerHA as single entities whose behavior can be tuned
to meet the requirements of clients and users.
Figure 2-4 on page 36 shows resources that PowerHA makes highly available, superimposed
on the underlying cluster topology.
tty2
Service IP label
en0 1
en1
node1
tty2 Service IP label 2
en0 en2 rg_01
en1 en3 tty1
tty1
node2
en2
rg_02
en3 tty1
tty1 en0
rg_03
en1
node3
Service IPen2
label 3
en3 tty2
Shared storage
share_vg
2.4.2 Resources
The items in this section are considered resources in a PowerHA cluster.
The service IP addresses becomes available when PowerHA brings the associated resource
group into an ONLINE status.
The IP label distribution preference can also be changed dynamically, but is only used in
subsequent cluster events. This is to avoid any extra interruptions in service. The cltopinfo
-w command displays the policy that is used, but only if it differs from the default value of
anti-collocation.
For more information about using this feature see 12.4, Site-specific service IP labels on
page 430.
If storage is to be shared by some or all of the nodes in the cluster, then all components must
be on external storage and configured in such a way that failure of one node does not affect
the access by the other nodes.
For a list of supported devices by PowerHA, see the following web page:
http://bit.ly/1E3xoXE
Important: Be aware that just because a third-party storage is not listed in the matrix does
not mean that storage is not supported. If VIOS supports third-party storage, and PowerHA
supports virtual devices through VIOS, then the storage should also be supported by
PowerHA. However, always verify support with the storage vendor.
For data protection, you can use either Redundant Array of Independent Disks (RAID)
technology (at the storage or adapter level) or AIX LVM mirroring (RAID 1).
Disk arrays are groups of disk drives that work together to achieve data transfer rates higher
than those provided by single (independent) drives. Arrays can also provide data redundancy
so that no data is lost if one drive (physical disk) in the array fails. Depending on the RAID
level, data is either mirrored, striped, or both. For the characteristics of some widely used
RAID levels, see Table 2-2 on page 55.
Some newer storage subsystems have any more specialized RAID type methods that do not
fit exactly into any of these categories, for example IBM XIV.
Important: Although all RAID levels (other than RAID 0) have data redundancy, data must
be regularly backed up. This is the only way to recover data if a file or directory is
accidentally corrupted or deleted.
Leaving quorum on (by default) causes resource group fallover if quorum is lost, and the
volume group will be forced to vary on, on the other node, if a forced varyon of volume groups
is enabled. When forced varyon of volume groups is enabled, PowerHA checks to determine
the following conditions:
That at least one copy of each mirrored set is in the volume group
That each disk is readable
That at least one accessible copy of each logical partition is in every logical volume
If these conditions are fulfilled, then PowerHA forces the volume group varyon.
When a node is integrated into the cluster, PowerHA builds a list of all enhanced concurrent
volume groups that are a resource in any resource group containing the node. These volume
groups are then activated in passive mode.
When the resource group comes online on the node, the enhanced concurrent volume groups
are then varied on in active mode. When the resource group goes offline on the node, the
volume group is varied off to passive mode.
Important: When you use enhanced concurrent volume groups, be sure that multiple
networks exist for heartbeats. Historically, because there was no SCSI locking, a
partitioned cluster can quickly vary on a volume group on all nodes, and then potentially
corrupt data. But enhancements in AIX 6.1.7 and 7.1.1 introduced JFS2 mountguard. This
option prevents a file system from being mounted on more than one system at time.
PowerHA v7.1.1 and later automatically enable this feature if it is not already enabled.
Although this is not an issue related to PowerHA, be aware that some applications using raw
logical volumes can start writing from the beginning of the device, therefore overwriting the
logical volume control block (LVCB).
Custom methods are provided for Veritas Volume Manager (VxVM) starting with the Veritas
Foundation Suite v4.0. For a newer version, you might need to create a custom user-defined
resource to handle the storage appropriately. More information about this option is in 2.4.9,
User defined resources on page 46.
The fsck utility performs a verification of the consistency of the file system, checking the
inodes, directory structure, and files. Although this is more likely to recover damaged file
systems, it does take longer.
Important: Restoring the file system to a consistent state does not guarantee that the data
is consistent; that is the responsibility of the application.
2.4.3 NFS
PowerHA works with the AIX network file system (NFS) to provide a highly available NFS
server, which allows the backup NFS server to recover the current NFS activity if the primary
NFS server fails. This feature is available only for two-node clusters when using
NFSv2/NFSv3, and more than two nodes when using NFSv4, because PowerHA preserves
locks for the NFS file systems and handles the duplicate request cache correctly. The
attached clients experience the same hang if the NFS resource group is acquired by another
node as they would if the NFS server reboots.
When configuring NFS through PowerHA, you can control these items:
The network that PowerHA will use for NFS mounting.
NFS exports and mounts at the directory level.
Export options for NFS exported directories and file systems. This information is kept in
/usr/es/sbin/cluster/etc/exports, which has the same format as the AIX exports file
(/etc/exports).
NFS cross-mounts
NFS cross-mounts work as follows:
The node that is hosting the resource group mounts the file systems locally, NFS exports
them, and NFS mounts them, thus becoming both an NFS server and an NFS client.
All other participating nodes of the resource group simply NFS-mount the file systems,
thus becoming NFS clients.
If the resource group is acquired by another node, that node mounts the file system locally
and NFS exports them, thus becoming the new NFS server.
Applications are defined to PowerHA as application controllers with the following attributes:
Start script This script must be able to start the application from both a clean and
an unexpected shutdown. Output from the script is logged in the
hacmp.out log file if set -x is defined within the script. The exit code
from the script is monitored by PowerHA.
Stop script This script must be able to successfully stop the application. Output is
also logged and the exit code monitored.
Application monitors
To keep applications highly available, PowerHA is can monitor the
application too, not just the required resources.
Application startup mode
Introduced in PowerHA v7.1.1 this mode specifies how the application
controller startup script is called. Select background (the default
value) if you want the start script to be called as a background
process, and event processing continues even if the start script has
not completed. Select foreground if you want the event to suspend
processing until the start script exits.
As the exit codes from the application scripts are monitored, PowerHA assumes that a
non-zero return code from the script means that the script failed and therefore starting or
stopping the application was not successful. If this is the case, the resource group will go into
error state and a config_too_long message is recorded.
Consider the following factors when you configure the application for PowerHA:
The application is compatible with the AIX version.
The storage environment is compatible with a highly available cluster.
The application and platform interdependencies must be well understood. The location of
the application code, data, temporary files, sockets, pipes, and other components of the
system such as printers must be replicated across all nodes that will host the application.
As previously described, the application must be able to be started and stopped without
any operator intervention, particularly after a node unexpectedly halts. The application
start and stop scripts must be thoroughly tested before implementation and with every
change in the environment.
The resource group that contains the application must contain all the resources required
by the application, or be the child of one that does.
Application licensing must be accounted for. Many applications have licenses that depend
on the CPU ID; careful planning must be done to ensure that the application can start on
any node in the resource group node list. Also be careful with the numbers of CPUs and
other items on each node because some licensing is sensitive to these amounts.
Application availability
PowerHA also offers an application availability analysis tool, which is useful for auditing the
overall application availability, and for assessing the cluster environment. For more details,
see 7.7.9, Measuring application availability on page 319.
WLM, using PowerHA configuration, starts either when a node joins the cluster, or as the
result of DARE involving WLM (and only on the nodes part of the resource groups that
contain WLM classes). PowerHA works with WLM in two ways:
If WLM is running, PowerHA will save the running configuration, stop WLM, and then
restart with the PowerHA configuration files. When PowerHA stops on a node, the
previous WLM configuration will be activated.
If WLM is not running, it will start with the PowerHA configuration, and stop when
PowerHA stops on the node.
Planning: PowerHA can perform only limited verification of the WLM configuration. Proper
planning must be done in advance.
The configuration that WLM uses on a node is specific to the node and the resource groups
that might be brought online on that node. Workload Manager classes can be assigned to
resource groups as either of these classes:
Primary class
Secondary class
When a node is integrated into the cluster, PowerHA checks each resource group whose
node list contains that node. The WLM classes that are used then depend on the startup
policy of each resource group, and the nodes priority in the node list.
If the resource group has a startup policy of either online on all available nodes (concurrent)
or online using a node distribution policy, then the node will use the primary WLM class.
Various WPAR types exist. Two examples are application WPARs or system WPARs. System
WPARs are autonomous virtual system environments with their own private file systems,
users and groups, login, network space, and administrative domain.
In AIX Version 7, administrators can create WPARs that can run AIX 5.2 in an AIX 7 operating
system instance. It is supported on the IBM POWER7 server platform.
For more details about implementing WPARs with PowerHA see Chapter 13, WPARs and
PowerHA scenario on page 443.
A user-defined resource type is one that you can define a customized resource that you can
add to a resource group. A user-defined resource type contains several attributes that
describe the properties of the instances of the resource type.
Ensure that the user-defined resource type management scripts exist on all nodes that
participate as possible owners of the resource group where the user-defined resource
resides.
PowerHA ensures that resource groups remain highly available by moving them from node to
node as conditions within the cluster change. The main states of the cluster and the
associated resource group actions are as follows:
Cluster startup The nodes in the cluster are up and then the resource groups are
distributed according to their startup policy.
Resource failure/recovery
When a particular resource that is part of a resource group becomes
unavailable, the resource group can be moved to another node.
Similarly, it can be moved back when the resource becomes available.
PowerHA shutdown There are several ways to stop PowerHA on a node. One method
causes the nodes resource groups to fall over to other nodes. Another
method takes the resource groups offline. Under some circumstances,
stopping the cluster services on the node, while leaving the resources
active is possible.
Before learning about the types of behavior and attributes that can be configured for resource
groups, you need to understand the following terms:
Node list The list of nodes that is able to host a particular resource group. Each
node must be able to access the resources that make up the resource
group.
Default node priority The order in which the nodes are defined in the resource group. A
resource group with default attributes will move from node to node in
this order as each node fails.
Home node The highest priority node in the default node list. By default this is the
node on which a resource group will initially be activated. This does
not specify the node that the resource group is currently active on.
Startup The process of bringing a resource group into an online state.
Fallover The process of moving a resource group that is online on one node to
another node in the cluster in response to an event.
Fallback The process of moving a resource group that is currently online on a
node that is not its home node, to a re-integrating node.
Actually, what is important for PowerHA implementors and administrators is the behavior of
the resource groups at startup, fallover, and fallback. We describe the custom resource group
behavior options.
Online on home
node only
rg_01
Online on first
available node
rg_01
rg_01
Online on all
available nodes
rg_01
rg_01 rg_01
Online using
distribution policy
check distribution
policy
rg_01 rg_02
Fallover to next
priority node in list
rg_01
This policy applies only to resource groups with three or more nodes.
Fallover using
DNP
RSCT
rg_01
Bring offline
(error node only)
rg_01 rg_01
Fallback options
These options control the behavior of an online resource group when a node joins the cluster:
Fall back to higher priority node in list (Figure 2-12)
The resource group falls back to a higher priority node when it joins the cluster.
Fallback to higher
priority node in list
rg_01
Never fallback
rg_01
The way these attributes affect resource group behavior is indicated in Table 2-2.
Table 2-2 Resource group attributes and how they affect RG behavior
This attribute ensures that a resource group does not start on an early integrated node that is
low in its priority list, then keep falling over to higher priority nodes as they integrate.
Distribution policy
This node-based distribution policy ensures that on cluster startup, each node will acquire
only one resource group with this policy set. Also, a network-based distribution policy ensures
that only one resource group is be coming online on the same network and node, so nodes
with multiple networks can host multiple resource groups of this type.
This default behavior can be altered and serial processing can be specified for particular
resource groups by specifying a serial acquisition list. This order defines only the order of
processing on a particular node, not across nodes. If serial processing is specified, the
following processing occurs:
The specified resource groups are processed in order.
Resource groups containing only NFS mounts are processed in parallel.
The remaining resource groups are processed in order.
The reverse order is used on release.
Parent/child relationships between resource groups are designed for multitier applications,
where one or more resource groups cannot successfully start until a particular resource
group is already active. When a parent/child relationship is defined, the parent resource group
must be online before any of its children can be brought online on any node. If the parent
resource group is to be taken offline, the children must be taken offline first.
Up to three levels of dependency can be specified, that is a parent resource group can have
children that are also parents to other resource groups. However, circular dependencies are
not allowed.
Figure 2-14 shows an example where resource group 2 has two children, one of which also
has a child. Thus, resource group 2 must be online before resource groups 3 and 4 can be
brought online. Similarly resource group 4 must be online before resource group 5 can be
brought online. Resource group 3 has two parents (resource groups 1 and 2) that must be
online before it can be online.
resource resource
group 1 group 2
resource resource
group 3 group 4
resource
group 5
As PowerHA starts applications in background (so a hang of the script does not stop
PowerHA processing), it is important to have startup application monitors for the parents in
any parent / child resource group dependency. After the startup application monitor, or
monitors, have confirmed that the application has successfully started, the processing of the
child resource groups can then commence.
Start after dependency establishes a processing order so that the target resource group is
always online before the source resource group. The target contains resources necessary for
the source to start, but is not required to continue running after the source is started.
Stop after dependency establishes a processing order so that the target is always brought
offline before the source. The target contains resources necessary for the source to stop, but
is not required to be running when the source is started.
Application 1 Application 2
parent/ parent/
child child
Database 1 Database 2
Databases located
on different nodes
To set or display the RG dependencies, use the clrgdependency command (Example 2-1).
If a node fails to bring a resource group online when it joins the cluster, the resource group will
be left in the ERROR state. If the resource group is not configured as online on all available
nodes, PowerHA will attempt to bring the resource group online on the other active nodes in
the resource groups node list.
Each node that joins the cluster automatically attempts to bring online any of the resource
groups that are in the ERROR state.
If a node fails to acquire a resource group during fallover, the resource group will be marked
as recoverable and PowerHA will attempt to bring the resource group online in all the nodes
in the resource groups node list. If this fails for all nodes, the resource group will be left in the
ERROR state.
If there is a failure of a network on a particular node, PowerHA will determine what resource
groups are affected (those that had service IP labels in the network) and then attempt to bring
them online on another node. If no other nodes have the required network resources, the
resource groups remain in the ERROR state. If any interfaces become available, PowerHA
works out what ERROR state resource groups can be brought on line, then attempt to do so.
Tip: If you want to override the automatic behavior of bringing a resource group in ERROR
state back online, specify that it must remain offline on a node.
DB2 10.1
Max DB 7.8
2.6.1 Notifications
This section shows notification events.
Error notification
This uses the AIX error notification facility. This allows you to trap on any specific error logged
in the error report and to run a custom notify method that the user provides.
You can use the verification automatic monitoring cluster_notify event to configure a
PowerHA SystemMirror remote notification method to send a message in case of detected
errors in cluster configuration. The output of this event is logged in the hacmp.out file
throughout the cluster on each node that is running cluster services.
You can configure any number of notification methods, for different events and with different
text or numeric messages and telephone numbers to dial. The same notification method can
be used for several different events, as long as the associated text message conveys enough
information to respond to all of the possible events that trigger the notification. This includes
SMS message support.
After configuring the notification method, you can send a test message to be sure
configurations are correct and that the expected message will be sent for an event.
PowerHA 7.1 introduces a system event that is enabled by default. This event allows for
monitoring the loss of the rootvg volume group while the cluster node is running. Previous
versions of PowerHA (HACMP) were unable to monitor this type of loss. Also the cluster was
unable to perform a fallover action in the event of the loss of access to rootvg. An example is
if you lose a SAN disk that is hosting the rootvg for this cluster node.
The option is available under the SMIT menu path. Enter smitty sysmirror, and then select
Custom Cluster Configuration Events System Events. Figure 2-16 shows that the
rootvg system event is defined and enabled by default in PowerHA 7.1.
[Entry Fields]
* Event Name ROOTVG +
* Response Log event and reboot +
* Active Yes +
Figure 2-16 The rootvg system event
The default event properties instruct the system to log an event and restart when a loss of
rootvg occurs.
The extra processors and memory, while physically present, are not used until you decide that
the additional capacity that you need is worth the cost. This provides you with a fast and easy
upgrade in capacity to meet peak or unexpected loads.
PowerHA SystemMirror integrates with the DLPAR and CoD functions. You can configure
cluster resources in a way where the logical partition with minimally allocated resources serve
as a standby node, and the application resides on another LPAR node that has more
resources than the standby node.
When it is necessary to run the application on the standby node, PowerHA SystemMirror
ensures that the node has sufficient resources to successfully run the application and
allocates the necessary resources.
For more information about using this feature, see 9.3, DLPAR and application provisioning
on page 354.
By using the PowerHA SystemMirror file collection function, you can request that a list of files
be automatically kept in sync across the cluster. You no longer have to manually copy an
updated file to every cluster node, confirm that the file is properly copied, and confirm that
each node has the same version of it. With PowerHA SystemMirror file collections enabled,
PowerHA SystemMirror can detect and warn you if one or more files in a collection is deleted
or has a zero value on one or more cluster nodes.
For more information about supported devices and required levels, consult the PowerHA
Enterprise Edition Cross Reference:
http://tinyurl.com/haEEcompat
Nodes 16
Resource groups 64
Networks 48
Cluster resources 128 is the maximum number that clinfo can handle, but
more can be in the cluster
Sites 2
GLVM Devices All disks are supported by AIX; they can be different
types of disks.
Subnet requirements
The AIX kernel routing table supports multiple routes for the same destination. If multiple
matching routes have the same weight, each subnet route will be used alternately. The
problem that this poses for PowerHA is that if one node has multiple interfaces that share the
same route, PowerHA has no means to determine its health.
Therefore, we suggest that each interface on a node should belong to a unique subnet, so
that each interface can be monitored. Using heartbeat over alias is an alternative.
If you are a system administrator of a PowerHA cluster, you might be asked to do any of the
following LVM-related maintenance tasks:
Create a new shared volume group.
Extend, reduce, change, or remove an existing volume group.
Create a new shared logical volume.
Extend, reduce, change, or remove an existing logical volume.
Create a new shared file system.
Extend, change, or remove an existing file system.
Add and remove physical volumes.
When performing any of these maintenance tasks on shared LVM components, make sure
that ownership and permissions are reset when a volume group is exported and then
reimported. More details about performing these tasks are available in 7.4, Shared storage
management on page 260.
After exporting and importing, a volume group is owned by root and accessible by the system
group.
Note: Applications, such as some database servers, that use raw logical volumes might be
affected by this change if they change the ownership of the raw logical volume device. You
must restore the ownership and permissions back to what is needed after this sequence.
There are also third-party (OEM) storage devices and subsystems that can be used, although
most of them are not directly certified by IBM for PowerHA usage. For these devices, check
the manufacturers respective websites.
PowerHA also supports shared tape drives (SCSI or Fibre Channel). The shared tape (or
tapes) can be connected using SCSI or Fibre Channel (FC). Concurrent mode tape access is
not supported.
Storage configuration is one of the most important tasks you must perform before starting the
PowerHA cluster configuration. Storage configuration can be considered a part of PowerHA
configuration.
Most IBM storage subsystems are supported with PowerHA. To find more information about
storage server support, see the matrix at the following web page:
http://ibm.co/1EvK8cG
Note: PowerHA does not provide data storage protection. Storage protection is provided
by using these items:
AIX (LVM mirroring)
GLVM
Hardware RAID
In this section, we provide information about data protection methods at the storage level, and
also talk about the LVM shared disk access modes:
Non-concurrent
Enhanced concurrent mode (ECM)
Both access methods actually use enhanced concurrent volume groups (ecvgs). In a
non-concurrent access configuration, only one cluster node can access the shared data at a
time. If the resource group containing the shared disk space moves to another node, the new
node will activate the disks, and check the current state of the volume groups, logical
volumes, and file systems.
In a concurrent access configuration, data on the disks is available to all nodes concurrently.
This access mode does not support file systems (either JFS or JFS2).
LVM requirements
The LVM component of AIX manages the storage by coordinating data mapping between
physical and logical storage. Logical storage can be expanded and replicated, and can span
multiple physical disks and enclosures.
By forcing a volume group to vary on, you can bring and keep a volume group online (as part
of a resource group) while one valid copy of the data is available. Use a forced varyon option
only for volume groups that have mirrored logical volumes; be cautious when you use this
facility to avoid creating a partitioned cluster.
This option is useful in a takeover situation in case a volume group that is part of that
resource group loses one or more disks (VGDAs). If this option is not used, the resource
group will not be activated on the takeover node, thus rendering the application unavailable.
When you use a forced varyon of volume groups option in a takeover situation, PowerHA first
tries a normal varyonvg command. If this attempt fails because of lack of quorum, PowerHA
checks the integrity of the data to ensure that at least one available copy of all data is in the
volume group before trying to force the volume online. If there is, it runs the varyonvg -f
command; if not, the volume group remains offline and the resource group results in an error
state.
Note: The forced varyon feature is usually specific to cross-site LVM and GLVM
configurations.
PowerHA allows customization on predefined cluster events and also allows new events
creation. When you create new events, an important step is to check whether any standard
event exists that covers the action or situation you want.
All standard cluster events have their own meaning and functioning behavior. Some examples
of cluster events are listed in Table 2-5.
node_up Nodes joining or leaving node_up event starts when a node joins or
cluster rejoins the cluster.
node_down Nodes joining or leaving node_down event starts when a cluster is not
cluster receiving heartbeats from a node; it considers
the node gone and starts a node_down event.
network_up Nodes joining or leaving network_up event starts when s cluster detects
cluster that a network is available and ready for cluster
usage (for a service IP address activation, per
example).
swap_adapter Network related events swap_adapter event starts when the interface
that hosts one service IP address experiences
a failure. If other boot networks are available
on the same node, then swap_adapter event
moves the service IP address to another boot
interface and refreshes network routing table.
fail_interface Interface related issues fail_interface event starts when any node
interface experiences a failure. If the interface
has no service IP defined, only
fail_interface event runs. If the failing
interface hosts a service IP address and there
is no other boot interface available to host it,
then a rg_move event is triggered.
rg_move Resource group changes rg_move event starts when a resource group
operation from one node to another starts.
rg_up Resource group changes rg_up event starts when a resource group is
successfully brought online at a node.
rg_down Resource group changes rg_down event starts when a resource group is
brought offline.
Note: All events have detailed usage description in the script file. All standard events are in
the /usr/es/sbin/cluster/events directory.
Chapter 3. Planning
In this chapter, we discuss the planning aspects for a PowerHA 7.1.3 cluster. Adequate
planning and preparation are necessary to successfully install and maintain a PowerHA 7.1.3
cluster. Time spent properly planning your cluster configuration and preparing your
environment will result in a cluster that is easier to install and maintain and one that provides
higher application availability.
Before you begin planning the cluster, you must have a good understanding of your current
environment, your application, and your expectations for PowerHA 7.1.3. Building on this
information, you can develop an implementation plan that helps you to more easily integrate
PowerHA 7.1.3 into your environment, and more important, have PowerHA 7.1.3 manage
your application availability to your expectations.
PowerHA 7.1.3 can be configured to monitor server hardware, operating system, and
application components. In the event of a failure, PowerHA 7.1.3 can take corrective actions,
such as moving specified resources (service IP addresses, storage, and applications) to
surviving cluster components in order to restore application availability as quickly as possible.
Because PowerHA 7.1.3 is an extremely flexible product, designing a cluster to fit your
organization requires thorough planning. Knowing your application requirements and
behavior provides important input to your PowerHA 7.1.3 plan and will be primary factors in
determining the cluster design. Ask yourself the following questions while developing your
cluster design:
Which application services are required to be highly available?
What are the service level requirements for these application services (24/7, 8/5) and how
quickly must service be restored if a failure occurs?
What are the potential points of failure in the environment and how can they be
addressed?
Which points of failure can be automatically detected by PowerHA 7.1.3 and which require
custom code to be written to trigger an event?
What is the skill level within the group implementing and maintaining the cluster?
Although the AIX system administrators are typically responsible for the implementation of
PowerHA 7.1.3, usually they cannot do it alone. A team consisting of the following
representatives should be assembled to assist with the PowerHA 7.1.3 planning; each will
play a role in the success of the cluster:
Network administrator
AIX system administrator
Database administrator
Application programmer
Support personnel
Application users
Using the concepts described in Chapter 1, Introduction to PowerHA SystemMirror for AIX
on page 3, begin the PowerHA 7.1.3 implementation by developing a detailed PowerHA 7.1.3
cluster configuration and implementation plan.
For ease of explanation, we use the planning of a simple two-node mutual takeover cluster as
an example. Sample planning worksheets are included as we work through this chapter so
you can see how the cluster planning is developed.
Both the cluster diagram and the paper planning worksheets provide a manual method of
recording your cluster information. A set of planning worksheets is in Appendix A, Paper
planning worksheets on page 489.
Chapter 3. Planning 75
The starting configuration shows this information:
Each application resides on a separate node (server).
Clients access each application over a dedicated Ethernet connection on each server.
Each node is relatively the same size in terms of CPU and memory, each with additional
spare capacity.
Each node has redundant power supplies and mirrored internal disks.
The applications reside on external SAN disk.
The applications each have their own robust start and stop scripts.
There is a monitoring tool to verify the health of each application.
AIX 7.1 is already installed.
Important: Each application to be integrated into the cluster must run in stand-alone
mode. You also must be able to fully control the application (start, stop, and validation test).
The intention is to make use of the two nodes in a mutual takeover configuration where app1
normally resides on Node01, and app2 normally resides on Node02. In the event of a failure,
we want both applications to run on the surviving server. As you can see from the diagram,
we need to prepare the environment to allow each node to run both applications.
Note: Each application to be integrated into the cluster must be able to run in stand-alone
mode on any node that it might have to run on (under both normal and fallover situations).
Chapter 3. Planning 77
3.2.6 Initial cluster design
Now that you understand the current environment, PowerHA 7.1.3 concepts, and your
expectations for the cluster, you can begin the cluster design.
This is a good time to create a diagram of the PowerHA 7.1.3 cluster. Start simply and
gradually increase the level of details as you go through the planning process. The diagram
can help identify single points of failure, application requirements, and guide you along the
planning process.
Also use the paper or online planning worksheets to record the configuration and cluster
details as you go.
Figure 3-2 illustrates the initial cluster diagram used in our example. At this point, the focus is
on high level cluster functionality. Cluster details are developed as we move through the
planning phase.
We begin to make design decisions for the cluster topology and behavior based on our
requirements. For example, based on our requirements, the initial cluster design for our
example includes the following considerations:
The cluster is a two-node mutual takeover cluster.
Although host names can be used as cluster node names, we choose to specify cluster
node names instead.
Each node contains one application but is capable of running both (consider network,
storage, memory, CPU, software).
Each node has one logical Ethernet interface that is protected using Shared Ethernet
Adapter (SEA) in a Virtual I/O Server (VIOS).
IP Address Takeover (IPAT) using aliasing.
Each node has a persistent IP address (an IP alias that is always available while the node
is up) and one service IP (aliased to one of the adapters under PowerHA 7.1.3 control).
The base Ethernet adapter addresses are on separate subnets.
Shared disks are virtual SCSI devices provided by a VIOS and reside on a SAN and are
available on both nodes.
All volume groups on the shared disks are created in Enhanced Concurrent Mode (ECM)
as required in PowerHA v7.1.3.
Each node has enough CPU and memory resources to run both applications.
Each node has redundant hardware and mirrored internal disks.
AIX 7.1 TL3 SP1 is installed.
PowerHA 7.1.3 SP1 is used.
This list captures the basic components of the cluster design. Each item will be investigated in
further detail as we progress through the planning stage.
COMMENTS This is a set of planning tables for a simple two-node PowerHA 7.1.3 mutual
takeover cluster using IPAT through aliasing.
Chapter 3. Planning 79
3.3 Planning cluster hardware
Cluster design starts by determining how many and what type of nodes are required. This
depends largely on a couple of factors:
The amount of resources required by each application
The fallover behavior of the cluster
PowerHA 7.1.3 supports virtually any AIX supported node, from desktop systems to high end
servers. When choosing a type of node, consider this information:
Ensure that sufficient CPU and memory resources are available on all nodes to allow the
system to behave as you want it to in a fallover situation. The CPU and memory resources
must be capable of sustaining the selected applications during fallover, otherwise clients
might experience performance problems. If you are using LPARs, you might want to use
the DLPAR capabilities to increase resources during fallover. If you are using stand-alone
servers, you do not have this option and so you might have to look at using a standby
server.
Make use of highly available hardware and redundant components where possible in each
server. For example, use redundant power supplies and connect them to separate power
sources.
Protect each nodes rootvg (local operating system copy) through the use of mirroring or
RAID.
Allocate at least two Ethernet adapters per node and connect them to separate switches
to protect from a single adapter or switch failure. Commonly this is done using a single or
dual Virtual I/O Server.
Allocate two SAN adapters per node to protect from a single SAN adapter failure.
Commonly this is done using a single or dual Virtual I/O Server.
Although not mandatory, we suggest you use cluster nodes with similar hardware
configurations so that you can more easily distribute the resources and perform administrative
operations. That is, do not try to fallover from a high-end server to a desktop model and
expect everything to work properly; be thoughtful in your choice of nodes.
Tip: For a list of supported devices by PowerHA, find the Hardware Support Matrix in the
following web page:
http://ibm.co/1xaX3uX
SAN Switches (2) IBM 2005 B16 Zoned for Virtual I/O Server
Host bus adapters (HBAs) and storage
Chapter 3. Planning 81
3.4.2 Virtual Ethernet and vSCSI support
PowerHA supports the use of virtualization provided by PowerVM. For more information about
Virtual I/O Server support, see Chapter 9, PowerHA and PowerVM on page 347.
Chapter 3. Planning 83
cluster.man.en_US.es.data
cluster.msg.Ja_JP.assist
cluster.msg.Ja_JP.es
cluster.msg.Ja_JP.es.client
cluster.msg.Ja_JP.es.server
cluster.msg.en_US.assist
cluster.msg.en_US.es
cluster.msg.en_US.es.client
cluster.msg.en_US.es.server
If using the installation media of PowerHA V7.1.3 Enterprise Edition, the following additional
file sets are available:
cluster.es.cgpprc
cluster.es.cgpprc.cmds
cluster.es.cgpprc.rte
cluster.es.genxd
cluster.es.genxd.cmds
cluster.es.genxd.rte
cluster.es.pprc
cluster.es.pprc.cmds
cluster.es.pprc.rte
cluster.es.spprc
cluster.es.spprc.cmds
cluster.es.spprc.rte
cluster.es.sr
cluster.es.sr.cmds
cluster.es.sr.rte
cluster.es.svcpprc
cluster.es.svcpprc.cmds
cluster.es.svcpprc.rte
cluster.es.tc
cluster.es.tc.cmds
cluster.es.tc.rte
cluster.msg.Ja_JP.cgpprc
cluster.msg.Ja_JP.genxd
cluster.msg.Ja_JP.glvm
cluster.msg.Ja_JP.pprc
cluster.msg.Ja_JP.sr
cluster.msg.Ja_JP.svcpprc
cluster.msg.Ja_JP.tc
cluster.msg.en_US.cgpprc
cluster.msg.en_US.genxd
cluster.msg.en_US.glvm
cluster.msg.en_US.pprc
cluster.msg.en_US.sr
cluster.msg.en_US.svcpprc
cluster.msg.en_US.tc
/etc/hosts
The cluster event scripts use the /etc/hosts file for name resolution. All cluster node IP
interfaces must be added to this file on each node. PowerHA 7.1.3 can modify this file to
ensure that all nodes have the necessary information in their /etc/hosts file, for proper
PowerHA 7.1.3 operations.
If you delete service IP labels from the cluster configuration by using SMIT, we suggest that
you also remove them from /etc/hosts.
/etc/inittab
The /etc/inittab file is modified in each of the following cases:
PowerHA 7.1.3 is installed:
The following line is added when you initially install PowerHA 7.1.3. It will start the
clcomdES and clstrmgrES subsystems if they are not already running.
clcomd:23456789:once:/usr/bin/startsrc -s clcomd
hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init >/dev/console 2>&1
Important: This PowerHA 7.1.3 entry is used to start the following daemons with the
startsrc command if they are not already running:
startsrc -s syslogd
startsrc -s snmpd
startsrc -s clstrmgrES
If PowerHA is set to start at system restart, add the following line to the /etc/inittab file:
hacmp6000:2:wait:/usr/es/sbin/cluster/etc/rc.cluster -boot -b -A # Bring up
Cluster
Chapter 3. Planning 85
Notes:
Although starting cluster services from the inittab file is possible, we suggest that
you do not use this option. The better approach is to manually control the starting of
PowerHA 7.1.3. For example, in the case of a node failure, investigate the cause of
the failure before restarting PowerHA 7.1.3 on the node.
ha_star is also found as an entry in the inittab file. This file set is delivered with the
bos.rte.control file set and not PowerHA 7.1.3.
/etc/rc.net
The /etc/rc.net file is called by cfgmgr, which is the AIX utility that configures devices and
optionally installs device software into the system, to configure and start TCP/IP during the
boot process. It sets host name, default gateway, and static routes.
/etc/services
PowerHA 7.1.3 makes use of the following network ports for communication between cluster
nodes. These are all listed in the /etc/services file:
clinfo_deadman 6176/tcp
clinfo_client 6174/tcp
clsmuxpd 6270/tcp
clm_lkm 6150/tcp
clm_smux 6175/tcp
clcomd_caa 16191/tcp
emsvcs 6180/udp
cthags 12348/udp
Note: If you install PowerHA 7.1.3 Enterprise Edition for GLVM, the following entry for the
port number and connection protocol is automatically added to the /etc/services file in
each node on the local and remote sites on which you installed the software:
rpv 6192/tcp
/etc/snmpd.conf
The default version of the file for versions of AIX later than V5.1 is snmpdv3.conf.
The SNMP daemon reads the /etc/snmpd.conf configuration file when it starts and when a
refresh or kill -1 signal is issued. This file specifies the community names and associated
access privileges and views, hosts for trap notification, logging attributes, snmpd-specific
parameter configurations, and SNMP mutliplexing (SMUX) configurations for the snmpd. The
PowerHA 7.1.3 installation process adds a clsmuxpd password to this file.
The following entry is added to the end of the file, to include the PowerHA 7.1.3 MIB,
supervised by the Cluster Manager:
smux 1.3.6.1.4.1.2.3.1.2.1.5 clsmuxpd_password # PowerHA SystemMirror clsmuxpd
/etc/snmpd.peers
The /etc/snmpd.peers file configures snmpd SMUX peers. During installation, PowerHA 7.1.3
adds the following entry to include the clsmuxpd password to this file:
/etc/syslog.conf
The /etc/syslog.conf configuration file controls output of the syslogd daemon, which logs
system messages. During installation, PowerHA 7.1.3 adds entries to this file that direct the
output from problems related to PowerHA 7.1.3 to certain files.
CAA also adds a line as shown at the end of the following example:
# PowerHA SystemMirror Critical Messages
local0.crit /dev/console
# PowerHA SystemMirror Informational Messages
local0.info /var/hacmp/adm/cluster.log rotate size 1m files 8
# PowerHA SystemMirror Messages from Cluster Scripts
user.notice /var/hacmp/adm/cluster.log rotate size 1m files 8
# PowerHA SystemMirror Messages from Cluster Daemons
daemon.notice /var/hacmp/adm/cluster.log rotate size 1m files 8
/etc/trcfmt
The /etc/trcfmt file is the template file for the system trace logging and report utility, trcrpt.
The installation process adds PowerHA 7.1.3 tracing to the trace format file. PowerHA 7.1.3
tracing is performed for the clstrmgrES and clinfo daemons.
/var/spool/cron/crontab/root
The PowerHA 7.1.3 installation process adds PowerHA 7.1.3 log file rotation to the
/var/spool/cron/crontabs/root file:
0 0 * * * /usr/es/sbin/cluster/utilities/clcycle 1>/dev/null 2>/dev/null # >
PowerHA SystemMirror Logfile rotation
Check with the application vendor to ensure that no issues (such as licensing) exist with the
use of PowerHA 7.1.3.
3.4.7 Licensing
The two aspects of licensing are as follows:
PowerHA 7.1.3 (features) licensing
Application licensing
PowerHA 7.1.3
PowerHA 7.1.3 licensing is based on the number of processors, where the number of
processors is the sum of the number of processors on which PowerHA 7.1.3 will be installed
Chapter 3. Planning 87
or run. A PowerHA 7.1.3 license is required for each AIX instance. PowerHA 7.1.3 Enterprise
Edition licenses are also done the same way as PowerHA 7.1.3.
Note: Micro-partition licensing for PowerHA 7.1.3 is not available. You must license by full
processors. For more information about licensing, see the following web page:
http://www-03.ibm.com/systems/power/software/availability/aix/faq_support.html
Applications
Some applications have specific licensing requirements, such as a unique license for each
processor that runs an application, which means that you must be sure that the application is
properly licensed to allow it to run on more than one system. To do this (license-protecting an
application) incorporate processor-specific information into the application when it is installed.
As a result, even though the PowerHA 7.1.3 software processes a node failure correctly, it
might be unable to restart the application on the fallover node because of a restriction on the
number of licenses for that application available within the cluster.
Important: To avoid this problem, be sure that you have a license for each system unit in
the cluster that might potentially run an application. Check with your application vendor for
any license issues for when you use PowerHA 7.1.3.
Another good practice is to allow approximately 100 MB free space in /var and /tmp for
PowerHA 7.1.3 logs. This way depends on the number of nodes in the cluster, which dictates
the size of the messages stored in the various PowerHA 7.1.3 logs.
Time synchronization
Time synchronization is important between cluster nodes for both application and PowerHA
log issues. This is standard system administration practice, and we suggest that you make
use of an NTP server or other procedure to keep the cluster nodes time in sync.
Note: Maintaining time synchronization between the nodes is especially useful for auditing
and debugging cluster problems.
Chapter 3. Planning 89
3.6.2 User administration
Most applications require user information to be consistent across the cluster nodes (user ID,
group membership, group ID) so that users can log in to surviving nodes without experiencing
problems.
This is particularly important in a fallover (takeover) situation. Application users must be able
to access the shared files from any required node in the cluster. This usually means that the
application-related user and group identifiers (UID and GID) must be the same on all nodes.
In preparation for a cluster configuration, be sure to consider and correct this, otherwise you
might experience service problems during a fallover.
After PowerHA 7.1.3 is installed, it contains facilities to let you manage AIX user and group
accounts across the cluster. It also provides a utility to authorize specified users to change
their own password across nodes in the cluster.
Attention: If you manage user accounts with a utility such as Network Information Service
(NIS), PSSP user management, or Distributed Computing Environment (DCE) Manager,
do not use PowerHA 7.1.3 user management. Using PowerHA 7.1.3 user management in
this environment might cause serious system inconsistencies in the user authentication
databases.
For more information about user administration, see 7.3.1, C-SPOC user and group
administration on page 247.
Before installation: If you prefer to control the GID of the hacmp group, we suggest that
you create the hacmp group before installing the PowerHA 7.1.3 file sets.
For more information about user administration, see 7.3.1, C-SPOC user and group
administration on page 247.
In addition to the ports identified in the /etc/services file, the following services also require
ports. However, these ports are selected randomly when the processes start. Currently, there
is no way to indicate specific ports, so be aware of their presence. Typical ports are shown for
illustration, but these ports can be altered if you need to do so:
#clstrmgr 870/udp
#clstrmgr 871/udp
#clinfo 32790/udp
Configuration_Files
The Configuration_Files collection is a container for the following essential system files:
/etc/hosts
/etc/services
/etc/snmpd.conf
/etc/snmpdv3.conf
/etc/rc.net
/etc/inetd.conf
/usr/es/sbin/cluster/netmon.cf
/usr/es/sbin/cluster/etc/clhosts
/usr/es/sbin/cluster/etc/rhosts
/usr/es/sbin/cluster/etc/clinfo.rc
You can alter the propagation options for this file collection, and you can add files to this file
collection and delete files from it.
HACMP_Files
The HACMP_Files collection is a container that typically holds user-configurable files of the
PowerHA 7.1.3 configuration such as application start and stop scripts, customized events,
and so on. This file collection cannot be removed or modified, and you cannot add files to or
delete files from it.
Example: For example, when you define an application server to PowerHA 7.1.3 (start,
stop and optionally monitoring scripts), PowerHA 7.1.3 will automatically include these files
into the HACMP_Files collection.
In a typical clustering environment, clients access the applications through a TCP/IP network
(usually Ethernet) using a service address. This service address is made highly available by
PowerHA 7.1.3 and moves between communication interfaces on the same network as
required. PowerHA 7.1.3 sends heartbeat packets between all communication interfaces
(adapters) in the network to determine the status of the adapters and nodes and takes
remedial actions as required.
To eliminate the TCP/IP protocol as a single point of failure and prevent cluster partitioning,
PowerHA 7.1.3 also uses non-IP networks for heartbeating. This assists PowerHA 7.1.3 with
identifying the failure boundary, such as a TCP/IP failure or a node failure.
Chapter 3. Planning 91
Figure 3-3 provides an overview of the networks used in a cluster.
An Ethernet network is used for public access and has multiple adapters connected from
each node. This network will hold the base IP addresses, the persistent IP addresses, and
the service IP addresses. You can have more than one network; however, for simplicity, we
are use only one.
The cluster repository is also shown. This provides another path of communications across
the disk. Multipath devices can be configured whenever there are multiple disk adapters in a
node, multiple storage adapters, or both.
PowerHA, through CAA, also can use the SAN HBAs for communications. This is often
referred to as SAN heartbeating, or sancomm. The device that enables it is sfwcomm.
All network connections are used by PowerHA 7.1.3 to monitor the status of the network,
adapters, and nodes in the cluster by default.
In our example, we plan for an Ethernet and repository disk but not sancomm network.
Network connections
PowerHA 7.1.3 requires that each node in the cluster have at least one direct, non-routed
network connection with every other node. These network connections pass heartbeat
messages among the cluster nodes to determine the state of all cluster nodes, networks and
network interfaces.
Chapter 3. Planning 93
PowerHA 7.1.3 also requires that all communication interfaces for a cluster network be
defined on the same physical network, route packets, and receive responses from each
other without interference by any network equipment.
Do not place intelligent switches, routers, or other network equipment that do not
transparently pass UDP broadcasts, and other packets between all cluster nodes.
Bridges, hubs, and other passive devices that do not modify the packet flow can be safely
placed between cluster nodes, and between nodes and clients.
Figure 3-4 illustrates a physical Ethernet configuration, showing dual Ethernet adapters on
each node connected across two switches but all configured in the same physical network
(VLAN). This is sometimes referred to as being in the same MAC collision domain.
EtherChannel
PowerHA 7.1.3 supports the use of EtherChannel (or Link Aggregation) for connection to an
Ethernet network. EtherChannel can be useful if you want to use several Ethernet adapters
for both extra network bandwidth and fallover, but also want to keep the PowerHA 7.1.3
configuration simple. With EtherChannel, you can specify the EtherChannel interface as the
communication interface. Any Ethernet failures, with the exception of the Ethernet network
itself, can be handled without PowerHA 7.1.3 being aware or involved.
Important:
The host name cannot be an alias in the /etc/hosts file.
The name resolution for the host name must work for both ways, therefore a limited set
of characters can be used.
The IP address that belongs to the host name must be reachable on the server, even
when PowerHA is down.
The host name cannot be a service address.
The host name cannot be an address located on a network which is defined as private
in PowerHA.
The host name, the CAA node name, and the communication path to a node must be
the same.
By default, the PowerHA, node name, the CAA nodename, and the communication
path to a node are set to the same name.
The host name and the PowerHA nodename can differ.
The rules leave the base addresses and the persistent address as candidates for the host
name. You can use the persistent address as the host name only if you set up the persistent
alias manually before you configure the cluster topology.
PowerHA 7.1.3, through CAA, now also offers the ability to change the cluster node host
name dynamically if or as needed. For more information about this capability, see Chapter 11
of the Guide to IBM PowerHA SystemMirror for AIX Version 7.1.3, SG24-8167.
/etc/hosts
An IP address and its associated label (name) must be present in the /etc/hosts file. We
suggest that you choose one of the cluster nodes to perform all changes to this file and then
use FTP or file collections to propagate the /etc/hosts file to the other nodes. However, in an
inactive cluster, the auto-corrective actions during cluster verification can at least keep the IPs
that are associated with the cluster in sync.
Note: Be sure that you test the direct and reverse name resolution on all nodes in the
cluster and the associated Hardware Management Consoles (HMCs). All these must
resolve names identically, otherwise you might run into security issues and other problems
related to name resolution.
IP aliases
An IP alias is an IP address configured onto a NIC in addition to the base IP address of the
NIC. The use of IP aliases is an AIX function that is supported by PowerHA 7.1.3. AIX
supports multiple IP aliases on a NIC, each on the same or different subnets.
Note: AIX allows IP aliases with different subnet masks to be configured for an interface.
However, PowerHA 7.1.3 uses the subnet mask of the base IP address for all IP aliases
configured on this network interface.
Chapter 3. Planning 95
Persistent IP addresses (aliases)
A primary reason for using a persistent alias is to provide access to the node with PowerHA
7.1.3 services down. This is a routable address and is available while the node is up. You
must configure this alias through PowerHA 7.1.3. When PowerHA 7.1.3 starts, it checks
whether the alias is available. If it is not, PowerHA 7.1.3 configures it on an available adapter
on the designated network. If the alias is available, PowerHA 7.1.3 leaves it alone.
Important: If the persistent IP address exists on the node, it must be an alias, not the base
address of an adapter.
Note: The persistent IP address will be assigned by PowerHA 7.1.3 to one communication
interface, which is part of a PowerHA 7.1.3 defined network.
Figure 3-6 on page 98 illustrates the concept of the persistent address. Notice that this is
simply another IP address, configured on one of the base interfaces. The netstat command
shows it as an additional IP address on an adapter.
node01 node02
Network
Figure 3-5 Persistent Aliases
Subnetting
All the communication interfaces that are configured in the same PowerHA 7.1.3 network
must have the same subnet mask. Interfaces that belong to a different PowerHA 7.1.3
network can have either the same or different network mask.
Note: Unless you use a single adapter per network configuration, the base (or boot) IP
and the service IP can be on the same subnet.
If you link your default route to one of the base address subnets and that adapter fails, your
default route will be lost.
To prevent this situation, be sure to use a persistent address and link the default route to this
subnet. The persistent address will be active while the node is active and therefore so will the
default route.
If you choose not to do this, then you must create a post-event script to reestablish the default
route if this becomes an issue.
Note: Not all adapters must contain addresses that are routable outside the VLAN. Only
the service and persistent addresses must be routable. The base adapter addresses and
any aliases used for heartbeating do not need to be routed outside the VLAN because they
are not known to the client side.
Ensure that the switch provides a timely response to Address Resolution Protocol (ARP)
requests. For many brands of switches, this means turning off the following functions:
The spanning tree algorithm
portfast
uplinkfast
backbonefast
If having the spanning tree algorithm turned on is necessary, then the portfast function should
also be turned on.
Multicast
Although multicast is no longer mandatory in PowerHA v7.1.3, it still is a valid option that you
might want to use. To use multicast, see 12.1, Multicast considerations on page 422.
Chapter 3. Planning 97
IPv6 address format
IPv6 increases the IP address size from 32 bits to 128 bits, thereby supporting more levels of
addressing hierarchy, a much greater number of addressable nodes, and simpler auto
configuration of addresses.
Figure 3-6 shows the basic format for global unicast IPv6 addresses.
subnet prefix
128 bits
For PowerHA, you can have your boot IPs configured to the link-local address if that is
suitable. However, for configurations involving sites, it will be more suitable for configuring
boot IPs with global unicast addresses that can communicate with each other. The benefit is
that you can have extra heartbeating paths, which helps prevent cluster partitions.
In general, automatic IPv6 addresses are suggested for unmanaged devices such as client
PCs and mobile devices. Manual IPv6 addresses are suggested for managed devices such
as servers.
For PowerHA, you are allowed to have either automatic or manual IPv6 addresses. However,
consider that automatic IPs have no guarantee to persist. CAA restricts you to having the host
name labeled to a configured IP address, and also does not allow you to change the IPs when
the cluster services are active.
PowerHA allows you to mix different IP address families on the same adapter (for example,
IPv6 service label in the network with IPv4 boot, IPv4 persistent label in the network with IPv6
boot). However, the preferred practice is to use the same family as the underlying network for
simplifying planning and maintenance.
Chapter 3. Planning 99
Figure 3-7 shows an example of this configuration.
To determine the IPv6 multicast address, a standard prefix of 0xFF05 is combined by using the
logical OR operator with the hexadecimal equivalent of the IPv4 address. For example, the
IPv4 multicast address is 228.8.16.129 or 0xE4081081. The transformation by the logical OR
operation with the standard prefix is 0xFF05:: | 0xE4081081. Thus, the resulting IPv6
multicast address is 0xFF05::E408:1081.
IPAT through aliasing is easy to implement and flexible. You can have multiple service
addresses on the same adapter at any one time, and some time can be saved during fallovers
because PowerHA 7.1.3 adds an alias rather than reconfigures the base IP address of an
adapter.
When PowerHA 7.1.3 starts, it configures a service alias on top of existing base IP address of
an available adapter.
In a multiple interface per network configuration, using a persistent alias and including it in the
same subnet as your default route is common. This typically means that the persistent
address is included in the same subnet as the service addresses. The persistent alias can be
used to access the node when PowerHA 7.1.3 is down and also overcome the default route
issue.
You can configure a distribution preference for the placement of service IP labels that are
configured in PowerHA 7.1.3 The placement of the alias is configurable through SMIT menus
as follows:
Anti-collocation
This is the default, and PowerHA distributes the service IP labels across all boot IP
interfaces in the same PowerHA network on the node.
Collocation
PowerHA allocates all service IP addresses on the same boot IP interface.
Collocation with persistent label
PowerHA allocates all service IP addresses on the boot IP interface that is hosting the
persistent IP label. This can be useful in environments with VPN and firewall configuration,
where only one interface is granted external connectivity.
Collocation with source
Service labels are mapped using collocation preference. This choice will allow one service
label to be chosen as source for outgoing communication. The service label chosen in the
next field is the source address.
Anti-collocation with source
Service labels are mapped using the anti-collocation preference. If there are not enough
adapters, more than one service label can be placed on one adapter. This choice will allow
one label to be chosen as source address for outgoing communication.
For more information and examples of using service distribution policies, see 12.2,
Distribution preference for service IP aliases on page 425.
We suggest that you make the following entry in the /etc/netsvc.conf file to assure that the
/etc/hosts file is read before a DNS lookup is attempted:
hosts = local, bind4
The round-trip time (rtt) value is shown in the output of the lscluster -i and lscluster -m
commands. The mean deviation in network rtt is the average round-trip time, which is
automatically managed by CAA.
Table 3-6 on page 103 and Table 3-7 on page 103 show the tunable values as they relate to
both local nodes within a site along with the additional values associated with utilizing a two
site linked cluster. The 'RSCT' column presents the tunables used in PowerHA prior to v.7.1
for comparison.
Node failure detection time Equals time of last Node timeout + Delay
network failure timeout
For a two node cluster, Table 3-6 and Table 3-7 translate for RSCT as follows:
For a two node cluster, Table 3-6 and Table 3-7 translate for CAA as follows(This is the
method PowerHA uses v7.1.3):
To change the cluster heartbeat settings, modify the paramaters for the PowerHA cluster from
the custom cluster configuration in the SMIT panel (Example 3-1). Enter smitty sysmirror,
and then select Custom Cluster Configuration Cluster Nodes and Networks
Manage the Cluster Cluster heartbeat settings. Press Enter.
[Entry Fields]
* Node Failure Detection Timeout [0]
* Node Failure Detection Grace Period [0]
Note: Unlike previous versions, this setting is global across all networks.
When cluster sites are used, specifically linked sites, two more parameters are available:
Link failure detection timeout: This is time (in seconds) that the health management layer
waits before declaring that the inter-site link failed. A link failure detection can cause the
cluster to switch to another link and continue the communication. If all the links failed, this
results in declaring a site failure. The default is 30 seconds.
Site heartbeat cycle: This is number factor (1 - 10) that controls heartbeat between the
sites.
Also, like most changes, a cluster synchronization is required for the changes to take affect.
However the change is dynamic so a cluster restart is not required.
/usr/es/sbin/cluster/netmon.cf
If a virtualized network environment, such as provided by VIOS, is being used for one or more
interfaces, PowerHA 7.1.3 can have difficulty accurately determining a particular adapter
failure. For these situations, the netmon.cf file is important to use. For more information see
12.5, Understanding the netmon.cf file on page 439.
The first worksheet (Table 3-8) shows the specifications for the Ethernet network used in our
example.
COMMENTS
Node01 hdisk2
Node02 hdisk2
COMMENTS
After the networks are recorded, document the interfaces and IP addresses that are used by
PowerHA 7.1.3, as shown in Table 3-10.
Node01
Node02
COMMENTS Each node contains 2 base adapters, each in their own subnet.
Each node also contains a persistent (node bound) address and a service address. IPAT through
aliases is used
When planning for a repository disk in case of a multi-site cluster solution, understand these
clusters:
Varied on: All shared disks must be zoned to any cluster nodes requiring access to the
specific volumes. That is, the shared disks must be able to be varied on and accessed by
any node that has to run a specific application.
Be sure to verify that shared volume groups can be manually varied on each node.
In a PowerHA 7.1.3 cluster, shared disks are connected to more than one cluster node. In a
non-concurrent configuration, only one node at a time owns the disks. If the owner node fails
to restore service to clients, another cluster node in the resource group node list acquires
ownership of the shared disks and restarts applications.
When working with a shared volume group be sure not to do these actions:
Do not include an internal disk in a shared volume group because it will not be accessible
by other nodes.
Do not activate (vary on) the shared volume groups in a PowerHA 7.1.3 cluster at system
boot. Ensure that the automatic varyon attribute in the AIX ODM is set to No for shared
volume groups being part of resource groups. You can use the cluster verification utility to
change this attribute.
Important: If you define a volume group to PowerHA 7.1.3, do not manage it manually on
any node outside of PowerHA 7.1.3 while PowerHA 7.1.3 is running. This can lead to
unpredictable results. Always use C-SPOC to maintain the shared volume groups.
When the volume group is activated in enhanced concurrent mode, the LVM allows access to
the volume group on all nodes. However, it restricts the higher-level connections, such as JFS
mounts and NFS mounts, on all nodes, and allows them only on the node that currently owns
the volume group.
Note: Although you must define enhanced concurrent mode volume groups, this does not
necessarily mean that you will use them for concurrent access; for example, you can still
define and use these volume groups as normal shared file system access. However, you
must not define file systems on volume groups that are intended for concurrent access.
Most configurations are non-concurrent, enabling the fast disk takeover feature to occur.
The following operations are not allowed when a volume group is varied on in the passive
state:
Operations on file systems, such mount
Any open or write operation on logical volumes
Synchronizing volume groups
Active mode is similar to a non-concurrent volume group being varied online with the
varyonvg command. It provides full read/write access to all logical volumes and file systems,
and it supports all LVM operations.
Passive mode is the LVM equivalent of disk fencing. Passive mode allows readability only of
the VGDA and the first 4 KB of each logical volume. It does not allow read/write access to file
systems or logical volumes. It also does not support LVM operations.
When a resource group, containing an enhanced concurrent volume group, is brought online,
the volume group is first varied on in passive mode and then it is varied on in active mode.
The active mode state applies only to the current resource group owning node. As any other
When the owning/home node fails, the fallover node changes the volume group state from
passive mode to active mode through the LVM. This change takes approximately 10 seconds
and is at the volume group level. It can take longer with multiple volume groups with multiple
disks per volume group. However, the time impact is minimal compared to the previous
method of breaking SCSI reserves.
The active and passive mode flags to the varyonvg command are not documented because
they should not be used outside a PowerHA environment. However, you can find it in the
hacmp.out log.
Active mode varyon command:
varyonvg -n -c -A app2vg
Passive mode varyon command:
varyonvg -n -c -P app2vg
Important: Do not run these commands without cluster services running. Also, do not run
these commands unless directed to do so from IBM support.
To determine if the volume group is online in active or passive mode, verify the VG PERMISSION
field from the lsvg command output, as shown in Figure 3-8.
There are other distinguishing LVM status features that you will notice for volume groups that
are being used in a fast disk takeover configuration. For example, the volume group will show
online in concurrent mode on each active cluster member node by using the lspv command.
However, the lsvg -o command reports only the volume group online to the node that has it
varied on in active mode. An example of how passive mode status is reported is shown in
Figure 3-8.
When a non-concurrent style resource group is brought online, PowerHA checks one of the
volume group member disks to determine whether it is an enhanced concurrent volume
group. PowerHA determines this with the lqueryvg -p devicename -X command. A return
output of 0 (zero) indicates a regular non-concurrent volume group. A return output of 32
indicates an enhanced concurrent volume group.
In Figure 3-9, hdisk0 is a rootvg member disk that is non-concurrent. The hdisk6 instance is
an enhanced concurrent volume group member disk.
After you establish a highly available disk infrastructure, also consider the following items
when designing your shared volume groups:
All shared volume groups have unique logical volume and file system names. This
includes the jfs and jfs2 log files.
PowerHA 7.1.3 also supports JFS2 with INLINE logs.
Major numbers for each volume group are unique (especially if you plan to use NFS).
JFS2 encrypted file systems (EFS) are supported. For more information about using EFS
with PowerHA, see 8.5, Federated security for cluster-wide security management on
page 331.
Figure 3-10 shows the basic components in the external storage. Notice that all logical
volumes and file system names are unique, as is the major number for each volume group.
The data is made highly available through the use of SAN disk and redundant paths to the
devices.
Enhanced Concurrent
Volume groups
app1vg app2vg
Major #90 Major #91
vpath0 vpath1
Document the shared volume groups and physical disks as shown in Table 3-11.
Node01 Node02
COMMENTS All disks are seen by both nodes. app1vg normally resides on Node01, app2vg normally resides on
Node02.
C10RG1 app1vg NA
Major Number = 90
log = app1vglog
Logical Volume 1 = app1lv1
Filesystem 1 = /app1 (20 GB)
C10RG2 app2vg NA
Major Number = 91
log = app2vglog
Logical Volume 1 = app2lv1
Filesystem 1 = /app2 (20 GB)
COMMENTS Create the shared Volume Group using C-SPOC after ensuring PVIDs exist on
each node.
When planning for an application to be highly available, be sure you understand the resources
required by the application and the location of these resources in the cluster. This helps you
provide a solution that allows them to be handled correctly by PowerHA 7.1.3 if a node fails.
You must thoroughly understand how the application behaves in a single-node and multi-node
environment. Be sure that, as part of preparing the application for PowerHA 7.1.3, you test
the execution of the application manually on both nodes before turning it over to PowerHA
7.1.3 to manage. Do not make assumptions about the applications behavior under fallover
conditions.
Note: The key prerequisite to making an application highly available is that it first must run
correctly in stand-alone mode on each node on which it can reside.
Be sure that the application runs on all required nodes properly before configuring it to be
managed by PowerHA 7.1.3.
Tip: When you plan the application, remember that If the application requires any manual
intervention, it is not suitable for a PowerHA cluster.
Configure your application controller by creating a name to be used by PowerHA 7.1.3 and
associating a start script and a stop script.
After you create an application controller, associate it with a resource group (RG).
PowerHA 7.1.3 then uses this information to control the application.
When defining your custom monitoring method, consider the following points:
You can configure multiple application monitors, each with unique names, and associate
them with one or more application servers.
The monitor method must be an executable program, such as a shell script, that tests the
application and exits, returning an integer value that indicates the applications status. The
return value must be zero if the application is healthy, and must be a non-zero value if the
application failed.
PowerHA 7.1.3 does not pass arguments to the monitor method.
The monitoring method logs messages to the following monitor log file:
/var/hacmp/log/clappmond.application_name.resource_group_name.monitor.log
Also, by default, each time the application runs, the monitor log file is overwritten.
Do not overcomplicate the method. The monitor method is terminated if it does not return
within the specified polling interval.
Important: Because the monitoring process is time-sensitive, always test your monitor
method under different workloads to arrive at the best polling interval value.
For more information, see 7.7.9, Measuring application availability on page 319.
Update the application worksheet to include all required information, as shown in Table 3-13.
APP1
VERIFICATION Run the following command and ensure APP1 is active. If not, send notification.
COMMANDS AND
PROCEDURES
NODE Must be reintegrated during scheduled maintenance window to minimize client disruption.
REINTEGRATION
APP2
VERIFICATION Run the following command and ensure APP2 is active. If not, send notification.
COMMANDS AND
PROCEDURES
NODE Must be reintegrated during scheduled maintenance window to minimize client disruption.
REINTEGRATION
Update the application monitoring worksheet to include all the information required for the
application monitoring tools (Table 3-14 on page 119).
APP1
Instance Count 1
Stabilization Interval 30
Restart Count 3
Restart Interval 95
APP2
Instance Count 1
Stabilization Interval 30
Restart Count 3
Restart Interval 95
After you decide what components to group into a resource group, plan the behavior of the
resource group.
Table 3-15 summarizes the basic startup, fallover, and fallback behaviors that you can
configure for resource groups in PowerHA 7.1.3.
Online on home node only Fallover to next priority Never fall back
(OHNO) for the resource group. node in the list Fall back to higher priority
Fallover using Dynamic node in the list
Node Priority
Online using node distribution Fallover to next priority Never fall back
policy. node in the list
Fallover using Dynamic
Node Priority
Online on first available node Fallover to next priority Never fall back
(OFAN). node in the list Fall back to higher priority
Fallover using Dynamic node in the list
Node Priority
Bring offline (on error node
only)
Online on all available nodes. Bring offline (on error node Never fall back
only)
If the node that is starting is a home node for this resource group, the settling time period is
skipped and PowerHA 7.1.3 immediately attempts to acquire the resource group on this node.
Note: This is a cluster-wide setting and will be set for all OFAN resource groups.
If you decide to define dynamic node priority policies using RMC resource variables to
determine the fallover node for a resource group, consider the following points about dynamic
node priority policy:
It is most useful in a cluster where all nodes have equal processing power and memory.
It is irrelevant for clusters of fewer than three nodes.
It is irrelevant for concurrent resource groups.
Remember that selecting a takeover node also depends on conditions such as the availability
of a network interface on that node. For more information about configuring DNP with
PowerHA, see 10.3, Dynamic node priority (DNP) on page 388.
Although by default, all resource groups are processed in parallel, PowerHA 7.1.3 processes
dependent resource groups according to the order dictated by the dependency, and not
necessarily in parallel. Resource group dependencies are honored cluster-wide and override
any customization for serial order of processing of any resource groups included in the
dependency. Dependencies between resource groups offer a predictable and reliable way of
building clusters with multi-tiered applications.
For more information about resource group dependences, see 10.5, Resource group
dependencies on page 397.
Startup Policy Online on Home Node Only (OHNO) Online on Home Node Only (OHNO)
Fallover Policy Fallover to Next Priority Node in List Fallover to Next Priority Node in List
(FONP) (FONP)
Fallback Policy Fallback to Higher Priority Node Fallback to Higher Priority Node
(FBHP) (FBHP)
Settling Time
Runtime Policies
Tape Resources
Miscellaneous Data
Table 3-17 outlines a sample test plan that can be used to test our cluster.
The Cluster Test Tool uses the PowerHA 7.1.3 Cluster Communications daemon to
communicate between cluster nodes to protect the security of your PowerHA 7.1.3 cluster.
Full details about using the Cluster Test Tool, and details about the tests it can run, can be
found in 6.8, Cluster Test Tool on page 214.
If this is a new installation, allow time to configure and test the basic cluster. After the cluster
is configured and tested, you can integrate the required applications during a scheduled
maintenance window.
Referring back to Figure 3-1 on page 76, you can see that there is a preparation step before
installing PowerHA 7.1.3. This step is intended to ensure the infrastructure is ready for
The preparation step can take some time, depending on the complexity of your environment
and the number of resource groups and nodes to be used. Take your time preparing the
environment because there is no purpose in trying to install PowerHA 7.1.3 in an environment
that is not ready; you will spend your time troubleshooting a poor installation. Remember that
a well configured cluster is built upon solid infrastructure.
After the cluster planning is complete and environment is prepared, the nodes are ready for
PowerHA 7.1.3 to be installed.
The installation of PowerHA 7.1.3 code is straight forward. If you use the installation CD, use
SMIT to install the required file sets. If you use a software repository, you can use NFS to
mount the directory and use SMIT to install from this directory. You can also install through
NIM.
Ensure you have licenses for any features you install, such as PowerHA 7.1.3 Enterprise
Edition.
After you install the required file sets on all cluster nodes, use the previously completed
planning worksheets to configure your cluster. Here you have a few tools available to use to
configure the cluster:
You can configure the PowerHA IBM Systems Director plug-in.
You can use an ASCII screen and SMIT to perform the configuration.
You can use the clmgr command line.
Note: When you configure the cluster, be sure to start by configuring the cluster topology.
This consists of the nodes, repository disk, and heartbeat type. After the cluster topology is
configured, verify and synchronize the cluster. This will create the CAA cluster.
After the topology is successfully verified and synchronized, start the cluster services and
verify that all is running as expected. This will allow you to identify any networking issues
before moving forward to configuring the cluster resources.
After you configure, verify, and synchronize the cluster, run the automated Cluster Test Tool to
validate cluster functionality. Review the results of the test tool and if it was successful, run
any custom tests you want to perform further verification.
After successful testing, take a mksysb of each node and a cluster snapshot from one of the
cluster nodes.
Standard change and problem management processes now apply to maintain application
availability.
The cluster snapshot does not save any user-customized scripts, applications, or other non
PowerHA 7.1.3 configuration parameters. For example, the names of application servers and
the locations of their start and stop scripts are stored in the HACMPserver Configuration
Database object class. However, the scripts themselves and also any applications they might
call are not saved.
The cluster snapshot utility stores the data it saves in two separate files:
ODM data file (.odm):
This file contains all the data stored in the HACMP Configuration Database object classes
for the cluster. This file is given a user-defined basename with the.odm file extension.
Because the Configuration Database information is largely the same on every cluster
node, the cluster snapshot saves the values from only one node.
Cluster state information file (.info):
This file contains the output from standard AIX and PowerHA 7.1.3 commands. This file is
given the same user-defined base name with the .info file extension. By default, this file
no longer contains cluster log information. Note that you can specify in SMIT that
PowerHA 7.1.3 collect cluster logs in this file when cluster snapshot is created.
For a complete backup, take a mksysb of each cluster node according to your standard
practices. Pick one node to perform a cluster snapshot and save the snapshot to a safe
location for disaster recovery purposes.
If you can, take the snapshot before taking the mksysb of the node so that it is included in the
system backup.
Important: You can take a snapshot from any node in the cluster, even if PowerHA 7.1.3 is
down. However, you can apply a snapshot to a cluster only if all nodes are running the
same version of PowerHA 7.1.3 and all are available.
We suggest that you maintain an accurate cluster diagram which can be used for change and
problem management.
In addition, PowerHA 7.1.3 provides updates to the clmgr command to enable creating an
HTML based report from the cluster.
The output can be generated for the whole cluster configuration or limited to special
configuration items such as these:
nodeinfo
rginfo
lvinfo
fsinfo
vginfo
dependencies
Figure 3-12 on page 130 shows the generated report. The report is far longer than depicted.
On a real report, you can scroll through the report page for further details.
Tip: For a full list of available options use the clmgr build in help:
clmgr view report -h
Effective change and problem management processes are imperative to maintaining cluster
availability. To be effective, you must have a current cluster configuration handy. You can use
the clmgr HTML tool to create an HTML version of the configuration and, as we also suggest,
a current cluster diagram.
Any changes to the cluster should be fully investigated as to their effect on the cluster
functionality. Even changes that do not directly affect PowerHA 7.1.3, such as the addition of
extra non PowerHA 7.1.3 workload, can affect the cluster. The changes should be planned,
scheduled, documented, and then tested on a test cluster before ever implementing in
production.
To ease your implementation of changes to the cluster, PoweHA provides the Cluster Single
Point of Control (C-SPOC) SMIT menus. Whenever possible, use the C-SPOC menus to
make changes. With C-SPOC, you can make changes from one node and the change will be
propagated to the other cluster nodes.
Problems with the cluster should be quickly investigated and corrected. Because the primary
job of PowerHA 7.1.3 is to mask any errors from applications, it is quite possible that unless
you have monitoring tools in place, you might be unaware of a fallover. Ensure that you make
use of error notification to notify the appropriate staff of failures.
The cluster simulator provides a specific planning mode. Planning mode gathers all
information that is related to host names, IP addresses, volume groups, and file systems, and
services from a real PowerHA that is running nodes in an initial step, the XML environment
file, contains real data in planning mode.
To collect this real environment from the PowerHA node, the PowerHA Console must be
connected to the PowerHA nodes during this initial step. When the XML environment file,
resulting from the collection is done, the PowerHA Console can work in a disconnected
fashion. Then, the configuration that is displayed in the Console reflects a real environment.
In this mode, as in offline mode, you use IBM Systems Director PowerHA Console to create,
display, change, or delete your PowerHA configuration, and save it into an XML configuration
file, with no possible risk to the production environments. In this mode, the XML configuration
file, which is the result of all actions you have done using the console, can be used in a real
PowerHA environment (for example can be deployed). This mode is useful to prepare or plan
a PowerHA configuration in a disconnected fashion. When your configuration is ready and
verified, the resulting XML configuration files (prepared with the console in the planning
More details about the cluster simulator and options are in Chapter 6 of the Guide to IBM
PowerHA SystemMirror for AIX Version 7.1.3, SG24-8167.
We have found that tailoring these worksheets into a format that fits our environment is useful.
We also include planning sheets in Appendix A, Paper planning worksheets on page 489.
Note: Besides the cluster configuration, the planning phase should also provide a
cluster testing plan. Use this testing plan in the final implementation phase, and also
during periodic cluster validations.
Note: Although you can use the cluster automated test tool, we suggest that you also
perform a thorough manual testing of the cluster.
Before you decide which approach to use, make sure that you have done the necessary
planning, and that the documentation for your cluster is available for use. See Chapter 3,
Planning on page 73.
The clmgr command line utility and the IBM Systems Director PowerHA plug-in can also be
used for configuring a cluster. For more information about using clmgr, see Appendix B in IBM
PowerHA SystemMirror Standard Edition 7.1.1 for AIX Update, SG24-8030.
Regardless of which method is used, be sure that the /etc/cluster/rhosts file is populated
with the host name IP address of each node in the cluster and clcomd refreshed before
beginning.
We configure a typical two-node hot standby cluster using the standard method.
The following prerequisites, assumptions, and defaults apply for the Standard Configuration
Path:
PowerHA software must be installed on all nodes of the cluster.
All network interfaces must be completely configured at the operating system level. You
must be able to communicate from one node to each of the other nodes and vice versa.
The PowerHA discovery process runs on all cluster nodes, not just the local node.
When you use the standard configuration path and information that is required for
configuration resides on remote nodes, PowerHA automatically discovers the necessary
cluster information for you. Cluster discovery is run automatically while you use the
standard configuration path.
PowerHA assumes that all network interfaces on a physical network belong to the same
PowerHA network.
Host names are used as node names.
Using the options under the Custom Configuration menu you can add the basic components
of a cluster to the PowerHA configuration database, and also other types of behaviors and
resources. Use the custom configuration path to customize the cluster components such as
policies and options that are not included in the standard menu.
Use the custom configuration path if you plan to use any of the following options:
Custom Disk Methods
Custom Volume Group Methods
Custom File System Methods
Customize Resource Recovery
Customize Inter-Site Resource Group Recovery
Create User Defined Events
Modify Pre/Post Event commands
Remote Notification Warnings
Change Warning Notification time
Change System Events (rootvg)
Advance method of Cluster Verification and Synchronization
When you use the standard configuration path, the node name and system host name are
expected to be the same. If you want them to differ, change them manually.
After you select the options and press Enter, the discovery process runs. This discovery
process automatically configures the networks so you do not have to do it manually.
[Entry Fields]
* Cluster Name [redbookdemocluster]
New Nodes (via selected communication paths) [cassidy]
+
Currently Configured Node(s) jessica
Figure 4-1 Add cluster and nodes
[Entry Fields]
* Cluster Name redbookdemocluster
* Heartbeat Mechanism Unicast
+
* Repository Disk [(00f6f5d015a4310b)] +
Cluster Multicast Address []
(Used only for multicast heartbeat)
Figure 4-2 Add repository and heartbeat method
In our example, we use unicast instead of multicast. For the repository disk, we highlight the
field and press F4 to see a list of all free shared disks between the two nodes. The disk size
must be at least 512 MB, however PowerHA discovery does not verify that the size is
adequate.
We run smitty sysmirror, select Cluster Nodes and Networks Verify and Synchronize
Cluster Configuration, and press Enter three times for synchronization to begin. This can
take several minutes for it to create the CAA cluster.
NODE cassidy:
Network net_ether_01
cassidy_xd 192.168.150.52
Network net_ether_010
cassidy 192.168.100.52
NODE jessica:
Network net_ether_01
jessica_xd 192.168.150.51
Network net_ether_010
jessica 192.168.100.51
Same PVID: All cluster nodes must have the same PVID for each shared disk, otherwise,
you will not be able to create the LVM components.
We run smitty cspoc, select Storage Volume Groups Create a Volume Group,
choose both nodes, choose one or more disks from the pick list, and choose a volume group
type from the pick list. The final SMIT menu is displayed, as shown in Figure 4-3.
Notice the Resource Group Name field. This gives the option to automatically create the
resource group and put the volume group resource into the resource group.
Important: When you choose to create a new resource group from C-SPOC, the resource
group will be created with the following default policies:
Startup: Online on home node only
Fallover: Fallover to next priority node in the list
Fallback: Never fallback
Repeat this procedure for all volume groups that will be configured in the cluster.
The new logical volume, shawnlv, is created and information is propagated on the other
cluster nodes. Repeat as needed for each logical volume needed.
Important: If logical volume type jfs2log is created, C-SPOC automatically runs the
logform command so that the type can be used.
Important: File systems are note allowed on volume groups that are a resource in an
Online on All Available Nodes type resource group.
To create a JFS2 file system on a previously defined logical volume, we use these steps:
1. Enter smitty cspoc and select Storage File Systems.
2. Choose volume group from pop-up list (in our case demovg).
3. Choose the type of File System (Enhanced, Standard, Compressed, or Large File
Enabled).
4. Select the previously created logical volume, shawnlv, from the pick list.
5. Fill in the necessary fields, as shown in Example 4-5.
Example 4-5 C-SPOC creating jfs2 file system on an existing logical volume
Add an Enhanced Journaled File System on a Previously Defined Logical Volume
Important: File systems are not allowed on volume groups that are a resource in an
Online on All Available Nodes type resource group.
The /shawnfs file system is now created. The contents of /etc/filesystems on both nodes
are now updated with the correct jfs2log. If the resource group and volume group are online
the file system is mounted automatically after creation.
Tip: With JFS2, you also have the option to use inline logs that can be configured from the
options in the previous example.
Make sure the mount point name is unique across the cluster. Repeat this procedure as
needed for each file system
To add an application controller, run smitty sysmirror, select Cluster Applications and
Resources Resources Configure User Applications (Scripts and Monitors)
Application Controller Scripts Add Application Controller Scripts, and then press
Enter.
[Entry Fields]
* Application Controller Name [bannerapp]
* Start Script [/usr/bin/banner start]
* Stop Script [/usr/bin/banner stop]
Application Monitor Name(s)
+
Application startup mode [background] +
Figure 4-4 Create application controller
In our case we do not have a real application so we use the banner command instead. Repeat
as needed for each application.
To create a service IP labels run smitty sysmirror, select Cluster Applications and
Resources Resources Configure Service IP Labels/Addresses Add a
Service IP Label/Address, and then press Enter. Then choose the network from the list. The
final SMIT menu is displayed, as shown in Figure 4-5.
[Entry Fields]
* IP Label/Address dallasserv +
Netmask(IPv4)/Prefix Length(IPv6) []
* Network Name net_ether_01
Figure 4-5 Add a Service IP Label
For the IP Label/Address field press F4 and a pick list will be generated of entries in the
/etc/hosts file that are not already defined to the cluster.
[Entry Fields]
* Resource Group Name [demoRG]
* Participating Nodes (Default Node Priority) [jessica cassidy] +
Complete the fields as shown in Figure 4-6. To complete the Participating Nodes field, enter
the information, separated by a space, or select from a pick list by first highlighting the field
and then pressing the F4 key.
For more information about policy options, see 2.4.10, Resource groups on page 46.
Run smitty sysmirror, select Custom Cluster Configuration Verify and Synchronize
Cluster Configuration (Advanced), and then press Enter. This time we a menu of options is
listed, as shown in Figure 4-8. Although most options are self-explanatory, one needs further
explanation: Automatically correct errors found during verification. This option is useful and
can be used from only this advanced option. It can correct certain problems automatically, or
if you can have it run interactively so it prompts for approval before correcting. For more
information about this option, see 7.6.5, Running automatically corrective actions during
verification on page 291.
[Entry Fields]
* Verify, Synchronize or Both [Both] +
* Include custom verification library checks [Yes] +
* Automatically correct errors found during [Interactively] +
verification?
After successful synchronization, you can start testing the cluster. For more information about
cluster testing, see 6.8, Cluster Test Tool on page 214.
Note: To run in Simulator Offline Mode, only the PowerHA SystemMirror Director Server
plug-in must be installed. Agent nodes are not needed.
Operating system: To run PowerHA plug-in for Systems Director, the minimum operating
system version is AIX 6.1 TL9 (or later) or AIX 7.1 TL3 (or later). For a managed server,
any operating system supported by Systems Director 6.3 is able to run the plug-in.
Note: To check all supported environments to run Systems Director 6.3, see Supported
IBM systems and products:
http://www.ibm.com/support/knowledgecenter/SSAV7B_633/com.ibm.director.plan.
helps.doc/fqm0_r_hardware_compatibility.html?cp=SSAV7B_633%2F2-3-0
Systems Director server: To support the cluster simulator feature, the minimum Director
server version is 6.3.2 (or later).
PowerHA SystemMirror: The minimum PowerHA version supported for the cluster
simulator feature is PowerHA SystemMirror 7.1.3.
After downloading and uncompressing the plug-in installation package, for AIX, Linux, or
Windows running Systems Director server, run the following binary that is in the package:
IBMSystemsDirector_PowerHA_sysmirror_Setup.bin
The installation proceeds as shown in Example 4-6 (on AIX 7.1 operating system).
Launching installer...
Graphical installers are not supported by the VM. The console mode will be used instead...
===============================================================================
Choose Locale...
----------------
1- Deutsch
->2- English
===============================================================================
Introduction
------------
It is strongly recommended that you quit all programs before continuing with
this installation.
Respond to each prompt to proceed to the next step in the installation. If you
want to change something on a previous step, type 'back'.
===============================================================================
===============================================================================
IBM Director Start
------------------
1- Yes
->2- No
ENTER THE NUMBER FOR YOUR CHOICE, OR PRESS <ENTER> TO ACCEPT THE DEFAULT:: 1
===============================================================================
Installing...
-------------
[==================|==================|==================|==================]
[------------------|------------------|------------------|------------------]
Thu Dec 19 10:07:50 CST 2013 PARMS: stop
Thu Dec 19 10:07:50 CST 2013 The lwi dir is: :/opt/ibm/director/lwi:
Thu Dec 19 10:07:50 CST 2013 localcp:
/opt/ibm/director/lwi/runtime/USMiData/eclipse/plugins/com.ibm.usmi.kernel.persistence_6.3.
3.jar:/opt/ibm/director/lwi/runtime/USMiMain/eclipse/plugins/com.ibm.director.core.kernel.n
l1_6.3.3.1.jar:/opt/ibm/director/lwi/runtime/USMiData/eclipse/plugins/com.ibm.usmi.kernel.p
ersistence.nl1_6.3.2.jar:/opt/ibm/director/bin/..//bin/pdata/pextensions.jar
Thu Dec 19 10:07:50 CST 2013 directorhome: /opt/ibm/director
Thu Dec 19 10:07:50 CST 2013 java_home: /opt/ibm/director/jre
Thu Dec 19 10:07:51 CST 2013 inscreenmessage STARTOFMESS --formatmessage- -shuttingdown-
-IBM Director- --
Thu Dec 19 10:07:51 CST 2013 startingvalue is shuttingdown
Thu Dec 19 10:07:51 CST 2013 shuttingdown IBM Director
Thu Dec 19 10:07:51 CST 2013 Calling lwistop
Thu Dec 19 10:08:22 CST 2013 lwistop complete
Thu Dec 19 10:08:22 CST 2013 starting wait for shutdown on lwipid:
Thu Dec 19 10:08:22 CST 2013 Running PID: ::
Thu Dec 19 10:16:27 CST 2013 PARMS: start
Thu Dec 19 10:16:27 CST 2013 The lwi dir is: :/opt/ibm/director/lwi:
Thu Dec 19 10:16:27 CST 2013 localcp:
/opt/ibm/director/lwi/runtime/USMiData/eclipse/plugins/com.ibm.usmi.kernel.persistence_6.3.
3.jar:/opt/ibm/director/lwi/runtime/USMiMain/eclipse/plugins/com.ibm.director.core.kernel.n
l1_6.3.3.1.jar:/opt/ibm/director/lwi/runtime/USMiData/eclipse/plugins/com.ibm.usmi.kernel.p
ersistence.nl1_6.3.2.jar:/opt/ibm/director/bin/..//bin/pdata/pextensions.jar
Thu Dec 19 10:16:27 CST 2013 directorhome: /opt/ibm/director
Thu Dec 19 10:16:27 CST 2013 java_home: /opt/ibm/director/jre
Thu Dec 19 10:16:32 CST 2013 inscreenmessage STARTOFMESS --formatmessage- -starting- -IBM
Director- --
Thu Dec 19 10:16:32 CST 2013 startingvalue is starting
Thu Dec 19 10:16:32 CST 2013 starting IBM Director
Thu Dec 19 10:16:32 CST 2013 inscreenmessage STARTOFMESS --formatmessage- -startingprocess-
- - --
Thu Dec 19 10:16:33 CST 2013 startingvalue is startingprocess
Thu Dec 19 10:16:33 CST 2013 startingprocess
Considering that the cluster nodes are only servers that are controlled by the managed
server, only two packages must be installed on them:
Systems Director Common Agent 6.3.3 (or newer)
PowerHA SystemMirror for Systems Director 7.1.3 (or newer)
Note: Remember that only the PowerHA plug-in starting from version 7.1.3 allows the
cluster simulator feature.
Both packages can be downloaded from the IBM Systems Director download page:
http://www-03.ibm.com/systems/director/downloads/plugins.html
Install the cluster.es.director.agent file set and common agent on each node in the
cluster.
Some subsystems are added as part of the installation: platform_agent and cimsys.
For more information about available options for using the IBM System Director PowerHA
plug-in, see Chapter 6 in the Guide to IBM PowerHA SystemMirror for AIX Version 7.1.3,
SG24-8167.
Chapter 5. Migration
This chapter covers the most common migration scenarios from PowerHA 6.1 and PowerHA
7.1.x to PowerHA 7.1.3.
Before the migration, always a have a backout plan in case any problems are encountered.
Some general suggestions are as follows:
Create a backup of rootvg.
In most cases of upgrading PowerHA, updating or upgrading AIX is also required. So
always save your existing rootvg. Our preferred method is to create a clone by using
alt_disk_copy to another free disk on the system. That way, a simple change to the
bootlist and a reboot can easily return the system to the beginning state.
Other options are available, such as mksysb, alt_disk_install, and multibos.
Save the existing cluster configuration.
Create a cluster snapshot before the migration. By default it is stored in the following
directory; make a copy of it and also save a copy from the cluster nodes for additional
insurance.
/usr/es/sbin/cluster/snapshots
Save any user-provided scripts.
This most commonly refers to custom events, pre- and post-events, application controller,
and application monitoring scripts.
Verify, by using the lslpp -h cluster.* command, that the current version of PowerHA is in
the COMMIT state and not in the APPLY state. If not, run smit install_commit before you
install the most recent software version.
Software requirements
The software requirements are as follows:
AIX 6.1 TL9 SP1 (SP4 or later recommended)
AIX 7.1 TL3 SP1 (SP4 or later recommended)
Migrating from PowerHA SystemMirror 6.1 or earlier requires installing these AIX file sets:
bos.cluster.rte
bos.ahafs
bos.clvm.enh
devices.commom.IBM.storfwork.rte
clic.rte (for secured encryption communication options of clcomd)
Optional: cas.agent (for Systems Director plug-in)
VIOS 2.2.0.1-FP24 SP01
PowerHA SystemMirror 7.1.3 (SP2 or later recommended)
http://www-01.ibm.com/support/knowledgecenter/SSPHQG_7.1.0/com.ibm.powerha.insg
d/ha_install_mig61_plan.htm
Hardware requirements
Use IBM systems that run IBM POWER5, POWER6, or POWER7 technology-based
processors.
In many cases, when migrating from a previous version of PowerHA, a disk heartbeat device
already exists. If that is the case, and it is not a data disk, and the size is at least 512 MB,
repurposing that disk as the cluster repository disk is possible.
Note: In our experience, even a three-node cluster does not use more than 500 MB; it
uses 448 MB of the repository disk. However, we suggest to simply use a 1GB disk for up
to four node clusters and maybe add 256 MB for each additional node.
Multicast or unicast
PowerHA v7.1.3 offers the choice of using either multicast or unicast for heartbeating. When
migrating from a version before v7 of PowerHA, unicast was used. You can to continue to use
unicast. However, if you want to use multicast, ensure that your network both supports and
has multicasting enabled. For more information, see 12.1, Multicast considerations on
page 422.
First four features: If your cluster is configured with any of the first four features in this list,
your environment cannot be migrated. You must either change or remove the features
before migrating, or simply remove the cluster and configure a new one with the new
version of PowerHA.
Although possible, you do not need to remove the non-IP networks from the configuration
before the migration. The migration process warns you and removes any identified non-IP
network.
Important: The Communication Path to a Node parameter for each PowerHA cluster node
must match the host name of that node. If not, you must reconfigure the cluster so that they
match. Also the cluster node host name cannot resolve to a persistent IP address. The
host name of the cluster nodes must resolve to a base IP.
With the introduction of PowerHA 7.1, you now use the features of CAA, introduced in AIX 6.1
TL6 and AIX 7.1. The migration process now has two main cluster components:
CAA
PowerHA
This process involves updating your existing PowerHA product and configuring the CAA
cluster component.
The clmigcheck command: The clmigcheck command automatically creates the CAA
cluster when it is run on the last node.
For a detailed explanation about the clmigcheck process, see 5.3, The clmigcheck
program on page 162.
3. Stage 3: Upgrading to PowerHA 7.1
After stage 2 is completed, you upgrade to PowerHA 7.1 on the node. Figure 5-1 on
page 158 shows the state of the cluster in the test environment after upgrading to
PowerHA 7.1 on one node.
Topology services are still active so that the newly migrated PowerHA 7.1 node can
communicate with the previous version, PowerHA 6.1. Although the CAA configuration is
completed, the CAA cluster is not yet created.
grpsvcs grpsvcs
Heartbeat
topsvcs topsvcs
CAA
HW HW
grpsvcs grpsvcs
Heartbeat
topsvcs topsvcs
Heartbeat
CAA CAA
HW HW
At this stage, the clmigcheck process has run on the last node of the cluster. The CAA
cluster is now created and CAA has established communication with the other node.
However, PowerHA is still using the Topology Services (topsvcs) function because the
migration switchover to CAA is not yet completed.
Figure 5-3 Extract from the clstrmgr.debug file showing the migration protocol
grpsvcs grpsvcs
topsvcs topsvcs
Heartbeat
CAA CAA
HW HW
Figure 5-5 shows the services that are running after migration, including cthags.
Table 5-1 shows changes to the SRC subsystem before and after migration.
The clcomd daemon uses port 16191, and the clcomdES daemon uses port 6191. When
migration is complete, the clcomdES daemon is removed.
clcomd daemon: You can have two instances of the clcomd daemon in the cluster, but
never on a given node. After PowerHA 7.1 is installed on a node, the clcomd daemon is
run, and the clcomdES daemon does not exist. AIX 6.1.6.0 and later with a back-level
PowerHA version (before version 7.1) runs only the clcomdES daemon even though the
clcomd daemon exists.
clcomdES daemon: The clcomdES daemon is removed when the older PowerHA software
version is removed (snapshot migration) or overwritten by the new PowerHA 7.1 version
(rolling or offline migration).
After migration is complete, the following line is added to the /etc/syslog.conf file:
*.info /var/adm/ras/syslog.caa rotate size 1m files 10
Command profile: The clmigcheck command is not a PowerHA command, but the
command is part of bos.cluster and is in the /usr/sbin directory.
The clmigcheck program uses the mkcluster command and passes the cluster parameters
from the existing PowerHA cluster, along with the repository disk and multicast address (if
applicable). Figure 5-7 shows the mkcluster command being called.
A warning message is displayed for certain unsupported elements, such as disk heartbeat as
shown in Figure 5-9.
Non-IP networks can be dynamically removed during the migration process by using the
clmigcleanup command. However, other configurations, such as IPAT through replacement,
require manual steps to remove or change them to a supported configuration.
After the changes are made, run clmigcheck again to ensure that the error is resolved.
The second function of the clmigcheck program is to prepare the CAA cluster environment.
This function is performed when you select option 3 (Enter repository disk and IP addresses)
from the menu.
When you select this option, the clmigcheck program stores the information entered in the
/var/clmigcheck/clmigcheck.txt file. This file is also copied to the /var/clmigcheck
directory on all nodes in the cluster. This file contains the physical volume identifier (PVID) of
the repository disk and the chosen multicast address. If PowerHA is allowed to choose a
multicast address automatically, the NULL setting is specified in the file. Figure 5-10 shows
an example of the clmigcheck.txt file.
CLUSTER_TYPE:STANDARD
CLUSTER_REPOSITORY_DISK:000fe40120e16405
CLUSTER_MULTICAST:NULL
Figure 5-10 Contents of the clmigcheck.txt file
The clmigcheck command checks whether the clmigcheck.txt file exists. If the
clmigcheck.txt file exists and the node is not the last node in the cluster to be migrated, the
panel shown in Figure 5-11 is displayed. It contains a message indicating that you can now
upgrade to the later level of PowerHA.
clmigcheck: This is not the first node or last node clmigcheck was run on.
No further checking is required on this node. You can install the new
version of PowerHA SystemMirror.
-----------------------------------------------------------------------
Figure 5-11 The clmigcheck panel after it has been run once and before the PowerHA upgrade
The clmigcheck program checks the installed version of PowerHA to see if was upgraded.
This step is important to determine which node is the last node to be upgraded in the cluster.
If it is the last node in the cluster, further configuration operations must be completed along
with creating and activating the CAA cluster.
Important: You must run the clmigcheck program before you upgrade PowerHA. Then
upgrade PowerHA one node at a time, and run the clmigcheck program on the next node
only after you complete the migration on the previous node.
ERROR: This program is intended for PowerHA configurations prior to version 7.1
The version currently installed appears to be: 7.1.2
Figure 5-12 clmigcheck panel after PowerHA has been installed on a node.
Figure shows an extract from the /tmp/clmigcheck/clmigcheck.log file that was taken when
the clmigcheck command ran on the last node in a three-node cluster migration. This file
shows the output by the clmigcheck program when checking whether this node is the last
node of the cluster.
ck_lastnode: oldnodes = 1
clmigcheck: This is the last node to run clmigcheck, create the CAA cluster
Extract from clmigcheck.log file showing the lslpp last node checking
From 6.1 Update to SP14 first then R,S,O are all viable options to 7.1.x
From 7.1.1 R, S, O, Nb R, S, O, Nb
From 7.1.2 R, S, O, Nb
a. R=Rolling, S=Snapshot, O=Offline, N=Nondisruptive
b. This option available only if beginning AIX level is high enough to support the
newer version.
Important: Before migration, always start with the most recent service packs available for
PowerHA, AIX, and VIOS. After migration also apply the latest PowerHA service pack.
Example 5-1 shows that the cluster topology includes a disk heartbeat network. This type of
network is deprecated, and it is automatically removed when the last node starts cluster
services.
Demonstration: See the demonstration about a rolling migration from PowerHA v6.1 to
PowerHA v7.1.3:
https://www.youtube.com/watch?v=MaPxuK4poUw
Note: If the value of COMMUNICATION_PATH does not match the AIX host name
output, /usr/sbin/clmigcheck displays the following error message:
------------[ PowerHA System Mirror Migration Check ]-------------
This error means you must correct the environment before proceeding with the
migration.
HACMPnode:
name = "jessica"
object = "COMMUNICATION_PATH"
value = "jessica"
node_id = 1
node_handle = 1
version = 11
HACMPnode:
name = "jordan"
object = "COMMUNICATION_PATH"
value = "jordan"
node_id = 2
node_handle = 2
version = 11
#[jessica] hostname
jessica
#[jordan] hostname
jordan
Note: Most rolling migrations from PowerHA v6.1 will require an AIX update. If so,
perform that step now including any additional new prerequisite file sets as mentioned
in Software requirements on page 154.
<select 1>
1 = DEFAULT_MULTICAST
2 = USER_MULTICAST
3 = UNICAST
1 = 000262ca102db1a2(hdisk2)
2 = 000262ca34f7ecd9(hdisk5)
Note: The following warning message is always displayed when UNICAST is selected.
If a repository disk was assigned, you can ignore the message.
------------[ PowerHA System Mirror Migration Check ]-------------
Note - If you have not completed the input of repository disks and
multicast IP addresses, you will not be able to install PowerHA System
Mirror. Additional details for this session may be found in
/tmp/clmigcheck/clmigcheck.log.
6. Install all the PowerHA 7.1.3 file sets (use smitty update_all).
7. Start cluster services (smitty clstart).
The output of the lssrc -ls clstrmgrES command on node jessica is shown in
Example 5-5.
8. On jordan, stop cluster services with the Move Resource Groups option to jessica.
Note: If your environment requires updating AIX, perform that step now.
13.Verify that UNICAST is in place for CAA inter-node communications on jessica as shown
in Example 5-8.
----------------------------------------------------------------------------
Note: The lscluster -m output on the remote node shows the reverse unicast network
direction:
-----------------------------------------------------------------------
tcpsock->01 UP IPv4 none 9.19.51.194->9.19.51.193
-----------------------------------------------------------------------
14.Install all PowerHA 7.1.3 file sets on node jordan (use smitty update_all).
15.Start cluster services on node jordan (smitty clstart).
16.Verify that the cluster has completed the migration on both nodes as shown in
Example 5-9.
Note: Both nodes must show CLversion: 15, otherwise the migration did not complete
successfully. Call IBM support.
Notes:
The running AIX level for the following migration is AIX 7.1 TL3 SP0.
The running PowerHA Level is PowerHA 7.1.0 SP8.
Remember the requirements for PowerHA 7.1.3:
AIX 6.1 TL9 SP0
AIX 7.1 TL3 SP0
5. On jordan, stop cluster services with the Move Resource Groups option.
6. If your environment requires updating AIX, perform that step now.
7. Install all PowerHA 7.1.3 file sets (use smitty update_all).
8. Start cluster services on node jordan (smitty clstart).
9. Verify that the cluster has completed the migration on both nodes (Example 5-12).
Note: Both nodes must show CLversion: 15, otherwise, the migration did not complete
successfully. Call IBM support.
Demonstration: See the demonstration about snapshot migration from PowerHA v6.1.0 to
PowerHA v7.1.3 at the following web address:
https://www.youtube.com/watch?v=1pkaQVB8r88
Demonstration: See the demonstration about offline migration from PowerHA v6.1 to
PowerHA v7.1.3 at the following web address:
https://youtu.be/DWWjzTJwJpo
1. Stop cluster services on all nodes. Choose to bring resource groups offline.
2. Create a cluster snapshot if you have not previously created one and saved copies of it.
3. Upgrade AIX (if needed).
4. Install additional requisite file sets as listed in Software requirements on page 154.
5. Reboot.
6. Verify that clcomd is active:
lssrc -s clcomd
7. Update /etc/cluster/rhosts.
Enter either cluster node host names or IP addresses; only one per line.
Note: The AIX level for this migration is AIX 7.1 TL3 SP0. To use the nondisruptive option,
the AIX levels must already be at the supported levels that are required for the version of
PowerHA you are migrating to. PowerHA level is 7.1.2 SP3.
4. On node jordan (the second and last node to be migrated), stop cluster services (smitty
clstop) with the Unmanage Resource Groups option. After it completes, node jordan is
listed as Forced down, as shown in Example 5-16.
Note: Both nodes must show CLversion: 15, otherwise the migration has not
completed successfully. Call IBM support if necessary.
During our migration testing, we encountered this error (Example 5-19) on two separate
clusters. However, the cause in each case was not exactly the same.
The second time we encountered the error on our GLVM cluster, we discovered that neither
the node name nor the communication path resolved to the host name IP address, as shown
in Example 5-20.
HACMPnode:
name = "node2"
object = "COMMUNICATION_PATH"
value = "10.1.11.21"
node_id = 2
node_handle = 2
version = 11
Ultimately, to resolve this challenge, the only task to be accomplished is to have the node
communication path point to the host name IP address. However, in our scenario we did
these steps:
1. Changed the node name to match the host name.
2. Changed the boot IP to be the host name IP.
Run smitty sysmirror and then select Cluster Nodes and Networks Manage Nodes
Change/Show a Node. The SMIT panel opens (Example 5-21).
[Entry Fields]
* Node Name node1
New Node Name [matt]
Communication Path to Node [10.1.10.21] +
Important: You must remove your resource group configuration before you can remove
the network.
2. Edit your /etc/hosts file to change the boot address host name to match the node name.
Make a note of your resource group configuration first. Then, remove it by running smitty
sysmirror and then selecting Cluster Applications and Resources Resource
Groups Remove a Resource Group.
3. Remove the network. This also removes your network interface configuration. Run smitty
sysmirror and then select Cluster Nodes and Networks Manage Networks and
Network Interfaces Networks Remove a Network.
4. Edit the /etc/hosts file on both nodes by changing the boot interface to match the node
name. Example 5-22 shows our before and after host table entries.
#HACMP config
#boot addresses
10.1.10.21 mattbt
10.1.11.21 mattbt2
10.1.10.22 shawnbt
10.1.11.22 shawnbt2
#Service address
10.2.10.21 RG1svc
10.2.10.22 RG2svc
AFTER:>
127.0.0.1 loopback localhost # loopback (lo0) name/address
9.175.210.77 mattpers
9.175.210.78 shawnpers
9.175.211.187 aix2.usceth.farn.uk.ibm.com
#PowerHA config
#boot addresses
10.1.10.21 matt
10.1.11.21 mattbt2
10.1.10.22 shawn
10.1.11.22 shawnbt2
#Service address
10.2.10.21 RG1svc
10.2.10.22 RG2svc
Note: Usually a complete stop, sync and verify, and restart of the cluster completes the
migration. If not, contact IBM support.
After the migration, the output of the cltopinfo command might continue to list the disk
heartbeat network as shown in Example 5-24.
NODE jessica:
Network shawn_dhb_01
jess_hdisk1 /dev/hdisk1
Network net_ether_01
ha_svc 192.168.100.100
jess_boot1 192.168.100.1
Network net_ether_02
jessica 9.19.51.193
NODE jordan:
Network shawn_dhb_01
jordan_hdisk1 /dev/hdisk1
Network net_ether_01
ha_svc 192.168.100.100
jordan_boot1 192.168.100.2
Network net_ether_02
jordan 9.19.51.194
COMMAND STATUS
Add a Network
Change/Show a Network
Remove a Network
+--------------------------------------------------------------------------+
| Select a Network to Remove |
| |
| Move cursor to desired item and press Enter. |
| |
| net_ether_01 (192.168.100.0/24) |
| shawn_dhb_01 |
| |
| F1=Help F2=Refresh F3=Cancel |
| F8=Image F10=Exit Enter=Do |
F1| /=Find n=Find Next |
9+--------------------------------------------------------------------------+
3. Synchronize your cluster by running smitty sysmirror and then selecting Custom
Cluster Configuration Verify and Synchronize Cluster Configuration (Advanced).
4. Verify that the disk heartbeat was removed from cltopinfo output (Example 5-25).
NODE jessica:
Network net_ether_01
ha_svc 192.168.100.100
jess_boot1 192.168.100.1
Network net_ether_02
jessica 9.19.51.193
NODE jordan:
Network net_ether_01
ha_svc 192.168.100.100
jordan_boot1 192.168.100.2
Network net_ether_02
jordan 9.19.51.194
If you receive this error, copy the information from /usr/es/sbin/cluster/utilities from
another upgraded PowerHA 7.1.3 node and then run clmigcheck again.
In this chapter, AIX best practices for troubleshooting, including monitoring the error log, are
assumed. However, we do not cover how to determine what problem exists, whether dealing
with problems either after they are discovered or as preventative maintenance.
6.1.1 Scope
Change control is not within the scope of the documented procedures in this book. It
encompasses several aspects and is not optional. Change control includes, but is not limited
to these items:
Limit root access
Thoroughly documented and tested procedures
Proper planning and approval of all changes
Although many current PowerHA customers have a test cluster, or at least begin with a test
cluster, over time these cluster nodes become used within the company in some form. Using
these systems requires a scheduled maintenance window much like the production cluster. If
that is the case, do not be fooled, because it truly is not a test cluster.
A test cluster, ideally, is at least the same AIX, PowerHA, and application level as the
production cluster. The hardware should also be as similar as possible. In most cases, fully
mirroring the production environment is not practical, especially when there are multiple
production clusters. Several approaches exist to maximize a test cluster when multiple
clusters have varying levels of software.
Using logical partitioning (LPAR), Virtual I/O Servers (VIOS), and multiple various rootvg
images, by using alt_disk_install or multibos, are common practice. Virtualization allows a
test cluster to be easily created with few physical resources and can even be within the same
physical machine. With the multi-boot option, you can easily change cluster environments by
simply booting the partition from another image. This also allows testing of many software
procedures such as these:
Applying AIX maintenance
Applying PowerHA fixes
Applying application maintenance
This type of test cluster requires at least one disk, per image, per LPAR. For example, if the
test cluster has two nodes and three different rootvg images, it requires a minimum of six hard
drives. This is still easier than having six separate nodes in three separate test clusters.
A test cluster also allows testing of hardware maintenance procedures. These procedures
include, but are not limited to the following updates and replacement:
System firmware updates
Adapter firmware updates
Adapter replacement
Disk replacement
More testing can be accomplished by using the Cluster Test Tool and event emulation.
After PowerHA is installed, the cluster manager process (clstrmgrES) is always running,
regardless of whether the cluster is online. It can be in one of the following states as displayed
by running the lssrc -ls clstrmgrES command:
NOT_CONFIGURED The cluster is not configured or node is not synchronized.
ST_INIT The cluster is configured but not active on this node.
ST_STABLE The cluster services are running with resources online.
ST_JOINING The cluster node is joining the cluster.
ST_VOTING The cluster nodes are voting to decide event execution.
ST_RP_RUNNING The cluster is running a recovery program.
RP_FAILED A recovery program event script failed.
ST_BARRIER The clstrmgr process is between events, waiting at the barrier.
ST_CBARRIER The clstrmgr process is exiting a recovery program.
ST_UNSTABLE The cluster is unstable usually do to an event error.
Changes in the state of the cluster are referred to as cluster events. The Cluster Manager
monitors local hardware and software subsystems on each node for events such as an
application failure event. In response to such events, the Cluster Manager runs one or more
event scripts such as a restart application script. Cluster Managers running on all nodes
exchange messages to coordinate required actions in response to an event.
During maintenance periods, you might need to stop and start cluster services. But before
you do that, be sure to understand the node interactions it causes and the impact on your
systems availability. The cluster must be synchronized and verification should detect no
errors. The following section briefly describes the processes themselves and then the
processing involved in startup or shutdown of these services. Later in this section, we
describe the procedures necessary to start or stop cluster services on a node.
As the root user, complete the following steps to start the cluster services on a node:
1. Run the SMIT fast path smitty clstart and press Enter. The Start Cluster Services panel
opens (see Figure 6-1 on page 193).
[Entry Fields]
* Start now, on system restart or both now +
Start Cluster Services on these nodes [Maddi,Patty] +
* Manage Resource Groups Automatically +
BROADCAST message at startup? true +
Startup Cluster Information Daemon? true +
Ignore verification errors? false +
Automatically correct errors found during Interactively +
cluster start?
The reason for this is directly related to what happens after system failure. If a
resource group owning system crashes, and AIX is set to reboot after crash, it can
restart cluster services in the middle of a current takeover. Depending on the cluster
configuration this might cause resource group contention, resource group
processing errors, or even a fallback to occur. All of which can extend an outage.
However during test and maintenance periods, and even on dedicated standby
nodes, using this option might be convenient.
Note: There are situations when choosing Interactively will correct some errors.
More details are in 7.6.5, Running automatically corrective actions during
verification on page 291.
After you complete the fields and press Enter, the system starts the cluster services on the
nodes specified, activating the cluster configuration that you defined. The time that it takes the
commands and scripts to run depends on your configuration (that is, the number of disks, the
number of interfaces to configure, the number of file systems to mount, and the number of
applications being started).
During the node_up event, resource groups are acquired. The time it takes to run each
node_up event is dependent on the resource processing during the event. The node_up
events for the joining nodes are processed sequentially.
When the command completes running and PowerHA cluster services are started on all
specified nodes, SMIT displays a command status window. Note that when the SMIT panel
indicates the completion of the cluster startup, event processing in most cases has not yet
completed. To verify the nodes are up you can use clstat or even tail on the hacmp.out file
on any node. More information about this is in 7.7.1, Cluster status checking utilities on
page 294.
[Entry Fields]
* Stop now, on system restart or both now +
Stop Cluster Services on these nodes [Maddi,Patty]+
BROADCAST cluster shutdown? true +
* Select an Action on Resource Groups Bring Resource Group>+
Understanding each of these actions is important, along with stopping and starting cluster
services, because they are often used during maintenance periods.
In the following topics, we assume that cluster services are running, the resource groups are
online, the applications are running, and the cluster is stable. If the cluster is not in the stable
state, then the operations related to resource group are not possible.
All three resource group options we describe can be done by using the clRGmove command.
However, in our examples, we use C-SPOC. They also all have similar SMIT panels and pick
lists. In an effort to streamline this documentation, we show only one SMIT panel in each of
the following sections.
2. Select a resource group from the list and press Enter. Another pick list is displayed (Select
an Online Node). The pick list contains only the nodes that are currently active in the
cluster and that currently are hosting the previously selected resource group.
3. Select an online node from the pick list and press Enter.
4. The final SMIT menu opens with the information that was selected in the previous pick
lists, as shown in Figure 6-4. Verify the entries you previously specified and then press
Enter to start the processing of the resource group to be brought offline.
[Entry Fields]
Resource Group to Bring Offline Maddi_rg
Node On Which to Bring Resource Group Offline Maddi
After processing is completed, the resource group be offline, but cluster services remain
active on the node. The standby will not acquire the resource group.
This option is also available by using either the clRGinfo or clmgr command. For more
information about these commands, see the man pages.
Upon successful completion, PowerHA displays a message and the status, location, and a
type of location of the resource group that was successfully started on the specified node.
This option is also available using either the clRGinfo or clmgr command. For more
information about these commands, see the man pages.
+--------------------------------------------------------------------------+
| Select a Destination Node |
| |
| Move cursor to desired item and press Enter. |
| |
| jessica |
| shanley |
| |
| F1=Help F2=Refresh F3=Cancel |
| F8=Image F10=Exit Enter=Do |
F1| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
PowerHA also has the ability to move a resource group to another site. The concept is the
same as moving it between local nodes. For our example, we use the option to move to
another node rather than to another site.
[Entry Fields]
Resource Group(s) to be Moved xsiteGLVMRG
Destination Node shanley
4. Verify the entries that are previously specified and then press Enter to start the moving of
the resource group.
Upon successful completion, PowerHA displays a message and the status, location, and a
type of location of the resource group that was successfully stopped on the specified node, as
shown in Figure 6-7 on page 200.
[MORE...7]
Resource group xsiteGLVMRG is online on node shanley.
[BOTTOM]
This option is also available by using either the clRGinfo or clmgr command. For more
information about these commands, see the man pages.
Any time that a resource group is moved to another node, application monitoring for the
applications is suspended during the application stop. After the application has restarted on
the destination node, application monitoring will resume. Additional information can be found
in 6.3.4, Suspending and resuming application monitoring on page 200.
The monitoring remains suspended until either manually resumed or until the resource group
is stopped and restarted.
6.4 Scenarios
In this section, we cover the following common scenarios:
PCI hot-plug replacement of a NIC
Installing AIX and PowerHA fixes
Replacing an LVM mirrored disk
Application maintenance
Note: Although PowerHA continues to provide this facility, with virtualization primarily
being used, this procedure is rarely used.
Special considerations
Consider the following factors before you replace a PCI hot-pluggable network interface card:
You should manually record the IP address settings of the network interface being
replaced to prepare for unplanned failures.
Be aware that if a network interface you are hot-replacing is the only available keepalive
path on the node where it resides, you must shut down PowerHA on this node to prevent a
partitioned cluster while the interface is being replaced.
This situation is easily avoidable by having a working non-IP network between the cluster
nodes.
SMIT gives you the option of doing a graceful shutdown on this node. From this point, you
can manually hot-replace the network interface card.
Hot-replacement of Ethernet network interface cards is supported.
Do not attempt to change any configuration settings while hot-replacement is in progress.
The SMIT interface simplifies the process of replacing a PCI hot-pluggable network interface
card. PowerHA supports only one PCI hot-pluggable network interface card replacement
using SMIT at one time per node.
Note: If the network interface was alive before the replacement process began, then
between the initiation and completion of the hot-replacement, the interface being replaced
is in a maintenance mode. During this time, network connectivity monitoring is suspended
on the interface for the duration of the replacement process.
Go to the node on which you want to replace a hot-pluggable PCI network interface card and
use the following steps.
1. Run smitty cspoc and then select Communication Interfaces PCI Hot Plug Replace
a Network Interface Card. Press Enter.
Tip: You can also get to this panel with the fast path smitty cl_pcihp.
SMIT displays a list of available PCI network interfaces that are hot-pluggable.
2. Select the network interface you want to hot-replace. Press Enter. The service address of
the PCI interface is moved to the available non-service interface.
3. SMIT prompts you to physically replace the network interface card. After you replace the
card, y confirm that replacement occurred.
If you select Yes, the service address is moved back to the network interface that was
hot-replaced. On aliased networks, the service address does not move back to the
original network interface, but remains as an alias on the same network interface. The
hot-replacement is complete.
If you select No, you must manually reconfigure the interface settings to their original
values:
i. Run the drslot command to take the PCI slot out of the removed state.
ii. Run mkdev on the physical interface.
iii. Use ifconfig manually (rather than smitty chinet, cfgmgr, or mkdev) to avoid
configuring duplicate IP addresses or an unwanted boot address.
We begin again from the fast path of smitty cl_pcihp as in the previous scenario:
1. Select the network interface that you want to hot-replace and press Enter.
SMIT prompts you to physically replace the network interface card. After you replace it,
confirm that replacement occurred.
If you select Yes, the hot-replacement is complete.
If you select No, you must manually reconfigure the interface settings to their original
values:
i. Run the drslot command to take the PCI slot out of the removed state.
ii. Run mkdev on the physical interface.
iii. Use ifconfig manually (rather than smitty chinet, cfgmgr, or mkdev) to avoid
configuring duplicate IP addresses or an unwanted boot address.
Some AIX fixes can be loaded dynamically without a reboot. Kernel and device driver updates
often require a reboot because installing updates to them runs a bosboot. One way to
determine if a reboot is required is to check the .toc file that is created by using the inutoc
command before installing the fixes. The file contains file set information similar to
Example 6-1.
In the example, the file set bos.64bit requires a reboot as indicated by the b character in
fourth column. The N character indicates that a reboot is not required.
Note: Follow this same general rule for fixes to the application; follow specific instructions
for the application.
The general procedure for applying AIX fixes that require a reboot is as follows:
1. Stop cluster services on standby node.
2. Apply, do not commit, TL or SP to the standby node (and reboot as needed).
3. Start cluster services on the standby node.
4. Stop cluster services on the production node using Move Resource Group option to the
standby machine.
5. Apply TL or SP to the primary node (and reboot as needed).
6. Start cluster services on the primary node.
If you install either AIX or PowerHA fixes that do not require a reboot, using the Unmanage
Resource Groups option is now possible when stopping cluster services, as described in
6.2.3, Stopping cluster services on page 195. The general procedure for doing this for a
two-node hot-standby cluster is as follows:
1. Stop cluster services on standby by using the Unmanage option.
2. Apply, do not commit, SP to the standby node.
3. Start cluster services on the standby node.
4. Stop cluster services on the production node by using the Unmanage option.
5. Apply SP to the primary node.
6. Start cluster services on the primary node.
Important: Never unmanage more than one node at a time. Complete the procedures
thoroughly on one node before beginning on another node. Of course, be sure to test these
procedures in a test environment before ever attempting them in production.
6.4.3 Storage
Most shared storage environments today use some level of RAID for data protection and
redundancy. In those cases, individual disk failures normally do not require AIX LVM
maintenance to be performed. Any procedures required are often external to cluster nodes
and do not affect the cluster itself. However, if protection is provided by using LVM mirroring,
then LVM maintenance procedures are required.
To physically replace an existing disk, remove the old disk and replace the new one in its
place. This of course assumes that the drive is hot-plug replaceable, which is common.
Note: During the command execution, SMIT tells you the name of the recovery
directory to use should replacepv fail. Make note of this information as it is required in
the recovery process.
Configuration of the destination disk on all nodes in the resource group occurs at this time.
If a node in the resource group fails to import the updated volume group, you can use the
C-SPOC Import a Shared Volume Group facility as shown in Importing volume groups using
C-SPOC on page 262.
6.4.4 Applications
Each application varies, however most application maintenance requires the application to be
brought offline. This can be done in several ways. The most appropriate method for any
particular environment depends on the overall cluster configuration.
Also common is to minimize the overall downtime of the application, and that the application
maintenance be performed first on the non-production nodes for that application. Traditionally
this means on a standby node, however it is not common that a backup/fallover node truly is a
standby only. If not a true standby node then any work load or applications currently running
on that node must be accounted for to minimize any adverse affects of installing the
maintenance. This should have all been tested previously in a test cluster.
In most cases, stopping cluster services is not needed. You can bring the resource group
offline as described in 6.3.1, Bringing a resource group offline by using SMIT on page 196. If
the shared volume group must be online during the maintenance, you can suspend
application monitoring and start the application stop-server script to bring the application
offline. However, this will keep the service IP address online, which might not be desirable.
In a multiple resource group or multiple application environment all running on the same
node, stopping cluster services on the local node might not be feasible. Be aware of the
possible effects of not stopping cluster services on the node in which application maintenance
is being performed.
If during the maintenance period, the system encounters a catastrophic error resulting in a
crash, a fallover will occur. This might be undesirable if the maintenance was not performed
on the fallover candidates first, if the maintenance is incomplete on the local node. Although
this might be a rare occurrence, the possibility exists and must be understood.
Another possibility is that if another production node fails during this maintenance period, a
fallover can occur successfully on the local node without adverse affects. If this is not what
you want, and there are multiple resource groups, then you might want to move the other
resource groups to another node first and then stop cluster services on the local node.
If you use persistent addresses, and you stop cluster services, local adapter swap protection
is no longer provided. Although again rare, the possibility then exists that when using the
persistent address to do maintenance and the hosting NIC fails, your connection will be
dropped.
After application maintenance, always test the cluster again. Depending on what actions you
selected to stop the application, you must then either restart cluster services, bring the
resource group back online through C-SPOC, or manually run the application start server
script and resume application monitoring as needed.
Beginning in PowerHA 7.1.3 SP1, clmgr was updated to allow CAA to be stopped on either a
node, cluster, or site level. These steps can be done on one node at a time, or the entire
cluster. Our scenarios show how to do it either way.
Note: In some cases, this step might change the device numbering. This does not
cause a problem because PowerHA and CAA know the repository disk by the PVID.
However, also check the disk device attributes (such as reserve_policy, queue_depth,
and others) to sure they are still what you want.
3. Start cluster services by running the following command on any node in the cluster:
clmgr online cluster WHEN=now MANAGE=auto START_CAA=yes
Important: If you use third-party storage multipathing device drivers, contact the vendor
for support assistance. Consult IBM only if you use native AIX MPIO.
The results of step 1 are shown in Example 6-2. Notice that CAA is inactive, and that the CAA
cluster and caavg_private no longer exist. This result is the same for all nodes in the cluster.
[cassidy:root] / # lspv
hdisk0 00f70c99013e28ca rootvg active
hdisk1 00f6f5d015a4310b None
hdisk2 00f6f5d015a44307 None
hdisk3 00f6f5d01660fbd1 None
hdisk4 00f6f5d0166106fa xsitevg
hdisk5 00f6f5d0166114f3 xsitevg
hdisk6 00f6f5d029906df4 xsitevg
hdisk7 00f6f5d0596beebf xsitevg
hdisk8 00f70c995a1bc94a None
After you do the maintenance that you want, restart the cluster services as shown in
Example 6-3. Notice that the CAA cluster and caavg_private are back and active.
Example 6-3 Starting cluster services cluster wide after maintenance performed
[cassidy:root] / # clmgr online cluster WHEN=now MANAGE=auto START_CAA=yes
[cassidy:root] / # lspv
hdisk0 00f70c99013e28ca rootvg active
hdisk1 00f6f5d015a4310b caavg_private active
hdisk2 00f6f5d015a44307 None
hdisk3 00f6f5d01660fbd1 None
hdisk4 00f6f5d0166106fa xsitevg concurrent
hdisk5 00f6f5d0166114f3 xsitevg concurrent
hdisk6 00f6f5d029906df4 xsitevg concurrent
hdisk7 00f6f5d0596beebf xsitevg concurrent
hdisk8 00f70c995a1bc94a None
Note: In some cases, this step might change the device numbering. This does not
cause a problem because PowerHA and CAA know the repository disk by the PVID.
However, also check the disk device attributes (such as reserve_policy, queue_depth,
and others) to be sure they are still what you want.
3. Start cluster services on the selected node by running the following command:
clmgr online node <nodename> WHEN=now MANAGE=auto START_CAA=yes
Important: If you use third-party storage multipathing device drivers, contact the
vendor for support assistance. Consult IBM support only if you use IBM device drivers.
The results of step 1 are shown in Example 6-4. Notice that CAA is inactive, but the CAA
cluster and caavg_private no longer exist on node cassidy. This applies only to the individual
node in this case. Also as shown, the cluster exists and is still active on node jessica.
[jessica:root] / # lspv
hdisk0 00f6f5d00146570c rootvg active
hdisk1 00f6f5d015a4310b caavg_private active
hdisk2 00f6f5d01660fbd1 amyvg
hdisk3 00f6f5d015a44307 amyvg
hdisk4 00f6f5d0166106fa xsitevg concurrent
hdisk5 00f6f5d0166114f3 xsitevg concurrent
hdisk6 00f6f5d029906df4 xsitevg concurrent
hdisk7 00f6f5d0596beebf xsitevg concurrent
Then after performing the maintenance, restart the cluster services on node cassidy as
shown in Example 6-5. Notice after that the CAA cluster and caavg_private are back and
active.
Example 6-5 Starting cluster services on individual node after maintenance performed
[cassidy:root] / # clmgr start node cassidy WHEN=now MANAGE=auto START_CAA=yes
....
"cassidy" is now online.
[cassidy:root] / # lspv
hdisk0 00f70c99013e28ca rootvg active
hdisk1 00f6f5d015a4310b caavg_private active
hdisk2 00f6f5d015a44307 None
hdisk3 00f6f5d01660fbd1 None
hdisk4 00f6f5d0166106fa xsitevg concurrent
hdisk5 00f6f5d0166114f3 xsitevg concurrent
hdisk6 00f6f5d029906df4 xsitevg concurrent
hdisk7 00f6f5d0596beebf xsitevg concurrent
hdisk8 00f70c995a1bc94a None
[Entry Fields]
Site Name fortworth
* Repository Disk [(00f61ab216646614] +
+--------------------------------------------------------------------------+
| Repository Disk |
| |
| Move cursor to desired item and press Enter. |
| |
| hdisk2 (00f61ab216646614) on all nodes at site fortworth |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F8=Image F10=Exit Enter=Do |
F5| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Figure 6-8 Add a repository disk
3. Run smitty sysmirror, select Problem Determination Tools Replace the Primary
Repository Disk, and then press Enter.
4. If sites are defined, you can select a site through the pop-up list. Otherwise, you are
directed to the last SMIT menu.
5. Select the new repository disk by pressing F4. See Figure 6-9 on page 213.
6. Synchronize the cluster.
This procedure of replacing a repository disk can also be accomplished by using the clmgr
command, as shown in Example 6-6. Of course if you are not using sites, you can exclude the
site option from the syntax.
[Entry Fields]
Site Name fortworth
* Repository Disk [00f61ab216646614] +
+--------------------------------------------------------------------------+
| Repository Disk |
| |
| Move cursor to desired item and press Enter. |
| |
| 00f61ab216646614 |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F8=Image F10=Exit Enter=Do |
F5| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Figure 6-9 Replace repository disk
Critical volume groups safeguard the Oracle RAC voting disks. PowerHA continuously
monitors the read-write accessibility of the voting disks. You can set up one of the following
recovery actions if you lose access to a volume group:
Notify only.
Halt the node.
Fence the node so that the node remains up but it cannot access the Oracle database.
Shut down cluster services and bring all resource groups offline.
Important: The critical volume groups and the Multi-Node Disk Heart Beat do not replace
the SAN-based disk heartbeat. These technologies are used for separate purposes.
Important: If you change the critical volume groups, verify and synchronize the cluster.
You can start a test, let it run unattended, and return later to evaluate the results of your
testing. You should run the utility under both low load and high load conditions to observe how
system load affects your PowerHA cluster.
You run the Cluster Test Tool from SMIT on one node in a PowerHA cluster. For testing
purposes, this node is referred to as the control node. From the control node, the tool runs a
series of specified tests, some on other cluster nodes, gathers information about the success
or failure of the tests processed, and stores this information in the Cluster Test Tool log file for
evaluation or future reference.
Important: If you uninstall PowerHA, the program removes any files that you might have
customized for the Cluster Test Tool. If you want to retain these files, copy them before you
uninstall PowerHA.
You create a custom test plan, a file that lists a series of tests to be run, to meet requirements
specific to your environment and apply that test plan to any number of clusters. You specify
the order in which tests run and the specific components to be tested. After you set up your
custom test environment, you run the test procedure from SMIT and view test results in SMIT
and in the Cluster Test Tool log file.
Individual tests can take approximately three minutes to run. The following conditions affect
the length of time to run the tests:
Cluster complexity.
Testing in complex environments takes considerably longer.
Network latency.
Cluster testing relies on network communication between the nodes. Any degradation in
network performance slows the performance of the Cluster Test Tool.
Use of verbose logging for the tool.
Custom user defined resources or events.
If you customize verbose logging to run additional commands from which to capture
output, testing takes longer to complete. In general, the more commands you add for
verbose logging, the longer a test procedure takes to complete.
Manual intervention on the control node.
At some points in the test, you might need to intervene.
Running custom tests.
If you run a custom test plan, the number of tests run also affects the time required to run
the test procedure. If you run a long list of tests, or if any of the tests require a substantial
amount of time to complete, then the time to process the test plan increases.
6.8.3 Considerations
The Cluster Test Tool has several considerations. It does not support testing of the following
PowerHA cluster-related components:
Resource groups with dependencies
Replicated resources
In addition, the Cluster Test Tool might not recover from the following situations:
A node that fails unexpectedly, that is, a failure not initiated by testing.
The cluster does not stabilize.
The automated test procedure runs a predefined set of tests on a node that the tool randomly
selects. The tool ensures that the node selected for testing varies from one test to another.
You can run the automated test procedure on any PowerHA cluster that is not currently in
service.
One network is used to test a network becoming unavailable then available. The second
network provides network connectivity for the Cluster Test Tool. Both networks are tested, one
at a time.
The automated test procedure runs sets of predefined tests in this order:
1. General topology tests
2. Resource group tests on non-concurrent resource groups
The Cluster Test Tool discovers information about the cluster configuration and randomly
selects cluster components, such as nodes and networks, to be used in the testing.
Which nodes are used in testing varies from one test to another. The Cluster Test Tool can
select some nodes for the initial battery of tests, and then, for subsequent tests, it can
intentionally select the same nodes, or, choose from nodes on which no tests were run
previously. In general, the logic in the automated test sequence ensures that all components
are sufficiently tested in all necessary combinations.
The automated test procedure runs a node_up event at the beginning of the test to ensure
that all cluster nodes are up and available for testing.
The Cluster Test Tool uses the terminology, for stopping cluster services, that was used in
pre-HACMP 5.4.
When the automated test procedure starts, the tool runs each of the following tests in the
order shown:
1. NODE_UP, ALL, Start cluster services on all available nodes
2. NODE_DOWN_GRACEFUL, node1, Stop cluster services gracefully on a node
3. NODE_UP, node1, Restart cluster services on the node that was stopped
4. NODE_DOWN_TAKEOVER, node2, Stop cluster services with takeover on a node
5. NODE_UP, node2, Restart cluster services on the node that was stopped
6. NODE_DOWN_FORCED, node3, Stop cluster services forced on a node
7. NODE_UP, node3, Restart cluster services on the node that was stopped
The Cluster Test Tool runs each of the following tests, in the order listed here, for each
resource group:
1. Bring a resource group offline and online on a node:
RG_OFFLINE, RG_ONLINE
2. Bring a local network down on a node to produce a resource group fallover:
NETWORK_DOWN_LOCAL, rg_owner, svc1_net, Selective fallover on local network
down
3. Recover the previously failed network:
NETWORK_UP_LOCAL, prev_rg_owner, svc1_net, Recover previously failed network
4. Move a resource group to another node:
RG_MOVE
5. Bring an application server down and recover from the application failure:
SERVER_DOWN, ANY, app1, /app/stop/script, Recover from application failure
Network tests
The tool runs tests for IP networks and for non-IP networks. For each IP network, the tool
runs these tests:
Bring a network down and up:
NETWORK_DOWN_GLOBAL, NETWORK_UP_GLOBAL
Fail a network interface, join a network interface. This test is run for the service interface
on the network. If no service interface is configured, the test uses a random interface
defined on the network:
FAIL_LABEL, JOIN_LABEL
Site-specific tests
If sites are present in the cluster, the tool runs tests for them. The automated testing
sequence that the Cluster Test Tool uses contains two site-specific tests:
auto_site: This sequence of tests runs if you have any cluster configuration with sites. For
example, this sequence is used for clusters with cross-site LVM mirroring configured that
does not use XD_data networks. The tests in this sequence include:
SITE_DOWN_GRACEFUL Stops the cluster services on all nodes in a site while taking
resources offline.
SITE_UP Restarts the cluster services on the nodes in a site.
SITE_DOWN_TAKEOVER Stops the cluster services on all nodes in a site and move the
resources to nodes at another site.
SITE_UP Restarts the cluster services on the nodes at a site.
RG_MOVE_SITE Moves a resource group to a node at another site.
auto_site_isolation: This sequence of tests runs only if you configured sites and an
XD-type network. The tests in this sequence include:
SITE_ISOLATION Isolates sites by failing XD_data networks.
SITE_MERGE Merges sites by bringing up XD_data networks.
When the tool terminates the Cluster Manager on the control node, you most likely will need
to reactivate the node.
You create a custom test plan, a file that lists a series of tests to be run, to meet requirements
specific to your environment and apply that test plan to any number of clusters. You specify
the order in which tests run and the specific components to be tested. After you set up your
custom test environment, you run the test procedure from SMIT and view test results in SMIT
and in the Cluster Test Tool log file.
Your test procedure should bring each component offline then online, or cause a resource
group fallover, to ensure that the cluster recovers from each failure. Start your test by running
a node_up event on each cluster node to ensure that all cluster nodes are up and available for
testing.
Note: The Cluster Test Tool uses existing terminology for stopping cluster services as
follows:
Graceful = Bring Resource Groups Offline
Takeover = Move Resource Groups
Forced = Unmanage Resource Groups
When the Cluster Test Tool starts, it uses a variables file if you specified the location of one in
SMIT. If it does not locate a variables file, it uses values set in an environment variable. If a
value is not specified in an environment variable, it uses the value in the test plan. If the value
set in the test plan is not valid, the tool displays an error message.
Note: One of the success indicators for each test is that the cluster becomes stable.
Where:
The test name is in uppercase letters.
Parameters follow the test name.
Italic text indicates parameters expressed as variables.
Commas separate the test name from the parameters and the parameters from each
other. A space around the commas is also supported.
The syntax line shows parameters as parameter1 and parametern with n representing the
next parameter. Tests typically have 2 - 4 parameters.
The vertical bar, or pipe character (|), indicates parameters that are mutually exclusive
alternatives.
Optional: The comments part of the syntax is user-defined text that appears at the end of
the line. The Cluster Test Tool displays the text string when the Cluster Test Tool runs.
Node tests
The node tests start and stop cluster services on specified nodes.
The following command starts the cluster services on a specified node that is offline or on
all nodes that are offline:
NODE_UP, node | ALL, comments
Where:
node The name of a node on which cluster services start
ALL Any nodes that are offline have cluster services start
comments User-defined text to describe the configured test
Example
NODE_UP, node1, Bring up node1
Entrance criteria
Any node to be started is inactive.
Success indicators
The following conditions indicate success for this test:
The cluster becomes stable.
The cluster services successfully start on all specified nodes.
No resource group enters the error state.
No resource group moves from online to offline.
Entrance criteria
The specified node is active and has at least one active interface on the specified network.
Success indicators
The following conditions indicate success for this test:
The cluster becomes stable.
Cluster services continue to run on the cluster nodes where they were active before
the test.
Resource groups on other nodes remain in the same state; however, some might be
hosted on a different node.
If the node hosts a resource group for which the recovery method is set to notify, the
resource group does not move.
The following command brings up the specified network on all nodes that have interfaces
on the network. The specified network can be an IP network or a serial network.
NETWORK_UP_GLOBAL, network, comments
Where:
network The name of the network to which the interface is connected
comments User-defined text to describe the configured test
Example
NETWORK_UP_GLOBAL, hanet1, Bring up hanet1 on node 6
Entrance criteria
The specified network is active on at least one node.
Success indicators
The following conditions indicate success for this test:
The cluster becomes stable.
Cluster services continue to run on the cluster nodes where they were active before
the test.
Resource groups that are in the ERROR state on the specified node and that have a
service IP label available on the network can go online, but should not enter the
ERROR state.
Resource groups on other nodes remain in the same state.
The following command brings the specified network down on all nodes that have
interfaces on the network. The network specified can be an IP network or a serial network:
NETWORK_DOWN_GLOBAL, network, comments
Where:
network The name of the network to which the interface is connected
comments User-defined text to describe the configured test
Example
NETWORK_DOWN_GLOBAL, hanet1, Bring down hanet1 on node 6
Site tests
These tests are for the site.
The following command fails all the XD_data networks, causing the site_isolation event:
SITE_ISOLATION, comments
Where:
comments User-defined text to describe the configured test.
Example
SITE_ISOLATION, Fail all the XD_data networks
Entrance criteria
At least one XD_data network is configured and is up on any node in the cluster.
Success indicators
The following conditions indicate success for this test:
The XD_data network fails, no resource groups change state.
The cluster becomes stable.
The following command runs when at least one XD_data network is up to restore
connections between the sites, and remove site isolation. Run this test after running the
SITE_ISOLATION test:
SITE_MERGE, comments
Where:
comments User-defined text to describe the configured test.
Example
SITE_MERGE, Heal the XD_data networks
For the Cluster Test Tool to accurately assess the success or failure of a CLSTRMGR_KILL
test, do not do other activities in the cluster while the Cluster Test Tool is running.
Example
CLSTRMGR_KILL, node5, Bring down node5 hard
Entrance criteria
The specified node is active.
Success indicators
The following conditions indicate success for this test:
The cluster becomes stable.
Cluster services stop on the specified node.
Cluster services continue to run on other nodes.
Resource groups that were online on the node where the Cluster Manager fails move
to other nodes.
All resource groups on other nodes remain in the same state.
The following command generates a wait period for the Cluster Test Tool for a specified
number of seconds:
WAIT, seconds, comments
Where:
seconds Number of seconds that the Cluster Test Tool waits before
proceeding with processing
comments User-defined text to describe the configured test
Example
WAIT, 300, We need to wait for five minutes before the next test
Entrance criteria
Not applicable
Success indicators
Not applicable
It also includes a WAIT interval. The comment text at the end of the line describes the action
that the test will do.
Note: The tool stops running and issues an error if a test fails and Abort On Error is set
to Yes.
[Entry Fields]
* Test Plan [/cluster/custom] /
Variables File [/cluster/testvars] /
Verbose Logging [Yes] +
Cycle Log File [Yes] +
Abort On Error [No] +
Important: If you uninstall PowerHA, the program removes any files that you might have
customized for the Cluster Test Tool. If you want to retain these files, copy them before you
uninstall PowerHA.
Log files
If a test fails, the Cluster Test Tool collects information in the automatically created log files.
You evaluate the success or failure of tests by reviewing the contents of the Cluster Test Tool
log file, /var/hacmp/log/cl_testtool.log. PowerHA never deletes the files in this directory.
For each test plan that has any failures, the tool creates a new directory under
/var/hacmp/log/. If the test plan has no failures, the tool does not create a log directory. The
directory name is unique and consists of the name of the Cluster Test Tool plan file, and the
time stamp when the test plan was run.
Note: Detailed output from an automated cluster test is in Appendix C, Cluster Test Tool
log on page 531.
The tool also rotates the files: the oldest file is overwritten. If you do not want the tool to rotate
the log files, you can disable this feature from SMIT.
Highly available environments require special consideration when you plan changes to the
environment. Be sure to follow a strict change management discipline.
Before we describe cluster management in more detail, we emphasize the following general
preferred practices for cluster administration:
Where possible, use the PowerHA C-SPOC facility to make changes to the cluster.
Document routine operational procedures (for example, shutdown, startup, and increasing
the size of a file system).
Restrict access to the root password to trained PowerHA administrators.
Always take a snapshot of your existing configuration before making any changes.
Monitor your cluster regularly.
The C-SPOC function is provided through its own set of cluster administration commands,
accessible through SMIT menus. The commands are in the /usr/es/sbin/cluster/cspoc
directory. C-SPOC uses the Cluster Communications daemon (clcomdES) to run commands
on remote nodes. If this daemon is not running, the command might not be run and C-SPOC
operation might fail.
Note: After PowerHA is installed clstrmgrES is started from inittab, so it is always running
whether cluster services are started or not.
C-SPOC operations fail if any target node is down at the time of execution or if the selected
resource is not available. It requires a correctly configured cluster in the sense that all nodes
within the cluster can communicate.
If node failure occurs during a C-SPOC operation, an error is displayed to the SMIT panel and
the error output is recorded in the C-SPOC log file (cspoc.log). Check this log if any C-SPOC
problem occurs. For more information about PowerHA logs, see 7.7.6, Log files on
page 305.
Always use full path names. Each file can be added to only one file collection, except those
files that are automatically added to the HACMP_Files collection. The files should not exist
on the remote nodes, PowerHA creates them during the first synchronization. The zero length
or non-existent files are not propagated from the local node.
PowerHA creates a backup copy of the modified files during synchronization on all nodes.
These backups are stored in the /var/hacmp/filebackup directory. Only one previous version
is retained and you can only manually restore them.
Important: You are responsible for ensuring that files on the local node (where you start
the propagation) are the most recent and are not corrupted.
Configuration_Files
This collection contains the essential AIX configuration files:
/etc/hosts
/etc/services
/etc/snmpd.conf
/etc/snmpdv3.conf
/etc/rc.net
/etc/inetd.conf
/usr/es/sbin/cluster/netmon.cf
/usr/es/sbin/cluster/etc/clhosts
/usr/es/sbin/cluster/etc/rhosts
/usr/es/sbin/cluster/etc/clinfo.rc
You can add to or remove files from these file collections. See Adding files to a file collection
on page 244 for more information.
Next, we look at an example of how this works. Our cluster has an application server,
app_server_1. It has the following three files:
A start script: /usr/app_scripts/app_start
A stop script: /usr/app_scripts/app_stop
A custom post-event script to the PowerHA node_up event:
/usr/app_scripts/post_node_up
These three files were automatically added to the HACMP_Files file collection when we
defined them during PowerHA configuration.
[Entry Fields]
File Collection Name HACMP_Files
New File Collection Name []
File Collection Description [User-defined scripts >
+--------------------------------------------------------------------------+
| Collection files |
| |
| The value for this entry field must be in the |
| range shown below. |
| Press Enter or Cancel to return to the entry field, |
| and enter the desired value. |
| |
| /tmp/app_scripts/app_start |
| /tmp/app_scripts/app_stop |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F8=Image F10=Exit Enter=Do |
F5| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Note: You cannot add files to or remove them from this file collection. If you start using
HACMP_Files collection, be sure that your scripts work as designed on all nodes.
If you do not want to synchronize all of your user-defined scripts or if they are not the same on
all nodes, then disable this file collection and create another one, which includes only the
required files.
[Entry Fields]
* File Collection Name [application_files]
File Collection Description [Application config fi>
Propagate files during cluster synchronization? yes +
Propagate files automatically when changes are det no +
ected?
[Entry Fields]
File Collection Name Configuration_Files
New File Collection Name []
File Collection Description [AIX and HACMP configu>
Propagate files during cluster synchronization? no +
Propagate files automatically when changes are det no +
ected?
Collection files +
[Entry Fields]
File Collection Name app_files
File Collection Description Application configura>
Propagate files during cluster synchronization? no
Propagate files automatically when changes are det no
ected?
Collection files
* New File [/usr/app/config_file]/
+--------------------------------------------------------------------------+
| Select one or more files to remove from this File Collection |
| |
| Move cursor to desired item and press F7. |
| ONE OR MORE items can be selected. |
| Press Enter AFTER making all selections. |
| |
| /usr/app/data.conf |
| /usr/app/app.conf |
| /usr/app/config_file |
| |
| F1=Help F2=Refresh F3=Cancel |
| F7=Select F8=Image F10=Exit |
| Enter=Do /=Find n=Find Next |
+--------------------------------------------------------------------------
Figure 7-4 Removing files from a file collection
Here are a couple options to consider for user and password synchronization:
Using C-SPOC: PowerHA provides utilities in C-SPOC for easy user administration. See
7.3.1, C-SPOC user and group administration on page 247.
LDAP is the best solution for managing a large number of users in a complex environment.
LDAP can be set up to work together with PowerHA. For more information about LDAP,
see Understanding LDAP - Design and Implementation, SG24-4986.
Note: PowerHA C-SPOC does provide SMIT panels for configuring both LDAP servers
and clients. The fast path is smitty cl_ldap. However, we do not provide additional details
about that topic.
Adding a user
To add a user on all nodes in the cluster, follow these steps:
1. Start SMIT: Run smitty cspoc and then select Security and Users.
Or you can use the fast path by entering smitty cl_usergroup.
2. Select Users in a PowerHA SystemMirror Cluster.
3. Select Add a User to the Cluster.
4. Select either LOCAL(FILES) or LDAP.
5. Select the nodes on which you want to create users. If the Select Nodes by Resource
Group field is kept empty, the user will be created on all nodes in the cluster. If you select
a resource group here, the user will be created only on the subset of nodes on which that
resource group is configured to run. In the case of a two-node cluster, leave this field
blank.
If you have more than two nodes in your cluster, you can create users that are related to
specific resource groups. If you want to create a user for these nodes only (for example,
user can log in to node1 and node2, but user is not allowed to log in to node3 or node4),
select the appropriate resource group name from the pick list. See Figure 7-6.
[Entry Fields]
Select nodes by Resource Group [] +
*** No selection means all nodes! ***
+-----------------------------------------------------------------------+
Select nodes by Resource Group
*** No selection means all nodes! ***
Move cursor to desired item and press Enter.
rg1
rg2
rg3
rg4
F1=Help F2=Refresh F3=Cancel
F1 F8=Image F10=Exit Enter=Do
F5 /=Find n=Find Next
F9+-----------------------------------------------------------------------+
Figure 7-6 Select nodes by resource group
Note: When you create a users home directory and if it is to reside on a shared file
system, C-SPOC does not check whether the file system is mounted or if the volume group
is varied. In this case, C-SPOC creates the user home directory under the empty mount
point of the shared file system.You can correct this by moving the home directory to under
the shared file system.
If a users home directory is on a shared file system, the user can only log in on the node
where the file system is mounted.
COMMAND STATUS
node1 root 0 /
node1 daemon 1 /etc
node1 bin 2 /bin
node1 sys 3 /usr/sys
node1 adm 4 /var/adm
node1 sshd 207 /var/empty
node1 sbodily302 /home/sbodily
node1 killer 303 /home/killer
node1 alexm 305 /home/alexm
node2 root 0 /
node2 daemon 1 /etc
node2 bin 2 /bin
node2 sys 3 /usr/sys
node2 adm 4 /var/adm
node2 sshd 207 /var/empty
node2 sbodily302 /home/sbodily
node2 killer 303 /home/killer
node2 alexm 305 /home/alexm
Removing a user
To remove a user, follow these steps:
1. Start C-SPOC Security and Users by entering smitty cl_usergroup and selecting Users
in an PowerHA SystemMirror Cluster Remove a User from the Cluster.
2. Select either LOCAL(FILES) or LDAP from pop-up list.
3. Select the nodes from which you want to remove a user. If you leave the Select Nodes by
Resource Group field empty, any user can be removed from all nodes.
If you select a resource group here, C-SPOC will remove the user from only the nodes that
belong to the specified resource group.
4. Enter the user name to remove or press F4 to select a user from the pick list.
5. For Remove AUTHENTICATION information, select Yes (the default) to delete the user
password and other authentication information. Select No to leave the user password in
the /etc/security/passwd file. See Figure 7-10 on page 252.
[Entry Fields]
Select nodes by resource group
*** No selection means all nodes! ***
Table 7-1 is a cross-reference resource groups, nodes, and groups. It shows support
present on all nodes (leave the Select Nodes by Resource Group field empty), while
groups such as dbadmin will be created only on node1 and node2 (select rg1 in the
Select Nodes by Resource Group field).
4. Create the group. See SMIT panel in Figure 7-11 on page 253. Supply the group name,
user list, and other relevant information just as when you create any normal group. Press
F4 for the list of the available users to include in the group.
You can specify the group ID here. However, if it is already used on a node, the command
will fail. If you leave the Group ID field blank, the group will be created with the first
available ID on all cluster nodes.
[Entry Fields]
Select nodes by resource group
*** No selection means all nodes! ***
[Entry Fields]
Select nodes by resource group
Removing a group
To remove a group from a cluster, follow these steps:
1. Start C-SPOC Security and User by entering smitty cl_usergroup and selecting
Groups in a PowerHA SystemMirror Cluster Remove a Group from the Cluster.
2. Select either LOCAL(FILES) or LDAP from pop-up menu list.
3. Select the nodes whose groups you want to change. If you leave the Select Nodes by
Resource Group option empty, C-SPOC will remove the selected group from all cluster
nodes. If you select a resource group here, C-SPOC will remove the group from only the
nodes which belong to the specified resource group. Select the group to remove.
4. Enter the name of the group you want to modify or press F4 to select from the pick list.
[Entry Fields]
* /bin/passwd utility is [Link to Cluster Passw> +
+-----------------------------------------------------------------------+
/bin/passwd utility is
Move cursor to desired item and press Enter.
Original AIX System Command
Link to Cluster Password Utility
F1=Help F2=Refresh F3=Cancel
F1 F8=Image F10=Exit Enter=Do
F5 /=Find n=Find Next
F9+-----------------------------------------------------------------------+
Figure 7-14 Modifying the system password utility
[Entry Fields]
Users allowed to change password [logan longr] +
cluster-wide
c. To modify the list of the users who are allowed to change their password cluster-wide,
press F4 and select the user names from the pop-up list. Choose ALL_USERS to
enable all current and future cluster users to use C-SPOC password management. See
Figure 7-16 on page 258.
We suggest that you include only real named users here, and manually change the
password for the technical users.
[Entry Fields]
Users allowed to change password [logan longr] +
+-----------------------------------------------------------------------+
Users allowed to change password
cluster-wide
Move cursor to desired item and press F7.
ONE OR MORE items can be selected.
Press Enter AFTER making all selections.
ALL_USERS
sshd
sbodily
killer
dbadm
dbuser
F1=Help F2=Refresh F3=Cancel
F1 F7=Select F8=Image F10=Exit
F5 Enter=Do /=Find n=Find Next
F9+-----------------------------------------------------------------------+
Figure 7-16 Selecting users allowed to change their password cluster-wide
Note: If you enable C-SPOC password utilities for all users in the cluster, but
you have users who only exist on one node, an error message occurs similar
to this example:
# passwd shane
Changing password for "shane"
shanes New password:
Enter the new password again:
node2: clpasswdremote: User shane does not exist on node node2
node2: cl_rsh had exit code = 1, see cspoc.log and/or clcomd.log for
more information
[Entry Fields]
Selection nodes by resource group
*** No selection means all nodes! ***
Tip: You can still use the AIX passwd command to change a specific users password on all
nodes.
[Entry Fields]
Selection nodes by resource group [] +
*** No selection means all nodes! ***
Tip: You can use the passwd command to change your password on all nodes.
When you use C-SPOC, the command runs on the local node and propagates the changes to
the other cluster nodes where the operation is to be run.
If you use C-SPOC to make LVM changes within a PowerHA cluster, the changes are
propagated automatically to all nodes selected for the LVM operation.
Note: Ownership and permissions on logical volume devices are reset when a volume
group is exported and then reimported. After exporting and importing, a volume group will
be owned by root:system. Some applications that use raw logical volumes might be
affected by this. You must check the ownership and permissions before exporting volume
group and restore them back manually in case they are not root:system as the default.
Instead of export and import commands, you can use the importvg -L VGNAME HDISK
command on the remote nodes, but be aware that the -L option requires that the volume
group has not been exported on the remote nodes. The importvg -L command preserves the
logical volume devices ownership.
Lazy update
In a cluster, PowerHA controls when volume groups are activated. PowerHA implements a
function called lazy update.
This function examines the volume group time stamp, which is maintained in both the volume
groups VGDA, and the local ODM. AIX updates both these timestamps whenever a change is
made to the volume group. When PowerHA is going to varyon a volume group, it compares
the copy of the time stamp in the local ODM with that in the VGDA. If the values differ,
PowerHA will cause the local ODM information on the volume group to be refreshed from the
information in the VGDA.
If a volume group under PowerHA control is updated directly (that is, without going through
C-SPOC), information of other nodes on that volume group will be updated when PowerHA
Note: Use C-SPOC to make LVM changes rather than relying on lazy update. C-SPOC will
import these changes to all nodes at the time of the C-SPOC operation unless a node is
powered off. Also consider using the C-SPOC CLI. See 7.4.6, C-SPOC command-line
interface (CLI) on page 283 for more information.
To use this feature, run smitty sysmirror, select Cluster Applications and Resources
Resource Groups Change/Show Resources and Attributes for a Resource Group.
Then, select resource group and set Automatically Import Volume Groups option to true.
This operation runs after you press Enter. It also automatically switches the setting back to
false. This prevents unwanted future imports until you specifically set the option again.
The following guidelines must be met for PowerHA to import available volume groups:
Logical volumes and file systems must have unique names cluster wide.
All physical disks must be known to AIX and have appropriate PVIDs assigned.
The physical disks on which the volume group resides are available to all of the nodes in
the resource group.
Note: To use this C-SPOC function, the volume group must belong to a resource group.
To increase the size of a shared LUN allocated to your cluster, use the following steps:
1. Verify that the volume group is active in concurrent mode on each node in the cluster:
2. Increase the size of the LUNs.
Note: This step might not be required because both VIOS and AIX are good at
automatically detecting the change. However, doing this step is a good practice.
4. Verify that the disk size is what you want by running the bootinfo -s hdisk# command.
5. Run the chvg -g vgname command on only the node that has the volume group in full
active, read/write mode.
DVE example
In this scenario, we have two disks, hdisk6 and hdisk7, that are originally 30 GB each in size
as shown in Example 7-3. They are both members of the xsitevg volume group.
Demonstration: See the demonstration about DVE in an active PowerHA v7.1.3 cluster:
http://youtu.be/iUB7rUG1nkw
-------------------------------
NODE cassidy
-------------------------------
30624
-------------------------------
NODE jessica
-------------------------------
30624
-------------------------------
NODE jessica
-------------------------------
30624
We begin with the cluster active on both nodes and the volume group is online in concurrent,
albeit active/passive, mode as shown in Example 7-4 on page 265. We parsed out the other
irrelevant fields to more easily show the differences after the changes are made. Notice that
the volume group is in active full read/write mode on node jessica. Also, notice that the total
volume group size is approximately 122 GB.
-------------------------------
NODE cassidy
-------------------------------
hdisk4 00f6f5d0166106fa xsitevg concurrent
hdisk5 00f6f5d0166114f3 xsitevg concurrent
hdisk6 00f6f5d029906df4 xsitevg concurrent
hdisk7 00f6f5d0596beebf xsitevg concurrent
-------------------------------
NODE jessica
-------------------------------
hdisk4 00f6f5d0166106fa xsitevg concurrent
hdisk5 00f6f5d0166114f3 xsitevg concurrent
hdisk6 00f6f5d029906df4 xsitevg concurrent
hdisk7 00f6f5d0596beebf xsitevg concurrent
-------------------------------
NODE cassidy
-------------------------------
VOLUME GROUP: xsitevg VG IDENTIFIER: 0f6f5d000004c00000001466765fb16
VG STATE: active PP SIZE: 32 megabyte(s)
VG PERMISSION: passive-only TOTAL PPs: 3828 (122496 megabytes)
MAX LVs: 256 FREE PPs: 3762 (120384 megabytes)
Concurrent: Enhanced-Capable Auto-Concurrent: Disabled
VG Mode: Concurrent
-------------------------------
NODE jessica
-------------------------------
VOLUME GROUP: xsitevg VG IDENTIFIER: 0f6f5d000004c00000001466765fb16
VG STATE: active PP SIZE: 32 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 3828 (122496 megabytes)
MAX LVs: 256 FREE PPs: 3762 (120384 megabytes)
Concurrent: Enhanced-Capable Auto-Concurrent: Disabled
VG Mode: Concurrent
We provision more space onto the disks (LUNs) by adding 9 GB to hdisk6 and 7 GB to
hdisk7. Next, we run cfgmgr on both nodes. Then, we use bootinfo -s to verify that the new
sizes are being reported properly, as shown in Example 7-5.
-------------------------------
NODE cassidy
-------------------------------
39936
-------------------------------
NODE jessica
-------------------------------
NODE cassidy
-------------------------------
37888
-------------------------------
NODE jessica
-------------------------------
37888
Now we need to update the volume group to be aware of the new space. We do so by running
chvg -g xsitevg on node jessica, which has the volume group active. Then, we verify the
results of the new hdisk size and the new total space to the volume group as shown in
Example 7-6. Notice that hdisk6 is now reporting 39 GB, hdisk7 is 37 GB, and the total
volume group size is now 138 GB.
-------------------------------
NODE cassidy
-------------------------------
VOLUME GROUP: xsitevg VG IDENTIFIER: 0f6f5d000004c00000001466765fb16
VG STATE: active PP SIZE: 32 megabyte(s)
VG PERMISSION: passive-only TOTAL PPs: 4340 (138880 megabytes)
MAX LVs: 256 FREE PPs: 4274 (136768 megabytes)
LVs: 2 USED PPs: 66 (2112 megabytes)
Concurrent: Enhanced-Capable Auto-Concurrent: Disabled
VG Mode: Concurrent
-------------------------------
NODE jessica
-------------------------------
VOLUME GROUP: xsitevg VG IDENTIFIER: 0f6f5d000004c00000001466765fb16
VG STATE: active PP SIZE: 32 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 4340 (138880 megabytes)
MAX LVs: 256 FREE PPs: 4274 (136768 megabytes)
LVs: 2 USED PPs: 66 (2112 megabytes)
Concurrent: Enhanced-Capable Auto-Concurrent: Disabled
VG Mode: Concurrent
To select the LVM C-SPOC menu for logical volume management, run smitty cspoc and then
select Storage. The following menu options are available:
Volume Groups:
List All Volume Groups
Create a Volume Group
Create a Volume Group with Data Path Devices
Enable a Volume Group for Fast Disk Takeover or Concurrent Access
Set Characteristics of a Volume Group
Import a Volume Group
Mirror a Volume Group
Unmirror a Volume Group
Manage Critical Volume Groups
Synchronize LVM Mirrors
Synchronize a Volume Group Definition
Remove a Volume Group
Manage Mirror Pools for Volume Groups
Logical Volumes:
List All Logical Volumes by Volume Group
Add a Logical Volume
Show Characteristics of a Logical Volume
Set Characteristics of a Logical Volume
Change a Logical Volume
Remove a Logical Volume
File Systems:
List All File Systems by Volume Group
Add a File System
Change / Show Characteristics of a File System
Remove a File System
Physical Volumes
Remove a Disk From the Cluster
Cluster Disk Replacement
Cluster Data Path Device Management
List all shared Physical Volumes
Change/Show Characteristics of a Physical Volume
Rename a Physical Volume
Show UUID for a Physical Volume
Manage Mirror Pools for Volume Groups
For more details about the specific tasks, see 7.4.5, Examples on page 268.
In our examples, we used a two-node cluster based on VIO clients, one Ethernet network
using IPAT through aliasing, and a disk heartbeat network for non-IP communication. The
storage is DS4k presented through VIO servers. Figure 7-20 shows our test cluster setup.
We add the enhanced concurrent capable volume group by using these steps:
1. Run smitty cspoc and then select Storage Volume Groups Create a Volume
Group.
2. Press F7, select nodes, and press Enter.
3. Press F7, select disk or disks, and press Enter.
4. Select a volume group type from the pick list.
As a result of the volume group type that we chose, we create a scalable volume group as
shown in Example 7-7. From here, if we also want to add this new volume group to a
resource group, we can either select an existing resource group from the pick list or we
can create a new resource group.
Important: When you choose to create a new resource group from the C-SPOC
Logical Volume Management menu, the resource group will be created with the
following default policies. After the group is created, you may change the policies in the
Resource Group Configuration:
Startup: Online On Home Node Only
Fallover: Fallover To Next Priority Node In The List
Fallback: Never Fallback
Before creating a shared volume group for the cluster using C-SPOC, we check that the
following conditions are true:
All disk devices are properly configured on all cluster nodes and the devices are listed as
available on all nodes.
Disks have a PVID.
We add the concurrent volume group and resource group by using these steps:
1. Run smitty cspoc and then select Storage Volume Groups Create a Volume
Group.
2. Press F7, select nodes, and then press Enter
3. Press F7, select disks, and then press Enter
4. Select a volume group type from the pick list.
As a result of the volume group type that we chose, we created a big, concurrent volume
group as displayed in Example 7-8
Example 7-8 Create a new concurrent volume group and concurrent resource group
Create a Big Volume Group
Warning:
Changing the volume group major number may result
in the command being unable to execute
[MORE...5]
Example 7-9 on page 271 shows the output from the command we used to create this volume
group and resource group. The cluster must now be synchronized for the resource group
changes to take effect, however, the volume group information was imported to all cluster
nodes selected for the operation immediately.
jessica: concdbvg
jessica: mkvg: This concurrent capable volume group must be varied on manually.
jessica: synclvodm: No logical volumes in volume group concdbvg.
jessica: Volume group concdbvg has been updated.
cassidy: synclvodm: No logical volumes in volume group concdbvg.
cassidy: 0516-783 importvg: This imported volume group is concurrent capable.
cassidy: Therefore, the volume group must be varied on manually.
cassidy: 0516-1804 chvg: The quorum change takes effect immediately.
cassidy: Volume group concdbvg has been imported.
0516-306 getlvodm: Unable to find volume group concdbvg in the Device
Configuration Database.
cl_mkvg: The PowerHA SystemMirror configuration has been changed - Resource Group
newconcRG has been added. The configuration must be synchronized to make this
change effective across the cluster
Important: When you create a new concurrent resource group from the C-SPOC
Concurrent Logical Volume Management menu, the resource group will be created with the
following default policies:
Startup: Online On All Available Nodes
Fallover: Bring Offline (On Error Node Only)
Fallback: Never Fallback
The new logical volume, jerryclv, is created and information is propagated on the other cluster
nodes.
Important: If a logical volume of type jfs2log is created, C-SPOC automatically runs the
logform command so that the volume can be used.
Important: File systems are note allowed on volume groups that are a resource in an
Online on All Available Nodes type resource group.
The /jerrycfs file system is now created. The contents of /etc/filesystems on both nodes
are now updated with the correct jfs2log. If the resource group and volume group are online,
the file system is mounted automatically after creation.
Tip: With JFS2, we also have the option to use inline logs that can be configured from the
options in the example.
Important: Always add more space to a file system by adding more space to the logical
volume first. Never add the extra space to the JFS first when using cross-site LVM
mirroring because the mirroring might not be maintained properly.
Similar to creating a logical volume, be sure to allocate the extra space properly to maintain
the mirrored copies at each site. To add mores space, complete the following steps:
1. Run smitty cl_lvsc, select Increase the Size of a Shared Logical Volume, and press
Enter.
2. Choose a volume group and resource group from the pop-up list (Figure 7-21).
3. Then choose the logical volume from the next pop-up list (Figure 7-22 on page 275). A list
of disks is displayed that belong to the same volume group as the logical volume
previously chosen. The list is similar to the list displayed when you create a new logical
volume (Figure 7-21). Press F7, choose the disks and press Enter.
Important: Do not use the Auto-select option, which is at the top of the pop-up list.
+--------------------------------------------------------------------------+
| Select the Volume Group that holds the Logical Volume to Extend |
| |
| Move cursor to desired item and press Enter. |
| |
| #Volume Group Resource Group Node List |
| xsitevg xsitelvmRG cassidy,jessica |
| |
| F1=Help F2=Refresh F3=Cancel |
| F8=Image F10=Exit Enter=Do |
F1| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Figure 7-21 Shared volume group pop-up list
+--------------------------------------------------------------------------+
| Select the Logical Volume to Extend |
| |
| Move cursor to desired item and press Enter. Use arrow keys to scroll. |
| |
| #xsitevg: |
| # LV NAME TYPE LPs PPs PVs LV STATE MO |
| xsitelv1 jfs2 25 50 2 closed/syncd N/ |
| |
| F1=Help F2=Refresh F3=Cancel |
| F8=Image F10=Exit Enter=Do |
F1| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Figure 7-22 Logical volume pop-up list selection
4. After selecting the target disks, the final menu opens (shown in Figure 7-23 on page 276).
Set the following options:
RANGE of physical volumes: minimum
Allocate each logical partition copy on a SEPARATE physical volume: superstrict
This is already set correctly if the logical volume was originally created correctly.
5. After adding extra space, verify that the partition mapping is correct by running the lslv -m
lvname command again, as shown Example 7-13. Highlighted in bold are the two new
partitions just added to the logical volume.
[Entry Fields]
Volume Group Name xsitevg
Resource Group Name xsitelvmRG
* LOGICAL VOLUME name xsitelv1
Reference node
* NEW TOTAL number of logical partition 2 +
copies
PHYSICAL VOLUME names
POSITION on physical volume outer_middle +
RANGE of physical volumes minimum +
MAXIMUM NUMBER of PHYSICAL VOLUMES [] #
to use for allocation
Allocate each logical partition copy yes +
on a SEPARATE physical volume?
SYNCHRONIZE the data in the new no +
logical partition copies?
[Entry Fields]
* VOLUME GROUP name xsitevg
Resource Group Name xsitelvmRG
Node List jessica,cassidy
Reference node
PHYSICAL VOLUME names
[Entry Fields]
VOLUME GROUP name xsitevg
Resource Group Name xsitelvmRG
* Node List jessica,cassidy
[Entry Fields]
Volume Group Name xsitevg
Resource Group Name xsitelvmRG
* LOGICAL VOLUME name xsitelv1
Reference node
[Entry Fields]
VOLUME GROUP name xsitevg
Resource Group Name xsitelvmRG
Node List jessica,cassidy
Reference node jessica
PHYSICAL VOLUME names hdisk3
[Entry Fields]
Volume group name concdbvg
Resource Group Name newconcRG
* File system name /jerrycfs
* Node Names cassidy,jessica
The CLI is oriented for root users who need to run certain tasks with shell scripts rather than
through a SMIT menu. The C-SPOC CLI commands are located in the
/usr/es/sbin/cluster/cspoc directory and they all have a name with the cli_ prefix.
Similar to the C-SPOC SMIT menus, the CLI commands log their operations in the cspoc.log
file on the node where the CLI command was run.
A list of the commands is shown in Figure 7-30. Although the names are descriptive regarding
what function each one offers, we also provide their corresponding man pages in Appendix B,
C-SPOC LVM CLI commands on page 501.
The PowerHA cluster stores the information about all cluster resources and cluster topology,
and also several other parameters in PowerHA that are specific object classes in the ODM.
PowerHA ODM files must be consistent across all cluster nodes so cluster behavior will work
as designed. Cluster verification checks the consistency of PowerHA ODM files across all
nodes and also verify if PowerHA ODM information is consistent with required AIX ODM
information. If verification is successful, the cluster configuration can be synchronized across
all the nodes. Synchronization is effective immediately in an active cluster. Cluster
synchronization copies the PowerHA ODM from the local nodes to all remote nodes.
Note: If the cluster is not synchronized and failure of a cluster topology or resource
component occurs, the cluster might not be able to fallover as designed. Be sure to
regularly verify the cluster configuration and synchronize all changes when complete.
When you use the path, Cluster Nodes and Networks, synchronization will take place
automatically following successful verification of the cluster configuration. There are no
additional options in this menu. The feature to automatically correct errors found during
verification is always active. For information about automatically correcting errors that are
found during verification, see 7.6.5, Running automatically corrective actions during
verification on page 291.
Figure 7-31 on page 286 shows the SMIT panel that is displayed when cluster services are
active. Performing a synchronization in an active cluster is also called Dynamic
Reconfiguration (DARE).
The Custom Cluster Configuration Verification and Synchronization path parameters depend
on the cluster services state (active or inactive) on the node, where verification is initiated.
In an active cluster, the SMIT panel parameters are as follows (Figure 7-31 on page 286):
Verify changes only:
Select No to run the full check of topology and resources
Select Yes to verify only the changes made to the cluster configuration (PowerHA
ODM) since the last verification.
Logging:
Select Verbose to send full output to the console, which otherwise is directed to
the clverify.log file.
In an inactive cluster, the SMIT panel parameters are as follows (Figure 7-32 on page 286):
Verify, Synchronize or Both:
Select Verify to run verification only.
Select Synchronize to run synchronization only
Select Both to run verification, and when that completes, synchronization is done (with
this option, the Force synchronization if verification fails option can be used).
Include custom verification library checks:
This can be set to either yes or no.
Automatically correct errors found during verification:
For a detailed description, see 7.6.5, Running automatically corrective actions during
verification on page 291.
Force synchronization if verification fails:
Select No to stop synchronization from commencing if the verification procedure
returns errors.
Select Yes to force synchronization regardless of the result of verification. In general
do not force the synchronization. In some specific situations, if the synchronization
needs to be forced, ensure that you fully understand the consequences of these cluster
configuration changes.
Verify changes only:
Select No to run a full check of topology and resources
Select Yes to verify only the changes that occurred in the PowerHA ODM files since the
time of the last verification operation.
Logging:
Select Verbose to send full output to the console, which is otherwise directed to the
clverify.log file.
Note: Synchronization can be initiated on either an active or inactive cluster. If some nodes
in the cluster are inactive, synchronization can be initiated only from an active node, using
DARE (Dynamic Reconfiguration). For more information about DARE, see 7.6.2, Dynamic
cluster reconfiguration with DARE on page 287.
[Entry Fields]
* Verify changes only? [No] +
* Logging [Standard] +
[Entry Fields]
* Verify, Synchronize or Both [Both] +
* Include custom verification library checks [Yes] +
* Automatically correct errors found during [No] +
verification?
If you are using the Problem Determination Tools path, you have more options for verification,
such as defining custom verification methods. However, synchronizing the cluster from here is
not possible. The SMIT panel of the Problem Determination Tools verification path is shown in
Figure 7-33 on page 287.
If verification fails, correct the errors and repeat verification to ensure that the problems are
resolved as soon as possible. The messages that are output from verification indicate where
the error occurred (for example, on a node, a device, or a command). In 7.6.4, Verification log
files on page 290, we describe the location and purpose of the verification logs.
Considerations:
Be aware that when the cluster synchronization (DARE) takes place, action will be
taken on any resource or topology component that is to be changed or removed,
immediately.
Not supported is running a DARE operation on a cluster that has nodes running at
different versions of the PowerHA code, for example, during a cluster migration.
You cannot perform a DARE operation while any node in the cluster is in the
unmanaged state.
The following changes can be made to topology in an active cluster using DARE:
Add or remove nodes.
Add or remove network interfaces.
Add or remove networks.
Restriction: At the time of writing, all PowerHA v7.1 releases did not support dynamically
changing a private network to a public network as shown in the following error example
output. At this time, there are not plans to remove this restriction.
The following changes can be made to resources in an active cluster using DARE:
Add, remove, or change an application server.
Add, remove, or change application monitoring.
Add or remove the contents of one or more resource groups.
Add, remove, or change a tape resource.
Add or remove resource groups.
Add, remove, or change the order of participating nodes in a resource group.
Change the node relationship of the resource group.
Change resource group processing order.
Add, remove, or change the fallback timer policy associated with a resource. group. The
new fallback timer will not have any effect until the resource group is brought online on
another node.
Add, remove, or change the settling time for resource groups.
Add or remove the node distribution policy for resource groups.
Add, change, or remove parent/child or location dependencies for resource groups (some
limitations apply here).
Add, change, or remove inter-site management policy for resource groups.
Add, remove, or change pre-events or post-events.
The dynamic reconfiguration can be initiated only from an active cluster node, which means,
from a node that has cluster services running. The change must be made from a node that is
active so the cluster can be synchronized.
Before making changes to a cluster definition, ensure that these items are true:
The same version of PowerHA is installed on all nodes.
Some nodes are up and running PowerHA and they are able to communicate with each
other. No node should be in an UNMANAGED state.
The cluster is stable and the hacmp.out log file does not contain recent event errors or
config_too_long events.
Depending on your cluster configuration and on the specific changes you plan to make in your
cluster environment, there are many possibilities and possible limitations while running a
dynamic reconfiguration event. You must understand all of the consequences of changing an
active cluster configuration. Be sure to read the Administering PowerHA SystemMirror guide
for further details before making dynamic changes in your live PowerHA environment:
[Entry Fields]
* Cluster Name xsite_cluster
* Heartbeat Mechanism Unicast +
Repository Disk 00f6f5d015a4310b
Cluster Multicast Address 228.168.100.51
(Used only for multicast heartbeat)
Example 7-19 shows the /var/hacmp/clverify/ directory contents with verification log files.
Notes:
To be able to run, verification requires 4 MB of free space per node in the /var file
system. Typically, the /var/hacmp/clverify/clverify.log files require an extra
1 - 2 MB of disk space. At least 42 MB of free space is suggested for a four-node
cluster.
The default log file location for most PowerHA log files is now /var/hacmp, however there
are some exceptions. For more details, see the PowerHA Administration Guide.
The automatic corrective action feature can correct only some types of errors, which are
detected during the cluster verification. The following errors can be addressed:
PowerHA shared volume group time stamps are not up-to-date on a node.
The /etc/hosts file on a node does not contain all PowerHA-managed IP addresses.
A file system is not created on a node, although disks are available.
Disks are available, but the volume group has not been imported to a node.
Shared volume groups configured as part of an PowerHA resource group have their
automatic varyon attribute set to Yes.
Required /etc/services entries are missing on a node.
Required PowerHA snmpd entries are missing on a node.
Required RSCT network options settings.
Required PowerHA network options setting.
Required routerevalidate network option setting.
Corrective actions when using IPv6.
Create WPAR if added to a resource group but WPAR does not exist yet.
With no prompting:
Correct error conditions that appear in /etc/hosts.
Correct error conditions that appear in /usr/es/sbin/cluster/etc/clhosts.client.
Update /etc/services with missing entries.
Update /etc/snmpd.peers and /etc/snmp.conf files with missing entries.
With prompting:
Update auto-varyon on this volume group.
Update volume group definitions for this volume group.
Keep PowerHA volume group timestamps in sync with the VGDA.
Auto-import volume groups.
If cluster services are inactive, you can select the mode of the automatic error correction
feature directly in the Extended Configuration verification path menu by running smitty
sysmirror and then selecting Extended Configuration Extended Verification and
Synchronization.
As shown in Figure 7-32 on page 286, you can change the mode with the Automatically
correct errors found during verification field, by setting it to one of these choices:
Yes
No
Interactively
If the cluster is active, the automatic corrective action feature is enabled by default. You can
change the mode of automatic error correction for an active cluster directly in the SMIT menu
for starting cluster services. Run smitty sysmirror and then select System Management
(C-SPOC) PowerHA SystemMirror Services Start Cluster Services.
Setting values to Yes, No, or Interactive will set the automatic error correction mode for these
items:
The PowerHA Extended Configuration verification path.
Automatic cluster verification to run at cluster services start time.
Automatic cluster configuration monitoring, which runs daily if enabled.
During automatic verification and synchronization, PowerHA will detect and correct several
common configuration issues. This automatic behavior ensures that if you did not manually
verify and synchronize a node in your cluster before starting cluster services, PowerHA will
do so.
Using the SMIT menus, you can set the parameters for the periodic automatic cluster
verification checking utility, by running smitty sysmirror and then selecting Problem
Determination Tools PowerHA SystemMirror Verification Automatic Cluster
Configuration Monitoring.
Figure 7-34 shows the SMIT panel for Automatic Cluster Configuration Monitoring parameters
setting, for running smitty clautover.dialog.
[Entry Fields]
* Automatic cluster configuration verification Enabled +
Node name Default +
* HOUR (00 - 23) [00] +
Debug yes +
As a result, it is possible that a component in the cluster has failed and that you are unaware
of the fact. The danger here is that, while PowerHA can survive one or possibly several
failures, each failure that escapes your notice threatens the clusters ability to provide a highly
available environment, as the redundancy of cluster components is diminished.
To avoid this situation, we suggest that you regularly check and monitor the cluster. PowerHA
offers various utilities to help you with cluster monitoring and other items:
Automatic cluster verification. See 7.6.6, Automatic cluster verification on page 293.
Cluster status checking utilities
Resource group information commands
Topology information commands
Log files
Error notification methods
Application monitoring
Measuring application availability
Monitoring clusters from the enterprise system administration and monitoring tools
You can use either ASCII SMIT, IBM Systems Director PowerHA SystemMirror plug-in, or the
clmgr command line to configure and manage cluster environments.
This utility requires the clinfoES subsystem to be active on nodes where the clstat command
is initiated.
The clstat command is supported in two modes: ASCII mode and X Window mode. ASCII
mode can run on any physical or virtual ASCII terminal, including xterm or aixterm windows. If
the cluster node runs graphical X Window mode, clstat displays the output in a graphical
window. Before running the command, ensure that the DISPLAY variable is exported to the X
server and that X client access is allowed.
Figure 7-35 on page 295 shows the syntax of the clstat command.
Consider the following information about the clstat command in the figure:
clstat -a runs the program in ASCII mode.
clstat -o runs the program once in ASCII mode and exits (useful for capturing output
from a shell script or cron job).
clstat -s displays service labels that are both up and down, otherwise it displays only
service labels, which are active.
Example 7-20 shows the clstat -o command output from our test cluster.
The cldump command does not have any arguments, so you simply run cldump from the
command line.
Tip: Another good source of resolving common issues with clstat can be found at:
http://www.ibm.com/support/docview.wss?uid=isg3T1020101
Access permission
Check for access permission to the PowerHA portion of the SNMP Management Information
Base (MIB) in the SNMP configuration file:
1. Find the defaultView entries in the /etc/snmpdv3.conf file, shown in Example 7-21.
Beginning with AIX 7.1, as a security precaution, the snmpdv3.conf file is included with the
Internet access commented out (#). The preceding example shows the unmodified
configuration file; the Internet descriptor is commented out, which means that there is no
access to most of the MIB, including the PowerHA information. Other included entries
provide access to other limited parts of the MIB. By default, in AIX 7.1 and later, the
PowerHA SNMP-based status commands do not work, unless you edit the snmpdv3.conf
file. The two ways to provide access to the PowerHA MIB are by modifying the
snmpdv3.conf file as follows:
Uncomment (remove the number sign, #, from) the following Internet line, which will
give you access to the entire MIB:
VACM_VIEW defaultView internet - included -
If you do not want to provide access to the entire MIB, add the following line, which
gives you access to only the PowerHA MIB:
VACM_VIEW defaultView risc6000clsmuxpd - included -
IPv6 entries
If you use PowerHA SystemMirror 7.1.2 or later, check for the correct IPv6 entries in the
configuration files for clinfoES and snmpd. In PowerHA 7.1.2, an entry is added to the
/usr/es/sbin/cluster/etc/clhosts file to support IPv6. However, the required corresponding
entry is not added to the /etc/snmpdv3.conf file. This causes intermittent problems with the
clstat command.
Example 7-22 shows output of the clshowres command from our test cluster, when cluster
services are running.
You can also use the clshowsrv -v command through the SMIT menus by running smitty
sysmirror and then selecting System Management (C-SPOC) PowerHA SystemMirror
Services Show Cluster Services.
The following tools are publicly available but are not included with the PowerHA software:
Query HA (qha)
Query CAA (qcaa)
Note: Custom examples of qha and other tools are in the Guide to IBM PowerHA
SystemMirror for AIX Version 7.1.3, SG24-8167:
Query HA (qha)
Query HA tool, qha, was created in approximately 2001 but has been updated since then to
support the most recent code levels up to version 7.1.3 (at the time of writing this book). It
primarily provides an in-cluster status view, which is not reliant on the SNMP protocol and
clinfo infrastructure. Query HA can also be easily customized.
Rather than reporting about whether the cluster is running or unstable, the focus is on the
internal status of the cluster manager. Although not officially documented, see 6.2, Starting
and stopping the cluster on page 191 for a list of the internal clstrmgr states. This status
information helps you understand what is happening within the cluster, especially during
event processing (cluster changes such as start, stop, resource groups moves, application
failures, and more). When viewed next to other information, such as the running event, the
resource group status, online network interfaces, and the varied on volume groups, it provides
an excellent overall status view of the cluster. It also helps with problem determination as to
understanding PowerHA event flow during, for example, node_up or fallover events and when
searching through cluster and hacmp.out files.
Note: The qha tool is usually available for download from the following website:
http://www.powerha.lpar.co.uk
Example 7-23 shows sample status output from qha -nev command.
Note: The qcaa tool is usually available for download from the following website:
http://www.powerha.lpar.co.uk
Example 7-24 shows sample status output from the qcaa -nev command.
Node: jessica
Node: UUID = 8fb52cd6-e5e7-11e3-af34-eeaf01717802
1, en0
= UP
2, en1
= UP
3, Cluster repos. comms
= UP RESTRICTED AIX_CONTROLLED
Node: shanley
Node: UUID = aaedd9e4-e5e7-11e3-9e0f-eeaf01717802
1, en0
= UP
2, en1
= UP
3, Cluster repos. comms
= UP RESTRICTED AIX_CONTROLLED
Node: cassidy
Node: UUID = 8fb52d58-e5e7-11e3-af34-eeaf01717802
1, en0
= UP
2, en1
= UP
3, Cluster repos. comms
= UP RESTRICTED AIX_CONTROLLED
Note: At the time of writing, the man page for cltopinfo still shows the -m option. However
this option is no longer available in PowerHA v7 and later.
You can also use SMIT menus to display various formats of the topology information:
Display by cluster:
Run smitty sysmirror and then select Cluster Nodes and Networks Manage the
Cluster Display PowerHA SystemMirror Configuration.
SMIT panel output: Exactly the same as running the cltopinfo command.
Display by node:
Run smitty sysmirror and then select Cluster Nodes and Networks Manage
Nodes Show Topology Information by Node Show All Nodes.
SMIT panel output: Exactly the same as running the cltopinfo -n command.
SMIT panel output: Exactly the same as running the cltopinfo -w command.
SMIT panel output: Exactly the same as running the cltopinfo -i command.
----------------------------------------------------------------------------
----------------------------------------------------------------------------
If cluster services are not running on the local node, the command determines a node where
the cluster services are active and obtains the resource group information from the active
cluster manager.
The default locations of log files are used in this section. If you redirected any logs, check the
appropriate location.
For example, for a four-node cluster, you need the following amount of space in the /var
file system:
2 + (4x4) + 20 + (4x1) = 42 MB
Some additional log files that gather debug data might require further additional space in the
/var file system. This depends on other factors such as number of shared volume groups and
file system. Cluster verification will issue a warning if not enough space is allocated to the
/var file system.
The SMIT fast path is smitty clusterlog_redir.select. The default log directory is changed
for all nodes in cluster. The cluster should be synchronized after changing the log parameters.
Note: We suggest using only local file systems if changing default log locations rather than
shared or NFS file systems. Having logs on shared or NFS file systems can cause
problems if the file system needs to unmount during a fallover event. Redirecting logs to
shared or NFS file systems can also prevent cluster services from starting during node
reintegration.
For more information about automatic error notification, with examples of using and
configuring it, see 11.5, Automatic error notification on page 413.
In addition, the introduction of the Unmanaged Resource Groups option, while stopping
cluster services (which leaves the applications running without cluster services), makes
application monitors a crucial factor in maintaining application availability.
When cluster services are restarted to begin managing the resource groups again, the
process of acquiring resources will check each resource to determine if it is online. If it is
running, acquiring that resource is skipped.
For the application, for example running the server start script, this check is done by using an
application monitor. The application monitors returned status determines whether the
application server start script will be run.
What if no application monitor is defined? If so, the cluster manager runs the application
server start script. This might cause problems for applications that cannot deal with another
instance being started, for example, if the start script is run again when the application is
already running.
For each PowerHA application server configured in the cluster, you can configure up to
128 application monitors, but the total number of application monitors in a cluster cannot
exceed 128.
In long-running mode, the monitor periodically checks that the application is running
successfully. The checking frequency is set through the Monitor Interval. The checking begins
after the stabilization interval expires, the Resource Group that owns the application server is
marked online, and the cluster has stabilized.
In startup mode, PowerHA checks the process (or calls the custom monitor), at an interval
equal to one-twentieth of the stabilization interval of the startup monitor. The monitoring
continues until the following events occur:
The process is active.
The custom monitor returns a 0.
The stabilization interval expires.
If successful, the resource group is put into the online state, otherwise the cleanup method is
invoked. In both modes, the monitor checks for the successful startup of the application
server and periodically checks that the application is running successfully.
Tip: The SMIT fast path for application monitor configuration is smitty cm_cfg_appmon.
When PowerHA finds that the monitored application process (or processes) are terminated, it
tries to restart the application on the current node until a specified retry count is exhausted.
To add a new process application monitor using SMIT, use one of the following steps:
Run smitty sysmirror and then select Cluster Applications and Resources
Resources Configure User Applications (Scripts and Monitors) Application
Monitors Configure Process Application Monitors Add a Process Application
Monitor.
Use smitty cm_appmon fast path.
Figure 7-40 shows the SMIT panel with field entries for configuring an example process
application monitor.
In our example, the application monitor is called APP1_monitor and is configured to monitor
the APP1 application server. The default monitor mode, Long-running monitoring, was
selected. A stabilization interval of 120 seconds was selected.
Note: The stabilization interval is one of the most critical values in the monitor
configuration. It must be set to a value that is determined to be long enough that if it
expires, the application has definitely failed to start. If the application is in the process of a
successful start and the stabilization interval expires, cleanup will be attempted and the
resource group will be placed into ERROR state. The consequences of the cleanup
process will vary by application and the method might provide undesirable results.
If the application fails, the Restart Method is run to recover the application. If the application
fails to recover to a running state after the number of restart attempts exceed the Retry
Count, the Action on Application Failure is taken. The action can be notify or fallover. If
notify is selected, no further action is taken after running the Notify Method. If fallover is
selected, the resource group containing the monitored application moves to the next available
node in the resource group.
The Cleanup Method and Restart Method define the scripts for stopping and restarting the
application after failure is detected. The default values are the start and stop scripts as
defined in the application server configuration.
To add a new custom application monitor using SMIT, use one of the following steps:
Run smitty sysmirror and then select Cluster Applications and Resources
Resources Configure User Applications (Scripts and Monitors) Application
Monitors Configure Custom Application Monitors Add a Custom Application
Monitor.
Run smitty cm_cfg_custom_appmon fast path.
The SMIT panel and its entries for adding this method into the cluster configuration are similar
to the process application monitor add SMIT panel, as shown in Figure 7-40 on page 311.
The only different fields in configuring custom application monitors SMIT menu are as follows:
Monitor Method Defines the full path name for the script that provides a method to
check the application status. If the application is a database, this
script could connect to the database and run a SQL select
sentence for a specific table in the database. If the given result of
the SQL select sentence is correct, it means that database works
normally.
Monitor Interval Defines the time (in seconds) between each occurrence of Monitor
Method being run.
Hung Monitor Signal Defines the signal that is sent to stop the Monitor Method if it
doesn't return within Monitor Interval seconds. The default action
is SIGKILL(9).
If only one application monitor is defined for an application server, the process is as simple as
stated previously.
If more than one application monitor is defined, the selection priority is based on the Monitor
Type (custom or process) and the Invocation (both, long-running or startup). The ranking of
the combinations of these two monitor characteristics is as follows:
Both, Process
Long-running, Process
Both, Custom
Long-running, Custom
Startup, Process
Startup, Custom
The highest priority application monitor found is used to test the state of the application.
When creating multiple application monitors for an application, be sure that your highest
ranking monitor according to the foregoing list returns a status that can be used by the cluster
manager to decide whether to invoke the application server start script.
When more than one application monitor meets the criteria as the highest ranking, the sort
order is unpredictable (because qsort is used). It does consistently produce the same result,
though.
Fortunately, there is a way to test which monitor will be used. The routine that is used by the
cluster manager to determine the highest ranking monitor is as follows:
/usr/es/sbin/cluster/utilities/cl_app_startup_monitor
An example of using this utility for the application server called testmonApp, which has three
monitors configured, is as follows:
/usr/es/sbin/cluster/utilities/cl_app_startup_monitor -s testmonApp -a
The output for this command, shown in Example 7-27, shows three monitors:
Mon: Custom, Long-running
bothuser_testmon: Both, Custom
longproctestmon: Process, Long-running
application = [testmonApp]
monitor_name = [bothuser_testmon]
resourceGroup = [NULL]
MONITOR_TYPE = [user]
PROCESSES = [NULL]
PROCESS_OWNER = [NULL]
MONITOR_METHOD = [/tmp/Bothtest]
INSTANCE_COUNT = [NULL]
MONITOR_INTERVAL = [10]
HUNG_MONITOR_SIGNAL = [9]
STABILIZATION_INTERVAL = [20]
INVOCATION = [both]
application = [testmonApp]
monitor_name = [longproctestmon]
resourceGroup = [NULL]
MONITOR_TYPE = [process]
PROCESSES = [httpd]
PROCESS_OWNER = [root]
MONITOR_METHOD = [NULL]
INSTANCE_COUNT = [4]
MONITOR_INTERVAL = [NULL]
HUNG_MONITOR_SIGNAL = [9]
STABILIZATION_INTERVAL = [60]
INVOCATION = [longrunning]
In the example, three monitors can be used for initial status checking. The highest ranking is
the long-running process monitor, longproctestmon. Recall that the Monitor Type for custom
monitors is user.
Note: A startup monitor will be used for initial application status checking only if no
long-running (or both) monitor is found.
If necessary, the application server start script is invoked. Simultaneously, all startup monitors
are invoked. Only when all the startup monitors indicate that the application has started, by
The stabilization interval is the timeout period for the startup monitor. If the startup monitor
fails to return a successful status, the application's resource group goes to the ERROR state.
After the startup monitor returns a successful status, there is a short time during which the
resource group state transitions to ONLINE, usually from ACQUIRING.
For each long-running monitor, the stabilization interval is allowed to elapse and then the
long-running monitor is invoked. The long-running monitor continues to run until a problem is
encountered with the application.
If the long-running monitor returns a failure status, the retry count is examined. If it is
non-zero, it is decremented, the Cleanup Method is invoked, and then the Restart Method is
invoked. If the retry count is zero, the cluster manager will process either a fallover event or a
notify event. This is determined by the Action on Application Failure setting for the monitor.
After the Restart Interval expires, the retry count is reset to the configured value.
Both /tmp/longR and /tmp/start-up methods check for /tmp/App in the process table. If
/tmp/App is found in the process table, the return code (RC) is 0; if not found, the RC is 1.
The /tmp/Stop method finds and kills the /tmp/App process in the process table to cause a
failure.
To see what happened in more detail, the failure as logged in /var/hacmp/log/hacmp.out (on
final RC=1 from start-up monitor) is shown Example 7-30.
}
+testmon:start_server[+284] cat /var/hacmp/log/.start_server.700610
+testmon:start_server[+293] SUCCESS=1
+testmon:start_server[+295] [[ 1 = 0 ]]
+testmon:start_server[+299] exit 1
Mar 11 11:33:13 EVENT FAILED: 1: start_server testmonApp 1
+testmon:node_up_local_complete[+148] RC=1
+testmon:node_up_local_complete[+149] : exit status of start_server testmonApp is:
1
This is can be done through the SMIT C-SPOC menus, by running smitty cspoc, selecting
Resource Group and Applications Suspend/Resume Application Monitoring
Suspend Application Monitoring, and then selecting the application server that is
associated with the monitor you want to suspend.
Use the same SMIT path to resume the application monitor. The output of resuming the
application monitor associated with the application server APP1 is shown in Example 7-31.
According to the information, collected by the application availability analysis tool, you can
select a time for measurement period and the tool displays uptime and downtime statistics for
a specific application during that period. Using SMIT you can display this information:
Percentage of uptime
Amount of uptime
Longest period of uptime
Percentage of downtime
Amount of downtime
Longest period of downtime
Percentage of time application monitoring was suspended
The application availability analysis tool reports application availability from the PowerHA
cluster perspective. It can analyze only those applications that were correctly configured in
the cluster configuration.
This tool shows only the statistics that reflect the availability of the PowerHA application
server, resource group, and the application monitor (if configured). It cannot measure any
internal failure in the application that can be detected by the user, if it is not detected by the
application monitor.
Figure 7-41 shows the SMIT panel displayed for the application availability analysis tool in our
test cluster. You can use smitty cl_app_AAA.dialog fast path to get to the SMIT panel.
[Entry
Fields]
* Select an Application [App1] +
* Begin analysis on YEAR (1970-2038) [2012] #
* MONTH (01-12) [03] #
* DAY (1-31) [24] #
* Begin analysis at HOUR (00-23) [16] #
* MINUTES (00-59) [20] #
* SECONDS (00-59) [00] #
* End analysis on YEAR (1970-2038) [2012] #
* MONTH (01-12) [03] #
* DAY (1-31) [24] #
* End analysis at HOUR (00-23) [17] #
* MINUTES (00-59) [42] #
* SECONDS (00-59) [00] #
Figure 7-41 Adding Application Availability Analysis SMIT panel
In the SMIT menu of the Application Availability Analysis tool, enter the selected application
server, enter start and stop time for statistics, and run the tool. Example 7-32 shows the
Application Availability Analysis tool output from our test cluster.
Log records terminated before the specified ending time was reached.
Application monitoring was suspended for 75.87% of the time period analyzed.
Application monitoring state was manually changed during the time period analyzed.
Cluster services were manually restarted during the time period analyzed.
Typically data center clusters are deployed in trusted environments and hence might not need
any security to protect cluster packets (which are custom to begin with and also have no
user-related data). Additionally in version 7, CAA provides for inherent security because of
the need for repository disk.
Repository disk is a shared disk across all the nodes of the CAA cluster and is used
extensively and continuously by the CAA for health monitoring and configuration purposes.
The expectation is that individual nodes have connectivity to the repository disk through the
SAN fabric and pass all security controls of the SAN fabric, regarding host access, to the disk.
Hosts can join the CAA cluster and become a member only if they have access to the shared
repository disk. As a result, any other node trying to spoof and join the cluster cannot
succeed unless it has an enabled physical connection to the repository disk.
The repository disk does not host any file system. This disk is accessed by clustering
components in the raw-format to maintain their internal data structures. These structures are
internal to clustering software and is not published anywhere.
Because of these reasons, most customers might choose to deploy clusters without enabling
any encryption/decryption for the cluster.
However an administrator can choose to deploy CAA security; the various configuration
modes supported are described in later sections.
All cluster utilities intended for public use have hacmp setgid turned on so they can read the
PowerHA ODM files. The hacmp group is created during PowerHA installation, if it is not
already there.
CAA encrypts packets exchanged between the nodes using a Symmetric key. This symmetric
key can be one of the types listed in Table 8-1.
CAA exchanges the symmetric key for certain configuration methods with host-specific
certificate and private key pairs using asymmetric encryption and decryption.
CAA provides these methods of security setup regarding asymmetric or symmetric keys:
Self-signed certificate-private key pair
The administrator can choose this option for easy setup. When the administrator uses this
option, CAA generates a certificate and private key pair. The asymmetric key pair
generated will be of the type RSA (1024 bits). In this case, the administrator also provides
a symmetric key algorithm to be used (key size is determined by the symmetric algorithm
selected, as shown in Table 8-1).
CAA security also supports the following levels of security. Currently these levels are not
differentiated at a fine granular level.
Medium or High All cluster packets will be encrypted and decrypted
Low or Disable CAA security will be disabled
Various CAA security keys are stored in the /etc/security/cluster/ directory. Default files
are as follows (the location and file names are internal to CAA but should not be assumed):
Certificate: /etc/security/cluster/cacert.der
Private key: /etc/security/cluster/cakey.der
Symmetric key: /etc/security/cluster/SymKey
Enabling security
Example 8-1 shows how to enable security by using clmgr.
[Entry Fields]
* Security Level [High] +
+--------------------------------------------------------------------------+
| Security Level |
| |
| Move cursor to desired item and press Enter. |
| |
| Disable |
| Low |
| Medium |
| High |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F8=Image F10=Exit Enter=Do |
F5| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Figure 8-1 SMIT setting security level to high
[Entry Fields]
* Symmetric Algorithm [AES] +
+--------------------------------------------------------------------------+
| Symmetric Algorithm |
| |
| Move cursor to desired item and press Enter. |
| |
| DES |
| 3DES |
| AES |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F8=Image F10=Exit Enter=Do |
F5| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Figure 8-2 Set symmetric algorithm through SMIT
4. Upon successful completion, verify that the settings exist on the other node, shanley, as
shown in Example 8-2.
Important: The settings are effective immediately and dynamically. Synchronizing the
cluster is not required.
[Entry Fields]
* Security Level [High] +
* Auto create & distribute Security Configuration [Self Signed Certifica> +
Certificate Location []
Private Key Location []
Figure 8-3 Set security mechanism through SMIT
Disabling security
Disabling security can also be done with the clmgr command or with SMIT. Example 8-3
shows disabling by using the clmgr command.
To disable through SMIT, run smitty clustersec, and select Cluster Security Level, and
then select Disable from the F4 pick list, as shown in Figure 8-4.
[Entry Fields]
* Security Level [Disable] +
Figure 8-4 Disabling security through SMIT
The -c and -f flags are optional. Example 8-4 shows two samples of the command: one with
the bare minimum requirements and one with the exact file paths.
In either case the changes are effective immediately and are automatically updated across
the cluster. There is no need for cluster synchronization.
If administrators wants to replace the existing symmetric key with a new one, they can do the
same by updating CAA to the new key and algorithm. When security is already enabled on
the cluster, the user will request to enable the security with a different symmetric key
algorithm. This also applies the same security algorithm. CAA security mechanism, first
disables the existing security and then enables the security with the requested symmetric key
algorithm/key.
Note: An easy method to generate a symmetric key is to collect a random set of bytes and
store it in a file. The example shows a key being generated for AES 256 algorithm use.
To generate a 256-bit key from the random device to a symmetric key (SymKey) file, use
these steps:
1. Use the dd command as follows:
[shanley:root] / # dd if=/dev/random of=/tmp/SymKey bs=8 count=4
4+0 records in.
4+0 records out.
2. Copy the SymKey to the /etc/cluster/security directory to each node in the cluster.
Then enable security with the symmetric key as follows:
[shanley:root]clctrl -sec -x /etc/security/cluster/SymKey -s AES
savesecconf: Security enabled successfully.
We suggest using SSH. DLPAR operations require SSH also. SSH and Secure Sockets Layer
(SSL) together provide authentication, confidentiality, and data integrity. The SSH
authentication scheme is based on public and private key infrastructure; SSL encrypts
network traffic.
Through the federated security cluster, administrators are able to manage roles and the
encryption of data across the cluster.
EFS
Keystore @ LDAP , Encrypt cluster Enable EFS
Keystore @ data at filesystem level
shared filesystem through efs
LDAP
The LDAP method is used by cluster nodes to allow centralized security authentication and
access to user and group information.
The following supported LDAP servers can be configured for federated security:
IBM Tivoli Director server
Windows Active Directory server
All cluster nodes must be configured with the LDAP server and the client file sets. PowerHA
provides options to configure the LDAP server and client across all cluster nodes.
SSL: Secure Sockets Layer (SSL) connection is mandatory for binding LDAP clients to
servers. Remember to configure SSL in the cluster nodes.
The LDAP server and client configuration is provided through the PowerHA smitty option and
the System Director PowerHA plug-in.
For the LDAP server and client setup, SSL must be configured. The SSL connection is
mandatory for binding LDAP clients to servers.
LDAP server: The LDAP server must be configured on all cluster nodes. If an LDAP
server exists, it can be incorporated into PowerHA for federated security usage.
Details of the LDAP server and client configuration are explained in Configuring LDAP on
page 333.
RBAC
Cluster administration is an important aspect of high availability operations, and security in
the cluster is an inherent part of most system administration functions. Federated security
integrates the AIX RBAC features to enhance the operational security.
During LDAP client configuration, four PowerHA defined roles are created in LDAP. These
roles can be assigned to the user to provide restricted access to the cluster functionality
based on the role.
ha_admin Provides administrator authorization for the relevant cluster
functionality. For example, taking a cluster snapshot is under
administrator authorization.
ha_op Provides operator authorization for the relevant cluster functionality.
For example, move cluster resource group is under operator
authorization.
ha_mon Provides monitor authorization for the relevant cluster functionality.
For example, the command clRGinfo is under monitor authorization.
ha_view Provides viewer authorization. It has all read permissions for the
cluster functionality.
Role creation: PowerHA roles are created while you configure the LDAP client in the
cluster nodes.
From the federated security perspective, the EFS keystores are stored in LDAP. There is an
option to store the keystores through a shared file system in the cluster environment if LDAP
is not configured in the cluster.
Tip: Store the EFS keystore in LDAP. As an option, if the LDAP environment is not
configured, the keystore can be stored in a Network File System (NFS) mounted file
system.
The file sets for RBAC and EFS are available by default in AIX 6.1 and later versions, and no
specific prerequisites are required. The challenge is to configure LDAP.
More information: For complete DB2 and LDAP configuration details, see this website:
https://www.ibm.com/developerworks/mydeveloperworks/wikis/home/wiki/PowerHA%20S
ystemMirror/page/PowerHA%20Cluster%20with%20Federated%20Security?lang=en
Configuring LDAP
Use the following steps to install the LDAP configuration:
1. Install and Configure DB2.
2. Install the GSkit file sets.
3. Install Tivoli Director Server (LDAP server and client) file sets.
Installing DB2
The DB2 installation steps are shown in Example 8-5.
Ensure that the SSL file sets are configured as shown in Example 8-7.
LDAP configuration
The LDAP configuration by using the smitty panel can be reached through System
Management (C-SPOC) LDAP as shown in Figure 8-6.
Storage
PowerHA SystemMirror Services
Communication Interfaces
Resource Group and Applications
PowerHA SystemMirror Logs
PowerHA SystemMirror File Collection Management
Security and Users
LDAP
Configure GPFS
If an LDAP server is already configured, the cluster nodes can use the existing LDAP server
or configure a new LDAP server.
Option1
Option2 Clus ter nodes
Cluste r nodes
nodeA
nodeA
LDA P server &
LDA P server & L DAP client
L DAP client
External
L DAP Se rver
nodeB
nodeB
LDA P server &
LDA P server &
L DAP client
L DAP client
[Entry Fields]
* Hostname(s) +
* LDAP Administrator DN [cn=admin]
* LDAP Administrator password []
Schema type rfc2307aix
* Suffix / Base DN [cn=aixdata,o=ibm]
* Server port number [636] #
* SSL Key path [] /
* SSL Key password []
* Version +
* DB2 instance password []
* Encryption seed for Key stash files []
Example 8-9 ODM command to verify LDAP configuration for federated security
# odmget -q "group=LDAPServer and name=ServerList" HACMPLDAP
HACMPLDAP:
group = "LDAPServer"
type = "IBMExisting"
name = "ServerList"
value = "selma06,selma07"
[Entry Fields]
* LDAP server(s) []
* Bind DN [cn=admin]
* Bind password []
* Suffix / Base DN [cn=aixdata,o=ibm]
* Server port number [636] #
* SSL Key path [] /
* SSL Key password []
The success of adding an existing LDAP server is verified with the ODM command that is
shown in Example 8-10.
Example 8-10 ODM command to verify the existing LDAP configuration for federated security
# odmget -q "group=LDAPServer and name=ServerList" HACMPLDAP
HACMPLDAP:
group = "LDAPServer"
type = "IBMExisting"
name = "ServerList"
value = "selma06,selma07"
LDAP Client
[Entry Fields]
* LDAP server(s) [] +
* Bind DN [cn=admin]
* Bind password []
Authentication type ldap_auth
* Suffix / Base DN [cn=aixdata,o=ibm]
* +--------------------------------------------------------------------------+#
* | LDAP server(s) |/
* | |
| Move cursor to desired item and press F7. |
| ONE OR MORE items can be selected. |
| Press Enter AFTER making all selections. |
| |
| quimby06 |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F7=Select F8=Image F10=Exit |
F5| Enter=Do /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Figure 8-11 LDAP client configuration parameters
Verify the client configuration by using the ODM command that is shown in Example 8-11.
HACMPLDAP:
group = "LDAPClient"
type = "ITDSClinet"
name = "ServerList"
value = "selma06,selma07"
You can also verify the client configuration by checking the LDAP client daemon status, by
using the command that is shown in Example 8-12.
Example 8-12 Verify the client daemon status after LDAP client configuration
# ps -eaf | grep secldapclntd
root 4194478 1 2 04:30:09 - 0:10 /usr/sbin/secldapclntd
RBAC
During the LDAP client configuration, the PowerHA defined roles are created in the LDAP
server.
Verify the configuration of the RBAC roles in the LDAP server by using the ODM command
that is shown in Example 8-13.
Example 8-13 ODM command to verify RBAC configuration into LDAP server
# odmget -q "group=LDAPClient and name=RBACConfig" HACMPLDAP
HACMPLDAP:
group = "LDAPClient"
type = "RBAC"
name = "RBACConfig"
value = "YES"
Verify the four PowerHA defined roles that are created in LDAP, as shown in Example 8-14.
Example 8-14 shows that the RBAC is configured and can be used by the cluster users and
groups. The usage scenario of roles by cluster users and groups are defined in EFS on
page 341.
No
Create efskeystor e
(shared) filesystem
Store EFSkeystor e
in shared filesystem
To configure the EFS management configuration (Figure 8-13), run smitty sysmirror and
select System Management (C-SPOC) Security and Users EFS Management.
Under EFS management, the options are provided to enable EFS and to store keystores
either in LDAP or a shared file system.
[Entry Fields]
* EFS keystore mode LDAP +
EFS admin password []
Volume group for EFS Keystore [] +
Service IP [] +
+--------------------------------------------------------------------------+
| EFS keystore mode |
| |
| Move cursor to desired item and press Enter. |
| |
| LDAP |
| Shared Filesystem |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F8=Image F10=Exit Enter=Do |
F5| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Important: The volume group and service IP are invalid and ignored in LDAP mode.
Verify the EFS enablement as understood by the cluster, by using the command that is shown
in Example 8-15.
HACMPLDAP:
group = "EFSKeyStore"
type = "EFS"
name = "mode"
value = "1"
As shown in Figure 8-13 on page 341, to enable EFS and to store the EFS keystore in the
shared file system, provide the volume group and service IP details:
The volume group to store the EFS keystore in a file system
The service IP to mount the file system where the keystore is stored so that it is highly
available to cluster nodes
Important: The file system creation, mount point, and NFS export are performed internally
under the EFS keystore in a shared file system option.
Verify the configuration by using the ODM command that is shown in Example 8-16.
Example 8-16 ODM command to verify EFS configuration in shared file system mode
# odmget -q "group=EFSKeyStore AND name=mode" HACMPLDAP
HACMPLDAP:
group = "EFSKeyStore"
type = "EFS"
name = "mode"
value = "2"
When planning a virtual environment in which to run a PowerHA cluster, we must focus on
improving hardware concurrency at the Virtual I/O Server level and also in the PowerHA
cluster nodes. Typically, the Virtual I/O Server hosts the physical hardware being presented to
the cluster nodes, so a critical question to address is: What would happen to your cluster if
each or any of those devices were to fail?
The Virtual I/O Server is considered a single point of failure, so you should consider
presenting shared disk and virtual Ethernet to your cluster nodes from additional Virtual I/O
Server partitions. Figure 9-1shows an example of considerations for PowerHA clusters in a
virtualized environment.
For more information about configuring Virtual I/O Servers, see IBM PowerVM Virtualization
Introduction and Configuration, SG24-7940.
Several ways are available to configure AIX client partitions and resources for extra high
availability with PowerHA. We suggest that you use at least two Virtual I/O Servers for
maintenance tasks at that level. An example of a PowerHA configuration that is based on VIO
clients is shown in Figure 9-2.
All volume group creation and maintenance is done using the C-SPOC function of PowerHA
and the bos.clvm.enh file set must be installed.
Example configuration
The following steps describe an example of how to set up concurrent disk access for a SAN
disk that is assigned to two client partitions. Each client partition sees the disk through two
Virtual I/O Servers. On the disk, an enhanced concurrent volume group is created. This kind
of configuration can be used to build a two-node PowerHA test cluster on a single POWER
machine:
1. Create the disk on the storage device.
2. Assign the disk to the Virtual I/O Servers.
3. On the first Virtual I/O Server, do the following tasks:
a. Scan for the newly assigned disk:
$ cfgdev
b. Change the SCSI reservation of that disk to no_reserve so that the SCSI reservation
bit on that disk is not set if the disk is accessed:
$ chdev -dev hdiskN -attr reserve_policy=no_reserve
Where N is the number of the disk, and reservation commands are specific to the
multipathing disk driver in use. This parameter is used with IBM DS4000 disks; it can
be different with other disk subsystems.
c. Assign the disk to the first partition:
$ mkvdev -vdev hdiskN -vadapter vhostN [ -dev Name ]
Where N is the number of the disk; the vhost and the device name can be selected to
what you want, but can also be left out entirely. The system then creates a name
automatically.
d. Assign the disk to the second partition:
$ mkvdev -f -vdev hdiskN -vadapter vhostN [ -dev Name ]
4. On the second Virtual I/O Server, do the following tasks:
a. Scan for the disk:
$ cfgdev
b. Change the SCSI reservation of that disk:
$ chdev -dev hdiskN -attr reserve_policy=no_reserve
c. Assign the disk to the first cluster node:
$ mkvdev -vdev hdiskN -vadapter vhostN [ -dev Name ]
d. Assign the disk to the second cluster node:
$ mkvdev -f -vdev hdiskN -vadapter vhostN [ -dev Name ]
For cluster nodes that use virtual Ethernet adapters, multiple configurations are possible for
maintaining high availability at network layer. Consider the following factors:
Configure dual VIOS to ensure high availability of virtualized network paths.
Use the servers that are already configured with virtual Ethernet settings because no
special modification is required. For a VLAN-tagged network, the preferred solution is to
use SEA fallover; otherwise, consider using the network interface backup.
One client-side virtual Ethernet interface simplifies the configuration; however, PowerHA
might miss network events. For a more comprehensive cluster configuration, configure two
virtual Ethernet interfaces on the cluster LPAR to enable PowerHA. Two network interfaces
are required by PowerHA to track network changes, similar to physical network cards. Be
sure to have two client-side virtual Ethernet adapters that use different SEAs. This
ensures that any changes in the physical network environment can be informed to the
PowerHA cluster using virtual Ethernet adapters such as in a cluster with physical network
adapters.
Important: You can perform LPM on a PowerHA SystemMirror LPAR that is configured
with SAN communication. However, when you use LPM, the SAN communication is not
automatically migrated to the destination system. You must configure the SAN
communication on the destination system before you use LPM. Full details can be found
at:
http://www-01.ibm.com/support/knowledgecenter/SSPHQG_7.1.0/com.ibm.powerha.admn
gd/ha_admin_config_san.htm
PowerHA SystemMirror 7.1.3 supports SAN-based heartbeat within a site. The SAN
heartbeating infrastructure can be accomplished in many ways:
Using real or physical adapters on cluster nodes and enabling the storage framework
capability (sfwcomm device) of the HBAs. Currently FC and SAS technologies are
supported. For more details about the HBAs and the steps to set up the storage
framework communication, see the following IBM Knowledge Center topic:
http://pic.dhe.ibm.com/infocenter/aix/v7r1/index.jsp?topic=/com.ibm.aix.cluster
aware/claware_comm_setup.htm
In a virtual environment using NPIV or vSCSI with a Virtual I/O Server. Enabling the
sfwcomm interface requires activating the target mode enabled (the tme attribute) on the
real adapters in the VIOS and defining a private virtual LAN (VLAN) with VLAN ID 3358 for
communication between the partitions that contain the sfwcomm interface and VIOS (the
VLAN acts as control channel such as in case of SEA fallover). The real adapter on VIOS
must be a supported HBA.
Using FC or SAN heartbeat requires zoning of corresponding FC adapter ports (real FC
adapters or virtual FC adapters on VIOS).
After the zoning is complete, the next step is to enable target mode enabled attribute. The
target mode enabled (tme) attribute for a supported adapter is available only when the
minimum AIX level for CAA is installed (AIX 6.1 TL6 or later or AIX 7.1 TL0 or later). This must
be performed on all Virtual I/O Servers. The configuration steps are as follows:
1. Configure the FC adapters for SAN heartbeating on VIOS:
# chdev -l fcsX -a tme=yes -P
2. Repeat step 1 for all FC adapters.
3. Set dynamic tracking to yes and FC error recovery to fast_fail:
# chdev -l fscsiX -a dyntrk=yes -a fc_err_recov=fast_fail -P
4. Reboot the VIOS.
5. Repeat steps 1 - 4 for every VIOS that serves the cluster LPARs.
6. On the HMC, create a new virtual Ethernet adapter for each cluster LPAR and VIOS. Set
the VLAN ID to 3358. Do not put other VLAN ID or any other traffic on this interface. Save
the LPAR profile.
7. On the VIOS, run the cfgmgr command and check for the virtual Ethernet and sfwcomm
device by using the lsdev command:
#lsdev -C | grep sfwcomm
Notes:
sfwcomm0 Available 01-00-02-FF Fibre Channel Storage Framework Communication
sfwcomm1 Available 01-01-02-FF Fibre Channel Storage Framework Communication
8. On the cluster nodes, run the cfgmgr command, and check for the virtual Ethernet adapter
and sfwcomm with the lsdev command.
9. No other configuration is required at the PowerHA level. When the cluster is configured
and running, you can check the status of SAN heartbeat by using the lscluster -i
sfwcin command.
We expect that proper LPAR and DLPAR planning is part of your overall process before
implementing any similar configuration. Understanding the following topics is important:
The requirements and how to implement them
The overall effects that each decision has on the overall implementation.
9.3.1 Requirements
To use the integrated DLPAR functions, or capacity on demand (CoD) of PowerHA on
POWER5 and later, all LPAR nodes in the cluster must have at least the following levels
installed:
PowerHA 5.5 or later
Appropriate AIX level for specific PowerHA version
Appropriate RSCT level for specific AIX level
OpenSSH
The OpenSSH software can be obtained from the base AIX media and often is installed by
default.
The HMC attachment to the LPARs is required for proper management and DLPAR
capabilities. The HMC must be network attached on a common network with the LPARs to
allow remote DLPAR operations.
Important: A key configuration requirement, and one that PowerHA assumes is in place, is
that the LPAR partition name, the cluster node name, and AIX host name must match.
Other considerations
When planning a cluster to include DLPAR operations, the following considerations apply:
Encountering possible config_too_long message during DLPAR events
Mix of dedicated and shared LPARs
Mix of POWER server types
CoD provisioning
Although HMC versions might not support all POWER platforms, in general, PowerHA does.
This means that a POWER7 production system can fallover to a POWER6 system.
As with any cluster, the configuration must be tested thoroughly. This includes anything that
can be done to simulate or produce a real work load for the most realistic test scenarios as
possible.
CoD Limitations
To use CoD, a license key must already be entered into the HMC so that PowerHA can
enable it.
Permanent CPU=Yes
Memory=Yes
On/Off CPU=Yes
Memory=No
Trial CPU=Yes
Memory=Yes
Utility Utility CoD is performed at the Hypervisor/System level and PowerHA cannot
perform this role.
Overview
When you configure an LPAR on the HMC (outside of PowerHA), you provide LPAR minimum,
desired, and maximum values for the number of CPUs and amount of memory. These values
can be obtained by running the lshwres command on the HMC. The stated minimum values
of the resources must be available when an LPAR node starts. If more resources are available
in the free pool on the frame, an LPAR can allocate up to the stated desired values.
During dynamic allocation operations, the system does not allow the values for CPU and
memory to go below the minimum or above the maximum amounts specified for the LPAR.
PowerHA obtains the LPAR minimum and maximum values and uses them to allocate and
release CPU and memory when application servers are started and stopped on the LPAR
node.
PowerHA requests the DLPAR resource allocation on the HMC before the application servers
are started, and releases the resources after the application servers are stopped. The Cluster
Manager waits for the completion of these events before continuing the event processing in
the cluster.
PowerHA handles the resource allocation and release for application servers serially,
regardless if the resource groups are processed in parallel. This minimizes conflicts between
application servers that try to allocate or release the same CPU or memory resources.
Therefore, you must carefully configure the cluster to correctly handle all CPU and memory
requests on an LPAR.
Creating a custom event or customizing application start/stop scripts to stop LPAR nodes is
possible if you want.
More details about this topic are in the Administering PowerHA SystemMirror:
http://public.dhe.ibm.com/systems/power/docs/powerha/71/hacmpadmngd_pdf.pdf
In general, PowerHA tries to allocate as many resources as possible to meet the desired
amount for the application, and uses CoD, if allowed, to do this.
In general, PowerHA starts counting the extra resources required for the application from the
minimum amount. That is, the minimum resources are retained for the nodes overhead
operations, and are not used to host an application.
In this case, PowerHA does not allocate any extra resources and the application can be
successfully started on the LPAR node. PowerHA also calculates that the node has enough
resources for this application in addition to hosting all other application servers that might be
currently running on the node.
Resources requested from the free pool and from the CoD pool
If the number of resources in the free pool is insufficient to satisfy the total amount requested
for allocation (minimum requirements for one or more applications), PowerHA requests
resources from CoD (if enabled).
If PowerHA meets the requirement for a minimum amount of resources for the application
server, application server processing continues. Application server processing continues
even if the total desired resources (for one or more applications) have not been met or are
only partially met. In general, PowerHA attempts to acquire up to the desired amount of
resources requested for an application.
If the amount of resources is still insufficient to host an application, PowerHA starts resource
group recovery actions to move the resource group to another node.
Based on the resource group processing order, some resource groups (hence the
applications) might not be started. This is explained further in Examples of using DLPAR and
CoD resources on page 358
PowerHA first releases the DLPAR or CoD resources it acquired last. This implies that the
CoD resources might not always be released before the dynamic LPAR resources are
released.
The free pool is limited to only the single frame. That is, for clusters configured on two frames,
PowerHA does not request resources from the second frame for an LPAR node residing on
the first frame.
Also, if LPAR 1 releases an application that puts some DLPAR resources into free pool,
LPAR 2, which is using the CoD resources, does not attempt to release its CoD resources
and acquire the free DLPAR resources.
If the LPAR is not stopped after the Cluster Manager is forced down on the node, the CPU
and memory resources remain allocated to the LPAR for use when the LPAR rejoins the
cluster.
The new configuration is not reflected until the next event that causes the application (hence
the resource group) to be released and reacquired on another node. This means that a
change in the resource requirements for CPUs, memory, or both does not cause the
recalculation of the DLPAR resources. PowerHA does not stop and restart application servers
solely for the purpose of making the application provisioning changes.
If another dynamic reconfiguration change (for example, an rg_move event) causes the
resource groups to be released and reacquired, the new resource requirements for DLPAR
and CoD are used at the end of this dynamic reconfiguration event.
Note: Be aware that after PowerHA acquires additional resources for an application server,
when the server moves again to another node, it takes the resources with it. That is, the
LPAR node releases all the additional resources it acquired.
The configuration is an eight-CPU frame, with a two-node (each an LPAR) cluster. There are
two CPUs available in the CoD pool, that is, through the CoD activations. The nodes have the
partition profile characteristics shown in Table 9-2 and Table 9-3 (this table lists three defined
application servers, each belonging to separate resource groups.
Bear 1 9
Aggie 1 5
App1 1 1 Yes
App2 2 2 No
App3 4 4 No
Note: If the minimum resources for App2 had been set to zero instead of one, the
acquisition would have succeeded. because no additional resources would be required.
Example 5: Resource group failure, LPAR min and max are same
This example demonstrates a real case that we encountered during our early testing. This is
a direct result of improper planning in regards to how application provisioning works.
We are still using an 8-CPU frame, however, the additional application servers and nodes are
not relevant to this example. The LPAR configuration for node Bear is shown in Table 9-4. The
App1 application server has the settings shown in Table 9-5.
4 4 4
1 4
App1 application server is started locally on node Bear. During acquisition, the LPAR
minimum is checked and added to the application server minimum, which returns a total of 5.
This total exceeds the LPAR maximum setting and results in the resource group going into the
error state.
Although technically the LPAR might already have enough resources to host the application,
because of the combination of settings, it results in a failure. Generally, you will not have the
minimum and maximum settings equal.
This scenario might have been avoided in any one of these three ways:
Change LPAR minimum to 3 or less.
Change LPAR maximum to more than 4.
Change App1 minimum CPUs to 0.
Ensure that all nodes and the HMC are configured identically by checking the following list. All
systems (all PowerHA nodes and the HMC) must do these tasks:
Resolve the participating host names and IP addresses identical. This includes reverse
name resolution.
Use the same type of name resolution, either short or long name resolution. All systems
should use the same name resolution order, either local or remote.
To ensure these requirements, check the following files:
/etc/hosts on all systems
/etc/netsvc.conf on all AIX nodes
/etc/host.conf, if applicable, on the HMC
We expect that knowing how to check these files on the AIX systems is common knowledge.
However, not as well known is how to check the files on the HMC, which is covered in the
following sections.
With each version of SSH and HMC code, these steps might differ slightly. We document our
processes, which we used to successfully implement our environment.
You can now install using smitty install_all. The core file sets required to install and the
results of our installation are shown in Example 9-1 on page 362.
Tip: Be sure to choose yes in the field to accept the license agreement.
Now that SSH is installed we need to configure the PowerHA nodes to access the HMC
without passwords for remote DLPAR operations.
Note: Although we normally suggest that you create a separate HMC user for remote
command execution, PowerHA uses hscroot.
You must create the SSH directory $HOME/.ssh for the root user to store the authentication
keys. PowerHA run the SSH remote DLPAR operations as the root user. By default this is
/.ssh, and is what we used.
To generate public and private keys, run the following command on each PowerHA node:
/usr/bin/ssh-keygen -t rsa
The write bits for both group and other are turned off. Ensure that the private key has a
permission of 600.
The HMCs public key must be in known_hosts file on each PowerHA node, and vice versa.
This is easily accomplished by running ssh to the HMC from each PowerHA node. The first
time you run it, you are prompted to insert the key into the file. Answer yes to continue. You
are then prompted to enter a password, which is unnecessary now because we have not
completed the setup yet to allow non-password SSH access, as shown in Example 9-2.
When using two HMCs, you must repeat this process for each HMC. You can also do this
between all member nodes to allow SSH types of operations between them (for example, scp,
sftp, and ssh).
To allow non-password SSH access, we put each PowerHA nodes public key into the
authorized_keys2 file on the HMC. This can be done in more than one way; you can consult
the HMC for information about using mkauthkeys, however here is an overview of the steps
we used:
1. Copy (scp) the authorized_keys2 file from the HMC to the local node.
2. Concatenate (cat) the public key for each node into the authorized_keys2 file.
3. Repeat on each node.
4. Copy (scp) the concatenated file to the HMC /home/hscroot/.ssh.
In the /.ssh directory, we then copied it to the local node by running this command:
scp hscroot@itsohmc:~/.ssh/authorized_keys2 ./authorized_keys2.hmc
Next, from /.ssh on the AIX LPARs, we made a copy of the public key and renamed it to
include the local node name as part of the file name. We then copied, through scp, the
public key of each machine (jessica and shanley) to one node (cassidy).
2. We then ran the cat command to create an authorized_keys2 file that contains the public
key information for all PowerHA nodes. The commands run on each node are shown in
Example 9-4.
When running the scp command to the HMC, you are prompted to enter the password for the
hscroot user. Then, the authorized_key2 are copied. You can then test whether the
After each node can use ssh to the HMC without a password, then this step is completed and
PowerHA verification of the HMC communications succeeds.
You can obtain the managed system names through the HMC console in the navigation area.
The managed system name can be a user-created name, or the default name is the machine
type and serial number.
To define the HMC communication for each PowerHA node, use these steps:
1. In smitty sysmirror, select Cluster Applications and Resources Resources
Configure User Applications (Scripts and Monitors) Configure Application for
Dynamic LPAR and CoD Resources Configure Communication Path to HMC
Add HMC IP addresses for a node, and then press Enter.
Tip: You can use the SMIT fast path of smitty cladd_apphmc.dialog.
2. Complete the following fields (Figure 9-4 on page 366 shows the SMIT panel):
Node Name
Select a node name to associate with one or more. HardwareManagement Console
(HMC) IP addresses and a Managed System.
HMC IP Addresses
Enter one or more space-separated IP addresses for the HMC. If addresses are added
for more than one HMC, PowerHA tries to communicate with each HMC until a working
communication path is found. After the communication path is established, PowerHA
uses this path to run the dynamic logical partition commands on that HMC.
[Entry Fields]
* Node Name [jessica] +
* HMC IP Address(es) [192.168.100.2] +
Managed System Name [p750_4] +
Figure 9-4 Defining HMC and Managed System to PowerHA
3. Press Enter.
4. Repeat this process for each node in the cluster. In our case, we perform it three times.
During cluster verification, PowerHA verifies that the HMC is reachable by first issuing a ping
command to the specified IP address. If the HMC responds, then PowerHA verifies that each
specified PowerHA node is DLPAR-capable by issuing an lssycfg command, through ssh, on
the HMC.
Verification error: The Managed System Name is not a required field. PowerHA can
extrapolate the name from the HMC and partition name. Also during our testing a
verification error was encountered when we provided the managed system name. If you
encounter this contact support line for a formal resolution. In our case we removed the
managed system name and everything then worked correctly.
Example 9-6 on page 367 shows the HMC definitions for each of our cluster nodes.
During cluster verification, PowerHA verifies that the HMC is reachable by first issuing a ping
to the IP address specified. If the HMC responds, then PowerHA will verify that each specified
PowerHA node is in fact DLPAR capable by issuing an lssycfg command, through ssh, on
the HMC.
Tip: You can use the SMIT fast path of smitty cm_cfg_appdlpar.
When the application requires more resources to be allocated on this node, PowerHA
performs its calculations to determine whether it needs to request only the DLPAR resources
from the free pool on the frame and whether that will satisfy the requirement, or if CoD
resources are also needed for the application server. After that, PowerHA proceeds with
requesting the desired amounts of memory and numbers of CPU, if you selected to use them.
PowerHA also verifies that the total of required resources for all application servers that can
run concurrently on the LPAR is less than the LPAR maximum. If this requirement is not met,
PowerHA issues a warning.
Note: This scenario can happen upon subsequent fallovers. That is, if the LPAR node is
already hosting application servers that require DLPAR and CoD resources, then upon
acquiring yet another application server, the possibility exists that the LPAR cannot acquire
any more resources beyond its LPAR maximum. PowerHA verifies this case and issues a
warning.
An application provisioning example for our test configuration is shown in Figure 9-6.
After adding both the HMC communications and application provisioning, synchronize the
cluster.
Example 9-7 shows an error message that gives good probable causes for a problem.
However, the following two actions can help you to discover the source of the problem:
Ping the HMC IP address.
Use the ssh hscroo@hmcip command to the HMC.
If ssh is unsuccessful (Example 9-8) or prompts for a password, this is an indication that SSH
was not correctly configured.
If the message in Example 9-8 appears by itself, it is normally an indication that access to the
HMC is working, however the particular nodes matching LPAR definition is not reporting that
it is DLPAR-capable. This might be caused by RMC not updating properly. Generally, this is
rare, and usually applies only to POWER4 systems. You can verify this manually from the
HMC command line, as shown in Example 9-9.
Note: The HMC command syntax can vary by HMC code levels and type.
Also, be sure that RMC communication to HMC (port 657) is working and restart the rsct
daemons on the partitions by running these commands, in this order on the cluster node:
1. /usr/sbin/rsct/install/bin/recfgct
2. /usr/sbin/rsct/bin/rmcctrl -z
3. /usr/sbin/rsct/bin/rmcctrl A
4. /usr/sbin/rsct/bin/rmcctrl p
During our testing, we ran several events within short periods of time. At certain points, our
LPAR reported that it was no longer DLPAR-capable. Then after a short period, it reported
normally again. We believe that this occurred because RMC information became out-of-sync
between the LPARs and the HMC.
For shared storage, we used a DS4800 with 10 GB LUNs, two of which were assigned to
each production resource group. For purposes of our testing, the shared storage was not
important other than trying to set up a more complete cluster configuration.
These LPARs are configured with the partition profile settings listed in Table 9-6. This table
shows CPUs, virtual processor (VP), and memory values.
We have two configured resource groups, jessdlparrg and cassdlparrg, each containing their
own corresponding application servers, dummyapp1 and dummyapp2 respectively. Each
resource group is configured as online on home node. Jessdlparrg has participating nodes of
Jessica and Shanley. Cassdlparrg has participating nodes of Cassidy and Shanley. These
make our cluster a 2+1 setup, with node Shanley as the standby node.
The application server DLPAR configuration settings (minimum and desired) are shown in
Table 9-7 on page 372.
We specifically chose the minimum settings of zero to always allow our resource group to be
acquired.
Note: We suggest to set the partition profile values of minimum and desired to the same
value. This should be the minimum that the application will run with. This allows the
PowerHA DLPAR to always be set to 0. This applies to both processor and memory.
Important: In most POWER systems, even if partitions are inactive, if they have resources
assigned to them, meaning that their minimum, desired, and maximum settings are not set
to zero (0), then they are not available in the free pool.
Upon starting cluster services on Jessica, dummyapp1 is started locally and attempts to
acquire the desired amount of resources assigned. Because there are enough CPU
resources another .9 CPUs are added. This is because 1.4 is the desired number in the
application DLPAR setting and .5 is the partition minimum. The total is 1.9 being allocated.
For memory, an additional 8 GB is allocated totaling 12 GB. This is because the partition
minimum is 4 GB, the desired application DLPAR setting is 6 GB, which can be
accommodated, so it attempts to allocate up to the desired, which is 9 GB. To add the entire
desired 9 GB requires 13 GB total. However, our partition has a memory maximum setting of
12 GB. Therefore, the result is 12 GB, as shown in Figure 9-8 on page 373.
We repeat this scenario, this time with the other node Cassidy and application controller
dummyapp2. Cassidy is active on its desired partition settings of .5 CPU and 4 GB of memory
as shown in Figure 9-9.
Upon starting cluster services on Cassidy, dummyapp2 is started locally and attempts to
acquire the desired amount of resources assigned. Because the partition has maximum
setting of 2 CPUs, it can allocate only up to 2. Remember, our processor DLPAR application
minimum was 0 and desired was 2.2.
For memory, an extra 10 GB is allocated, totaling 14 GB. This is because the partition
minimum is 4 GB, the desired application DLPAR setting is 5 GB, which can be
accommodated, so it attempts to allocated up to the desired, which is 11 GB. To add the
entire desired 11 GB requires 15 GB total. However, our partition has a memory maximum
setting of 14 GB, so the maximum allowed memory size was allocated, as shown in
Figure 9-10 on page 374.
Upon stopping cluster services on Jessica, dummyapp1 is stopped locally and releases
resources back. When releasing resources, PowerHA releases resource back to the partition
minimum settings. In our example, this is more than it originally acquired. Although the
memory amount released, 8 GB, is the same, it actually releases 1.4 CPU when it only
originally acquired .9 CPU.
We repeat our scenario, but this time on node Cassidy is online in the cluster with
dummyapp2 running on its partition with the settings shown in the previous scenario.
Upon stopping cluster services on Cassidy, dummyapp2 is stopped locally and releases
resources back. When releasing resources, PowerHA actually releases resource back to the
partition minimum settings. In this example, it is actually the same amount it originally
acquired. This is primarily because the partition profile minimum and desired settings are the
same. Unlike in the previous example of node Jessica, we essentially return to where we
started, as shown in Figure 9-9 on page 373.
Shanley is currently online with desired resources of .5 CPU and 8 GB of memory, as shown
in Figure 9-12. For the first part, we perform a resource group move of jessdlparrg to the
standby node Shanley. Because it involves acquiring the resource group it gives the same net
effect as though it were a hard fallover.
This results in a fallover to occur to node Shanley. Shanley acquires the jessdlparrg resource
group and allocates 1.1 CPU and 5 GB of memory. The reason is that the minimum profile
setting for node Shanley is .2 CPU, so it added the application DLPAR setting of 1.4 more
CPU. Similarly for memory, the minimum profile setting is 4 GB. It was able to add the desired
application DLPAR memory setting of 9 GB. The final result of 1.6 CPU and 13 GB of memory
is shown in Figure 9-13.
Node Shanley now actually has less CPU but more memory than the original hosting node
Jessica. This is a direct result of the differing partition minimums.
In the second part of this Scenario 3, we move cassdlparrg to node Shanley. Node Cassidy
has the assigned resources shown in Figure 9-10 on page 374. Node Shanley still has the
resource from the previous test, as shown in Figure 9-13.
Node Shanley takes over the cassdlparrg resource group and acquires more resources, as
shown in Figure 9-14 on page 376.
This time we move dummyapp2 from node Cassidy first. This results in node Shanley
acquiring 1.9 more CPU. This is the result of adding 2.2 CPU from the application DLPAR
setting to the .2 partition minimum setting. It also acquires the full 11 GB of memory, as
shown in Figure 9-15.
For the second part of Scenario 4, we continue by moving dummyapp1 from node Jessica to
the backup node Shanley.
Node Shanley takes over the jessdlparrg resource group and acquires the desired resources
as shown in Figure 9-16. The result is not he same as the first part of Scenario 4. This is
because the 1.9 CPU requirement was already met, although it will still add extra memory.
Node Shanley is currently hosting both resource groups. We run rg_move on jessdlparrg
from node Shanley back to its home node of Jessica. Shanley releases 9 GB of memory,
while Jessica acquires 1.4 CPU and 8 GB of memory, as shown in Figure 9-8 on page 373.
Shanleys current state is shown in Figure 9-17. This again is a direct result of the
combination of application provisioning and the LPAR profile settings.
Figure 9-17 Jessica resource group release from backup node back to primary node Jessica
Next, we perform an rg_move event of cassdlparrg with dummyapp2 from the backup node
Shanley to its original home node Cassidy. This results in node Shanley resource reverting to
the original minimum LPAR settings, as shown in Figure 9-18.
Figure 9-18 Cassidy resource group release from backup node back to primary node Cassidy
Cassidy resumes the resources as it did in its original resource group hosting state, shown in
Figure 9-10 on page 374.
Important: Its important to test to make sure the desired results are achieved. As we have
shown the order of acquisition and release can differ the results of the resources acquired
and released accordingly.
It provides the facility for no required downtime for planned hardware maintenance. However,
it does not offer the same for software maintenance or unplanned downtime. That is why
PowerHA is still relevant today.
PowerHA can be used within a partition that is capable of being moved with Live Partition
Mobility. This does not mean that PowerHA uses Live Partition Mobility in any way. PowerHA
is treated as another application within the partition.
With the requisite levels of software and firmware installed on the source and destination
POWER6 (or later) servers, the Live Partition Mobility feature can be used to migrate an
LPAR running as a PowerHA node without affecting the state or operation of the PowerHA
cluster, provided that the PowerHA cluster is configured to use standard (that is, default)
heartbeat parameters.
In that case, the effect on the application servers running under PowerHA control is a brief
suspension of operations during the migration. Neither PowerHA nor the application servers
must be restarted.
To perform LPM on a PowerHA SystemMirror node in a cluster, the following steps may be
used. If your environment also has SANcomm defined then also see Performing LPM with
SANcomm defined on page 379.
Note: To complete step 2 and step 4, you must apply the following APARs that correspond
to the version of PowerHA SystemMirror in your environment:
PowerHA SystemMirror Version 7.1.1 - IV65537
PowerHA SystemMirror Version 7.1.2 - IV65538
PowerHA SystemMirror Version 7.1.3 - IV65536
1. Optional: Stop PowerHA SystemMirror cluster services by completing the following steps:
From the command line, enter smit cl_admin. From the SMIT interface, select PowerHA
SystemMirror Services Stop Cluster Services. Then for the Select an Action on
Resource Groups choose Unmanage Resource Groups and press Enter.
2. Disable Dead Man Switch monitoring in the cluster by entering the following commands.
The pm_lpar_name is the name of the node you are performing LPM on:
/usr/sbin/clctrl -tune -n lpm_lpar_name -o deadman_mode=e
/usr/sbin/rsct/bin/dms/stopdms s cthags
Note: The previous step is required because the LPAR time stamp might change when the
LPAR is moving during the migration.
Important: You can perform LPM on a PowerHA SystemMirror LPAR that is configured
with SAN communication. However, when you use LPM, the SAN communication is not
automatically migrated to the destination system. You must configure SAN communication
on the destination system before you use LPM. Full details can be found at:
http://www-01.ibm.com/support/knowledgecenter/SSPHQG_7.1.0/com.ibm.powerha.admn
gd/ha_admin_config_san.htm
In our scenario we have a two-node cluster, Jessica and Shanley, each on their own managed
systems named p750_4 and p750_2 respectively. Both systems and nodes are using
sancomm. We have a third managed system, p750_3 in which sancomm is not configured.
However its VIOS adapters are target-mode-capable and currently enabled.
In both scenarios, when we first start running the migration during the verification process, a
warning is displayed, as shown in Figure 9-19.
Because this is only a warning, we can continue. The LPM process completes and node
jessica is active and running on p750_3. However, of course, sancomm is no longer
functioning as shown by the lack of output from the following lscluster command output.
[jessica:root] /utilities # lscluster -m |grep sfw
[jessica:root] /utilities #
Next, we add new virtual Ethernet adapters that use VLAN3358 to each VIOS. We then run
cfmgr on each VIOS to configure the sfwcomm device. No further action is required on node
jessica because its profile already contains the proper virtual adapter.
In the second scenario, we repeat the LPM. However, this time, the target system already has
both sancomm devices configured on its VIOS and the appropriate virtual Ethernet adapters.
During the LPM, we did notice a couple of seconds in which sfwcom did register as being
down, but then it automatically comes back online.
During the acquisition of the resource groups on cluster startup, you can also see the settling
time value by running the clRGinfo -t command as shown in Example 10-1 on page 383.
Note: A settling time with a non-zero value will be displayed only during the acquisition of
the resource group. The value will be set to 0 after the settling time expires and the
resource group is acquired by the appropriate node.
We specified a settling time of 6 minutes and configured a resource group named SettleRG1
to use the startup policy, Online on First Available Node. We set the node list for the
resource group so node jessica would fallover to node cassidy.
For the first test, the following steps demonstrate how we let the settling time expire and how
the secondary node acquires the resource group:
1. With cluster services inactive on all nodes, define a settling time value of 600 seconds.
2. Synchronize the cluster.
3. Validate the settling time by running clsettlingtime as follows:
[jessica:root] / # /usr/es/sbin/cluster/utilities/clsettlingtime list
#SETTLING_TIME
360
4. Start cluster services on node cassidy.
We started cluster services on this node because it was the last node in the list for the
resource group. After starting cluster services, the resource group was node acquired by
node cassidy. Running the clRGinfo -t command displays the 360 seconds settling time
as shown in Example 10-2.
For the next test scenario, we demonstrate how the primary node will start the resource group
when the settling time does not expire.
1. Repeat the previous step 1 on page 383 through step 4 on page 383.
2. Start cluster services on node jessica.
After waiting about two minutes, after the cluster stabilized on node cassidy, we start
cluster services on node jessica. This results in the resource group being brought online to
node jessica, as shown in Figure 10-2 on page 385.
Note: This feature is effective only when cluster services on a node are started. This is not
enforced when C-SPOC is used to bring a resource group online.
This policy causes resource groups having this startup policy to spread across cluster nodes
in such a way that only one resource group is acquired by any node during startup. This can
be used, for instance, for distributing CPU-intensive applications on different nodes.
If two or more resource groups are offline when a particular node joins the cluster, this
policy determines which resource group is brought online based on the following criteria and
order of precedence:
1. The resource group with the least number of participating nodes will be acquired.
2. In case of a tie, the resource group to be acquired is chosen alphabetically.
3. A parent resource group is preferred over a resource group that does not have any child
resource group.
[Entry Fields]
* Resource Group Name [RG1]
* Participating Nodes (Default Node Priority) [node1 node2 node3]
2. Start cluster services on node cassidy. RG2 was acquired because of alphabetical order.
RG3 stays offline. RG3 can be brought online manually through C-SPOC. This is done by
running smitty cspoc, selecting Resource Group and Applications Bring a Resource
Group Online, choosing RG3, and then choosing the node on which you want to start it.
Important: Dynamic node priority is relevant only to clusters with three or more nodes
participating in the resource group.
The cluster manager queries the RMC subsystem every three minutes to obtain the current
value of these attributes on each node and distributes them cluster wide. The interval at which
the queries of the RMC subsystem are performed is not user-configurable. During a fallover
event of a resource group with dynamic node priority configured, the most recently collected
values are used in the determination of the best node to acquire the resource group.
For dynamic node priority (DNP) to be effective, consider the following information:
DNP cannot be used with fewer than three nodes.
DNP cannot be used for Online on All Available Nodes resource groups.
DNP is most useful in a cluster where all nodes have equal processing power and
memory.
Important: The highest free memory calculation is performed based on the amount of
paging activity taking place. It does not consider whether one cluster node has less real
physical memory than another.
For more details about how predefined DNP values are used, see step 3 on page 393.
When you select one of the these criteria, you must also provide values for the DNP script
path and DNP time-out attributes for a resource group. When the DNP script path attribute is
specified, the given script is invoked on all nodes and return values are collected from all
nodes. The failover node decision is made by using these values and the specified criteria. If
you choose the cl_highest_udscript_rc attribute, collected values are sorted and the node
that returned the highest value is selected as a candidate node to failover. Similarly, if you
choose the cl_lowest_nonzero_udscript_rc attribute, collected values are sorted and the
node which returned lowest nonzero positive value is selected as a candidate node to failover.
Demonstration: See the demonstration about user-defined adaptive fallover node priority:
https://www.youtube.com/watch?v=ajsIpeMkf38
[Entry Fields]
* Resource Group Name [DNP_test1]
* Participating Nodes (Default Node Priority) [alexis jessica jordan] +
3. Assign the resources to the resource group by selecting Change/Show Resources and
Attributes for a Resource Group and press Enter, as shown in Example 10-7.
4. Select one of the three available RMC based policies from the pull-down list:
cl_highest_free_mem
cl_highest_idle_cpu
cl_lowest_disk_busy
You can display the current DNP policy for an existing resource group (Example 10-8).
HACMPresource:
group = "test_rg"
name = "NODE_PRIORITY_POLICY"
value = "cl_highest_free_mem"
id = 21
monitor_method = ""
Notes:
Using the information retrieved directly from the ODM is for informational purposes only
because the format within the stanzas might change with updates or new versions.
Hardcoding ODM queries within user-defined applications is not supported and should
be avoided.
The following resource monitors contain the information for each policy:
IBM.PhysicalVolume
IBM.Host
Each of these monitors can be queried during normal operation by running the commands
shown in Example 10-9.
You can display the current table maintained by clstrmgrES by running the command shown
in Example 10-10.
The values in the table are used for the DNP calculation in the event of a fallover. If
clstrmgrES is in the middle of polling the current state when a fallover occurs, then the value
last taken when the cluster was in a stable state is used to determine the DNP.
[Entry Fields]
* Resource Group Name [DNPrg]
* Participating Nodes (Default Node Priority) [jessica cassidy shanl> +
2. Assign the resources to the resource group by selecting Change/Show Resources and
Attributes for a Resource Group and then press Enter, as shown in Example 10-12.
3. Select one of the two adaptive faillover policies from the pull-down list:
cl_highest_udscript_rc
cl_lowest_nonzero_udscript_rc
Continue selecting the resources that will be part of the resource group.
4. Verify and synchronize the cluster.
You can display the current DNP policy for an existing resource group as shown in
Example 10-13.
HACMPresource:
group = "DNPrg"
type = ""
name = "NODE_PRIORITY_POLICY"
value = "cl_lowest_nonzero_udscript_rc"
id = 1
monitor_method = ""
-------------------------------
NODE cassidy
-------------------------------
#!/bin/ksh
exit 5
-------------------------------
NODE shanley
-------------------------------
#!/bin/ksh
exit 3
-------------------------------
NODE jessica
-------------------------------
#!/bin/ksh
exit 1
Fallback Timer:
Sunday 12:00PM
Consider a simple scenario with a cluster having two nodes and a resource group. In the
event of a node failure, the resource group will fallover to the standby node. The resource
group remains on that node until the fallback timer expires. If cluster services are active on
the primary node at that time, the resource group will fallback to the primary node. If the
primary node is not available at that moment, the fallback timer is reset and the fallback will be
postponed until the fallback timer expires again.
[Entry Fields]
* Name of the Fallback Policy [daily515]
* HOUR (0-23) [17] #
* MINUTES (0-59) [15]
To assign a fallback timer policy to a resource group, complete the following steps:
1. Use the smitty sysmirror fast path and select Cluster Applications and Resources
Resource Groups Change/Show Resources and Attributes for a Resource Group.
Select a resource group from the list and press Enter.
2. Press the F4 to select one of the policies configured in the previous steps. The display is
similar to Example 10-16 on page 396.
3. Select a fallback timer policy from the pick list and press Enter.
4. Add any extra resources to the resource group and press Enter.
5. Run verification and synchronization on the cluster to propagate the changes to all
cluster nodes.
HACMPtimer:
policy_name = "daily515"
recurrence = "daily"
year = -3800
month = 0
day_of_month = 1
week_day = 0
hour = 17
minutes = 30
For instance, a database must be online before the application server is started. If the
database goes down and falls over to a different node, the resource group that contains the
application server will also be brought down and back up on any of the available cluster
nodes. If the fallover of the database resource group is not successful, then both resource
groups (database and application) will be put offline.
When you plan to use Online on Different Nodes dependencies, consider these factors:
Only one Online On Different Nodes dependency is allowed per cluster.
Each resource group must have a different home node for startup.
When using this policy, a higher priority resource group takes precedence over a lower
priority resource group during startup, fallover, and fallback:
If a resource group with High priority is online on a node, no other resource group that
is part of the Online On Different Nodes dependency can be put online on that node.
If a resource group that is part of the Online On Different Nodes dependency is online
on a cluster node and a resource group that is part of the Online On Different Nodes
dependency and has a higher priority falls over or falls back to the same cluster node,
HACMPrgdependency:
id = 0
group_parent = "rg_parent"
group_child = "rg_child"
dependency_type = "PARENT_CHILD"
dep_type = 0
group_name = ""
root@ xdsvc1[] odmget HACMPrg_loc_dependency
HACMPrg_loc_dependency:
id = 1
set_id = 1
group_name = "rg_same_node2"
priority = 0
loc_dep_type = "NODECOLLOCATION"
loc_dep_sub_type = "STRICT"
HACMPrg_loc_dependency:
id = 2
set_id = 1
group_name = "rg_same_node_1"
priority = 0
loc_dep_type = "NODECOLLOCATION"
loc_dep_sub_type = "STRICT"
HACMPrg_loc_dependency:
id = 4
set_id = 2
group_name = "rg_different_node1"
priority = 1
loc_dep_type = "ANTICOLLOCATION"
loc_dep_sub_type = "STRICT"
HACMPrg_loc_dependency:
id = 5
set_id = 2
group_name = "rg_different_node2"
priority = 2
loc_dep_type = "ANTICOLLOCATION"
loc_dep_sub_type = "STRICT"
Note: Using the information retrieved directly from the ODM is for informational purposes
only, because the format within the stanzas might change with updates or new versions.
Hardcoding ODM queries within user-defined applications is not supported and should be
avoided.
Note: The options of when and where to process the resource are not as granular as using
custom events. However they are suitable for most requirements.
[Entry Fields]
* Resource Type Name [specialrestype]
* Processing Order [FIRST] +
Verification Method []
Verification Type [Script] +
* Start Method [/HA713/custom.sh star>
* Stop Method [/HA713/custom.sh stop]
Monitor Method []
Cleanup Method []
Restart Method []
Failure Notification Method []
Required Attributes []
Optional Attributes []
Description []
Figure 11-1 Create user-defined resource type
Note: With monitoring, because the resource is already stopped when this script is
called, the resource stop script might fail.
Restart Method
The default restart method is the resource start script defined previously. You can specify
a different method here, if desired. If you change the monitor mode to be used only in the
startup monitoring mode, the method specified in this field does not apply, and PowerHA
SystemMirror ignores values entered in this field.
Failure Notification Method
Define a notify method to run when the user-defined resource fails. This custom method
runs during the restart process and during notify activity. If you are changing the monitor
mode to be used only in the startup monitoring mode, the method specified in this field
does not apply, and PowerHA SystemMirror ignores values entered in this field.
Required Attributes
Specify a list of attribute names, with each name separated by a comma. These attributes
must be assigned with values when you create the user-defined resource, for example,
Rattr1,Rattr2. The purpose of the attributes is to store resource-specific attributes, which
can be used in the different methods specified in the resource type configuration.
[Entry Fields]
* Resource Type Name specialrestype
* Resource Name [shawnsresource]
Attribute data []
Note: The resource name must be unique across the cluster. When you define a
volume group as a user-defined resource for a Peer-to-Peer Remote Copy (PPRC)
configuration or a HyperSwap configuration, the resource name must match the volume
group.
Attribute data
Specify a list of attributes and values in the form of attribute=value, with each pair
separated by a space as in the following example:
Rattr1="value1" Rattr2="value2" Oattr1="value3"
When you are done and to use the resource, add it to the resource group.
Tape Resources [] +
Raw Disk PVIDs [] +
Raw Disk UUIDs/hdisks [] +
Disk Error Management? no +
Miscellaneous Data []
WPAR Name [] +
User Defined Resources [shawnsresource] +
Figure 11-3 Add user-defined resource into resource group
5. Upon completion, synchronize the cluster for the new resource to be used.
Important: Your cluster will not continue processing events until your custom pre-event or
post-event script finishes running.
Only the following events occur during parallel processing of resource groups:
node_up
node_down
acquire_svc_addr
acquire_takeover_addr
release_svc_addr
release_takeover_addr
start_server
stop_server
Always be attentive to the list of events when you upgrade from an older version and choose
parallel processing for some of the pre-existing resource groups in your configuration.
Note: When trying to adjust the default behavior of an event script, always use pre-event or
post-event scripts. Do not modify the built-in event script files. This option is neither
supported nor safe because these files can be modified without notice when applying fixes
or performing upgrades.
To define a pre-event or post-event script, you must create a custom event and then associate
the custom event with a cluster event as follows:
1. Write and test your event script carefully. Ensure that you copy the file to all cluster nodes
under the same path and name.
2. Define the custom event:
a. Run smitty sysmirror fast path and select Custom Cluster Configuration
Events Cluster Events Pre/Post-Event Commands Add a Custom Cluster
Event.
b. Complete the following information:
Cluster Event Name: The name of the event.
Cluster Event Description: A short description of the event.
Cluster Event Script Filename: The full path of the event script.
3. Connect the custom event with pre/post-event cluster event:
a. Run smitty sysmirror fast path and select Custom Cluster Configuration
Events Cluster Events Change/Show Pre-Defined Events.
b. Select the event that you want to adjust.
c. Enter the following values:
Notify Command (optional): The full path name of the notification command, if any.
Pre-event Command (optional): The name of the custom cluster event that you want
to run as a pre-event. You can choose from the custom cluster event list that have
been previously defined.
Post-event Command (optional): The name of the custom cluster event that you
want to run as a post-event. You can choose from the custom cluster event list that
have been previously defined.
Tips:
You can use cluster file collection feature to ensure that custom event files will be
propagated automatically to all cluster nodes.
If you use pre-event and post-event scripts to ensure proper sequencing and correlation
of resources used by applications running on the cluster,
you can consider simplifying or even eliminating them by specifying parent/child
dependencies between resource groups.
PowerHA provides a SMIT interface to the AIX error notification function. Use this function to
detect an event that is not specifically monitored by the PowerHA (for example, a disk adapter
failure) and to trigger a response to this event.
Before you configure automatic error notification, a valid cluster configuration must be in
place.
Automatic error notification applies to selected hard, non-recoverable error types such as
those that are related to disks or disk adapters. This utility does not support media errors,
recovered errors, or temporary errors.
Enabling automatic error notification assigns one of two error notification methods for all error
types as follows:
The non-recoverable errors pertaining to resources that have been determined to
represent a single point of failure are assigned the cl_failover method and will trigger a
failover.
All other non-critical errors are assigned the cl_logerror method and an error entry will be
logged against the hacmp.out file.
PowerHA automatically configures error notifications and recovery actions for several
resources and error types including these items:
All disks in the rootvg volume group
All disks in cluster volume groups, concurrent volume groups, and file systems
All disks defined as cluster resources
To set up automatic error notifications, use the smitty sysmirror fast path and select
Problem Determination Tools PowerHA SystemMirror Error Notification
Configure Automatic Error Notification Add Error Notify Methods for Cluster
Resources.
Note: You cannot configure automatic error notification while the cluster is running.
xdsvc1:
xdsvc1: HACMP Resource Error Notify Method
xdsvc1:
With PowerHA, you can customize the error notification method for other devices and error
types and define a specific notification method, rather than using one of the two automatic
error notification methods.
After an error notification is defined, PowerHA offers the means to emulate it. You can
emulate an error log entry with a selected error label. The error label is listed in the error log
and the notification method is run by errdemon.
To change the total event duration time before receiving a config_too_long warning
message, complete these steps:
1. Use the smitty sysmirror fast path and select Custom Cluster Configuration
Events Cluster Events Change/Show Time Until Warning.
2. Complete these fields:
Max. Event-only Duration (in seconds)
The maximum time (in seconds) to run a cluster event. The default is 180 seconds.
Max. Resource Group Processing Time (in seconds)
The maximum time (in seconds) to acquire or release a resource group. The default is
180 seconds.
Total time to process a Resource Group event before a warning is displayed
The total time for the Cluster Manager to wait before running the config_too_long
script. The default is 6 minutes. This field is the sum of the two other fields and is not
editable.
3. Press Enter to create the error notification object.
4. Verify and synchronize the cluster to propagate the changes.
For more details regarding resources and RMC see the IBM RSCT website:
http://www.ibm.com/support/knowledgecenter/SGVKBA/welcome
Recovery programs
A recovery program consists of a sequence of recovery command specifications that has the
following format:
:node_set recovery_command expected_status NULL
Where:
node_set: The set of nodes on which the recovery program will run and can take one of
the following values:
all: The recovery command runs on all nodes.
event: The node on which the event occurred.
other: All nodes except the one on which the event occurred.
recovery_command: String (delimited by quotation marks) that specifies a full path to the
executable program. The command cannot include any arguments. Any executable
program that requires arguments must be a separate script. The recovery program must
have the same path on all cluster nodes. The program must specify an exit status.
expected_status: Integer status to be returned when the recovery command completes
successfully. The Cluster Manager compares the actual status returned against the
expected status. A mismatch indicates unsuccessful recovery. If you specify the character
X in the expected status field, Cluster Manager will skip the comparison.
NULL: Not used, included for future functions.
Multiple recovery command specifications can be separated by the barrier command. All
recovery command specifications before a barrier start in parallel. When a node encounters a
barrier command, all nodes must reach the same barrier before the recovery program
resumes.
[Entry Fields]
* Event name [user_defined_event]
* Recovery program path [/user_defined.rp]
* Resource name [IBM.FileSystem]
* Selection string [name = "/var"]
* Expression [PercentTotUsed > 70]
Rearm expression [PercentTotUsed < 50]
For additional details regarding user-defined events, see the Planning PowerHA
SystemMirror guide:
http://public.dhe.ibm.com/systems/power/docs/powerha/71/hacmpplangd_pdf.pdf
Note: PowerHA v7.1.0 through v7.1.2 require the use of multicast within a site. PowerHA
v7.1.3 introduced unicast as an option, making multicast optional also.
PowerHA uses a new redesigned cluster health management layer embedded as part of the
operating system called Cluster Aware AIX (CAA). CAA uses kernel-level code to exchange
heartbeats over network, SAN fabric (when correct Fibre Channel adapters are deployed),
and also disk-based messaging through the central repository.
If multicast is selected during the initial cluster configuration, an important factor is that the
multicast traffic be able to flow between the cluster hosts in the data center before the cluster
formation can be attempted. Plan to test and verify the multicast traffic flow between the
would-be cluster nodes before attempting to create the cluster. Review the guidelines in the
following sections to test the multicast packet flow between the hosts.
Network switches
Hosts communicate over the network fabric that might consist of many switches and routers.
A switch connects separate hosts and network segments and allows for network traffic to be
sent to the correct place. A switch refers to a multiport network bridge that processes and
routes data at the data link layer (Layer 2) of the OSI model. Some switches can also process
data at the network layer (Layer 3).
IGMP communication protocol is used by the hosts and the adjacent routers on IP networks
to interact and establish rules for multicast communication, in particular establish multicast
group membership. Switches that feature IGMP snooping derive useful information by
observing these IGMP transactions between the hosts and routers. This enables the switches
to correctly forward the multicast packets when needed to the next switch in the network path.
IGMP snooping
IGMP snooping is an activity performed by the switches to track the IGMP communications
packet exchanges and adapt the same in regards to filtering the multicast packets. Switches
monitor the IGMP traffic and allow out the multicast packets only when necessary. The switch
typically builds an IGMP snooping table that has a list of all the ports that have requested a
particular multicast group and uses this table to allow or disallow the multicast packets to flow.
Multicast routing
The network entities that forward multicast packets by using special routing algorithms are
referred to as mrouters. Also, router vendors might implement multicast routing refer to the
router vendors documentation and guidance. Hosts and other network elements implement
mrouters and allow for the multicast network traffic to flow appropriately. Some traditional
routers also support multicasting packet routing.
When switches are cascaded, or chained, setting up the switch to forward the packets might
be necessary, as needed, to implement mrouting. However, this might be one of the possible
approaches to solving multicast traffic flow issues in the environment. See the switch vendors
documentation and guidance regarding setting up the switches for multicast traffic.
Multicast testing
Do not attempt to create the cluster by using multicast until you verify that multicast traffic
flows without interruption between the nodes that will be part of the cluster. Clustering will not
continue if the mping test fails. If problems occur with the multicast communication in your
network environment, contact the network administrator and review the switches involved and
the setup needed. After the setup is complete, retest the multicast communication.
The mping command can be invoked with a particular multicast address or it chooses a
default multicast address. A test for our cluster is shown in Example 12-1.
As the address input to mping, use the actual multicast address that will be used during
clustering. CAA creates a default multicast address if one is not specified during cluster
creation. This default multicast address is formed by combining (using OR) 228.0.0.0 with the
lower 24 bits of the IP address of the host. As an example, in our case the host IP address is
192.168.100.51, so the default multicast address is 228.168.100.51.
This network-wide attribute can be used to customize the load balancing of PowerHA service
IP labels, taking into consideration any persistent IP labels that are already configured. The
distribution selected is maintained during cluster startup and subsequent cluster events. The
distribution preference will be maintained, if acceptable network interfaces are available in the
cluster. However, PowerHA will always keep service IP labels active, even if the preference
cannot be satisfied.
The placement of the service IP labels can be specified with these distribution preferences:
Anti-Collocation
This is the default, and PowerHA distributes the service IP labels across all boot IP
interfaces in the same PowerHA network on the node.
Collocation
PowerHA allocates all service IP addresses on the same boot IP interface.
Collocation with persistent label
PowerHA allocates all service IP addresses on the boot IP interface that is hosting the
persistent IP label. This can be useful in environments with VPN and firewall configuration,
where only one interface is granted external connectivity.
Collocation with source
Service labels are mapped using the Collocation preference. You can choose one service
label as a source for outgoing communication. The service label chosen in the next field is
the source address.
Anti-Collocation with source
Service labels are mapped using the Anti-Collocation preference. If not enough adapters
are available, more than one service label can be placed on one adapter. This choice
allows one label to be selected as the source address for outgoing communication.
[Entry Fields]
* Network Name net_ether_01
* Distribution Preference Collocation with Persistent Label +
Source IP Label for outgoing packets +
+--------------------------------------------------------------------------+
| Source IP Label for outgoing packets |
| |
| Move cursor to desired item and press Enter. |
| |
| dallasserv |
| ftwserv |
| |
| F1=Help F2=Refresh F3=Cancel |
F1| F8=Image F10=Exit Enter=Do |
F5| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
Figure 12-1 Specify service IP for outgoing packets
6. Press Enter to accept your selection and update the PowerHA ODM on the local node.
7. For the change to take effect and to be propagated to all nodes, synchronize your cluster.
Use smitty sysmirror fast path, select Cluster Applications and Resources
Verification and Synchronize Cluster Configuration, and press Enter. This triggers a
dynamic reconfiguration event.
Network net_ether_01
NODE cassidy:
ftwserv 10.10.10.52
dallasserv 10.10.10.51
cassidy_xd 192.168.150.52
NODE jessica:
ftwserv 10.10.10.52
dallasserv 10.10.10.51
jessica_xd 192.168.150.51
NODE shanley:
ftwserv 10.10.10.52
dallasserv 10.10.10.51
shanley_xd 192.168.150.53
Network net_ether_01 is using the following distribution preference for service labels:
Collocation with persistent - service label(s) will be mapped to the same interface as the
persistent label.
Network net_ether_010
NODE cassidy:
cassidy 192.168.100.52
NODE jessica:
jessica 192.168.100.51
NODE shanley:
shanley 192.168.100.53
This was visible in the output of the netstat -i command, s shown in Example 12-3.
jessica-# netstat -i
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 0.2.55.4f.c4.ab 5044669 0 1828909 0 0
en0 1500 10.10.31 jessicaa 5044669 0 1828909 0 0
en0 1500 192.168.100 p750n01 5044669 0 1828909 0 0
en0 1500 192.168.100 cassidysvc 5044669 0 1828909 0 0
en0 1500 192.168.100 shanleysvc 5044669 0 1828909 0 0
en3 1500 link#3 0.20.35.e2.7f.8d 3191047 0 1410806 0 0
en3 1500 10.10.32 jessicab 3191047 0 1410806 0 0
lo0 16896 link#1 1952676 0 1957548 0 0
lo0 16896 127 localhost 1952676 0 1957548 0 0
lo0 16896 localhost 1952676 0 1957548 0 0
Our testing of the dynamic change of this policy resulted in no move of any of the labels after
a synchronization. The following message was logged during the synchronization of the
cluster after making the service IP distribution policy change:
Verifying additional pre-requisites for Dynamic Reconfiguration...
Note: For this instance, the message logged is generic and gets reported only because a
change was detected. As long as that was the only change made, no actual resources will
be brought offline.
A change to the service IP distribution policy is only enforced when we manually invoke a
swap event or stop and restart PowerHA on a node. This is the intended behavior of the
feature to avoid any potential disruption of connectivity to those IP addresses. The remaining
cluster nodes will not enforce the policy unless cluster services are also stopped and
restarted on them.
The round-trip time (rtt) value is shown in the output of the lscluster -i and lscluster -m
commands. The mean deviation in network rtt is the average round-trip time, which is
automatically managed by CAA.
To change the cluster heartbeat settings, modify the failure cycle and the grace period for the
PowerHA cluster from the custom cluster configuration in the SMIT panel (Example 12-4).
Use smitty sysmirror and select Custom Cluster Configuration Cluster Nodes and
Networks Manage the Cluster Cluster heartbeat settings.
Note: Unlike previous versions, this setting is global across all networks.
When cluster sites are used, specifically linked sites, two more parameters are available:
Link failure detection timeout: This is time (in seconds) that the health management layer
waits before declaring that the inter-site link failed. A link failure detection can cause the
cluster to switch to another link and continue the communication. If all the links failed, this
results in declaring a site failure. The default is 30 seconds.
Site heartbeat cycle: This is number factor (1 - 10) that controls heartbeat between the
sites.
Also, like most changes, a cluster synchronization is required for the changes to take affect.
However the change is dynamic so a cluster restart is not required.
For more details see CAA failure detection tunables on page 102.
It also can be used in combination with regular service IP labels and persistent IP labels. In
general, use persistent IP labels, especially one that is node-bound with XD_data networks,
because no communication occurs through the service IP label that is configurable on
multiple nodes.
To configure and use site-specific service IP labels, obviously sites must be defined to the
cluster. After you add a cluster and add nodes to the cluster, complete these steps:
1. Add sites.
2. Add more networks as needed (ether, XD_data, or XD_ip):
Add interfaces to each network
3. Add service IP labels:
Configurable on multiple nodes
Specify the associated site
4. Add resource groups.
5. Add service IP labels to the resource groups.
6. Synchronize the cluster.
In our test scenario, we have a two-node cluster (cassidy and jessica nodes) that currently
has a single ether network with a single interface defined to it. We also have a volume group
available on each node named xsitevg. Our starting topology is shown in Example 12-5 on
page 432.
NODE cassidy:
Network net_ether_010
cassidy 192.168.100.52
NODE jessica:
Network net_ether_010
jessica 192.168.100.51
Adding sites
To define the sites, complete these steps:
1. Run smitty sysmirror fast path, select Cluster Nodes and Networks Manage Site
Add a Site, and press Enter.
2. We add the two sites dallas and fortworth. Node jessica is a part of the dallas site; node
cassidy is a part of the fortworth site. The Add a Site menu is shown in Figure 12-2.
Add a Site
[Entry Fields]
* Site Name [dallas]
* Site Nodes jessica +
Cluster Type Standard
COMMAND STATUS
Adding a network
To define the additional network, complete these steps (see Figure 12-4):
1. Using the smitty sysmirror fast path, select Cluster Nodes and Networks Manage
Networks and Network Interfaces Networks Add a Network, and press Enter.
Choose the network type, in our case we select XD_ip, and press Enter.
2. You can keep the default network name, as we did, or specify one.
[Entry Fields]
* Network Name [net_XD_ip_01]
* Network Type XD_ip
* Netmask(IPv4)/Prefix Length(IPv6) [255.255.255.0]
* Network attribute public +
Figure 12-4 Add network
[Entry Fields]
* IP Label/Address [jessica_xd] +
* Network Type XD_ip
* Network Name net_XD_ip_01
* Node Name [jessica] +
Network Interface []
[Entry Fields]
* IP Label/Address dallasserv +
Netmask(IPv4)/Prefix Length(IPv6) []
* Network Name net_XD_ip_01
Associated Site dallas +
The topology as configured at this point is shown in Example 12-6 on page 435.
NODE cassidy:
Network net_XD_ip_01
ftwserv 10.10.10.52
dallasserv 10.10.10.51
cassidy_xd 192.168.150.52
Network net_ether_010
cassidy 192.168.100.52
NODE jessica:
Network net_XD_ip_01
ftwserv 10.10.10.52
dallasserv 10.10.10.51
jessica_xd 192.168.150.51
Network net_ether_010
jessica 192.168.100.51
[jessica:root] / # cllssite
---------------------------------------------------
Sitename Site Nodes Dominance Protection Type
---------------------------------------------------
dallas jessica NONE
fortworth cassidy NONE
[Entry Fields]
* Resource Group Name [xsiteRG]
Note: The additional options, shown in bold in this figure, are available only if sites are
defined.
-------------------------------
NODE cassidy
-------------------------------
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 7a.40.c8.b3.15.2 42474 0 41723 0 0
en0 1500 192.168.100 cassidy 42474 0 41723 0 0
en1 1500 link#3 7a.40.c8.b3.15.3 5917 0 4802 0 0
en1 1500 192.168.150 cassidy_xd 5917 0 4802 0 0
lo0 16896 link#1 5965 0 5965 0 0
lo0 16896 127 loopback 5965 0 5965 0 0
lo0 16896 loopback 5965 0 5965 0 0
-------------------------------
NODE jessica
-------------------------------
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 ee.af.1.71.78.2 45506 0 44229 0 0
en0 1500 192.168.100 jessica 45506 0 44229 0 0
en1 1500 link#3 ee.af.1.71.78.3 5685 0 5040 0 0
en1 1500 192.168.150 jessica_xd 5685 0 5040 0 0
lo0 16896 link#1 15647 0 15647 0 0
lo0 16896 127 loopback 15647 0 15647 0 0
lo0 16896 loopback 15647 0 15647 0 0
The text that is in bold text indicates that en1 is currently configured only with the boot IP
address on both nodes. Upon starting cluster services, because jessica is the primary, it will
acquire the service address that is specific to the dallas site of dallasserv. The secondary
node, cassidy, remains unchanged. as shown in Example 12-8.
-------------------------------
NODE cassidy
-------------------------------
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 7a.40.c8.b3.15.2 52447 0 51059 0 0
en0 1500 192.168.100 cassidy 52447 0 51059 0 0
-------------------------------
NODE jessica
-------------------------------
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 ee.af.1.71.78.2 55556 0 54709 0 0
en0 1500 192.168.100 jessica 55556 0 54709 0 0
en1 1500 link#3 ee.af.1.71.78.3 10161 0 8680 0 0
en1 1500 10.10.10 dallasserv 10161 0 8680 0 0
en1 1500 192.168.150 jessica_xd 10161 0 8680 0 0
lo0 16896 link#1 24361 0 24361 0 0
lo0 16896 127 loopback 24361 0 24361 0 0
lo0 16896 loopback 24361 0 24361 0 0
We then move the resource group to the fortworth site through the SMIT fast path, smitty
cl_resgrp_move_node.select. Upon success, the primary site service IP, dallasserv, is
removed and the secondary site IP, ftwserv is brought online, as shown in Example 12-9.
-------------------------------
NODE cassidy
-------------------------------
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 7a.40.c8.b3.15.2 54634 0 53164 0 0
en0 1500 192.168.100 cassidy 54634 0 53164 0 0
en1 1500 link#3 7a.40.c8.b3.15.3 9722 0 9264 0 0
en1 1500 10.10.10 ftwserv 9722 0 9264 0 0
en1 1500 192.168.150 cassidy_xd 9722 0 9264 0 0
lo0 16896 link#1 8332 0 8332 0 0
lo0 16896 127 loopback 8332 0 8332 0 0
lo0 16896 loopback 8332 0 8332 0 0
-------------------------------
NODE jessica
-------------------------------
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 ee.af.1.71.78.2 57916 0 57041 0 0
en0 1500 192.168.100 jessica 57916 0 57041 0 0
en1 1500 link#3 ee.af.1.71.78.3 10262 0 8730 0 0
en1 1500 192.168.150 jessica_xd 10262 0 8730 0 0
lo0 16896 link#1 25069 0 25069 0 0
lo0 16896 127 loopback 25069 0 25069 0 0
lo0 16896 loopback 25069 0 25069 0 0
There is already a general consensus against having two PowerHA nodes in the same cluster
using the same VIOS, because this can mean that heartbeats can be passed between the
nodes through the server even when no real network connectivity exists. The problem
addressed by this netmon.cf format is not the same as that issue, although similarities exist.
This decision of which adapter is bad, local or remote, is made based on whether any network
traffic can be seen on the local adapter, using the inbound byte count of the interface. Where
VIO is involved, this test becomes unreliable because there is no way to distinguish whether
inbound traffic came in from the Virtual I/O Server's connection to the outside world, or from a
neighboring virtual I/O client. This is a design point of VIO, that its virtual adapters be
indistinguishable to the LPAR from a real adapter.
Problem resolution
The netmon.cf format was added to help in virtual environments. This new format allows
customers to declare that a given adapter should be considered only if it can ping a set of
specified targets.
Important: For this fix to be effective, the customer must select targets that are outside the
VIO environment, and not reachable simply by hopping from one Virtual I/O Server to
another. Cluster verification will not determine if they are valid or not.
Configuring netmon.cf
The netmon.cf file must be placed in the /usr/es/sbin/cluster directory on all cluster nodes.
Up to 32 targets can be provided for each interface. If any specific target is pingable, the
adapter will be considered up.
Targets are specified by using the existing netmon.cf configuration file with this new format,
as shown in Example 12-10 on page 440.
Parameters:
----------
!REQD : An explicit string; it *must* be at the beginning of the line (no
leading spaces).
<owner> : The interface this line is intended to be used by; that is, the code
monitoring the adapter specified here will determine its own up/down
status by whether it can ping any of the targets (below) specified in
these lines. The owner can be specified as a hostname, IP address, or
interface name. In the case of hostname or IP address, it *must* refer
to the boot name/IP (no service aliases). In the case of a hostname,
it must be resolvable to an IP address or the line will be ignored.
The string "!ALL" will specify all adapters.
<target> : The IP address or hostname you want the owner to try to ping. As with
normal netmon.cf entries, a hostname target must be resolvable to an
IP address in order to be usable.
Attention: The traditional format of the netmon.cf file is not valid in PowerHA v7, and
later, and is ignored. Only the !REQD lines are used.
The order from one line to the other is unimportant. Comments, lines beginning with the
number sign character (#) are allowed on or between lines and are ignored. If more than 32
!REQD lines are specified for the same owner, any extra are ignored.
Examples
These examples explain the syntax.
The adapter owning host1.ibm is considered up only if it can ping 100.12.7.9 or whatever
host4.ibm resolves to. The adapter owning 100.12.7.20 is considered up only if it can ping
100.12.7.10 or whatever host5.ibm resolves to. It is possible that 100.12.7.20 is the IP
address for host1.ibm.
!REQD host1.ibm 100.12.7.9
!REQD host1.ibm host4.ibm
!REQD 100.12.7.20 100.12.7.10
!REQD 100.12.7.20 host5.ibm
However, we cannot tell from this example if that is true; then all four targets belong to that
adapter.
!REQD !ALL 100.12.7.9
!REQD !ALL 110.12.7.9
!REQD !ALL 111.100.1.10
!REQD en1 9.12.11.10
12.5.2 Implications
Any interfaces that are not included as an owner of one of the !REQD lines in the netmon.cf will
continue to behave in the old manner, even if you are using this new function for other
interfaces.
This format does not change heartbeating behavior in any way. It changes only how the
decision is made regarding whether a local adapter is up or down. This new logic will be used
in this situations:
Upon startup, before heartbeating rings are formed
During heartbeat failure, when contact with a neighbor is initially lost
During periods when heartbeating is not possible, such as when a node is the only one up
in the cluster
Invoking the format changes the definition of a good adapter (from Am I able to receive any
network traffic? to Can I successfully ping certain addresses?) regardless of how much
traffic is seen.
Because of this, an adapter is inherently more likely to be falsely considered down because
the second definition is more restrictive.
For this same reason, if you find that you must take advantage of this new functionality, be as
generous as possible with the number of targets you provide for each interface.
192.168.150.51 #jessica_xd
192.168.150.52 #cassidy_xd
192.168.150.53 #shanley_xd
192.168.100.53 #shanley
Notice that all of the addresses are pulled in including the boot, service, and persistent IP
labels. Before using any of the monitor utilities from a client node, the clhosts.client file
must be copied over to all clients as /usr/es/sbin/cluster/etc/clhosts. Remember to
remove the client extension when you copy the file to the client nodes.
Important: The clhosts file on a client must never contain 127.0.0.1, loopback, or
localhost.
In this type of environment, implementing a clhosts file on the client is critical. This file gives
the clinfoES daemon the addresses to attempt communication with the SNMP process
running on the PowerHA cluster nodes.
There are several WPARs types: application WPARs or system WPARs. System WPARs are
autonomous virtual system environments with their own private file systems, users and
groups, login, network space, and administrative domain.
By default, a system WPAR shares two file systems (/usr and /opt) from the global
environment by using read-only namefs mounts. You can configure WPARs to have a
non-shared, writable /usr file system and /opt file system. The WPARs are also called
private.
For more information about IBM AIX WPARs, see Exploiting IBM AIX Workload Partitions,
SG24-7955.
In AIX Version 7, administrators now can create WPARs that can run AIX 5.2 or AIX 5.3 in an
AIX 7 operating system instance. Both are supported on the POWER7 server platform and
PowerHA. PowerHA support details are at the following web page:
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FLASH10782
PowerHA: PowerHA does not manage or monitor the WPAR. It manages and monitors
only the applications that run within the WPAR.
Important: PowerHA does not have integration support to manage rootvg WPARs,
although it has integration support for system WPARs.
Configuring a rootvg WPAR by using one or more storage devices gives the WPAR
administrator complete control of the storage devices that are exported to the WPAR,
the volume groups on these devices, and the logical volumes and file systems within
these volume groups. A system WPAR, which is not a rootvg WPAR, does not have its
own root volume group. It has the same file system layout that is created in logical volumes,
which are created externally in another place such as a Network File System (NFS) server
(NFS WPARs) or on a volume group of the global system (local WPAR).
If PowerHA creates the WPAR, this type of installation and configuration results. For more
details, see 13.3.2, Creating a WPAR with the Resource Group menu on page 450.
Planning steps
The setup sequence and necessary planning are summarized in these steps:
1. Set up the NFS server.
We include the main items:
If you use a dedicated NFS server or network-attached storage (NAS) system, ensure
that it has the same or better availability capabilities as your systems.
Verify that you have enough disk space available on your NFS or NAS server.
Ensure that the root equivalency is defined. See Example 13-1 for details.
Create the directory for your WPAR.
2. Configure the WPAR.
The file system for a WPAR is structured from a starting directory, for instance,
/wpars/<wpar_name>. This directory contains subdirectories for each private file system in
the WPAR.
The starting directory of the WPAR must be created in the NFS server before the
execution of the mkwpar command.
Important: The wpar_name must equal the PowerHA resource group name that you plan
to use.
For an NFS-based WPAR, each file system must be specified at creation time.
An example is in Defining WPAR on page 447.
3. Configure PowerHA.
NFS setup
For an NFS-based WPAR in an PowerHA environment, each node in the cluster must have
root access to the NFS shared file systems that contain the WPAR data. Example 13-1 shows
how the entry in /etc/exports might look. In this example, the PowerHA cluster nodes are
sys5lpar3 and sys5lpar4. The NFS server is a third system (not part of the cluster).
For an NFS-based WPAR, each file system must be specified at creation time. These file
systems are as follows:
/
/home
/var/hacmp/adm
/var
Optional (for a private system WPAR): /usr and /opt
When the WPAR is created on the first node, you can define it on the next node or nodes by
adding the -p option to the command that is used in Example 13-2. If you forget the -p option,
an error message is issued. Example 13-3 shows the command that we used.
Example 13-4 shows the important fields in the Change/Show window for resource groups.
[Entry Fields]
Resource Group Name testwpar
Participating Nodes (Default Node Priority) sys5lpar3 sys5lpar4
Volume Groups [] +
...
Miscellaneous Data []
WPAR Name [testwpar] +
User Defined Resources [ ]
A versioned WPAR provides a different version of the runtime environment than the global
system. Support for AIX 5.2 or AIX 5.3 versioned WPARs requires the installation of more
licensed program products:
AIX 5.2 WPARs for AIX 7
AIX 5.3 WPARs for AIX 7
A mksysb backup of a system that runs an earlier version of AIX is used to create the
versioned WPAR. Applications that run in a versioned WPAR use the commands and libraries
from the operating system files where the backup is made to create the versioned WPAR, for
example, AIX 5.2 or AIX 5.3. These versioned WPARs own writable /opt and /usr file
systems. Applications that run in the versioned WPAR do not need to know that the global
system has a different version of the AIX operating system.
The product media contains the required installation images (vwpar.images) to support the
creation of versioned WPARs. The product media contains optional software that provides
System Management Interface Tool (SMIT) support to create and manage versioned WPARs.
Each WPAR has an isolated network environment with unique IP addresses and a unique
host name. You can access WPARs through standard networking programs, such as Telnet,
FTP, and rlogin.
The command creates the WPAR according to your backup. The initial output of the mkwpar
command is similar to the following example:
mkwpar: Extracting file system information from backup...
mkwpar: Creating file systems...
Creating file system '/' specified in image.data
/bff
Creating file system '/bff' specified in image.data
/home
Creating file system '/home' specified in image.data
Important: PowerHA automatically assigns and unassigns resources to and from a WPAR
as the corresponding WPAR-enabled resources come online (or go offline). You must not
assign any PowerHA resources to a WPAR.
Considerations
Consider the following important information:
PowerHA Smart Assist scripts are not supported for a WPAR-enabled RG. Therefore, any
application server or application monitoring script that uses the PowerHA Smart Assist
scripts cannot be configured as a part of a WPAR-enabled RG.
Process application monitoring is not supported for WPAR-enabled RGs.
For every WPAR-capable node that is a part of a WPAR-enabled RG and contains a
WPAR for a WPAR-enabled RG, at least one of the service labels (of the WPAR-enabled
RG) must be accessible from the corresponding global WPAR.
Important: Only the Global instance can run PowerHA. A WPAR can be considered an RG
of the type WPAR-enabled RG only.
The Add a Resource Group panel opens (Figure 13-2 on page 451). Use this menu to specify
the RG name and the start and stop script full path that is available in the global instance.
[Entry Fields]
* Resource Group Name [ApplA]
* Participating Nodes (Default Node Priority) [wpar1 wpar2] +
Run smit sysmirror and then select Cluster Applications and Resources Resource
Groups Change/Show Resources and Attributes for a Resource Group. Or, use the
fast path: smitty cm_resource_groups and select Change/Show All Resources and
Attributes for a Resource Group. See Figure 13-3 on page 452.
Volume Groups [] +
Use forced varyon of volume groups, if necessary false +
Automatically Import Volume Groups false +
Tape Resources [] +
Raw Disk PVIDs [] +
Miscellaneous Data []
WPAR Name [ApplA] +
User Defined Resources [ ] +
Figure 13-3 Adding a WPAR-enabled RG
Important: If the WPAR name does not exist when you synchronize the configuration,
you are asked to correct the error and the system creates a simple WPAR by using the
mkwpar -n WPAR-name command on the specified node. The service address is attached
to the WPAR when the RG is brought online.
When you run the lswpar -M ApplA command on your nodes, the output is similar to
Example 13-6.
Requirement: For this scenario, you must have some AIX administrator knowledge and
also some WPAR knowledge.
IP label:wpar1 IP label:wpar1
10.1.2.37 IP label:wpar1 : 10.1.1.37 IP label: wpar2 : 10.1.1.45 10.1.2.45
Cluster repository
WPAR
file systems including
caa_private /, /home, /var, /tmp
(hdisk2)
The WPAR is started and functional. We can create a specfile configuration file to use on the
other node (Example 13-8).
We need to vary off the volume group for the other node to use it and create the WPAR again,
as though it is new (Example 13-9).
varyoffvg haapvg
# GOTO WPAR2
lspv
importvg -y haapvg -V 111 hdisk1
varyonvg haapvg
Remove the /wpar/ApplA directory. You can create the WPAR again from the beginning by
using the mkwpar -f ApplA.spec command (Example 13-11).
Create the file systems again as shown in the initial node by using chlv (Example 13-12).
The WPAR is created again on node wpar2, and it can be started. To start the WPAR on node
wpar1, vary the volume offline and vary the volume online.
The WPAR is defined on both nodes and can be started on the node where the volume
haagvg is active.
All commands can be issued by using tools, such as smit, clmgr on the command line, or the
System Director plug-in. In some cases, where you can use the command line, it is listed.
We create a simple cluster with two nodes and one repository disk, and we create the
appropriate RG for PowerHA functionality with the WPAR.
NODE wpar1:
Network net_ether_01
wparsvc1 172.16.21.63
wpar1p1 172.16.21.37
wpar1 10.1.1.37
Network net_ether_02
wpar1b2 10.1.2.37
NODE wpar2:
Network net_ether_01
wparsvc1 172.16.21.63
wpar2p1 172.16.21.45
wpar2 10.1.1.45
Network net_ether_02
------------------------------
10.Create the ApplA application controller scripts by using cm_add_app_scripts and ensure
that they are executable on both nodes. Examples of these scripts are shown in
Example 13-15.
if [ "$Name" = "StartA" ]
then
echo "$(date) \"Application A started\" " >> /var/hacmp/adm/applA.log
touch /var/hacmp/adm/AppAup
nohup /usr/local/bin/ApplA &
exit 0
elif [ "$Name" = "StopA" ]
then
rm -f /var/hacmp/adm/AppAup
echo "$(date) \"Application A stopped\" " >> /var/hacmp/adm/applA.log
exit 0
else
echo "$(date) \"ERROR - Application A start/stop script called with wrong
name\" " >> /var/hacmp/adm/applA.log
exit 999
fi
#---------------------------------------------------#
# cat /usr/local/ha/StopA
#!/usr/bin/ksh
#
[[ "$VERBOSE_LOGGING" = "high" ]] && set -x
#
Name=$(basename $0 )
if [ "$Name" = "StartA" ]
then
echo "$(date) \"Application A started\" " >> /var/hacmp/adm/applA.log
touch /var/hacmp/adm/AppAup
nohup /usr/local/bin/ApplA &
exit 0
elif [ "$Name" = "StopA" ]
11.Create the application monitor by using the Add Custom Application Monitor menu
(Figure 13-5). The command-line command, cm_cfg_custom_appmon, opens the menu that
is shown in Figure 13-5.
[Entry Fields]
* Monitor Name [ApplA]
* Application Controller(s) to Monitor ApplA +
* Monitor Mode [Long-running monitori>
* Monitor Method [/usr/local/ha/MonA]
Monitor Interval [2] #
Hung Monitor Signal [9] #
* Stabilization Interval [1] #
* Restart Count [3] #
Restart Interval [] #
* Action on Application Failure [fallover] +
Notify Method []
Cleanup Method [/usr/local/ha/StopA]
Restart Method [/usr/local/ha/StartA]
12.Add the application controller scripts (scripts for starting and stopping the WPAR) by using
the cm_add_app_scripts SMIT command; the menu is shown in Figure 13-6. Perform a
quick check by using the cllsserv command.
[Entry Fields]
* Application Controller Name [ApplA]
* Start Script [/usr/local/ha/StartA]
* Stop Script [/usr/local/ha/StopA]
Application Monitor Name(s) +
Application startup mode [background] +
13.Add a service label for the WPAR address 10.1.1.50 by using the command line or SMIT
cm_add_a_service_ip_label_address.select_net (Example 13-17).
NODE wpar1:
Network net_ether_01
wpar1sap 10.1.1.50
wparsvc1 172.16.21.63
wpar1 10.1.1.37
Network net_ether_02
wpar1b2 10.1.2.37salut
NODE wpar2:
Network net_ether_01
wpar1sap 10.1.1.50
wparsvc1 172.16.21.63
wpar2 10.1.1.45
Network net_ether_02
[Entry Fields]
Resource Group Name ApplA
Participating Nodes (Default Node Priority) wpar1 wpar2
Volume Groups [] +
Use forced varyon of volume groups, if necessary true +
Automatically Import Volume Groups true +
Tape Resources [] +
Raw Disk PVIDs [] +
Miscellaneous Data []
WPAR Name [ApplA] +
User Defined Resources [ ] +
[MORE...17]
Node State
---------------------------- ---------------
wpar1 OFFLINE
wpar2 OFFLINE
19.Move the RG to the other node. The volume group must be varied online on the other
node, the WPAR must be started, and the application must be running. Select the RG to
move as shown in Figure 13-8.
+--------------------------------------------------------------------------+
| Select a Destination Node |
| |
| Move cursor to desired item and press Enter. |
| |
| # *Denotes Originally Configured Highest Priority Node |
| wpar1 |
| |
| F1=Help F2=Refresh F3=Cancel |
| F8=Image F10=Exit Enter=Do |
F1| /=Find n=Find Next |
F9+--------------------------------------------------------------------------+
[Entry Fields]
Resource Group(s) to be Moved ApplA
Destination Node wpar1
The result is a summary of the RG status as shown in Figure 13-10 on page 466.
-------------------------------------------------------------------------------
-
-------------------------------------
Group Name State Application state Node
-------------------------------------------------------------------------------
-
-------------------------------------
ApplA ONLINE wpar2
WPAR address: When the WPAR is handled by PowerHA and is online, it gets the
address running that is associated to the WPAR:
When the Resource group is offline, the address is no longer associated to the WPAR.
PowerHA internally remembers the association:
# clRGinfo
-----------------------------------------------------------------------------
Group Name State Node
-----------------------------------------------------------------------------
ApplA OFFLINE wpar1
OFFLINE wpar2
When the resource group is offline, the address is no longer associated to the WPAR.
PowerHA internally remembers the association, as shown in Example 13-21
Figure 13-11 on page 468 shows the network configuration that includes the WPAR.
IP label:wpar1 IP label:wpar1
10.1.2.37 IP label:wpar1 : 10.1.1.37 IP label: wpar2 : 10.1.1.45 10.1.2.45
Cluster repository
The /usr and /opt file systems can be shared with the Global environment.
Create links
For our scenario, we must create the /usr/sap directory under the Global environment. This
/usr/snap directory is seen within the WPAR and can be over mounted within the WPAR.
Administrator: File systems must be created correctly for the /etc/filesystem file to
mount them in the correct order because multiple file systems overmount themselves.
Allocate addresses
The WPAR service address that is allocated to our network address, for example,
172.16.21.63, is named wparsvc1 in /etc/hosts. There is no need to create another specific
address for the WPAR. The PowerHA environment uses its own set of addresses.
The command line to create the WPAR is shown in Example 13-23 on page 470. It also can
be created by using the SMIT wpar fast path.
echo Creating wpar $WPARNAME with address $addr on server $NFSHOST in $NFSROOT
#!/usr/bin/ksh
WPARNAME=wparsvc1
addr=172.16.21.63
NFSHOST=172.16.21.65
NFSROOT=/install/wpar/wpars
echo Creating wpar $WPARNAME with address $addr on server $NFSHOST in $NFSROOT
OStype: 0
UUID: 68f2fcdc-ca7e-4835-b9e1-9edaa2a36dcd
Important: To change the user, you need to set the CLUSTER_OVERRIDE=yes variable.
2. Create the correct file directories by using the script that is shown in Example 13-27.
22.03.2012 11:27:23
GetProcessList
OK
name, description, dispstatus, textstatus, starttime, elapsedtime, pid
msg_server, MessageServer, GREEN, Running, 2012 03 22 11:21:59, 0:05:24, 1572936
disp+work, Dispatcher, GREEN, Running, Message Server connection ok, Dialog Queue
time: 0.00 sec, 2012 03 22 11:21:59, 0:05:24, 2097170
igswd_mt, IGS Watchdog, GREEN, Running, 2012 03 22 11:21:59, 0:05:24, 2621502
All commands can be issued by using the usual tools, such as smit, clmgr on the command
line, or the System Director plug-in. In some cases, where the command line is usable, it is
listed as an option.
We create a cluster with two nodes and one repository disk, and we create the appropriate
RG for the PowerHA function with the WPAR.
[Entry Fields]
* Resource Type Name [wparsvc1]
* Processing order [WPAR] +
Verification Method []
Verification Type [Script] +
* Start Method [/usr/local/starwpar.sh]
* Stop Method [/usr/local/stowpar.sh]
Monitor Method
[/usr/local/monitorwpar.sh]
Cleanup Method []
Restart Method
[/usr/local/restartwapr.sh]
Failure Notification Method []
Required Attributes []
Optional Attributes []
Description [wpar]
3. Add a service label for the WPAR address 10.1.1.50 by using the command line or SMIT
cm_add_a_service_ip_label_address.select_net (Example 13-33).
4. Add the wparsvc1 RG by using SMIT cm_add_resource_group. Check the output by using
the cllsclstr command that is shown in Example 13-34.
[Entry Fields]
Resource Group Name wparsvc1
Participating Nodes (Default Node Priority) wpar1 wpar2
Volume Groups [] +
Use forced varyon of volume groups, if necessary false +
Automatically Import Volume Groups false +
Tape Resources [] +
Raw Disk PVIDs [] +
Miscellaneous Data []
WPAR Name [wparsvc1] +
User Defined Resources [ ] +
cltopinfo -w
claddserv -s'sap' -b'/usr/local/startwpar.sh' -e'/usr/local/stopwpar.sh' -O 'foreground'
clmgr add service_ip 'wparsvc1' NETWORK='net_ether_01'
claddgrp -g 'wparsvc1' -n 'wpar1' -S 'OHN' -O 'FNPN' -B 'NFB'
/usr/es/sbin/cluster/utilities/claddres -g 'wparsvc1' SERVICE_LABEL='wparsvc1' A
PPLICATIONS='sap' VOLUME_GROUP= FORCED_VARYON='false' VG_AUTO_IMPORT='false' FIL
ESYSTEM= FSCHECK_TOOL='fsck' RECOVERY_METHOD='sequential' FS_BEFORE_IPADDR='fals
e' EXPORT_FILESYSTEM= EXPORT_FILESYSTEM_V4= STABLE_STORAGE_PATH= MOUNT_FILESYSTE
M= NFS_NETWORK= SHARED_TAPE_RESOURCES= DISK= MISC_DATA= WPAR_NAME='wparsvc1' USE
RDEFINED_RESOURCES=
clmgr sync cluster wpar1_cluster
cldare -t
clmgr start cluster wpar1_cluster
/usr/es/sbin/cluster/utilities/cldare -rt -V 'normal' -C'interactive'
clRGinfo
clshowsrv -av
lswpar
# In case setting disabled auto start
# clmgr start resource_group wparsvc1
# clRGinfo
# lswpar
The configuration of the network is similar to the previous NFS scenario. Only the WPAR
changed because it moved from a standard shared WPAR to a private NFS versioned WPAR
as shown in Figure 13-12.
IP label:wpar1 IP label:wpar1
10.1.2.37 IP label:wpar1 : 10.1.1.37 IP label: wpar2 : 10.1.1.45 10.1.2.45
Example 13-37 Sample start and stop controller scripts for application ApplA
# cat /usr/local/ha/StartA
#!/usr/bin/ksh
#
[[ "$VERBOSE_LOGGING" = "high" ]] && set -x
#
Name=$(basename $0 )
if [ "$Name" = "StartA" ]
then
echo "$(date) \"Application A started\" " >> /var/hacmp/adm/applA.log
touch /var/hacmp/adm/AppAup
nohup /usr/local/bin/ApplA &
exit 0
elif [ "$Name" = "StopA" ]
then
rm -f /var/hacmp/adm/AppAup
echo "$(date) \"Application A stopped\" " >> /var/hacmp/adm/applA.log
exit 0
else
echo "$(date) \"ERROR - Application A start/stop script called with wrong name\"
" >> /var/hacmp/adm/applA.log
exit 999
fi
#---------------------------------------------------#
# cat /usr/local/ha/StopA
#!/usr/bin/ksh
#
[[ "$VERBOSE_LOGGING" = "high" ]] && set -x
#
Name=$(basename $0 )
if [ "$Name" = "StartA" ]
then
echo "$(date) \"Application A started\" " >> /var/hacmp/adm/applA.log
touch /var/hacmp/adm/AppAup
nohup /usr/local/bin/ApplA &
exit 0
elif [ "$Name" = "StopA" ]
then
rm -f /var/hacmp/adm/AppAup
echo "$(date) \"Application A stopped\" " >> /var/hacmp/adm/applA.log
exit 0
else
echo "$(date) \"ERROR - Application A start/stop script called with wrong name\"
" >> /var/hacmp/adm/applA.log
exit 1
fi
while [ -a /var/hacmp/adm/AppAup ]
do
echo "$(date) \"Application A is running\" " >> /var/hacmp/adm/applA.log
sleep 10
done
The command-line script to create the WPAR is shown in Example 13-39. You may also use
the SMIT wpar fast path.
WPARNAME=wparsvc3
addr=172.16.21.64
NFSHOST=172.16.21.65
NFSROOT=/install/wpar/wpars
MKSYSB=/install/wpar/AIX52New.mksysb
echo Creating wpar $WPARNAME with address $addr on server $NFSHOST in $NFSROOT
from mksysb $MKSYSB
NFSVERS=3
The major difference is the use of the -C -B -l parameters. For more explanation, see
Exploiting IBM AIX Workload Partitions, SG24-7955.
When the WPAR is created, you can use the lswpar command to check for the disk and the
main parameters as shown in Example 13-40.
Example 13-40 Main lswpar command that is used to check the WPAR creation
# lswpar -M wparsvc3
Name MountPoint Device Vfs Nodename
Options
-------------------------------------------------------------------------------------------
----------
wparsvc3 /wpars/wparsvc3 /install/wpar/wpars/wparsvc3/ nfs 172.16.21.65
vers=3
wparsvc3 /wpars/wparsvc3/home /install/wpar/wpars/wparsvc3/home nfs 172.16.21.65
vers=3
wparsvc3 /wpars/wparsvc3/nre/opt /opt namefs ro
wparsvc3 /wpars/wparsvc3/nre/sbin /sbin namefs ro
wparsvc3 /wpars/wparsvc3/nre/usr /usr namefs ro
wparsvc3 /wpars/wparsvc3/opt /install/wpar/wpars/wparsvc3/opt nfs 172.16.21.65
vers=3
wparsvc3 /wpars/wparsvc3/proc /proc namefs rw
wparsvc3 /wpars/wparsvc3/var/hacmp/adm /install/wpar/wpars/wparsvc3/var/hacmp/adm
nfs 172.16.21.65 vers=3
wparsvc3 /wpars/wparsvc3/usr /install/wpar/wpars/wparsvc3/usr nfs 172.16.21.65
vers=3
wparsvc3 /wpars/wparsvc3/var /install/wpar/wpars/wparsvc3/var nfs 172.16.21.65
vers=3
# # lswpar -G wparsvc3
=================================================================
wparsvc3 - Defined
=================================================================
Type: S
RootVG WPAR: no
Owner: root
Hostname: wparsvc3
WPAR-Specific Routing: no
OStype: 1
UUID: 2767381f-5de7-4cb7-a43c-5475ecde54f6
# df |grep wparsvc3
172.16.21.65:/install/wpar/wpars/wparsvc3/ 309329920 39976376 88% 118848 3%
/wpars/wparsvc3
172.16.21.65:/install/wpar/wpars/wparsvc3/home 309329920 39976376 88% 118848 3%
/wpars/wparsvc3/home
172.16.21.65:/install/wpar/wpars/wparsvc3/opt 309329920 39976376 88% 118848 3%
/wpars/wparsvc3/opt
/proc - - - - - /wpars/wparsvc3/proc
172.16.21.65:/install/wpar/wpars/wparsvc3/var/hacmp/adm 309329920 39976376 88% 118848
3% /wpars/wparsvc3/var/hacmp/adm
172.16.21.65:/install/wpar/wpars/wparsvc3/usr 309329920 39976376 88% 118848 3%
/wpars/wparsvc3/usr
172.16.21.65:/install/wpar/wpars/wparsvc3/var 309329920 39976376 88% 118848 3%
/wpars/wparsvc3/var
/usr 6291456 1787184 72% 50979 20% /wpars/wparsvc3/nre/usr
/opt 4194304 3809992 10% 7078 2% /wpars/wparsvc3/nre/opt
/sbin 2097152 1566912 26% 11155 6%
/wpars/wparsvc3/nre/sbin
Because we created the WPAR, creating it on the other system by using the mkwpar -pf
command is possible.
**********************************************************************
mkwpar: Creating file systems...
/
/home
/nre/opt
/nre/sbin
/nre/usr
/opt
/proc
[Entry Fields]
* Resource Group Name [wparsvc3]
* Participating Nodes (Default Node Priority) [wpar1 wpar2]
+
[Entry Fields]
* IP Label/Address wparsvc3 +
Netmask(IPv4)/Prefix Length(IPv6) []
* Network Name net_ether_01
[Entry Fields]
Resource Group Name wparsvc3
Participating Nodes (Default Node Priority) wpar1 wpar2
Volume Groups [] +
Use forced varyon of volume groups, if necessary false +
Automatically Import Volume Groups false +
Tape Resources [] +
Raw Disk PVIDs [] +
Miscellaneous Data []
WPAR Name [wparsvc3] +
User Defined Resources [ ] +
Part 5 Appendixes
This part includes the following appendixes:
Appendix A, Paper planning worksheets on page 489
Appendix B, C-SPOC LVM CLI commands on page 501
Appendix C, Cluster Test Tool log on page 531
Node Names
Major Number
Physical Volumes
Size
Export options
Export Options
Export Options
Export Options
Directory
Executable Files
Configuration Files
Cluster Name
Node
Strategy
Verification Commands
Node A
Node B
Server name
Start Script
Stop Script
Server name
Start Script
Stop Script
Server name
Start Script
Stop Script
Monitor Method
Monitor Interval
Stabilization Interval
Restart Count
Restart Interval
Notify Method
Cleanup Method
Restart Method
Startup Policy
Fallover Policy
Fallback Policy
Settling Time
Runtime Policies
Service IP Label
file systems
Volume Groups
Tape Resources
Application Servers
Miscellaneous Data
WPAR Name
COMMENTS
Event Command
Notify Command
Pre-Event Command
Post-Event Command
Recovery Counter
cli_assign_pvids
Assign a PVID to each of the disks passed as arguments, then update all other cluster nodes
with those PVIDs.
Syntax
cli_assign_pvids PhysicalVolume ...
Description
Directs LVM to assign a PVID to each of the physical volumes in the list (if one is not already
present), and then makes those PVIDs known on all cluster nodes.
Examples
To assign PVIDs to a list of disks, and have those PVIDs known across the cluster, enter:
cli_assign_pvids hdisk10 hdisk11 hdisk13
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should only be used
on physical disks accessible from all cluster nodes.
cli_chfs
Change the attributes of a file system on all nodes in a cluster.
Syntax
cli_chfs [ -m NewMountPoint ] [ -u MountGroup ] [ -p { ro | rw } ] [ -t { yes
| no } ] [ -a Attribute=Value ] [ -d Attribute ] FileSystem
Description
Uses C-SPOC to run the chfs command with the parameters, and make the updated file
system definition known on all cluster nodes.
Flags
Only the following flags from the chfs command are supported:
-d Attribute
Deletes the specified attribute from the /etc/filesystems file for the specified file
system.
-m NewMountPoint
Specifies a new mount point for the specified file system.
-p
Examples
In general, any operation that is valid with the chfs command and that uses the supported
operands is valid with cli_chfs. For example, to change the size of the shared file system
/test:
cli_chfs -a size=32768 /test
To increase the size of the shared file system called /shawnfs issue:
jessica# df -m /shawnfs
Filesystem MB blocks Free %Used Iused %Iused Mounted on
/dev/lv1 64.00 61.95 4% 17 1% /shawnfs
cassidy# df -m /lv1_fs1
Filesystem MB blocks Free %Used Iused %Iused Mounted on
/dev/lv1 80.00 77.45 4% 17 1% /shawnfs
Note: The /shawnfs file system is locally mounted on jessica, and we ran cli_chfs from
the other cluster node cassidy. This is valid; the communication between the nodes occurs
through the clcomd daemon.
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
file systems in rootvg, or that otherwise might appear multiple times across the cluster. The
automount attribute is not supported.
cli_chlv
Change the attributes of a logical volume on all nodes in a cluster.
Syntax
cli_chlv [-a Position] [-b BadBlocks] [-d Schedule] [-e Range]
[-L label] [-p Permission] [-r Relocate] [-s Strict]
[-t Type] [-u Upperbound] [-v Verify] [-w MirrorWriteConsistency]
[-x Maximum] [-U userid] [-G groupid] [-P modes] LogicalVolume
Description
Uses C-SPOC to run the chlv command with the given parameters, and make the updated
logical volume definition known on all cluster nodes.
Flags
Only the following flags from the chlv command are supported:
-a Position
Sets the intra-physical volume allocation policy (the position of the logical partitions on
the physical volume). The Position variable is represented by one of these values:
m
Allocates logical partitions in the outer middle section of each physical volume. This
is the default position.
c
Allocates logical partitions in the center section of each physical volume.
e
Allocates logical partitions in the outer edge section of each physical volume.
ie
Allocates logical partitions in the inner edge section of each physical volume.
im
Allocates logical partitions in the inner middle section of each physical volume.
-b BadBlocks
Sets the bad-block relocation policy. The BadBlocks variable is represented by one of
these values:
y
Causes bad-block relocation to occur.
n
Prevents bad block relocation from occurring.
-d Schedule
Sets the scheduling policy when more than one logical partition is written. Must use
parallel or sequential to mirror a striped LV. The Schedule variable is represented by
one of these values:
p
Note: Mounting a JFS file system on a read-only logical volume is not supported.
-P Modes
Specifies permissions (file modes) for the logical volume special file.
-r Relocate
-t Type
Sets the logical volume type. The maximum size is 31 characters. If the logical volume
is striped, you cannot change Type to boot.
-U Userid
Specifies the user ID for the logical volume special file.
-u Upperbound
Sets the maximum number of physical volumes for new allocation. The value of the
Upperbound variable must be at least one and the total number of physical volumes.
When using super strictness, the upperbound indicates the maximum number of
physical volumes allowed for each mirror copy. When using striped logical volumes, the
upper bound must be multiple of Stripe_width parameter.
-v Verify
Sets the write-verify state for the logical volume. Causes all writes to the logical volume
either to be verified with a follow-up read or not to be verified with a follow-up read. The
Verify variable is represented by one of these values:
y
Causes all writes to the logical volume to be verified with a follow-up read.
n
Causes all writes to the logical volume not to be verified with a follow-up read.
n
No mirror write consistency. See the -f flag of the syncvg command.
-x Maximum
Sets the maximum number of logical partitions that can be allocated to the logical
volume. The maximum number of logical partitions per logical volume is 32,512.
Examples
In general, any operation that is valid with the chlv command and that uses the supported
operands is valid with cli_chlv. For example, to change the inter-physical volume allocation
of logical volume jesslv, enter:
cli_chlv -e m jesslv
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
any logical volume in rootvg, or that otherwise might appear multiple times across the cluster.
The '-f' flag is passed to cl_chlv to suppress unnecessary checking. As a consequence, the
operation will proceed even if some nodes are not accessible.
cli_chvg
Change the attributes of a volume group on all nodes in a cluster.
Syntax
cli_chvg [ -s Sync { y | n }] [ -L LTGSize ] [ -Q { n | y } ] [ -u ] [ -t [factor
] ] [ -B ] [ -C ] VolumeGroup
Description
Uses C-SPOC to run the chvg command with the given parameters, and make the updated
volume group definition known on all cluster nodes.
Flags
Only the following flags from the chvg command are supported:
-B
Changes the volume group to Big VG format. This can accommodate up to 128
physical volumes and 512 logical volumes. Consider these notes:
The -B flag cannot be used if there are any stale physical partitions.
The -B flag cannot be used if the volume group is varied on in concurrent mode.
Note: This flag is not supported for the concurrent capable volume groups.
-t [factor]
Changes the limit of the number of physical partitions per physical volume, specified by
factor. factor should be in the range of 1 - 16 for 32 disk-volume groups and 1 - 64 for
128 disk-volume groups.
If factor is not supplied, it is set to the lowest value so that the number of physical
partitions of the largest disk in volume group is less than factor x 1016.
If factor is specified, the maximum number of physical partitions per physical volume
for this volume group changes to factor x 1016. Consider these notes:
This option is ignored for Scalable-type volume groups.
If the volume group was created in AIX 4.1.2 in violation of 1016 physical partitions
per physical volume limit, this flag can be used to convert the marking of partitions.
factor cannot be changed if there are any stale physical partitions in the volume
group.
This flag cannot be used if the volume group is varied on in concurrent mode.
The maximum number of physical volumes that can be included in this volume
group will be reduced to (MAXPVS/factor).
-u
Unlocks the volume group. This option is provided if the volume group is left in a locked
state by abnormal termination of another LVM operation (such as the command core
dumping, or the system crashing).
Note: Before using the -u flag, make sure that the volume group is not being used
by another LVM command.
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
rootvg, or any other volume group that otherwise might appear multiple times across the
cluster.
The '-f' flag is passed to cl_chvg to suppress unnecessary checking. As a consequence, the
operation will proceed even if some nodes are not accessible.
cli_crfs
Create a new file system, and make it known on all nodes in a cluster.
Syntax
cli_crfs -v VfsType { -g VolumeGroup | -d Device } [ -l LogPartitions ] -m
MountPoint [ -u MountGroup ] [ -A { yes | no } ] [ -p {ro | rw } ] [ -a
Attribute=Value ... ] [ -t { yes | no } ]
Description
Uses C-SPOC to run the crfs command with the given parameters, and make the updated
file system definition known on all cluster nodes.
Flags
Only the following flags from the crfs command are supported:
-a Attribute=Value
Specifies a virtual file system-dependent attribute/value pair. To specify more than one
attribute/value pair, provide multiple -a Attribute=Value parameters
-d Device
Specifies the device name of a device or logical volume on which to make the file
system. This is used to create a file system on an already existing logical volume.
-g VolumeGroup
Specifies an existing volume group on which to make the file system. A volume group
is a collection of one or more physical volumes.
-l LogPartitions
Specifies the size of the log logical volume, expressed as a number of logical partitions.
This flag applies only to JFS and JFS2 file systems that do not already have a log
device.
-m MountPoint
Specifies the mount point, which is the directory where the file system will be made
available.
Note: If you specify a relative path name, it is converted to an absolute path name
before being inserted into the /etc/filesystems file.
Note: The agblksize attribute is set at file system creation and cannot be changed
after the file system is successfully created. The size attribute defines the minimum
file system size, and you cannot decrease it after the file system is created.
Examples
In general, any operation that is valid with the crfs command and that uses the supported
operands is valid with cli_crfs. For example, to create a JFS file system on an existing
logical volume lv01, enter:
cli_crfs -v jfs -d jesslv -m /shawnfs -a 'size=32768'
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used to
create a file system in rootvg, or that otherwise might appear multipe times across the cluster.
The '-f' flag is passed to cl_crfs to suppress unnecessary checking. As a consequence, the
operation will proceed even if some nodes are not accessible.
cli_crlvfs
Create a new logical volume and file system on it, and make it known on all nodes in a cluster.
Syntax
cli_crlvfs -v VfsType -g VolumeGroup [ -l LogPartitions ] -m MountPoint [ -u
MountGroup ] [ -A { yes | no } ] [ -p {ro | rw } ] [ -a Attribute=Value ... ] [ -t
{ yes | no } ]
Description
Uses C-SPOC to run the crfs command with the given parameters, and make the updated
file system definition known on all cluster nodes.
Note: If you specify a relative path name, it is converted to an absolute path name
before being inserted into the /etc/filesystems file.
-p
Sets the permissions for the file system.
ro
Read-only permissions
rw
Read-write permissions
-t
Specifies whether the file system is to be processed by the accounting subsystem:
yes
Accounting is enabled on the file system.
no
Accounting is not enabled on the file system (default value).
-u MountGroup
Specifies the mount group.
-v VfsType
Specifies the virtual file system type.
Note: the agblksize attribute is set at file system creation and cannot be changed
after the file system is successfully created. The size attribute defines the minimum
file system size, and you cannot decrease it after the file system is created.
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used to
create a file system in rootvg, or that otherwise might appear multiple times across the
cluster.
The '-f' flag is passed to cl_crlvfs to suppress unnecessary checking. As a consequence, the
operation will proceed even if some nodes are not accessible.
cli_extendlv
Increases the size of a logical volume on all nodes in a cluster by adding unallocated physical
partitions from within the volume group.
Syntax
cli_extendlv [ -a Position ] [ -e Range ] [ -u Upperbound ] [ -s Strict ]
LogicalVolume Partitions [ PhysicalVolume ... ]
Description
Uses C-SPOC to run the extendlv command with the given parameters, and make the
updated logical volume definition known on all cluster nodes.
Flags
Only the following flags from the extendlv command are supported:
-a Position
Sets the intra-physical volume allocation policy (the position of the logical partitions on
the physical volume). The Position variable can be one of these values:
m
Allocates logical partitions in the outer middle section of each physical volume. This
is the default position.
c
Allocates logical partitions in the center section of each physical volume.
e
Allocates logical partitions in the outer edge section of each physical volume.
ie
Allocates logical partitions in the inner edge section of each physical volume.
im
Allocates logical partitions in the inner middle section of each physical volume.
-e Range
Sets the inter-physical volume allocation policy (the number of physical volumes to
extend across, using the volumes that provide the best allocation). The value of the
-u Upperbound
Sets the maximum number of physical volumes for new allocation. The value of the
Upperbound variable should be between one and the total number of physical volumes.
When using super strictness, the upper bound indicates the maximum number of
physical volumes allowed for each mirror copy. When using striped logical volumes, the
upper bound must be a multiple of Stripe_width.
Examples
In general, any operation that is valid with the extendlv command and that uses the
supported operands is valid with cli_extendlv. For example, to increase the size of the
logical volume emilylv by three logical partitions, enter:
cli_extendlv emilylv 3
cli_extendvg
Adds physical volumes to a volume group on all nodes in a cluster.
Syntax
cli_extendvg VolumeGroup PhysicalVolume ...
Description
Uses C-SPOC to run the extendvg command with the given parameters, and make the
updated volume group definition known on all cluster nodes.
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
any logical volume in rootvg, or that otherwise might be duplicated across the cluster.
cli_importvg
Imports a new volume group definition from a set of physical volumes on all nodes in a
cluster.
Syntax
cli_importvg [ -y VolumeGroup ] [ -V MajorNumber ] PhysicalVolume
Description
Uses C-SPOC to run the importvg command, which causes LVM on each cluster node to
read the LVM information on the disks in the volume group, and update the local volume
group definition.
Flags
-V MajorNumber
Specifies the major number of the imported volume group.
-y VolumeGroup
Specifies the name to use for the new volume group. If this flag is not used, the system
automatically generates a new name.
The volume group name can contain only the following characters (all other characters
are considered invalid):
A-Z
a-z
0-9
Underscore (_)
Minus sign (-)
Period (.)
Examples
In general, any operation that is valid with the importvg command and that uses the
supported operands is valid with cli_importvg. For example, to make the volume group
emilyvg from physical volume hdisk07 known on all cluster nodes, enter:
cli_importvg -y emilyvg hdisk07
Implementation specifics
It should not be used on rootvg, or any other volume group that otherwise might appear
multiple times across the cluster.
cli_mirrorvg
Mirrors all the logical volumes that exist on a given volume group on all nodes in a cluster.
Syntax
cli_mirrorvg [-S | -s] [-Q] [-c Copies] [-m] VolumeGroup [PhysicalVolume...]
Description
Uses C-SPOC to run the mirrorvg command with the given parameters, and make the
updated volume group definition known on all cluster nodes.
Flags
Only the following flags from the mirrorvg command are supported:
-c Copies
Specifies the minimum number of copies that each logical volume must have after the
mirrorvg command has finished running. It might be possible, through the independent
use of mklvcopy, that some logical volumes might have more than the minimum number
specified after the mirrorvg command runs. Minimum value is 2; maximum value is 3.
A value of 1 is ignored.
-m exact map
Allows mirroring of logical volumes in the exact physical partition order that the original
copy is ordered. This option requires you to specify one or more Physical Volumes
where the exact map copy should be placed. If the space is insufficient for an exact
mapping, the command will fail. Add new drives or pick a different set of drives that will
satisfy an exact logical volume mapping of the entire volume group. The designated
disks must be equal to or exceed the size of the drives that are to be exactly mirrored,
regardless of if the entire disk is used. Also, if any logical volume to be mirrored is
already mirrored, this command will fail.
-Q Quorum Keep
By default in mirrorvg, when a volume group's contents becomes mirrored, volume
group quorum is disabled. If the user wants to keep the volume group quorum
requirement after mirroring is complete, this option should be used in the command.
For later quorum changes, refer to the chvg command.
-S Background Sync
Returns the mirrorvg command immediately and starts a background syncvg of the
volume group. With this option, it is not obvious when the mirrors have completely
finished their synchronization. However, as portions of the mirrors become
synchronized, they are immediately used by LVM for mirroring.
-s Disable Sync
Returns the mirrorvg command immediately without performing any type of mirror
synchronization. If this option is used, the mirror might exist for a logical volume but is
not used by the operating system until it is synchronized with the syncvg command.
Examples
In general, any operation that is valid with the mirrorvg command and that uses the
supported operands is valid with cli_mirrorvg. For example, to specify two copies for every
logical volume in shared volume group vg01, enter:
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
rootvg, or any other volume group that otherwise might appear multiple times across the
cluster.
cli_mklv
Create a new logical volume on all nodes in a cluster.
Syntax
cli_mklv [ -a Position ] [ -b BadBlocks ] [ -c Copies ] [ -d Schedule ] [ -e
Range ] [ -i ] [ -L Label ] [ -o y / n ] [ -r Relocate ] [ -s Strict ] [ -t Type ]
[ -u UpperBound ] [ -v Verify ] [ -w MirrorWriteConsistency ] [ -x Maximum ] [ -y
NewLogicalVolume | -Y Prefix ] [ -S StripSize ] [ -U Userid ] [ -G Groupid ] [ -P
Modes ] VolumeGroup NumberOfLPs [ PhysicalVolume ... ]
Description
Uses C-SPOC to run the mklv command with the given parameters, and make the new logical
volume definition known on all cluster nodes.
Flags
Only the following flags from the mklv command are supported:
-a Position
Sets the intra-physical volume allocation policy (the position of the logical partitions on
the physical volume). The Position variable can be one of these values:
m
Allocates logical partitions in the outer middle section of each physical volume. This
is the default position.
c
Allocates logical partitions in the center section of each physical volume.
e
Allocates logical partitions in the outer edge section of each physical volume.
ie
Allocates logical partitions in the inner edge section of each physical volume.
im
Allocates logical partitions in the inner middle section of each physical volume.
-b BadBlocks
Sets the bad-block relocation policy. The Relocation variable can be one of these
values:
y
Causes bad-block relocation to occur. This is the default.
n
Examples
In general, any operation that is valid with the mklv command and that uses the supported
operands is valid with cli_mklv. For example, to make a logical volume in volume group
stevehvg with one logical partition and a total of two copies of the data, enter:
cli_mklv -c 2 stevehvg 1
In general, it is preferred to specify the name when creating a new logical volume. To create a
new logical volume called emilylv of 10 logical partitions inside volume group called
sbodilyvg, enter:
cli_mklv -y emilylv stevehvg 10
The '-f' flag is passed to cl_mklv to suppress unnecessary checking. As a consequence, the
operation will proceed even if some nodes are not accessible.
cli_mklvcopy
Increase the number of copies in each logical partition in a logical volume on all nodes in a
cluster.
Syntax
cli_mklvcopy [ -a Position ] [ -e Range ] [ -k ] [ -s Strict ] [ -u UpperBound ]
LogicalVolume Copies [ PhysicalVolume... ]
Description
Uses C-SPOC to run the mklvcopy command with the given parameters, and make the
updated logical volume definition known on all cluster nodes.
Flags
Only the following flags from the mklvcopy command are supported:
-a Position
Sets the intra-physical volume allocation policy (the position of the logical partitions on
the physical volume). The Position variable can be one of these values:
m
Allocates logical partitions in the outer middle section of each physical volume. This
is the default position.
c
Allocates logical partitions in the center section of each physical volume.
e
Allocates logical partitions in the outer edge section of each physical volume.
ie
Allocates logical partitions in the inner edge section of each physical volume.
im
Allocates logical partitions in the inner middle section of each physical volume.
-e Range
Sets the inter-physical volume allocation policy (the number of physical volumes to
extend across, using the volumes that provide the best allocation). The value of the
Range variable is limited by the Upperbound variable (set with the -u flag) and can be
one of these values:
x
Allocates logical partitions across the maximum number of physical volumes.
m
Allocates logical partitions across the minimum number of physical volumes.
Note: When changing a non-super strict logical volume to a super strict logical
volume, you must specify physical volumes or use the -u flag.
-u UpperBound
Sets the maximum number of physical volumes for new allocation. The value of the
Upperbound variable should be between one and the total number of physical volumes.
When using super strictness, the upper bound indicates the maximum number of
physical volumes allowed for each mirror copy. When using striped logical volumes, the
upper bound must be multiple of Stripe_width.
Examples
In general, any operation that is valid with the mklvcopy command and that uses the
supported operands is valid with cli_mklvcopy. For example, to add physical partitions to the
logical partitions logical volume jesslv, so that a total of three copies exist for each logical
partition, enter:
cli_mklvcopy jesslv 3
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
any logical volume in rootvg, or that otherwise might be duplicated across the cluster.
cli_mkvg
Create a new volume group on all nodes in a cluster.
Syntax
cli_mkvg [ -B ] [ -t factor ] [ -C ] [ -G ] [ -x ] [ -s Size ]
[ -V MajorNumber ] [ -v LogicalVolumes ] [ -y VolumeGroup ]
PhysicalVolume ...
Flags
Only the following flags from the mkvg command are supported:
-B
Creates a big-type volume group. This can accommodate up to 128 physical volumes
and 512 logical volumes. Note, because the VGDA space has been increased
substantially, every VGDA update operation (creating a logical volume, changing a
logical volume, adding a physical volume, and so on) might take considerably longer to
run.
-C
Creates an enhanced concurrent capable volume group. Only use the -C flag in a
configured PowerHA cluster.
Use this flag to create an enhanced concurrent capable volume group.
Notes:
1) Enhanced concurrent volume groups use group services. Group services ships
with HACMP and must be configured prior to activating a volume group in this
mode.
2) Only enhanced concurrent capable volume groups are supported when running
with a 64-bit kernel. Concurrent capable volume groups are not supported when
running with a 64-bit kernel.
-G
Same as -b flag.
-p partitions
Total number of partitions in the volume group, where the partitions variable is
represented in units of 1024 partitions. Valid values are 32, 64, 128, 256, 512 768,
1024, and 2048. The default is 32 KB (32768 partitions). The chvg command can be
used to increase the number of partitions up to the maximum of 2048 KB (2097152
partitions). This option is valid only with the -s option.
-s size
Sets the number of megabytes in each physical partition, where the size variable is
expressed in units of megabytes from 1 (1 MB) through 131072 (128 GB). The size
variable must be equal to a power of 2 (example 1, 2, 4, 8). The default value for 32 and
128 PV volume groups will be the lowest value to remain within the limitation of 1016
physical partitions per PV. The default value for scalable volume groups will be the
lowest value to accommodate 2040 physical partitions per PV.
-t factor
Changes the limit of the number of physical partitions per physical volume, specified by
factor. The factor should be in the range of 1 - 16 for 32 PV volume groups and 1 - 64
for 128 PV volume groups. The maximum number of physical partitions per physical
volume for this volume group changes to factor x 1016. The default will be the lowest
value to remain within the physical partition limit of factor x 1016. The maximum
number of pvs that can be included in the volume group is maxpvs/factor. The -t
option is ignored with the -s option.
-V majornumber
Examples
In general, any operation that is valid with the mkvg command and that uses the supported
operands is valid with cli_mkvg. For example, to create a volume group that contains hdisk3,
hdisk5 and hdisk6 with a physical partition size set to 1 MB, enter:
cli_mkvg -s 1 stevehvg hdisk5 hdisk6
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster.
The '-f' flag is passed to cl_mkvg to suppress unnecessary checking. As a consequence, the
operation will proceed even if some nodes are not accessible.
cli_on_cluster
Run an arbitrary command on all nodes in the cluster.
Syntax
cli_on_cluster [ -S | -P ] 'command string'
Description
Runs a given command as root on all cluster nodes, either serially or in parallel. Any output
(stdout or stderr) from the command is sent to the terminal. Each line of output is preceded by
the node name followed by a colon (:) character.
Flags
-S
Runs the command on each node in the cluster in turn, waiting for completion before
going to the next.
-P
Runs the command in parallel on all nodes in the cluster simultaneously.
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster.
Note: CAA introduced the clcmd, which also provides parallel command functionality.
cli_on_node
Run an arbitrary command on a specific node in the cluster.
Syntax
cli_on_node [ -V <volume group> | -R <resource group | -N <node> ] 'command
string'
Description
Runs a given command as root on either an explicitly specified node, or on the cluster node
that owns a specified volume group or resource group. Any output from the command (stdout
and stderr) is sent to the terminal.
Flags
Only one of the following flags can be specified.
-V volume group
Runs the command on the node on which the given volume group is varied on. If the
volume group is varied on in concurrent mode on multiple nodes, the command will be
run on all those nodes.
-R resource group
Runs the command on the node that currently owns the given resource group.
-N node
Runs the command on the given node. This is the PowerHA node name.
Examples
To run the ps -efk command on the node named jessica enter:
cli_on_node -N jessica 'ps -efk'
Implementation specifics
It must be run as root, on a node in an PowerHA SystemMirror cluster.
cli_reducevg
Removes a physical volume from a volume group, and makes the change known on all cluster
nodes. When all physical volumes are removed from the volume group, the volume group is
deleted on all cluster nodes.
Syntax
cli_reducevg VolumeGroup PhysicalVolume ...
Description
Uses C-SPOC to run the reducevg command with the given parameters, and make the
updated volume group definition known on all cluster nodes.
Examples
In general, any operation that is valid with the reducevg command and that uses the
supported operands is valid with cli_reducevg. For example, to remove physical disk hdisk10
from volume group sbodilyvg, enter:
cli_reducevg sbodilyvg hdisk10
cli_replacepv
Replace a physical volume in a volume group with another, and make the change known on
all cluster nodes.
Syntax
cli_replacepv SourcePhysicalVolume DestinationPhysicalVolume
Description
Uses C-SPOC to run the replacepv command with the given parameters, and make the
updated volume group definition known on all cluster nodes.
Examples
In general, any operation that is valid with the replacepv command and that uses the
supported operands is valid with cli_replacepv. For example, to replace hdisk10 with
hdisk20 in the volume group that owns hdisk10, enter:
cli_replacepv hdisk10 hdisk20
The '-f' flag is passed to cl_reducevg to suppress unnecessary checking and provide
automatic confirmation of physical disk removal. As a consequence, the operation will
proceed even if some nodes are not accessible.
cli_rmfs
Remove a file system from all nodes in a cluster.
Syntax
cli_rmfs [ -r ] FileSystem
Description
Uses C-SPOC to run the rmfs command with the given parameters, and remove the file
system definition from all cluster nodes.
Flags
Only the following flag from the rmfs command is supported:
-r
Removes the mount point of the file system
Examples
In general, any operation that is valid with the rmfs command and that uses the supported
operands is valid with cli_rmfs. For example, to remove the shared file system /vanfs, enter:
cli_rmfs -r /vanfs
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
any files ystems in rootvg that otherwise might appear multipe times across the cluster.
The '-f' flag is passed to cl_reducevg to suppress unnecessary checking and provide
automatic confirmation of physical disk removal. As a consequence, the operation will
proceed even if some nodes are not accessible
cli_rmlv
Remove a logical volume from all nodes in a cluster.
Syntax
cli_rmlv LogicalVolume ...
Description
Uses C-SPOC to run the rmlv command with the given parameters, and make the updated
logical volume definition known on all cluster nodes.
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
any logical volume in rootvg, or that otherwise might be duplicated across the cluster.
cli_rmlvcopy
Remove copies from a logical volume on all nodes in a cluster.
Syntax
cli_rmlvcopy LogicalVolume Copies [ PhysicalVolume... ]
Description
Uses C-SPOC to run the rmlvcopy command with the given parameters, and make the
updated logical volume definition known on all cluster nodes.
Examples
In general, any operation that is valid with the rmlvcopy command and that uses the
supported operands is valid with cli_rmlvcopy. For example, to reduce the number of copies
of each logical partition belonging to logical volume jesslv so that each as only a single copy,
enter:
cli_rmlvcopy jesslv 1
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
any logical volume in rootvg, or that otherwise might be duplicated across the cluster.
cli_syncvg
Run the syncvg command with the given parameters and make the updated volume group
definition known on all cluster nodes.
Syntax
cli_syncvg [-f] [-H] [-P NumParallelLps] {-l|-v} Name
Description
Uses C-SPOC to run the syncvg command, which causes LVM on each cluster node to read
the LVM information on the disks in the volume group, and update the local volume group
definition.
Examples
In general, any operation that is valid with the syncvg command and that uses the supported
operands is valid with cli_syncvg. For example, to synchronize the copies on volume group
sbodilyvg, enter:
cli_syncvg -v sbodilyvg
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
any logical volume in rootvg, or that otherwise might be duplicated across the cluster.
cli_unmirrorvg
Unmirror a volume group on all nodes in a cluster.
Syntax
cli_unmirrorvg [ -c Copies ] VolumeGroup [ PhysicalVolume ... ]
Description
Uses C-SPOC to run the unmirrorvg command with the given parameters, and make the
updated volume group definition known on all cluster nodes.
Examples
In general, any operation that is valid with the unmirrorvg command and that uses the
supported operands is valid with cli_unmirrorvg. For example, to specify only a single copy
for shared volume group bodilyvg, enter:
cli_unmirrorvg -c 1 bodilyvg
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
any logical volume in rootvg, or that otherwise might be duplicated across the cluster.
cli_updatevg
Updates the definition of a volume group on all cluster nodes to match the current actual state
of the volume group.
Syntax
cli_updatevg VolumeGroup
Description
Uses C-SPOC to run the updatevg command, which causes LVM on each cluster node to
read the LVM information on the disks in the volume group, and update the local volume
group definition.
Examples
To update the volume group definition for volume group shawnvg on all cluster nodes, enter:
cli_updatevg shawnvg
Implementation specifics
It must be run as root, on a node in a PowerHA SystemMirror cluster. It should not be used on
any logical volume in rootvg, or that otherwise might be duplicated across the cluster.
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this book.
IBM Redbooks
The following IBM Redbooks publications provide additional information about the topic in this
document. Some publications in this list might be available in softcopy only.
Exploiting IBM AIX Workload Partitions, SG24-7955
Guide to IBM PowerHA SystemMirror for AIX Version 7.1.3, SG24-8167
IBM PowerHA SystemMirror 7.1.2 Enterprise Edition for AIX, SG24-8106
IBM PowerHA SystemMirror Standard Edition 7.1.1 for AIX Update, SG24-8030
IBM PowerVM Virtualization Introduction and Configuration, SG24-7940
Understanding LDAP - Design and Implementation, SG24-4986
You can search for, view, download or order these documents and other Redbooks,
Redpapers, Web Docs, draft and additional materials, at the following website:
ibm.com/redbooks
Online resources
These websites are also relevant as further information sources:
List of current service packs for PowerHA:
http://www14.software.ibm.com/webapp/set2/sas/f/hacmp/home.html
PowerHA frequently asked questions:
http://www-03.ibm.com/systems/power/software/availability/aix/faq/index.html
List of supported devices by PowerHA:
http://ibm.co/1EvK8cG
PowerHA Enterprise Edition Cross Reference:
http://tinyurl.com/haEEcompat
IBM Systems Director download page:
http://www-03.ibm.com/systems/software/director/downloads/
IBM PowerHA
SystemMirror for AIX
Cookbook
Explore the most This IBM Redbooks publication can help you install, tailor, and
configure the new IBM PowerHA Version 7.1.3, and understand new INTERNATIONAL
recent
and improved features such as migrations, cluster administration, and TECHNICAL
enterprise-ready
advanced topics like configuring in a virtualized environment including SUPPORT
features workload partitions (WPARs). ORGANIZATION
With this book, you can gain a broad understanding of the IBM
Use the planning
PowerHA SystemMirror architecture. If you plan to install, migrate, or
worksheets and administer a high availability cluster, this book is right for you.
practical examples
This book can help IBM AIX professionals who seek a comprehensive BUILDING TECHNICAL
and task-oriented guide for developing the knowledge and skills INFORMATION BASED ON
Learn details about required for PowerHA cluster design, implementation, and daily system PRACTICAL EXPERIENCE
the advanced options administration. It provides a combination of theory and practical
experience. IBM Redbooks are developed
This book is targeted toward technical professionals (consultants, by the IBM International
technical support staff, IT architects, and IT specialists) who are Technical Support
responsible for providing high availability solutions and support with Organization. Experts from
the IBM PowerHA SystemMirror Standard on IBM POWER systems.
IBM, Customers and Partners
from around the world create
timely technical information
based on realistic scenarios.
Specific recommendations
are provided to help you
implement IT solutions more
effectively in your
environment.