IBM System Storage N Series Hardware Guide
IBM System Storage N Series Hardware Guide
IBM System Storage N Series Hardware Guide
Roland Tretau Jeff Lin Dirk Peitzmann Steven Pemberton Tom Provost Marco Schwarz
ibm.com/redbooks
International Technical Support Organization IBM System Storage N series Hardware Guide September 2012
SG24-7840-02
Note: Before using this information and the product it supports, read the information in Notices on page xi.
Third Edition (September 2012) This edition applies to the IBM System Storage N series portfolio as of June 2012.
Copyright International Business Machines Corporation 2012. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii The team who wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi Summary of changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii September 2012, Third Edition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Part 1. Introduction to N series hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 1. Introduction to IBM System Storage N series . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 IBM System Storage N series hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Software licensing structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.1 Mid-range and high-end . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.2 Entry-level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.4 Data ONTAP 8 supported systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chapter 2. Entry-level systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 N3220 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 N3220 model 2857-A12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 N3220 model 2857-A22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 N3220 hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 N3240 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 N3240 model 2857-A14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 N3240 model 2857-A24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 N3240 hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 N32x0 common information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 N3400 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 N3400 model 2859-A11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 N3400 model 2859-A21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.3 N3400 hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 N3000 technical specifications at a glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 3. Mid-range systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Common features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Hardware summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Functions and features common to all models . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Hardware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 N6210 and N6240 and N6240 hardware overview . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 IBM N62x0 MetroCluster / gateway models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 IBM N62x0 series technical specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 N62x0 technical specifications at a glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 14 14 14 14 15 16 16 16 16 18 19 19 19 19 21 23 24 24 25 25 27 27 31 32 33
iii
Chapter 4. High-end systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Hardware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Base components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 IBM N series N7950T slot configuration rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.3 N7950T hot-pluggable FRUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4 N7950T cooling architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.5 System-level diagnostic procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.6 N7950T supported back-end storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.7 MetroCluster, Gateway, and FlexCache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.8 N7950T guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.9 N7950T SFP+ modules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 N7950T technical specifications at a glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 5. Expansion units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Shelf technology overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Expansion unit EXN3000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Supported EXN3000 drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 Environmental and technical specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Expansion unit EXN3500 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Intermix support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Supported EXN3500 drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.4 Environmental and technical specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Expansion unit EXN4000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Supported EXN4000 drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Environmental and technical specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Self-Encrypting Drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 SED at a glance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 SED overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.3 Threats mitigated by self-encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.4 Effect of self-encryption on Data ONTAP features . . . . . . . . . . . . . . . . . . . . . . . . 5.5.5 Mixing drive types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.6 managementKey management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 6. Cabling expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 EXN3000 and EXN3500 disk shelves cabling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Controller-to-shelf connection rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.2 SAS shelf interconnects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.3 Top connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.4 Bottom connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.5 Verifying SAS connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.6 Connecting the optional ACP cables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 EXN4000 disk shelves cabling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Non-multipath Fibre Channel cabling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Multipath Fibre Channel cabling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Multipath High-Availability cabling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 7. Highly Available controller pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 HA pair overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Benefits of HA pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Characteristics of nodes in an HA pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3 Preferred practices for deploying an HA pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
IBM System Storage N series Hardware Guide
35 36 37 37 40 40 41 41 41 41 42 43 43 47 48 48 48 50 50 50 51 52 53 53 53 54 55 55 55 55 55 56 56 56 59 60 60 61 63 64 64 65 66 66 67 68 69 70 70 71 72
7.1.4 Comparison of HA pair types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 HA pair types and requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Standard HA pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 Mirrored HA pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Stretched MetroCluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.4 Fabric-attached MetroCluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Configuring the HA pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Configuration variations for standard HA pair configurations . . . . . . . . . . . . . . . . 7.3.2 Preferred practices for HA pair configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3 Enabling licenses on the HA pair configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.4 Configuring Interface Groups (VIFs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.5 Configuring interfaces for takeover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.6 Setting options and parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.7 Testing takeover and giveback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.8 Eliminating single points of failure with HA pair configurations . . . . . . . . . . . . . . . 7.4 Managing an HA pair configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Managing an HA pair configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Halting a node without takeover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 Basic HA pair configuration management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.4 HA pair configuration failover basic operations. . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.5 Connectivity during failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 8. MetroCluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Overview of MetroCluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Business continuity solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Stretch MetroCluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Planning Stretch MetroCluster configurations. . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.2 Cabling Stretch MetroClusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Fabric Attached MetroCluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1 Planning Fabric MetroCluster configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.2 Cabling Fabric MetroClusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Synchronous mirroring with SyncMirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1 SyncMirror overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.2 SyncMirror without MetroCluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 MetroCluster zoning and TI zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Failure scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7.1 MetroCluster host failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7.2 N series and expansion unit failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7.3 MetroCluster interconnect failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7.4 MetroCluster site failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7.5 MetroCluster site recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 9. FibreBridge 6500N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Administration and management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 10. Data protection with RAID Double Parity . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Why use RAID-DP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.1 Single-parity RAID using larger disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.2 Advantages of RAID-DP data protection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 RAID-DP overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 Protection levels with RAID-DP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents
73 74 74 76 77 78 80 81 81 82 83 83 84 85 86 87 88 88 89 98 98
101 102 105 105 106 107 108 109 111 112 112 115 116 118 119 119 120 121 122 123 124 124 127 129 130 131 131 132 133 133 v
10.3.2 Larger versus smaller RAID groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 RAID-DP and double parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1 Internal structure of RAID-DP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.2 RAID 4 horizontal row parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.3 Adding RAID-DP double-parity stripes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.4 RAID-DP reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.5 Protection levels with RAID-DP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Hot spare disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 11. Core technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Write Anywhere File Layout (WALF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Disk structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 NVRAM and system memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Intelligent caching of write requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.1 Journaling write requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.2 NVRAM operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5 N series read caching techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.1 Introduction of read caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.2 Read caching in system memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 12. Flash Cache. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 About Flash Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Flash Cache module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 How Flash Cache works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3.1 Data ONTAP disk read operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3.2 Data ONTAP clearing space in the system memory for more data . . . . . . . . . 12.3.3 Saving useful data in Flash Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3.4 Reading data from Flash Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 13. Disk sanitization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Data ONTAP disk sanitization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Data confidentiality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.2 Data erasure and standards compliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.3 Technology drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.4 Costs and risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3 Data ONTAP sanitization operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 Disk Sanitization with encrypted disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 14. Designing an N series solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1 Primary issues that affect planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1.1 IBM Capacity Magic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1.2 IBM Disk Magic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Performance and throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.1 Capacity requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.2 Other effects of Snapshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.3 Capacity overhead versus performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.4 Processor utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.5 Effects of optional features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.6 Future expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.7 Application considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.8 Backup servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.9 Backup and recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.10 Resiliency to failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
IBM System Storage N series Hardware Guide
133 134 134 135 136 137 141 145 147 148 149 150 151 151 152 153 154 154 157 158 158 158 159 159 160 161 163 164 164 164 164 165 165 166 168 169 170 170 170 170 171 176 176 177 177 177 178 180 180 181
14.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Part 2. Installation and administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Chapter 15. Preparation and installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1 Installation prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1.1 Pre-installation checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1.2 Before arriving on site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 Configuration worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3 Initial hardware setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4 Troubleshooting if the system does not boot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 16. Basic N series administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1 Administration methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1.1 FilerView interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1.2 Command-line interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1.3 N series System Manager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1.4 OnCommand. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2 Starting, stopping, and rebooting the storage system . . . . . . . . . . . . . . . . . . . . . . . . 16.2.1 Starting the IBM System Storage N series storage system . . . . . . . . . . . . . . . 16.2.2 Stopping the IBM System Storage N series storage system . . . . . . . . . . . . . . 16.2.3 Rebooting the system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 188 188 188 189 192 193 195 196 196 196 198 198 198 199 199 204
Part 3. Client hardware integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Chapter 17. Host Utilities Kits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.1 What Host Utilities Kits are . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2 The components of a Host Utilities Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.1 What is included in the Host Utilities Kit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.2 Current supported operating environments. . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3 Functions provided by Host Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.1 Host configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.2 IBM N series controller and LUN configuration . . . . . . . . . . . . . . . . . . . . . . . . . 17.4 Windows installation example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.1 Installing and configuring Host Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.2 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.3 Running the Host Utilities installation program . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.4 Host configuration settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.5 Overview of settings used by the Host Utilities . . . . . . . . . . . . . . . . . . . . . . . . . 17.5 Setting up LUNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5.1 LUN overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5.2 Initiator group overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5.3 About mapping LUNs for Windows clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5.4 Adding iSCSI targets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5.5 Accessing LUNs on hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 18. Boot from SAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2 Configure SAN boot for IBM System x servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2.1 Configuration limits and preferred configurations . . . . . . . . . . . . . . . . . . . . . . . 18.2.2 Preferred practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2.3 Basics of the boot process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2.4 Configuring SAN booting before installing Windows or Linux systems. . . . . . . 18.2.5 Windows 2003 Enterprise SP2 installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 208 208 208 208 209 209 209 209 209 210 213 214 215 216 216 216 217 217 217 219 220 221 221 222 224 225 243
Contents
vii
18.2.6 Windows 2008 Enterprise installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2.7 Red Hat Enterprise Linux 5.2 installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3 Boot from SAN and other protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3.1 Boot from iSCSI SAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3.2 Boot from FCoE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 19. Host multipathing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2 Multipathing software options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.1 Third-party multipathing solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.2 Native multipathing solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.3 Asymmetric Logical Unit Access (ALUA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.4 Why ALUA? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
244 250 252 252 252 255 256 257 257 258 258 258
Part 4. Performing upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Chapter 20. Designing for nondisruptive upgrades. . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1 System NDU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1.1 Types of system NDU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1.2 Supported Data ONTAP upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1.3 System NDU hardware requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1.4 System NDU software requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1.5 Prerequisites for a system NDU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1.6 Steps for major version upgrades NDU in NAS and SAN environments . . . . . 20.1.7 System commands compatibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2 Shelf firmware NDU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2.1 Types of shelf controller module firmware NDUs supported. . . . . . . . . . . . . . . 20.2.2 Upgrading the shelf firmware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2.3 Upgrading the AT-FCX shelf firmware on live systems. . . . . . . . . . . . . . . . . . . 20.2.4 Upgrading the AT-FCX shelf firmware during system reboot . . . . . . . . . . . . . . 20.3 Disk firmware NDU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.1 Overview of disk firmware NDU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.2 Upgrading the disk firmware non-disruptively . . . . . . . . . . . . . . . . . . . . . . . . . . 20.4 ACP firmware NDU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.4.1 Upgrading ACP firmware non-disruptively . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.4.2 Upgrading ACP firmware manually . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.5 RLM firmware NDU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 21. Hardware and software upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1 Hardware upgrades. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.1 Connecting a new disk shelf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.2 Adding a PCI adapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.3 Upgrading a storage controller head. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Software upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2.1 Upgrading to Data ONTAP 7.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2.2 Upgrading to Data ONTAP 8.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 264 264 264 266 266 268 269 270 270 270 271 271 272 272 272 273 274 274 274 275 277 278 278 278 279 279 280 281
Part 5. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Appendix A. Getting started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preinstallation planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Collecting documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Initial worksheet for setting up the nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Start with the hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 292 292 292 296
viii
Power on N series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data ONTAP update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Obtaining the Data ONTAP software from the IBM NAS website . . . . . . . . . . . . . . . . . . . Installing Data ONTAP system files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Downloading Data ONTAP to the storage system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Setting up the network using console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Changing the IP address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Setting up the DNS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B. Operating environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N3000 entry-level systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N3400 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N3220 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N3240 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N6000 mid-range systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N6210 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N6240 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N6270 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N7000 high-end systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N7950T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N series expansion shelves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EXN1000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EXN3000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EXN3500. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . EXN4000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
297 301 302 303 308 310 311 312 315 316 316 317 317 318 318 319 320 320 321 321 321 322 322 323
Appendix C. Useful resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 N series to NetApp model reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 Interoperability matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326 Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 327 328 328 328
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Contents
ix
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an endorsement of those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
xi
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol ( or ), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both:
AIX DB2 DS4000 DS6000 DS8000 Enterprise Storage Server IBM Redbooks Redpapers Redbooks (logo) System i System p System Storage System x System z Tivoli XIV xSeries z/OS
The following terms are trademarks of other companies: Intel Xeon, Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows NT, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Snapshot, SecureAdmin, RAID-DP, FlexShare, FlexCache, WAFL, SyncMirror, SnapVault, SnapRestore, SnapMirror, SnapManager, SnapLock, SnapDrive, NearStore, MultiStore, FlexVol, FlexClone, FilerView, Data ONTAP, NetApp, and the NetApp logo are trademarks or registered trademarks of NetApp, Inc. in the U.S. and other countries. UNIX is a registered trademark of The Open Group in the United States and other countries. Other company, product, or service names may be trademarks or service marks of others.
xii
Preface
This IBM Redbooks publication provides a detailed look at the features, benefits, and capabilities of the IBM System Storage N series hardware offerings. The IBM System Storage N series systems can help you tackle the challenge of effective data management by using virtualization technology and a unified storage architecture. The N series delivers low- to high-end enterprise storage and data management capabilities with midrange affordability. Built-in serviceability and manageability features help support your efforts to increase reliability; simplify and unify storage infrastructure and maintenance; and deliver exceptional economy. The IBM System Storage N series systems provide a range of reliable, scalable storage solutions to meet various storage requirements. These capabilities are achieved by using network access protocols such as Network File System (NFS), Common Internet File System (CIFS), HTTP, and iSCSI, and storage area network technologies such as Fibre Channel. Using built-in Redundant Array of Independent Disks (RAID) technologies, all data is protected with options to enhance protection through mirroring, replication, Snapshots, and backup. These storage systems also have simple management interfaces that make installation, administration, and troubleshooting straightforward. In addition, this book also addresses high-availability solutions including clustering and MetroCluster supporting highest business continuity requirements. MetroCluster is a unique solution that combines array-based clustering with synchronous mirroring to deliver continuous availability. This is a companion book to IBM System Storage N series Software Guide, SG24-7129. This book can be found at: http://www.redbooks.ibm.com/abstracts/sg247129.html?Open
xiii
analysis and the sizing of SAN and NAS solutions. He holds an engineering diploma in Computer Sciences from the University of Applied Science in Isny, Germany, and is an Open Group Master Certified IT Specialist. Steven Pemberton is a senior storage architect with IBM GTS in Melbourne, Australia. He has broad experience as an IT solution architect, pre-sales specialist, consultant, instructor, and enterprise IT customer. He is a member of the IBM Technical Experts Council for Australia and New Zealand (TEC A/NZ), has multiple industry certifications, and is the co-author of five previous IBM Redbooks. Tom Provost is a Field Technical Sales Specialist for the IBM Systems and Technology Group in Belgium. Tom has multiple years of experience as an IT professional providing design, implementation, migration, and troubleshooting support for IBM System x, IBM System Storage, storage software, and virtualization. Tom also is the co-author of several other Redbooks and IBM Redpapers. He joined IBM in 2010. Marco Schwarz is an IT specialist and team leader for Techline as part of the Techline Global Center of Excellence who lives in Germany. He has multiple years of experience in designing IBM System Storage solutions. His expertise spans all recent technologies in the IBM storage portfolio, including tape, disk, and NAS technologies.
Figure 1 The team, from left: Dirk, Tom, Roland, Marco, Jeff, and Steven
xiv
Thanks to the following people for their contributions to this project: Bertrand Dufrasne International Technical Support Organization, San Jose Center Thanks to the authors of the previous editions of this book: Alex Osuna Sandro De Santis Carsten Larsen Tarik Maluf Patrick P. Schill
Comments welcome
Your comments are important to us! We want our books to be as helpful as possible. Send us your comments about this book or other IBM Redbooks publications in one of the following ways: Use the online Contact us review Redbooks form found at: ibm.com/redbooks Send your comments in an email to: redbooks@us.ibm.com Mail your comments to: IBM Corporation, International Technical Support Organization Dept. HYTD Mail Station P099 2455 South Road Poughkeepsie, NY 12601-5400
Preface
xv
xvi
Summary of changes
This section describes the technical changes made in this edition of the book and in previous editions. This edition might also include minor corrections and editorial changes that are not identified. Summary of Changes for SG24-7840-02 for IBM System Storage N series Hardware Guide as created or updated on September 21, 2012.
New information
The N series hardware portfolio has been updated reflecting the June 2012 status quo. Information and changed in Data ONTAP 8.1 have been included. High-Availability and MetroCluster information has been updated to including SAS shelf technology.
Changed information
Hardware information for products no longer available has been removed Information only valid for Data ONTAP 7.x has been removed or modified to highlight differences and improvements in the current Data ONTAP 8.1 release.
xvii
xviii
Part
Chapter 1.
1.1 Overview
This section introduces the IBM System Storage N series and describes its hardware features. The IBM System Storage N series provides a range of reliable, scalable storage solutions for a variety of storage requirements. These capabilities are achieved by using network access protocols such as Network File System (NFS), Common Internet File System (CIFS), HTTP, FTP, and iSCSI. They are also achieved by using storage area network technologies such as Fibre Channel and Fibre Channel Over Ethernet (FCoE). built-in Redundant Array of Independent Disks (RAID) technologies, all data is protected, with options to enhance protection through mirroring, replication, Snapshots, and backup. These storage systems also have simple management interfaces that make installation, administration, and troubleshooting straightforward. The N series unified storage solution supports file and block protocols as shown in Figure 1-1. Further, converged networking is supported for all protocols.
This type of flexible storage solution offers many benefits: Heterogeneous unified storage solution: Unified access for multiprotocol storage environments. Versatile: A single integrated architecture designed to support concurrent block I/O and file servicing over Ethernet and Fibre Channel SAN infrastructures. Comprehensive software suite designed to provide robust system management, copy services, and virtualization technologies. Ease of changing storage requirements that allows fast, dynamic changes. If additional storage is required, you can expand it quickly and non-disruptively. If existing storage is deployed incorrectly, you can reallocate available storage from one application to another quickly and easily.
Maintains availability and productivity during upgrades. If outages are necessary, downtime is kept to a minimum. Easily and quickly implement nondisruptive upgrades. Create effortless backup and recovery solutions that operate in a common manner across all data access methods. Tune the storage environment to a specific application while maintaining its availability and flexibility. Change the deployment of storage resources easily, quickly, and non-disruptively. Online storage resource redeployment is possible. Achieve robust data protection with support for online backup and recovery. Include added value features such as deduplication to optimize space management. All N series storage systems use a single operating system (Data ONTAP) across the entire platform. They offer advanced function software features that provide one of the industrys most flexible storage platforms. This functionality includes comprehensive system management, storage management, onboard copy services, virtualization technologies, disaster recovery, and backup solutions.
N series Gateways
Leverage existing Storage Assets while introducing N series Software. Gateway functionality is achieved by adding a gateway feature code to the N6000 or N7000 appliance.
Excellent performance, flexibility, and scalability all at a proven lower overall TCO
Highly efficient capacity utilization Comprehensive set of storage resiliency features including RAID 6 (RAID-DP)
N series Unified Storage Architecture provides unmatched simplicity * Max capacity with 3TB HDD
Features and benefits include: Data compression Transparent in-line data compression can store more data in less space, reducing the amount of storage you need to purchase and maintain. Reduces the time and bandwidth required to replicate data during volume SnapMirror transfers. Deduplication Runs block-level data deduplication on NearStore data volumes. Scans and deduplicates volume data automatically, resulting in fast, efficient space savings with minimal effect on operations. Data ONTAP Provides full-featured and multiprotocol data management for both block and file serving environments through N series storage operating system. Simplifies data management through single architecture and user interface, and reduces costs for SAN and NAS deployment. Disk sanitization Obliterates data by overwriting disks with specified byte patterns or random data. Prevents recovery of current data by any known recovery methods.
FlexCache Creates a flexible caching layer within your storage infrastructure that automatically adapts to changing usage patterns to eliminate bottlenecks. Improves application response times for large compute farms, speeds data access for remote users, or creates a tiered storage infrastructure that circumvents tedious data management tasks. FlexClone Provides near-instant creation of LUN and volume clones without requiring additional storage capacity. Accelerates test and development, and storage capacity savings. FlexShare Prioritizes storage resource allocation to highest-value workloads on a heavily loaded system. Ensures that best performance is provided to designated high-priority applications. FlexVol Creates flexibly sized LUNs and volumes across a large pool of disks and one or more RAID groups. Enables applications and users to get more space dynamically and non-disruptively without IT staff intervention. Enables more productive use of available storage and helps improve performance. Gateway Supports attachment to IBM Enterprise Storage Server (ESS) series, IBM XIV Storage System, and IBM System Storage DS8000 and DS5000 series. Also supports a broad range of IBM, EMC, Hitachi, Fujitsu, and HP storage subsystems. MetroCluster Offers an integrated high-availability/disaster-recovery solution for campus and metro-area deployments. Ensures high data availability when a site failure occurs. Supports Fibre Channel attached storage with SAN Fibre Channel switch; SAS attached storage with Fibre Channel -SAS bridge; and Gateway storage with SAN Fibre Channel switch. MultiStore Partitions a storage system into multiple virtual storage appliances. Enables secure consolidation of multiple domains and controllers. NearStore (near-line) Increases the maximum number of concurrent data streams (per storage controller). Enhances backup, data protection, and disaster preparedness by increasing the number of concurrent data streams between two N series systems. OnCommand Enables the consolidation and simplification of shared IT storage management by providing common management services, integration, security, and role-based access controls delivering greater flexibility and efficiency. Manages multiple N series systems from a single administrative console. Speeds deployment and consolidated management of multiple N series systems.
Chapter 1. Introduction to IBM System Storage N series
Flash Cache (Performance Acceleration Module) Improves throughput and reduces latency for file services and other random read intensive workloads. Offers power savings by consuming less power than adding more disk drives to optimize performance. RAID-DP Offers double parity bit RAID protection (N series RAID 6 implementation). Protects against data loss because of double disk failures and media bit errors that occur during drive rebuild processes. SecureAdmin Authenticates both the administrative user and the N series system, creating a secure, direct communication link to the N series system. Protects administrative logins, passwords, and session commands from cleartext snooping by replacing RSH and Telnet with the strongly encrypted SSH protocol. Single Mailbox Recovery for Exchange (SMBR) Enables the recovery of a single mailbox from a Microsoft Exchange Information Store. Extracts a single mailbox or email directly in minutes with SMBR, compared to hours with traditional methods. This process eliminates the need for staff-intensive, complex, and time-consuming Exchange server and mailbox recovery SnapDrive Provides host-based data management of N series storage from Microsoft Windows, UNIX, and Linux servers. Simplifies host-consistent Snapshot copy creation and automates error-free restores. SnapLock Write-protects structured application data files within a volume to provide Write Once Read Many (WORM) disk storage. Provides storage, which enables compliance with government records retention regulations. SnapManager Provides host-based data management of N series storage for databases and business applications. Simplifies application-consistent Snapshot copies, automates error-free data restores, and enables application-aware disaster recovery. SnapMirror Enables automatic, incremental data replication between synchronous or asynchronous systems. Provides flexible, efficient site-to-site mirroring for disaster recovery and data distribution. SnapRestore Restores single files, directories, or entire LUNs and volumes rapidly, from any Snapshot backup. Enables near-instant recovery of files, databases, and complete volumes.
Snapshot Makes incremental, data-in-place, point-in-time copies of a LUN or volume with minimal performance effect. Enables frequent, nondisruptive, space-efficient, and quickly restorable backups. SnapVault Exports Snapshot copies to another N series system, providing an incremental block-level backup solution. Enables cost-effective, long-term retention of rapidly restorable disk-based backups. Storage Encryption Provides support for Full Disk Encryption (FDE) drives in N series disk shelf storage and integration with License Key Managers, including IBM Tivoli Key Lifecycle Manager (TKLM). SyncMirror Maintains two online copies of data with RAID-DP protection on each side of the mirror. Protects against all types of hardware outages, including triple disk failure. Gateway Reduce data management complexity in heterogeneous storage environments for data protection and retention. Software bundles Provides flexibility to take advantage of breakthrough capabilities, while maximizing value with a considerable discount. Simplifies ordering of combinations of software features: Windows Bundle, Complete Bundle, and Virtual Bundle. For more information about N series software features, see the companion book IBM System Storage N series Software Guide, SG24-7129. This book can be found at: http://www.redbooks.ibm.com/abstracts/sg247129.html?Open
All N series systems support the storage efficiency features shown in Figure 1-3.
Snapshot Copies
Save up to 33%
Deduplication
Removes data redundancies in primary and secondary storage.
Save up to 95%
Save up to 87%
Data Compression
Reduces footprint of primary and secondary storage.
10
Figure 1-4 provides an overview of the software structure introduced with the availability of Data ONTAP 8.1.
SnapManager Suite
Complete Bundle
NOTE: For DOT 8.0 and earlier, every feature requires its own License Key to be installed separately
To increase the business flow efficiencies, the 7-mode licensing infrastructure was modified to handle features that are free of charge in a more bundled/packaged manner. You no longer need to add license keys on your system for most features that are distributed at no additional fee. For some platforms, features in a software bundle require only one license key. Other features are enabled when you add certain other software bundle keys.
1.3.2 Entry-level
The entry-level software structure is similar to the mid-range and high-end structures outlined in the previous section. The following changes apply: All protocols (CIFS, NFS, Fibre Channel, iSCSI) are included with entry-level systems Gateway feature is not available MetroCluster feature is not available
11
Models
IBM N3220 N3240 N3400 N5300 N5600 N6040 N6060 N6070 N6210 N6240 N6270 N7600 N7700 N7800 N7900 N7950T
Current Portfolio
12
Chapter 2.
Entry-level systems
This chapter describes the IBM System Storage N series 3000 systems, which address the entry-level segment. This chapter includes the following sections: Overview N3220 N3240 N32x0 common information N3400 N3000 technical specifications at a glance
13
2.1 Overview
Figure 2-1 shows the N3000 modular disk storage system. They are designed to provide primary and auxiliary storage for midsize enterprises. N3000 systems offer integrated data access, intelligent management software, data protection capabilities, and expandability to 432 TB of raw capacity in a cost-effective package. Furthermore, N3000 series innovations include internal controller support for Serial-Attached SCSI (SAS) or SATA drives, expandable I/O connectivity, and onboard remote management.
IBM System Storage N3220 is available as a single-node (Model A12) and as a dual-node (Model A22) (active-active) base unit. The IBM System Storage N3240 consists of single-node (Model A14) and dual-node (Model A24) (active-active) base units. The IBM System Storage N3400 is available as a single-node (Model A11) and as a dual-node (Model A21) (active-active) base unit.
2.2 N3220
This section addresses the N series 3220 models.
14
15
2.3 N3240
This section addresses the N series 3240 models.
16
Figure 2-4 shows the front and rear view of the N3240
17
Figure 2-6 shows the controller with the 8 Gb FC Mezzanine card option
Figure 2-7 shows the controller with the 10 GbE Mezzanine card option
Table 2-2 provides ordering information for N32x0 systems with Mezzanine cards.
Table 2-2 N32x0 controller configuration Feature code Configuration Controller with no Mezzanine Card (blank cover) 2030 2031 Controller with dual-port FC Mezzanine Card (include SFP+) Controller with dual-port 10 GbE Mezzanine Card (no SFP+)
18
Table 2-3 provides information about the maximum number of supported shelves by expansion type.
Table 2-3 N32x0 number of supported shelves Expansion Shelf (Total 114 Spindles) EXN 1000 ESN 3000 EXN 3500 EXN 4000 Number of Shelves Supported Up to six shelves (500 GB, 750 GB, and 1 TB SATA disk drives) Up to nine shelves (300 GB, 450 GB & 600 GB SAS) or (500 GB, 1 TB, 2 TB and 3 TB SATA disk drives) Up to nine shelves (450 GB and 600 GB SAS SFF disk drives) Up to six shelves (144 GB, 300 GB, 450 GB, and 600 GB Fibre Channel disk drives)
2.5 N3400
This section addresses the N series 3400 models.
19
Figure 2-8 shows the front view of the N3400 controller module.
Figure 2-9 shows the back view of the N3400 controller module. In the rear panel both clustered controllers and stand-alone controller options are available.
I
N3400 has one SAS expansion port per controller with one Alternate Control Path (ACP). If you need to attach the EXN3000 shelf to the controller, you can configure the shelf ACP during the setup process. Doing so enables Data ONTAP to manage the EXN3000 on a separate network to increase availability and stability. The ACP is shown in Figure 2-10.
20
The N3400 has the following key specifications: 2U high Up to six external enclosures EXN1000, EXN4000 expansion units Up to five external SAS enclosures EXN3000 or EXN3500 expansion units High-performance SAS infrastructure Single controller or dual controller (for HA) Unified storage: iSCSI, NAS, Fibre Channel Each controller: Up to 8 gigabit Ethernet ports and two dual 4 Gbps Fibre Channel ports Onboard remote platform management Internal SAS drive bays Starting from SAS firmware 0500, you can perform a Non Disruptive Update (NDU) so disk I/Os are not interrupted while the SAS firmware is being updated.
66 lb. (29.9 kg) with drives 3.9A @100V 2A @200V 1319 @100V 1288 @200V Single 1x 32-bit dual-core 4 512 MB 4 4 Gb SFP 4 GbE RJ45 1x 3 Gb QSFP 4.6A@100V 2.3A@200V 1558 @100V 1524@200V Dual Active/Active 2x 32-bit dual-core 8 512 MB 2 4 Gb SFP 8 GbE RJ45 2x 3 Gb QSFP
21
N3220 Max Capacity TBe (7.3.x / 8.0.x / 8.1.x) Number of disk drives Max Shelves EXN3500 / EXN3000 / EXN4000 Max Aggregateg (7.3.x / 8.0.x / 8.1.x) Max FlexVol Size (7.3.x / 8.0.x / 8.1.x) Data ONTAP (minimum release)
N3220 - / - / 374
N3240
N3240 - / - / 432
N3400 136
N3400 136-408f
- / - / 60 TB - / - / 60 TB 8.1
a. AC Power values shown are based on typical system values with two power supply units. b. Thermal dissipation values shown are based on typical system values. c. NVMEM on N3240 and N3220 uses a portion of the 6 GB of controller memory, resulting in ~5.25 GB memory for Data ONTAP. d. NVMEM on N3240 and N3220 uses a portion of the 6 GB of controller memory, resulting in ~5.25 GB memory for Data ONTAP. e. System capacity is calculated using base 10 arithmetic (1TB=1,000,000,000,000 bytes) and is derived based on the type, size, and number of drives. f. Max capacity shown can be achieved only by using 3 TB drives under Data ONTAP 8.0.2 or later. g. Maximum aggregate size is calculated by using base 2 arithmetic (1 TB = 240 bytes).
For more information about N series 3000 systems, see the following website: http://www.ibm.com/systems/storage/network/n3000/appliance/index.html
22
Chapter 3.
Mid-range systems
This chapter addresses the IBM System Storage N series 6000 systems, which address the mid-range segment. This chapter includes the following sections: Overview Hardware N62x0 technical specifications at a glance
23
3.1 Overview
Figure 3-1 shows the N62x0 modular disk storage system. They are designed to have these advantages: Increase NAS storage flexibility and expansion capabilities by consolidating block and file data sets onto a single multiprotocol storage platform. Provide performance when your applications need it most with high bandwidth, 64-bit architecture, and the latest I/O technologies. Maximize storage efficiency and growth and preserve investments in staff expertise and capital equipment with data-in-place upgrades to more powerful IBM System Storage N series. Improve your business efficiency by taking advantage of the N6000 series capabilities, which are also available with a Gateway feature. These capabilities reduce data management complexity in heterogeneous storage environments for data protection and retention.
IBM System Storage N62x0 series systems help you meet your network-attached storage (NAS) needs. They provide high levels of application availability for everything from critical business operations to technical applications. You can also address NAS and storage area network (SAN) as primary and auxiliary storage requirements. In addition, you get outstanding value. These flexible systems offer excellent performance and impressive expandability at a low total cost of ownership.
24
N6210
The IBM System Storage N6210 includes these storage controllers: Model C20: An active/active dual-node base unit Model C10: A single-node base unit
N6240
The IBM System Storage N6240 includes these storage controllers: Model C21: An active/active dual-node base unit Model E11: A single-node base unit Model E21: The coupling of two Model E11s Exx models contain an I/O expansion module that provides additional PCIe slots. The I/O expansion is not available on Cxx models.
N6270
The IBM System Storage N6270 includes these storage controllers: Model C22: An active/active dual-node base unit that consists of a single chassis with two controllers and no I/O expansion modules Model E12: A single-node base unit that consists of a single chassis with one controller and one I/O expansion module Model E22: The coupling of two E12 models Exx models contain an I/O expansion module that provides additional PCIe slots. The I/O expansion is not available on Cxx models
25
The IBM System Storage N series supports these expansion units: EXN1000 SATA storage expansion unit EXN2000 and EXN4000 FC storage expansion units EXN3000 SAS/SATA expansion unit EXN3500 SAS expansion unit At least one storage expansion unit must be attached to the N series system. All eight models must be mounted in a standard 19-inch rack. None of the eight include storage in the base chassis
Reliability improvements
The N6000 series improves reliability as compared to its predecessors. Highlights include the following improvements: Fewer cables, eliminating external cables in cluster configurations Embedded NVRAM eliminates the PCIe connector There are fewer components, specifically, two fewer power supplies Improved component de-rating guidelines result in less stress on components
26
Upgrade path
You can make the following types of upgrades: The Model C10 can be upgraded to a Model C20 The Model E11 can be upgraded to a Model E21 The Model E12 can be upgraded to a Model E22 Model upgrades are disruptive.
3.2 Hardware
This section gives an overview of the N62x0 systems.
27
Figure 3-4 shows the IBM N62x0 slots and interfaces for a Standalone Controller: 2 PCIe v2.0 (Gen 2) x 8 slots Top full height, full length Bottom full height, length 2 x 6 Gb SAS (0a, 0b) 2 x HA interconnect (c0a, c0b) 2 x 4 Gb FCP (0c, 0d) 2 x GbE (e0a, e0b) USB port (not currently used) Management (wrench) SP and e0M Private management ACP (wrench w/lock) Serial console port I/O expansion module 4 x PCIe 8x Full length, full height slots
28
29
IBM N62x0 I/O Expansion Module (IOXM) is displayed in Figure 3-7. It has these characteristics: Components are not hot swappable: Controller will panic if removed If inserted into running IBM N6200, IOXM is not recognized until the controller is rebooted 4 full-length PCIe v1.0 (Gen 1) x 8 slots
30
Figure 3-9 shows the IBM N62x0 USB Flash Module, which has the following features. It is the boot device for Data ONTAP and the environment variables It replaces CompactFlash It has the same resiliency levels as CompactFlash 2 GB density is currently used It is a replaceable FRU
Two chassis with single-enclosure HA (twin) Fabric MetroCluster requires EXN4000 disk shelves or SAS shelves with SAS FibreBridge (EXN3000 and EXN3500) Gateway Models are supported on both models, but have these requirements: A 4 port/4 Gb FC adapter required for IBM Gateway N6240 array attach N6210 Gateway is limited to one LUN group from a single array
31
Stretching the HA-pair (also called the SFO pair) by using the c0x ports is qualified with optical SFPs up to a distance of 30 m. Beyond that distance, you need the FC-VI adapter. When the FC-VI card is present, the c0x ports are disabled. Tip: Always use an FCVI card in any N62xx MetroCluster, regardless if it is a stretched or fabric-attached MetroCluster
32
The system-level diagnostics SLDIAG has these features: SLDIAG replaces SYSDIAG Both run system-level diagnostic procedures SLDIAG has these major differences from SYSDIAG: SLDIAG runs from maintenance mode SYSDIAG booted with a separate binary SLDIAG has a CLI interface SYSDIAG used menu tables SLDIAG is used on the IBM N6210 and N6240, and all new platforms going forward.
1x 64-bit dual-core 4 GB 512 MB NVMEM Two 4 Gb SFP 2 PCIe 2 GbE RJ45 2x 6 Gb QSFP
Memory NVRAM Fibre Channel ports Exp Slots Ethernet ports SAS Ports Max Capacity TBc (7.3.x / 8.0.x / 8.1.x)
240/720d/720e
33
N6210 Number of disk drives Max Shelves EXN3500 / EXN3000 / EXN4000 Max Aggregateh (7.3.x / 8.0.x / 8.1.x) Max FlexVol Size (7.3.x / 8.0.x / 8.1.x) Data ONTAP (minimum release)
N6240
N6240
a. AC Power values shown are based on typical system values with two power supply units. b. Thermal dissipation values shown are based on typical system values. c. System capacity is calculated using base 10 arithmetic (1TB=1,000,000,000,000 bytes), and is derived based on the type, size, and number of drives d. Max capacity shown can be achieved only by using 3 TB drives under Data ONTAP 8.0.2 or later e. Max capacity shown can be achieved only by using 3 TB drives under Data ONTAP 8.0.2 or later f. Max capacity shown can be achieved only by using 3 TB drives under Data ONTAP 8.0.2 or later g. Max capacity shown can be achieved only by using 3 TB drives under Data ONTAP 8.0.2 or later h. Maximum aggregate size is calculated using base 2 arithmetic (1TB = 240 bytes).
For more information about N series 6000 systems, see the following website: http://www.ibm.com/systems/storage/network/n6000/appliance/index.html
34
Chapter 4.
High-end systems
This chapter describes the IBM System Storage N series 7000 systems, which address the high-end segment. This chapter includes the following sections: Overview Hardware N7950T technical specifications at a glance
35
4.1 Overview
Figure 4-1 shows the N7950T Model E22 modular disk storage system. It is designed to provide these advantages: High data availability and system-level redundancy Support of concurrent block I/O and file serving over Ethernet and Fibre Channel SAN infrastructures High throughput and fast response times Support of enterprise customers who require network-attached storage (NAS), with Fibre Channel or iSCSI connectivity Attachment of Fibre Channel, serial-attached SCSI (SAS), and Serial Advanced Technology Attachment (SATA) disk expansion units
The IBM System Storage N7950T (2867 Model E22) system is an active/active dual-node base unit. It consists of two cable-coupled chassis with one controller and one I/O expansion module per node. It is designed to provide fast data access, simultaneous multiprotocol support, expandability, upgradability, and low maintenance requirements. The N7950T can be configured as a gateway and is designed to provide these advantages: High data availability and system-level redundancy designed to address the needs of business-critical and mission-critical applications. Single, integrated architecture designed to support concurrent block I/O and file serving over Ethernet and Fibre Channel SAN infrastructures. High throughput and fast response times for database, email, and technical applications Enterprise customer support for unified access requirements for NAS through Fibre Channel or iSCSI. Fibre Channel, SAS, and SATA attachment options for disk expansion units designed to allow deployment in multiple environments. These environments include data retention, NearStore, disk-to-disk backup scenarios, and high-performance, mission-critical I/O intensive operations. The N7950T supports the EXN1000 SATA storage expansion unit, the EXN4000 FC storage expansion units, the EXN3000 SAS/SATA expansion unit, and the EXN3500 SAS expansion unit. At least one storage expansion unit must be attached to the N series system. The IBM System Storage N series is designed to interoperate with products capable of data transmission in the industry-standard iSCSI, CIFS, FCP, FCoE, and NFS protocols. Supported systems include the IBM System p, IBM System i (NFS only), IBM System x, and IBM System z (NFS only) servers. The N7950T system consists of Model E22 and associated software.
36
The N7950T can be configured, by using optional features, to be either a storage controller or gateway. It includes clustered failover (CFO) support (by using the required feature), which provides a failover and failback function to improve overall availability. N series systems must be mounted in a standard 19-inch rack. The N7950T includes the following hardware: Up to 14320 TB raw storage capacity 192 GB random access memory (192 GB of physical memory: Actual memory allocated depends on the Data ONTAP release in use) 8 GB nonvolatile memory Integrated Fibre Channel, Ethernet, and SAS ports Supports Flash Cache 2 Modules maximum of 16 TB Diagnostic LED/LCD Dual redundant hot-plug integrated cooling fans and auto ranging power supplies 19 inch, rack-mountable
4.2 Hardware
This section provides an overview of the N7950T E22 hardware.
37
Figure 4-4 shows the IBM N series N7950T Slots and Interfaces Controller Module.
Figure 4-4 IBM N series N7950T Slots and Interfaces Controller Module
The N7950T includes the following features: 2 onboard I/O slots (vertical) NVRAM8 always goes into slot 2 Special 8 Gb FC or 6 Gb SAS system board must be in slot 1 Fibre Channel system board ports can be target or initiator
38
4 PCIe v2.0 (Gen 2) x8 slots (horizontal) Full length and full height for regular expansion adapters 4 x 10 GbE (e0c, e0d, e0e, e0f) SFP+ module not interchangeable with other 10 GbE ports 4 x 8 Gb FCP (0a, 0b, 0c, 0d) Figure 4-5 shows the IBM N series N7950T Controller I/O.
Figure 4-6 shows the IBM N series N7950T I/O Expansion Module (IOXM).
The N7950T IOXM has these characteristics: All PCIe v2.0 (Gen 2) slots Vertical slots have different form factor Not hot-swappable: Controller will panic if removed Hot pluggable, but not recognized until reboot
39
40
41
FlexCache uses N7950T chassis Controller with IOXM Supports dual-enclosure HA configuration
NVRAM8 and SAS I/O system boards use the QSFP connector Mixing the cables does not cause physical damage, but the cables will not work Label your HA and SAS cables when you remove them
42
43
N7950T Weight AC Powera BTU/hrb Controller configuration Processor Memory NVRAM Fibre Channel ports Exp Slots Ethernet ports SAS ports Max Capacity TBe (7.3.x / 8.0.x / 8.1.x) Number of disk drives Max Shelves EXN3500 / EXN3000 / EXN4000 Max Aggregateh (7.3.x / 8.0.x / 8.1.x) Max FlexVol Size (7.3.x / 8.0.x / 8.1.x) Data ONTAP (minimum release) 251.4 lbs (114 kg) 13.8A @100V 7A @200V 4540 @100V 4404 @200V Dual Active/Active with IOXM 4x 64-bit six core 192 GB 8 GB 8 - 32 8 Gb SFP+c 24 PCIe 4x GbE RJ45 8x 10 GbE SFP+ 0 - 24 6 Gb QSFPd -/4320f/4320g 1440 60/60/84
a. AC Power values shown are based on typical system values with two power supply units. b. Thermal dissipation values shown are based on typical system values. c. N7950T embedded Fibre Channel ports and Fibre Channel ports on the vertical FC system boards are considered on-board ports. They can be set as target or initiators and support operation at 2, 4, or 8 Gb speeds. Operation at 1 Gb speeds is not supported. d. The number of onboard SAS ports differs based on the configuration. e. System capacity is calculated using base 10 arithmetic (1TB=1,000,000,000,000 bytes) and is derived based on the type, size, and number of drives. f. Max capacity shown can be achieved only by using 3 TB drives under Data ONTAP 8.0.2 or greater. g. Max capacity shown can be achieved only by using 3 TB drives under Data ONTAP 8.0.2 or greater.
44
h. Maximum aggregate size is calculated using base 2 arithmetic (1TB = 240 bytes).
For more information about N series 7000 systems, see the following website: http://www.ibm.com/systems/storage/network/n7000/appliance/index.html
45
46
Chapter 5.
Expansion units
This chapter provides detailed information for the IBM N series expansion units, also called disk shelves. This chapter includes the following sections: Shelf technology overview Expansion unit EXN3000 Expansion unit EXN3500 Expansion unit EXN4000 Self-Encrypting Drive
47
5.2.1 Overview
The IBM System Storage EXN3000 SAS/SATA expansion unit is available for attachment to all N series systems except N3300, N3700, N5200, and N5500. The EXN3000 provides low-cost high-capacity serially-attached SCSI (SAS) Serial Advanced Technology Attachment (SATA) disk storage for the IBM N series system storage. 48
IBM System Storage N series Hardware Guide
The EXN3000 is a 4U disk storage expansion unit. It can be mounted in any industry standard 19 inch rack. The EXN3000 contains these features: Dual redundant hot-pluggable integrated power supplies and cooling fans Dual redundant disk expansion unit switched controllers 24 hard disk drive slots The EXN3000 SAS/SATA expansion unit is shown in Figure 5-2.
The EXN3000 SAS/SATA expansion unit is shipped with no disk drives unless disk drives are included in the order. In that case, the disk drives are installed in the plant. The EXN3000 SAS/SATA expansion unit can be shipped with no disk drives installed. Disk drives ordered with the EXN3000 are installed by IBM in the plant before shipping. Requirement: For an initial order of an N series system, at least one of the storage expansion units must be ordered with at least five disk drive features. Figure 5-3 shows the rear view and the fans.
49
Power
Thermal (BTU/hr)
50
The EXN3500 SAS expansion unit is a 2U SFF disk storage expansion unit that must be mounted in an industry standard 19-inch rack. It can be attached to all N series systems except N3300, N3700, N5200, and N5500. It includes the following features: Third-generation SAS product Increased density 24x2.5 10k RPM drives in 2 rack U at same capacity points (450 GB, 600 GB) offers double the GB/rack U of the EXN3000 Increased IOPs/rack U Greater bandwidth 6 Gb SAS 2.0 offers ~24 Gb (6 Gb x4) combined bandwidth per wide port Improved power consumption: Power consumption per GB reduced by approximately 30-50%* Only SAS drives are supported in the EXN3500: SATA is not supported What has not changed: Same underlying architecture and FW base as EXN3000 All existing EXN3000 features/functionality Still use the 3 Gb PCIe Quad-Port SAS HBA (already 6 Gb capable) or onboard SAS ports
5.3.1 Overview
The EXN3500 includes the following hardware: Dual, redundant, hot-pluggable, integrated power supplies and cooling fans Dual, redundant, disk expansion unit switched controllers 24 SFF hard disk drive slots Diagnostic and status LEDs Figure 5-4 shows the EXN3500 front view.
The EXN3500 SAS expansion unit can be shipped with no disk drives installed. Disk drives ordered with the EXN3500 are installed by IBM in the plant before shipping. Disk drives can be of 450 GB and 600 GB physical capacity, and must be ordered as features of the EXN3500. Requirement: For an initial order of an N series system, at least one of the storage expansion units must be ordered with at least five disk drive features.
51
Figure 5-5 shows the rear view of the EXN3500 showing the connectivity and resiliency.
52
53
Figure 5-7 shows the front view of the EXN4000 expansion unit.
EXN4000 is the replacement for the EXN2000 Fibre Channel storage expansion unit.
54
55
encrypted data at rest on powered-off disk drives. That is, it prevents someone from removing a shelf or drive and mounting them on an unauthorized system. This security minimizes risk of unauthorized access to data if drives are stolen from a facility or compromised during physical movement of the storage array between facilities. Additionally, Self-encryption prevents unauthorized data access when drives are returned as spares or after drive failure. This security includes cryptographic shredding of data for non-returnable disk (NRD), disk repurposing scenarios, and simplified disposal of the drive through disk destroy commands. These processes render a disk completely unusable. This greatly simplifies the disposal of drives and eliminates the need for costly, time-consuming physical drive shredding. Remember that all data on the drives is automatically encrypted. If you do not want to track where the most sensitive data is or risk it being outside an encrypted volume, use NSE to ensure that all data is encrypted.
Overview of KMIP
Key Management Interoperability Protocol (KMIP) is an encryption key interoperability standard created by a consortium of security and storage vendors (OASIS). Version 1.0 was ratified in September 2010, and participating vendors have later released compatible products. KMIP seems to have replaced IEEE P1619.3, which was an earlier proposed standard. With KMIP-compatible tools, organizations can manage their encryption keys from a single point of control. This system improves security, simplifies complexity, and achieves regulation compliance more quickly and easily. It is a huge improvement over the current approach of using many different encryption key management tools for many different business purposes and IT assets.
56
57
IBM Tivoli Key Lifecycle Manager V1.0 supports the following operating systems: AIX V5.3, 64-bit, Technology Level 5300-04, and Service Pack 5300-04-02, AIX 6.1 64 bit Red Hat Enterprise Linux AS Version 4.0 on x86, 32-bit SUSE Linux Enterprise Server Version 9 on x86, 32-bit, and V10 on x86, 32-bit Sun Server Solaris 10 (SPARC 64-bit) Remember: In Sun Server Solaris, Tivoli Key Lifecycle Manager runs in a 32-bit JVM. Microsoft Windows Server 2003 R2 (32-bit Intel) IBM z/OS V1 Release 9, or later For more information about Tivoli Key Lifecycle Manager, see this website: http://www.ibm.com/software/tivoli/products/key-lifecycle-mgr/
58
Chapter 6.
Cabling expansions
This chapter addresses the multipath cabling of expansions. The following topics are covered: Standard multipath cabling Multipath HA cabling Cabling different expansions This chapter includes the following sections: EXN3000 and EXN3500 disk shelves cabling EXN4000 disk shelves cabling Multipath High-Availability cabling
59
Connecting the Quad-port SAS HBAs follows these rules for connecting to SAS shelves: HBA port A and port C always connect to the top storage expansion unit in a stack of storage expansion units. HBA port B and port D always connect to the bottom storage expansion unit in a stack of storage expansion units.
60
Think of the four HBA ports as two units of ports. Port A and port C are the top connection unit, and port B and port D are the bottom connection unit (Figure 6-2). Each unit (A/C and B/D) connects to each of the two ASIC chips on the HBA. If one chip fails, the HBA maintains connectivity to the stack of storage expansion units.
Figure 6-2 Top and bottom cabling for quad-port SAS HBAs
SAS cabling is based on the rule that each controller is connected to the top storage expansion unit and the bottom storage expansion unit in a stack: Controller 1 always connects to the top storage expansion unit IOM A and the bottom storage expansion unit IOM B in a stack of storage expansion units Controller 2 always connects to the top storage expansion unit IOM B and the bottom storage expansion unit IOM A in a stack of storage expansion units
61
Figure 6-3 shows how the SAS shelves are interconnected for two stacks with three shelves each.
62
63
Figure 6-5 is a fully redundant example of SAS shelf connectivity. No single cable failure or shelf controller causes any interruption of service.
64
2. Review the output and perform the following steps: If the output lists all of the IOMs, then the IOMs have connectivity. Return to the cabling procedure for your storage configuration to complete the cabling steps. Sometimes IOMs are not shown because the IOM is cabled incorrectly. The incorrectly cabled IOM and all of the IOMs downstream from it are not displayed in the output. Return to the cabling procedure for your storage configuration, review the cabling to correct cabling errors, and verify SAS connectivity again.
65
Enable ACP on the storage system by entering the following command at the console: options acp.enabled on Verify that the ACP cabling is correct by entering the following command: storage show acp For more information about cabling SAS stacks and ACP to an HA pair, see the IBM System Storage EXN3000 Storage Expansion Unit Hardware and Service Guide found at: http://www.ibm.com/storage/support/nas
66
Attention: Do not mix Fibre Channel and SATA expansion units in the same loop.
Tip: For N series controllers to communicate with an EXN4000 disk shelf, the Fibre Channel ports on the controller or gateway must be set for initiator. Changing behavior of the Fibre Channel ports on the N series system can be performed with the fcadmin command.
67
N6270A> storage show disk p PRIMARY PORT SECONDARY PORT SHELF BAY ------- ---- --------- ---- --------0a.16 A 1 0 0a.18 A 1 2 0a.19 A 1 3 0a.20 A 1 4 Multipath High-Availability (MPHA) cabling adds redundancy, reducing the number of conditions that can trigger a failover (Example 6-2).
Example 6-2 Clustered system with MPHA connections to disks
storage show disk -p PORT SECONDARY PORT SHELF BAY ---- --------- ---- --------A 0c.16 B 1 0 B 0a.17 A 1 1 B 0a.18 A 1 2 A 0c.19 B 1 3
With only a single connection to the A channel, a disk loop is technically a daisy chain. When any component (fiber cable, shelf cable, shelf controller) in the loop fails, access is lost to all shelves after the break, triggering a cluster failover event. MPHA cabling creates a true loop by providing a path into the A channel and out of the B channel. Multiple shelves can experience failures without losing communication to the controller. A cluster failover is only triggered when a single shelf experiences failures to both the A and B channels.
68
Chapter 7.
69
In a standard HA pair, Data ONTAP functions so that each node monitors the functioning of its partner through a heartbeat signal sent between the nodes. Data from the NVRAM of one node is mirrored to its partner. Each node can take over the partner's disks or array LUNs if the partner fails. Also, the nodes synchronize time.
70
Nondisruptive hardware maintenance: When you halt one node and allow takeover, the partner node continues to serve data for the halted node. You can then replace or repair hardware in the node you halted. Figure 7-2 shows an HA pair where Controller A has failed and Controller B took over services from the failing node.
its
71
Clarification: Disk ownership is established by Data ONTAP or the administrator, rather than by which disk shelf the disk is attached to. They own their spare disks, spare array LUNs, or both, and do not share them with the other node. They each have mailbox disks or array LUNs on the root volume: Two if it is an N series controller system (four if the root volume is mirrored by using the SyncMirror feature). One if it is an N series gateway system (two if the root volume is mirrored by using the SyncMirror feature). Tip: The mailbox disks or LUNs are used to do the following tasks: Maintain consistency between the pair Continually check whether the other node is running or whether it has run a takeover Store configuration information that is not specific to any particular node They can be on the same Windows domain, or on separate domains.
72
No
Up to 500 metersa
Use this configuration to provide higher availability by protecting against many hardware single points of failure. Use this configuration to add increased data protection to the benefits of a standard HA pair configuration. Use this configuration to provide data and hardware duplication to protect against a local disaster.
Yes
Up to 500 metersa
No
Stretch MetroCluster
Yes
Up to 500 meters (270 meters if Fibre Channel speed 4 Gbps and 150 meters if Fibre Channel speed is 8 Gbps) Up to 100 km depending on switch configuration. For gateway systems, up to 30 km.
Yes
Fabric-attached MetroCluster
Yes
Yes
Use this configuration to provide data and hardware duplication to protect against a larger-scale disaster.
Certain terms have particular meanings when used to refer to HA pair configuration. The specialized meanings of these terms are as follows: An HA pair configuration is a pair of storage systems configured to serve data for each other if one of the two systems becomes impaired. In Data ONTAP documentation and other information resources, HA pair configurations are sometimes also called HA pairs. When in an HA pair configuration, systems are often called nodes. One node is sometimes called the local node, and the other node is called the partner node or remote node.
Controller failover, also called cluster failover (CFO), refers to the technology that enables two storage systems to take over each others data. This configuration improves data availability. FC direct-attached topologies are topologies in which the hosts are directly attached to the storage system. Direct-attached systems do not use a fabric or Fibre Channel switches. FC dual fabric topologies are topologies in which each host is attached to two physically
independent fabrics that are connected to storage systems. Each independent fabric can consist of multiple Fibre Channel switches. A fabric that is zoned into two logically independent fabrics is not a dual fabric connection.
FC single fabric topologies are topologies in which the hosts are attached to the storage
systems through a single Fibre Channel fabric. The fabric can consist of multiple Fibre Channel switches.
73
iSCSI direct-attached topologies are topologies in which the hosts are directly attached to
the storage controller. Direct-attached systems do not use networks or Ethernet switches.
iSCSI network-attached topologies are topologies in which the hosts are attached to
storage controllers through Ethernet switches. Networks can contain multiple Ethernet switches in any configuration.
Mirrored HA pair configuration is similar to the standard HA pair configuration, except that there are two copies, or plexes, of the data. This configuration is also called data mirroring. Remote storage refers to the storage that is accessible to the local node, but is at the location of the remote node. Single storage controller configurations are topologies in which there is only one storage
controller is used. Single storage controller configurations have a single point of failure and do not support cfmodes in Fibre Channel SAN configurations.
74
Figure 7-3 shows a standard HA pair with native disk shelves without Multipath Storage.
Figure 7-3 Standard HA pair with native disk shelves without Multipath Storage
In the example shown in Figure 7-3, cabling is without redundant paths to disk shelves. If one controller loses access to disk shelves, the partner controller can take over services. Takeover scenarios are addressed later in this chapter.
75
Disks and disk shelf compatibility: Both Fibre Channel, SAS, and SATA storage are supported in standard HA pair configuration if the two storage types are not mixed on the same loop. One node can have only Fibre Channel storage and the partner node can have only SATA storage if needed. HA interconnect adapters and cables must be installed unless the system has two controllers in the chassis and an internal interconnect. Nodes must be attached to the same network and the network interface cards (NICs) must be configured correctly. The same system software, such as Common Internet File System (CIFS), Network File System (NFS), or SyncMirror, must be licensed and enabled on both nodes. For an HA pair that uses third-party storage, both nodes in the pair must be able to see the same array LUNs. However, only the node that is the configured owner of a LUN has read and write access to the LUN. Tip: If a takeover occurs, the takeover node can provide only the functionality for the licenses installed on it. If the takeover node does not have a license that was being used by the partner node to serve data, your HA pair configuration loses functionality at takeover.
License requirements
The cf (cluster failover) license must be enabled on both nodes.
76
License requirements
The following licenses must be enabled on both nodes: cf (cluster failover) syncmirror_local
A Stretch MetroCluster can be cabled to be redundant or non-redundant, and aggregates can be mirrored or unmirrored. Cabling for Stretch MetroCluster basically follows the same rules
77
as for a standard HA pair. The main difference is that a Stretch MetroCluster spans over two sites with a maximum distance of up to 500 meters. A MetroCluster provides the cf forcetakeover -d command, giving a single command to initiate a failover if an entire site becomes lost or unavailable. If a disaster occurs at one of the node locations, your data survives on the other node. In addition, it can be served by that node while you address the issue or rebuild the configuration. In a site disaster, unmirrored data cannot be retrieved from the failing site. For the surviving site to do a successful takeover, the root volume must be mirrored.
License requirements
The following licenses must be enabled on both nodes: cf (cluster failover) syncmirror_local cf_remote
78
operates at up to 8 Gbps. With a Fabric-attached MetroCluster, the distance between sites can be expanded from 500 meters up to a maximum of 100 km. Fabric-attached MetroClusters has the following characteristics: Fabric-attached MetroClusters contain two complete, separate copies of the data volumes or file systems that you configured as mirrored volumes or file systems in your HA pair. The fabric-attached MetroCluster nodes can be physically distant from each other beyond the 500 meter limit of a Stretch MetroCluster. Maximum distance between the fabric-attached MetroCluster nodes is up to 100 km, depending on the switch configuration. A fabric-attached MetroCluster connects the two controllers nodes and the disk shelves through four SAN switches called the Back-end Switches. The Back-end Switches are IBM/Brocade Fibre Channel switches in a dual-fabric configuration for redundancy. Figure 7-5 shows a simplified Fabric-attached MetroCluster. Use a single disk shelf per Fibre Channel switch port. Up to two shelves are allowed.
Tip: The back-end Fibre Channel switches can be used for HA node pair and disk shelf pair connectivity only.
79
Node requirements
The following are the requirements for the nodes: The nodes must be one of the following system models configured for mirrored volume use. Each node in the pair must be the same model. N5000 series systems, except for the N5500 and N5200 systems. N6040, N6060, and N6070 systems N7600, N7700, N7800, and N7900 N6210 and N6240 systems Each node requires a 4-Gbps FC-VI (Fibre Channel/Virtual Interface) adapter. The slot position is dependent on the controller model. The FC-VI adapter is also called a VI-MC or VI-MetroCluster adapter. Tip: For information about supported cards and slot placement, see the appropriate hardware and service guide on the IBM NAS support site. The 8-Gbps FC-VI (Fibre Channel/Virtual Interface) adapter is supported only on the following systems: N6210 and N6240
License requirements
The following licenses must be enabled on both nodes: cf (cluster failover) syncmirror_local cf_remote Consideration: Strict rules apply for how the back-end switches are configured. For more information, see the IBM System Storage N series Brocade 300 and Brocade 5100 Switch Configuration Guide found at: http://www.ibm.com/storage/support/nas Strict rules also apply for which firmware versions are supported on the back-end switches. For more information, see the latest IBM System Storage N series and TotalStorage NAS interoperability matrixes found at: http://www.ibm.com/support/docview.wss?uid=ssg1S7003897
80
Consider the following questions about your installation before proceeding through the setup program: Do you want to configure VIFs for your network interfaces? How do you want to configure your interfaces for takeover? Attention: Use VIFs with HA pairs to reduce SPOFs (single points of failure). If you do not want to configure your network for use in an HA pair when you run setup for the first time, you can configure it later. You can do so either by running setup again, or by using the ifconfig command and editing the /etc/rc file manually. However, you must provide at least one local IP address to exit setup.
81
Make sure that the /etc/rc file is correctly configured as shown in Example 7-1.
Example 7-1 Example of /etc/rc files
/etc/rc on itsotuc1: hostname itsotuc1 ifconfig e0 `hostname`-e0 mediatype 100tx-fd netmask 255.255.255.0 vif create multi vif1 e3a e3b e3c e3d ifconfig vif1 `hostname`-vif1 mediatype 100tx-fd netmask 255.255.255.0 partner vif2 route add default 10.10.10.1 1 routed on savecore exportfs -a nfs on /etc/rc on itsotuc2: hostname itsotuc2 ifconfig e0 `hostname`-e0 mediatype 100tx-fd netmask 255.255.255.0 vif create multi vif2 e3a e3b e3c e3d ifconfig vif2 `hostname`-vif2 mediatype 100tx-fd netmask 255.255.255.0 partner vif1 route add default 10.10.10.1 1 routed onsavecore exportfs -a nfs on
2. Reboot both nodes by using the reboot command. 3. Enable HA pair capability on each node by entering the cf enable command on the local node console. 4. Verify that HA pair capability is enabled by entering the cf status command on each node console as shown in the Example 7-3.
Example 7-3 Confirming whether a HA pair configuration is enabled cf status Cluster enabled, nas2 is up
5. Repeat for any other licenses that you need to enable using the license type and code for each licensed product installed on the HA pair configuration.
82
The Interface Groups can also be configured by using Data ONTAP FilerView or IBM System Manager for IBM N series.
83
After finishing setup, the system prompts you to reboot to make the new settings effective. Attention: If the partner is a VIF, you must use the VIF interface name.
84
85
The local node takes over the partner node and the following message is displayed: takeover completed 4. Test communication between the local node and partner node. For example, you can use the fcstat device_map command to ensure that one node can access the other nodes disks. 5. Give back the partner node by entering the following command: cf giveback The local node releases the partner node, which reboots and resumes normal operation. The following message is displayed on the console when the process is complete: giveback completed 6. Proceed as shown in Table 7-3 depending on whether you got the message that giveback was completed successfully.
Table 7-3 Takeover and giveback messages If takeover and giveback... Is completed successfully Fails Then... Repeat steps 2 through 5 on the partner node. Attempt to correct the takeover or giveback failure.
Yes
No
Processor fan
Yes
No
No
No
86
Hardware component
SPOF eliminated
FC-AL card
If an FC-AL card for the primary loop fails, the partner node attempts a failover at the time of failure. If the FC-AL card for the secondary loop fails, the failover capability is disabled. However, both storage systems continue to serve data to their respective applications and users, with no effect or delay.
No No
No No
If a disk fails, the storage system can reconstruct data from the RAID 4 or RAID DP. No failover is needed in this situation. A disk shelf is a passive backplane with dual power supplies, dual fans, dual ESH2s, and dual FC-AL loops. It is the most reliable component in a storage system. Both the storage system and the disk shelf have dual power supplies. If one power supply fails, the second power supply automatically kicks in. No failover is needed in this situation. Both the storage system head and disk shelf have multiple fans. If one fan fails, the second fan automatically provides cooling. No failover is needed in this situation. If a cluster adapter fails, the failover capability is disabled but both storage systems continue to serve data to their respective applications and users. The cluster adapter supports dual cluster interconnect cables. If one cable fails, the HA pair traffic (heartbeat and NVRAM data) is automatically sent over the second cable with no delay or interruption. If both cables fail, the failover capability is disabled, but both storage systems continue to serve data to their respective applications and users.
No
No
No
No
N/A
No
N/A
No
87
Example 7-8 shows how an HA pair node is halted by using the halt -f command. You can monitor the entire shutdown process to the LOADER prompt by logging on through the RLM module. Doing so gives you console access even during reboot.
Example 7-8 Halting by using the halt -f command. itsonas2> cf status Cluster enabled, itsonas1 is up. itsonas2> cf monitor current time: 09Apr2011 01:49:12 UP 8+23:34:29, partner 'itsonas1', cluster monitor enabled VIA Interconnect is up (link 0 up, link 1 up), takeover capability on-line partner update TAKEOVER_ENABLED (09Apr2011 01:49:12) itsonas2> halt -f CIFS local server is shutting down... CIFS local server has shut down... Sat Apr 9 01:49:21 GMT-7 [itsonas2: kern.shutdown:notice]: System shut down because : "halt".
88
Sat Apr Sat Apr Sat Apr monitor: Sat Apr monitor:
9 01:49:21 GMT-7 [itsonas2: fcp.service.shutdown:info]: FCP service shutdown 9 01:49:21 GMT-7 [itsonas2: perf.archive.stop:info]: Performance archiver stopped. 9 01:49:21 GMT-7 [itsonas2: cf.fsm.takeoverOfPartnerDisabled:notice]: Cluster takeover of itsonas1 disabled (local halt in progress) 9 01:49:28 GMT-7 [itsonas2: cf.fsm.takeoverByPartnerDisabled:notice]: Cluster takeover of itsonas2 by itsonas1 disabled (partner halted in notakeover mode)
CFE version 3.1.0 based on Broadcom CFE: 1.0.40 Copyright (C) 2000,2001,2002,2003 Broadcom Corporation. Portions Copyright (c) 2002-2006 Network Appliance, Inc. CPU type 0xF29: 2800MHz Total memory: 0x80000000 bytes (2048MB) CFE>
The same result can be accomplished by using the command cf disable followed by the halt command. From the CFE prompt or the boot LOADER prompt, depending on the model, the system can be rebooted by using the boot_ontap command.
2. Issue the cf takeover command. Example 7-10 shows the console output during takeover.
Example 7-10 cf takeover command itsonas2> cf takeover cf: takeover initiated by operator itsonas2> Sat Apr 9 02:00:22 GMT-7 [itsonas2: cf.misc.operatorTakeover:warning]: Cluster monitor: takeover initiated by operator
89
Sat Apr 9 02:00:22 GMT-7 [itsonas2: cf.fsm.nfo.acceptTakeoverReq:warning]: Negotiated failover: accepting takeover request by partner, reason: operator initiated cf takeover. Asking partner to shutdown gracefully; will takeover in at most 180 seconds. Sat Apr 9 02:00:33 GMT-7 [itsonas2: cf.fsm.firmwareStatus:info]: Cluster monitor: partner rebooting Sat Apr 9 02:00:33 GMT-7 [itsonas2: cf.fsm.takeoverByPartnerDisabled:notice]: Cluster monitor: takeover of itsonas2 by itsonas1 disabled (interconnect error) Sat Apr 9 02:00:33 GMT-7 [itsonas2: cf.fsm.nfo.partnerShutdown:warning]: Negotiated failover: partner has shutdown Sat Apr 9 02:00:33 GMT-7 [itsonas2: cf.fsm.takeover.nfo:info]: Cluster monitor: takeover attempted after 'cf takeover'. command Sat Apr 9 02:00:33 GMT-7 [itsonas2: cf.fsm.stateTransit:warning]: Cluster monitor: UP --> TAKEOVER Sat Apr 9 02:00:33 GMT-7 [itsonas2: cf.fm.takeoverStarted:warning]: Cluster monitor: takeover started Sat Apr 9 02:00:33 GMT-7 [itsonas1/itsonas2: coredump.spare.none:info]: No sparecore disk was found. Sat Apr 9 02:00:34 GMT-7 [itsonas2: nv.partner.disabled:info]: NVRAM takeover: Partner NVRAM was disabled. Replaying takeover WAFL log Sat Apr 9 02:00:36 GMT-7 [itsonas1/itsonas2: wafl.takeover.nvram.missing:info]: WAFL takeover: No WAFL nvlog records were found to replay. Sat Apr 9 02:00:36 GMT-7 [itsonas1/itsonas2: wafl.replay.done:info]: WAFL log replay completed, 0 seconds Sat Apr 9 02:00:36 GMT-7 [itsonas1/itsonas2: fcp.service.startup:info]: FCP service startup Vdisk Snap Table for host:1 is initialized Sat Apr 9 02:00:40 GMT-7 [itsonas2 (takeover): cf.fm.takeoverComplete:warning]: Cluster monitor: takeover completed Sat Apr 9 02:00:40 GMT-7 [itsonas2 (takeover): cf.fm.takeoverDuration:warning]: Cluster monitor: takeover duration time is 7 seconds Sat Apr 9 02:00:44 GMT-7 [itsonas1/itsonas2: cmds.sysconf.validDebug:debug]: sysconfig: Validating configuration. Sat Apr 9 02:00:47 GMT-7 [itsonas1/itsonas2: kern.syslogd.restarted:info]: syslogd: Restarted. Sat Apr 9 02:00:52 GMT-7 [itsonas1/itsonas2: asup.smtp.host:info]: Autosupport cannot connect to host mailhost (Unknown mhost) for message: SYSTEM CONFIGURATION WARNING Sat Apr 9 02:00:52 GMT-7 [itsonas1/itsonas2: asup.smtp.unreach:error]: Autosupport mail was not sent because the system cannot reach any of the mail hosts from the autosupport.mailhost option. (SYSTEM CONFIGURATION WARNING) Sat Apr 9 02:01:00 GMT-7 [itsonas2 (takeover): monitor.globalStatus.critical:CRITICAL]: This node has taken over itsonas1. Sat Apr 9 02:01:00 GMT-7 [itsonas1/itsonas2: monitor.volume.nearlyFull:debug]: /vol/mp3_files is nearly full (using or reserving 97% of space and 1% of inodes, using 97% of reserve). Sat Apr 9 02:01:00 GMT-7 [itsonas1/itsonas2: monitor.globalStatus.critical:CRITICAL]: itsonas2 has taken over this node. Sat Apr 9 02:01:03 GMT-7 [itsonas1/itsonas2: nbt.nbns.registrationComplete:info]: NBT: All CIFS name registrations have completed for the partner server. itsonas2(takeover)>
3. Check the status of the cluster by using the cf status command. Example 7-11 shows that system is in takeover condition, and that the partner controller is waiting for giveback.
Example 7-11 cf status: Verification if takeover completed itsonas2(takeover)> cf status itsonas2 has taken over itsonas1. itsonas1 is ready for giveback.
90
In the example, the N series itsonas1 rebooted when you ran the cf takeover command. When one N series storage system node is in takeover mode, the partner N series node does not reboot until the cf giveback command is run.
Options root create_ucode=on, convert_ucode=on create_ucode=on, convert_ucode=on create_ucode=on, convert_ucode=on Options root
itsonas2/itsonas1> aggr status Aggr State Status aggr0 online raid_dp, aggr itsonas2/itsonas1> partner Logoff from partner shell: itsonas2 itsonas1(takeover)>
To give back resources, issue the cf giveback command as shown in Example 7-13.
Example 7-13 cf giveback itsonas1(takeover)> cf status itsonas1 has taken over itsonas2. itsonas2 is ready for giveback. Takeover due to negotiated failover, reason: operator initiated cf takeover itsonas1(takeover)> cf giveback itsonas1(takeover)> Tue Apr 12 03:17:11 GMT-7 [itsonas1 (takeover): kern.cli.cmd:debug]: Command line input: the command is 'cf'. The full command line is 'cf giveback'. Tue Apr 12 03:17:11 GMT-7 [itsonas1 (takeover): cf.misc.operatorGiveback:info]: Cluster monitor: giveback initiated by operator Tue Apr 12 03:17:11 GMT-7 [itsonas1: cf.fm.givebackStarted:warning]: Cluster monitor: giveback started
91
CIFS partner server is shutting down... CIFS partner server has shut down... Tue Apr 12 03:17:11 GMT-7 [itsonas2/itsonas1: scsitgt.ha.state.changed:debug]: STIO HA State : In Takeover --> Giving Back after 5060 seconds. Tue Apr 12 03:17:11 GMT-7 [itsonas2/itsonas1: fcp.service.shutdown:info]: FCP service shutdown Tue Apr 12 03:17:11 GMT-7 [itsonas2/itsonas1: scsitgt.ha.state.changed:debug]: STIO HA State : Giving Back --> Normal after 0 seconds. Tue Apr 12 03:17:15 GMT-7 [itsonas1: cf.rsrc.transitTime:notice]: Top Giveback transit times raid=2963, wafl=974 {giveback_sync=367, sync_clean=316, forget=254, finish=35, vol_refs=2, mark_abort=0, wait_offline=0, wait_create=0, abort_scans=0, drain_msgs=0}, wafl_gb_sync=301, registry_giveback=35, sanown_replay=24, nfsd=14, java=7, ndmpd=6, httpd=1, ifconfig=1 Tue Apr 12 03:17:15 GMT-7 [itsonas1: asup.msg.giveback.delayed:info]: giveback AutoSupport delayed 5 minutes (until after the giveback process is complete). Tue Apr 12 03:17:15 GMT-7 [itsonas1: time.daemon.targetNotResponding:error]: Time server '0.north-america.pool.ntp.org' is not responding to time synchronization requests. Tue Apr 12 03:17:15 GMT-7 [itsonas1: cf.fm.givebackComplete:warning]: Cluster monitor: giveback completed Tue Apr 12 03:17:15 GMT-7 [itsonas1: cf.fm.givebackDuration:warning]: Cluster monitor: giveback duration time is 4 seconds Tue Apr 12 03:17:15 GMT-7 [itsonas1: cf.fsm.stateTransit:warning]: Cluster monitor: TAKEOVER --> UP Tue Apr 12 03:17:16 GMT-7 [itsonas1: cf.fsm.takeoverByPartnerDisabled:notice]: Cluster monitor: takeover of itsonas1 by itsonas2 disabled (unsynchronized log) Tue Apr 12 03:17:16 GMT-7 [itsonas1: cf.fm.timeMasterStatus:info]: Acting as cluster time slave Tue Apr 12 03:17:17 GMT-7 [itsonas1: cf.fsm.takeoverOfPartnerDisabled:notice]: Cluster monitor: takeover of itsonas2 disabled (partner booting) Tue Apr 12 03:17:22 GMT-7 [itsonas1: cf.fsm.takeoverOfPartnerDisabled:notice]: Cluster monitor: takeover of itsonas2 disabled (unsynchronized log) Tue Apr 12 03:17:23 GMT-7 [itsonas1: cf.fsm.takeoverByPartnerEnabled:notice]: Cluster monitor: takeover of itsonas1 by itsonas2 enabled Tue Apr 12 03:17:24 GMT-7 [itsonas1: cf.fsm.takeoverOfPartnerEnabled:notice]: Cluster monitor: takeover of itsonas2 enabled itsonas1>
You can check the HA pair status by issuing the cf status command as shown in Example 7-14.
Example 7-14 cf status: Check for successful giveback itsonas1> cf status Cluster enabled, itsonas2 is up. itsonas1>
Tip: Under normal conditions, you do not need to perform takeover/giveback on an IBM N series system. Usually you need to use it only if a controller needs to be halted or rebooted for maintenance. 1. As illustrated in Figure 7-6, you can perform the takeover by using System Manager and clicking Active/Active Configuration Takeover.
93
2. Figure 7-7 shows the Active/Active takeover wizard step 1. Click Next to continue.
3. Figure 7-8 shows the Active/Active takeover wizard step 2. Click Next to continue.
94
4. Figure 7-9 shows the Active/Active takeover wizard step 3. Click Finish to continue.
5. Figure 7-10 shows the Active/Active takeover wizard final step where takeover has been run successfully. Click Close to continue.
95
6. Figure 7-11 shows that System Manager now displays the status of the takeover. The only option at this stage to perform giveback.
96
Figure 7-14 shows that System Manager now reports the systems back to normal after a successful giveback.
97
98
its local A loop shelf count, the system concludes that it is impaired. It then prompts that nodes partner to initiate a takeover.
99
100
Chapter 8.
MetroCluster
This chapter address the MetroCluster feature. This integrated, high-availability, business continuance solution allows clustering of two N6000, or N7000 storage controllers at distances up to 100 kilometers. The primary goal of MetroCluster is to provide mission-critical applications with redundant storage services in case of site-specific disasters. By synchronously mirroring data between two sites, it tolerates site-specific disasters with minimal interruption to applications and no data loss. The following topics are covered: Benefits of using MetroCluster Synchronous mirroring with SyncMirror Business continuity with IBM System Storage N series Implementing MetroCluster MetroCluster configurations Prerequisites for MetroCluster usage SyncMirror setup Failure scenarios This chapter includes the following sections: Overview of MetroCluster Business continuity solutions Stretch MetroCluster Fabric Attached MetroCluster Synchronous mirroring with SyncMirror MetroCluster zoning and TI zones Failure scenarios
101
MetroCluster is a fully integrated solution designed to be easy to administer that is built on proven technology. It provides automatic failover to remote data center to achieve these goals: Helps protect business continuity in the event of a failure in the primary data center Helps reduce dependency on IT staff for manual actions Provides synchronous mirroring up to 100 km Its data replication capabilities are designed to do these functions: Maintain a constantly up-to-date copy of data at a remote data center Support replication of data from a primary to a remote site to maintain data currency
102
MetroCluster software provides an enterprise solution for high availability over wide area networks (WANs). MetroCluster deployments of N series storage systems are used for the following functions: Business continuance. Disaster recovery. Achieving recovery point and recovery time objectives (instant failover). You also have more options regarding recovery point/time objectives in conjunction with other features. MetroCluster technology is an important component of enterprise data protection strategies. In a failure in one location (the local node or the disks are failing), MetroCluster provides automatic failover to the remaining node. This failover allows access to the data copy (because of SyncMirror) in the second location. A MetroCluster system is made up of the following components: Two N series storage controllers, HA configuration: These provide the nodes for serving the data in case of a failure. N62x0 and N7950T systems are supported in MetroCluster configurations, whereas N3x00 is not supported. MetroCluster VI FC HBA: Used for cluster interconnect. SyncMirror license: Provides an up-to-date copy of data at the remote site. Data is ready for access after failover without administrator intervention. This license comes with Data ONTAP Essentials. MetroCluster/Cluster remote and CFO license: Provides a mechanism to failover (automatically or administrator driven). FC switches: Provide storage system connectivity between sites/locations. These are used for fabric MetroClusters only. FibreBridges if you are going to use EXN3000 or EXXN3500 SAS Shelves. Cables: Multimode fiber optic cables (single-mode cables are not supported). MetroCluster allows the Active/Active configuration to be spread across data centers up to 100 kilometers apart. During an outage at one data center, the second data center can assume all affected storage operations lost with the original data center. SyncMirror is required as part of MetroCluster to ensure that an identical copy of the data exists in the second data center. If site A goes down, MetroCluster allows you to rapidly resume operations at a remote site minutes after a disaster. SyncMirror is used in MetroCluster environments to mirror data in two locations, as illustrated in Figure 8-2 on page 104. Aggregate mirroring must be like to like disk types. Remember: Since the Data ONTAP 7.3 release, the cluster license and SyncMirror license are part of the base software bundle.
Chapter 8. MetroCluster
103
Geographical separation of N series nodes is implemented by physically separating controllers and storage, creating two MetroCluster halves. For distances under 500m (campus distances), long cables are used to create Stretch MetroCluster configurations. For distances more than 500m but less than 100km (metro distances), a fabric is implemented across the two locations, creating a Fabric MetroCluster configuration. The Cluster_Remote license provides features that enable the administrator to declare a site disaster and initiate a site failover by using a single command. The cf forcetakeover -d command initiates a takeover of the local partner even in the absence of a quorum of partner mailbox disks. This command gives the administrator the ability to declare a site-specific disaster and have one node take over its partners identity without a quorum of disks. Several requirements must be in place to enable takeover in a site disaster: Root volumes of both storage systems must be synchronously mirrored. Only synchronously mirrored aggregates are available during a site disaster. Administrator intervention, that is, issuing the forcetakeover command, is required as a safety precaution against a split brain scenario. Attention: Site-specific disasters are not the same as a normal cluster failover.
104
Within DataCenter Clustered Failover (CFO) High system protection MetroCluster (Stretch) Cost effective zero RPO protection
Async SnapMirror Most cost effective with RPO from 10 min. to 1 day MetroCluster (Fabric) Cost effective zero RPO protection Sync SnapMirror Most robust zero RPO protection
Table 8-1 addresses the differences between synchronous SnapMirror and MetroCluster with SyncMirror.
Table 8-1 Differences between Sync SnapMirror and MetroCluster SyncMirror Feature Network for Replication Concurrent transfer limited Distance limitation Replication between HA pairs Deduplication Synchronous SnapMirror Fibre Channel or IP Yes Up to 200 KM (depending on latency) Yes Deduplicated volume and sync volume cannot be in same aggregate Yes MetroCluster (SyncMirror) Fibre Channel only No 100 KM (Fabric MetroCluster) No Yes
Chapter 8. MetroCluster
105
Figure 8-4 Stretch MetroCluster setup with only one stack per site
Stretch MetroCluster has no imposed spindle limits, just the platform limit. Take care in planning N6210 MetroCluster configurations, because the N6210 has only two FC initiator onboard ports and two PCI expansion slots. Because you use one slot for the FC/VI adapter, you have only one remaining slot for an FC initiator card. Because you need four FC ports, needed for Stretch MetroCluster, two configurations are possible: Two onboard FC ports + dual port FC initiator adapter Quad port FC initiator HBA (frees up onboard FC ports) Remember that all slots are in use and the N6210 cannot be upgraded with other adapters.
106
Mixed SATA and FC configurations are allowed if the following requirements are met: There is no intermixing of Fibre Channel and SATA shelves on the same loop. Mirrored shelves must be of the same type as their parents. The Stretch MetroCluster heads can have a distance of up to 500 m (@2 Gbps). Greater distances might be available at lower speeds (check with RPQ/SCORE). Qualified distances are up to 500 m. If you have distances greater then 500m, choose Fabric MetroCluster. The following Table (Table 8-2) lists theoretical Stretch MetroCluster distances.
Table 8-2 Theoretical MetroCluster distances in meters Data Rate in Gbps 1 2 4 OM-2 (50/125um) 500 300 150 OM-3 (50/125um) 860 500 270 OM-3+ 1130 650 350
Remember: The following are the maximum distance supported for Stretch MetroCluster: 2 Gbps: 500 meter 4 Gbps: 270 meter 8 Gbps: 150 meter
Chapter 8. MetroCluster
107
If you decide to use SAS Shelves (EXN3000 and EXN3500) you must use FibreBridges. Starting with Data ONTAP 8.1, EXN3000 (SAS or SATA) and EXN3500 are supported on Stretch MetroCluster (and Fabric MetroCluster as well) through SAS FC bridge (FibreBridge). The FibreBridge runs protocol conversion from SAS to FC and enables connectivity between Fibre Channel initiators and SAS storage enclosure devices. This process enables SAS disks to display as LUNs in a MetroCluster fabric. You need a minimum of four FibreBridges (minimum is two per stack) in a MetroCluster environment. A sample is shown in Figure 8-6.
Figure 8-6 Cabling a Stretch MetroCluster with FibreBridges and SAS Shelves
For more information about SAS Bridges, see the SAS FibreBridges chapter of the N series Hardware book.
108
109
initiator card. Because of a minimum of four Fibre Channel ports needed for Stretch MetroCluster, two configurations are possible: Two onboard Fibre Channel ports + dual port Fibre Channel initiator adapter Quad port FC initiator HBA (frees up onboard Fibre Channel ports) Remember that all slots are in use, and the N6210 cannot be upgraded with other adapters. Currently when using SAS Shelves, there is no spindle limit with Fabric MetroCluster and Data ONTAP 8.x. Only the platform spindle limit applies (N62x0 and N7950T) as shown in Table 8-3.
Table 8-3 Maximum number of spindles with DOT 8.x and Fabric MetroCluster Platform N6210 N6240 N6270 N7950T Number of spindles SAS/SATA (requires FibreBridges) 480 600 960 1176 Maximum number of FC disks 480 600 840 (672 with DOT7.3.2 or 7.3.4) 840 (672 with DOT7.3.2 or 7.3.4)
Requirement: Fabric MetroClusters need four dedicated FC switches in two fabrics. Each fabric must be dedicated to the traffic for a single MetroCluster. No other devices can be connected to the MetroCluster fabric. Beginning with Data ONTAP 8.1, MetroCluster supports shared-switches configuration with Brocade 5100 switches. Two MetroCluster configurations can be built with four Brocade 5100 switches. For more information about shared-switches configuration, see the Data ONTAP High Availability Configuration Guide.
Attention: Always see the MetroCluster Interoperability Matrix on the IBM Support site for the latest information about components and compatibility.
110
Fabric MetroCluster configurations use Fibre Channel switches as the means to separate the controllers by a greater distance. The switches are connected between the controller heads and the disk shelves, and to each other. Each disk drive or LUN individually logs in to a Fibre Channel fabric. The nature of this architecture requires, for performance reasons, that the two fabrics be dedicated to Fabric MetroCluster. Extensive testing was done to ensure adequate performance with switches included in a Fabric MetroCluster configuration. For this reason, Fabric MetroCluster requirements prohibit the use of any other model or vendor of Fibre Channel switch than the Brocade included with the Fabric MetroCluster. If you decide to use SAS Shelves (EXN3000 and EXN3500) you must use the FibreBridges. Starting with Data ONTAP 8.1, EXN3000 (SAS or SATA) and EXN3500 are supported on Stretch MetroCluster (and Fabric MetroCluster as well) through SAS Fibre Channel bridge (FibreBridge). The FibreBridge runs protocol conversion from SAS to Fibre Channel, and enables connectivity between Fibre Channel initiators and SAS storage enclosure devices.
Chapter 8. MetroCluster
111
This process allows SAS disks to display as LUNs in a MetroCluster fabric. You need at least four FibreBridges (minimum is two per stack) in a MetroCluster environment as shown in Figure 8-8.
Figure 8-8 Cabling a Fabric MetroCluster with FibreBridges and SAS Shelves
For more information about SAS Bridges, see the SAS FibreBridges Chapter in the N series Hardware book.
112
Read performance is optimized by performing application reads from both plexes as shown in Figure 8-9.
SyncMirror is used to create aggregate mirrors. When planning for SyncMirror environments, keep in mind the following considerations: Aggregate mirrors need to be on the remote site (geographically separated) In normal mode (no takeover), aggregate mirrors cannot be served out Aggregate mirrors can exist only between like drive types. When the SyncMirror license is installed, disks are divided into pools (pool0: local, pool1: remote/mirror). When a mirror is created, Data ONTAP pulls disks from pool0 for the local aggregate and from pool1 for the mirrored aggregate. Verify the correct number of disks in each pool before creating the aggregates. Any of the following commands can be used as shown in Example 8-1.
Example 8-1 Verification of SyncMirror
itsosj_n1>sysconfig -r itsosj_n1>aggr status -r itsosj_n1>vol status -r To see the volume /plex/raidgroup relationship, use the sysconfig r command as shown in Example 8-2. Use the aggr mirror command to start mirroring the plexes.
Example 8-2 Viewing the aggregate status
n5500-ctr-tic-1> sysconfig -r Aggregate aggr0 (online, raid_dp, mirrored) (block checksums) Plex /aggr0/plex0 (online, normal, active, pool0) RAID group /aggr0/plex0/rg0 (normal) RAID Disk --------dparity parity data Device -----0a.16 0a.17 0a.18 HA SHELF BAY ------------0a 1 0 0a 1 1 0a 1 2 CHAN Pool Type RPM Used (MB/blks) ---- ---- ---- ----- -------------FC:A 0 FCAL 15000 136000/278528000 FC:A 0 FCAL 15000 136000/278528000 FC:A 0 FCAL 15000 136000/278528000 Phys (MB/blks) -------------137104/280790184 137104/280790184 137104/280790184
Chapter 8. MetroCluster
113
RAID group /aggr0/plex2/rg0 (normal) RAID Disk --------dparity parity data Device -----0c.25 0c.24 0c.23 HA SHELF BAY ------------0c 1 9 0c 1 8 0c 1 7 CHAN Pool Type RPM Used (MB/blks) ---- ---- ---- ----- -------------FC:B 1 FCAL 15000 136000/278528000 FC:B 1 FCAL 15000 136000/278528000 FC:B 1 FCAL 15000 136000/278528000 Phys (MB/blks) -------------137104/280790184 137104/280790184 137104/280790184
Aggregate aggr1 (online, raid4, mirrored) (block checksums) Plex /aggr1/plex0 (online, normal, active, pool0) RAID group /aggr1/plex0/rg0 (normal) RAID Disk --------parity data data Device -----0a.19 0a.21 0a.20 HA SHELF BAY ------------0a 1 3 0a 1 5 0a 1 4 CHAN Pool Type RPM Used (MB/blks) ---- ---- ---- ----- -------------FC:A 0 FCAL 15000 136000/278528000 FC:A 0 FCAL 15000 136000/278528000 FC:A 0 FCAL 15000 136000/278528000 Phys (MB/blks) -------------137104/280790184 137104/280790184 137104/280790184
Plex /aggr1/plex1 (online, normal, active, pool1) RAID group /aggr1/plex1/rg0 (normal) RAID Disk --------parity data data Device -----0c.26 0c.20 0c.29 HA SHELF BAY ------------0c 1 10 0c 1 4 0c 1 13 CHAN Pool Type RPM Used (MB/blks) ---- ---- ---- ----- -------------FC:B 1 FCAL 15000 272000/557056000 FC:B 1 FCAL 15000 136000/278528000 FC:B 1 FCAL 15000 136000/278528000 Phys (MB/blks) -------------274845/562884296 280104/573653840 280104/573653840
Pool1 spare disks RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) -------------- ------------- ---- ---- ---- ----- -------------Spare disks for block or zoned checksum traditional volumes or aggregates spare 0c.28 0c 1 12 FC:B 1 FCAL 15000 272000/557056000 Pool0 spare disks RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) -------------- ------------- ---- ---- ---- ----- -------------Spare disks for block or zoned checksum traditional volumes or aggregates spare 0a.22 0a 1 6 FC:A 0 FCAL 15000 136000/278528000 Partner disks RAID Disk --------partner partner partner partner partner partner partner partner partner Device -----0a.25 0a.27 0a.26 0c.16 0c.21 0c.22 0a.29 0c.17 0c.27 HA SHELF BAY ------------0a 1 9 0a 1 11 0a 1 10 0c 1 0 0c 1 5 0c 1 6 0a 1 13 0c 1 1 0c 1 11 CHAN Pool Type RPM Used (MB/blks) ---- ---- ---- ----- -------------FC:A 1 FCAL 15000 0/0 FC:A 1 FCAL 15000 0/0 FC:A 1 FCAL 15000 0/0 FC:B 0 FCAL 15000 0/0 FC:B 0 FCAL 15000 0/0 FC:B 0 FCAL 15000 0/0 FC:A 1 FCAL 15000 0/0 FC:B 0 FCAL 15000 0/0 FC:B 0 FCAL 15000 0/0 Phys (MB/blks) -------------137104/280790184 137104/280790184 137104/280790184 137104/280790184 137104/280790184 137104/280790184 137104/280790184 137104/280790184 137104/280790184 Phys (MB/blks) -------------137104/280790184 Phys (MB/blks) -------------280104/573653840
114
0c 0a 0a 0a 0c
1 1 1 1 1
2 7 12 8 3
0 1 1 1 0
Chapter 8. MetroCluster
115
116
You can benefit from using two ISLs per fabric (instead of one ISL per fabric) to separate out high-priority cluster interconnect traffic from other traffic. This configuration prevents contention on the back-end fabric, and provides additional bandwidth in some cases. The TI feature is used to enable this separation. The TI feature provides better resiliency and performance. Traffic isolation is implemented by using a special zone, called a traffic isolation zone (TI zone). A TI zone indicates the set of ports and ISLs to be used for a specific traffic flow. When a TI zone is activated, the fabric attempts to isolate all interswitch traffic that enters from a member of the zone. The traffic is isolated to only those ISLs that have been included in the zone. The fabric also attempts to exclude traffic not in the TI zone from using ISLs within that TI zone. TI Zones are a new feature of Fabric OS v6.0.0b that have the following restrictions: TI Zones exist only in the Defined Zoning Configuration TI Zones must be created with Domain, Index notation only TI Zones must include both E_Ports and N_Ports to create a complete, dedicated, end-to-end route from Initiator to Target Each fabric is configured to prohibit probing of the FCVI ports by the Fabric nameserver. Figure 8-12 shows the dedicated traffic between Domain 1 and Domain 2. Data from system A would stay in the TI Zone 1-2-3-4 and would not pass TI Zone 5-6-7-8. So the traffic is routed on 2-3 for system A and 6-7 for system B.
Chapter 8. MetroCluster
117
Figure 8-13 shows an example of TI in a Fabric MetroCluster environment. VI traffic (orange) is separated from data/backend traffic (black) by TI zones.
118
To continue access, a failover must be performed by the administrator by issuing the cfo -d command. Data access is restored because DC1 mirror was in sync with DC1 primary.
Chapter 8. MetroCluster
119
Through connectivity provided by the fabric switches, all hosts again have access to the required data.
During this period, data access is uninterrupted to all hosts. No automated controller takeover occurs. Both controller heads continue to serve its LUNs/volumes. However, mirroring and failover are disabled, thus reducing data protection. When the interconnect failure is resolved, resyncing of mirrors occurs.
120
Attention: If the site failure is staggered in nature and the interconnect fails before the rest of the site, data loss might occur. Data loss occurs because processing continues after the interconnect fails. However, typically site failures occur pervasively and at the same time.
Chapter 8. MetroCluster
121
122
Chapter 9.
FibreBridge 6500N
This chapter contains information about the FC-SAS FibreBridge. The ATTO FibreBridge 6500N provides an innovative bridging solution between the Fibre Channel and SAS protocols. It is an FC/SAS bridge in EXN3000 (2857-003) and EXN3500 (2857-006) storage expansion units attached to IBM System Storage N series storage systems in a MetroCluster configuration. The ATTO FibreBridge is a performance tuned intelligent protocol translator that allows upstream initiators connected through Fibre Channel to communicate with downstream targets connected through SAS. It is a high performance bridge that adds 8-Gigabit Fibre Channel connectivity to 6-Gigabit SAS storage devices. ATTO FibreBridge provides a complete highly available connectivity solution for MetroCluster. This chapter includes the following sections: Description Architecture Administration and management
123
9.1 Description
MetroCluster adds great availability to N series systems but is limited to Fibre Channel drive shelves only. Before 8.1, both SATA and Fibre Channel drive shelves were supported on active-active configuration in stretch MetroCluster configurations. However, both plexes of the same aggregate must use the same type of storage. In a fabric MetroCluster configuration, only Fibre Channel drive shelves had been supported. Starting with Data ONTAP 8.1, EXN3000 (SAS or SATA) and EXN3500 are supported on Fabric MetroCluster and on Stretch MetroCluster through SAS Fibre Channel bridge (FibreBridge). The FibreBridge (shown in Figure 9-1) runs protocol conversion from SAS to Fibre Channel. It enables connectivity between Fibre Channel initiators and SAS storage enclosure devices so that SAS disks display as LUNs in a MetroCluster fabric.
The FibreBridge is only available as part of the MetroCluster solution, and is intended for back-end shelf cabling only.
9.2 Architecture
FibreBridge 6500N bridges are used in MetroCluster systems when SAS disk shelves are used. You can install the bridges by using these methods: As part of a new MetroCluster installation As a hot-add to an existing MetroCluster system with SAS or Fibre Channel disk shelves As a hot-swap to replace a failed bridge You can also hot-add a SAS disk shelf to an existing stack of SAS disk shelves. Attention: At the time of writing, Data ONTAP 8.1 has these limitations: The FibreBridge does not support mixing EXN3000 and EXN3500 in same stack. FibreBridge configurations currently do not support SSD drives. The FibreBridge does not support SNMP.
Table 9-1 Shelf combinations in a FibreBridge stack Shelf EXN3000 (SAS disks) EXN3000 (SATA disks) EXN3500 SAS disks EXN3000 (SAS disks) SAME YES NO EXN3000 (SATA disks) YES SAME NO EXN3500 SAS disks NO NO SAME
124
The FC-SAS00 FibreBridge product has the following specifications: Two 8 Gb/s FC ports (optical SFP+ modules included) (4x) 6 Gb/s SAS ports (only one SAS port used) Dual 100/1000 RJ-45 Ethernet ports Serial port (RS-232) 1U enclosure Mountable into a standard 19 rack Figure 9-2 provides a a view of the bridge ports.
Restriction: Only the SAS port labeled A can be used to connect expansion shelves because SAS port B is disabled. An Ethernet port and a serial port are available for bridge management. At a minimum, MetroCluster requires four FibreBridges, two per stack, with one stack on either site. Therefore, two FibreBridges (one for redundancy) are required per stack of SAS shelves. Current maximum is 10 SAS shelves per stack of SAS or SATA disks. A sample cabling diagram is provided in Figure 9-3.
125
The normal platform spindle limits apply to the entire MetroCluster configuration. However, because each controller sees all storage, the platform spindle limit applies to the entire configuration. For example, if the spindle limit for N series N62x0 is n, then despite the two controllers, the spindle limit for a N62x0 fabric MetroCluster configuration remains n. Figure 9-4 shows an example of an N series Stretch MetroCluster environment. Fibre Channel ports of the N series nodes are connected to the Fibre Channel ports on the FibreBridge (FC1 and FC2). SAS ports of the first and last shelf in a stack are connected to the SAS ports (SAS port A) on the FibreBridge. Each stack has two bridges. MetroCluster uses at least four FibreBridges.
126
Figure 9-5 shows an example of a Fabric MetroCluster that uses FibreBridges to connect to SAS disk shelves. Each of the two nodes connects through four Fibre Channel links to the SAN fabrics for data traffic plus two additional Fibre Channel links intended for VI traffic. Each of the FibreBridges is connected with one link per bridge to the SAN. The first and last SAS shelves in a stack are each connected through one SAS link to a bridge.
N series Head
FC
FC
FC
FC
FC
FC
FC
FC
SAN Switches
Fibre Bridge
Et1
FC1
FC2
SASA
Et1
FC1
FC2
SASA
Et1
FC1
FC2
SA SA
E t1
FC1
FC2
S ASA
IO M A
IOM B
IO M A
IO M B
IO M A
IOM B
IO M A
IO M B
IO M A
IOM B
IO M A
IO M B
N series gateway configurations do not use the FibreBridge. Storage is presented through FCP as LUNs from whatever back-end array the gateway head is front ending.
127
Install an ATTO-supported web browser so that you can use the ATTO ExpressNAV GUI. The most effective browsers are Internet Explorer 8 and Mozilla Firefox 3. The ATTO FibreBridge 6500N Installation and Operation Manual contains a list of supported web browsers. The FibreBridge has the following environmental specifications: Power consumption is 55W: 110V, 0.5A / 220V, 0.25A Input 85-264 VAC, 1A, 47-63 Hz BTU: 205 BTU/hr Weight: 8.75lbs Operating Environment: Temperature: 5-40 C at 10,000 feet Humidity: 10 - 90% Thermal monitoring possible Front to rear cooling Monitoring options of the device includes: Event Management System (EMS) messages and Autosupport messages Data ONTAP commands such as storage show bridge v FibreBridge commands such as DumpConfiguration The FibreBridge does not support SNMP in the DOT 8.1 release.
128
10
Chapter 10.
129
10.1 Background
In this chapter, the term volume, when used alone, is defined to mean both traditional volumes and aggregates. Data ONTAP volumes have two distinct versions: Traditional volumes Virtual volumes called FlexVols FlexVols offer flexible and unparalleled functionality housed in a construct known as an aggregate. For more information about FlexVol and thin provisioning, see N series Thin Provisioning, REDP-47470, at: http://www.redbooks.ibm.com/abstracts/redp4747.html?Open Traditional single-parity RAID technology offers protection from a single disk drive failure. If a secondary event occurs during reconstruction, the RAID array might experience data corruption or a volume being lost. The single-parity RAID solution can improve performance, but presents greater risk of data loss. Select the solution carefully selected so that it complies with your organizations policies and application-specific requirements. Although disk drive technology has increased capacities and reduced seek time performances, it has not reduced the amount of contrast between decreased reliability. It addition, the technology has actually increased bit error rates. The result is an increase of potential uncorrectable bit errors, and reduced reliability of traditional single parity RAID adequately protecting data. Today traditional RAID is stretching past its limitations. By increasing the data fault tolerance of various disk failures, and infusing block-level striping, double parity distributions presents RAID data protection called RAID Double Parity. This protection is also called RAID-DP, and is illustrated in Figure 10-1. RAID-DP is available on the entire IBM System Storage N series data storage product line.
Survives any 2-disk-failure scenario Com pared to single-parity RAID, RAID-DP has:
Better protection (>4,000 MTTDL) Equal, often better performance Same capacity overhead (typically 1 parity per 6 data drives
RAID-DP
Protects against any P tw o-disk failure
DP
Outperform s any other doubleparity offering Com bined with SyncMirror (RAID1), N series storage system s are designed to survive failure of any five disks in one disk protection group
35
130
131
centers, and smaller disks result in less capacity per square foot. Also, storage vendors are forced to offer products based on what disk manufacturers are supplying, and smaller disks are not readily available, if at all. The second way to protect data on larger disks with single-parity RAID is slightly more practical, but still not effective for various reasons. Keeping the size of arrays or volumes small, the time to reconstruct is reduced. However, an array or volume built with more disks takes longer to reconstruct data from one failed disk than one built with fewer disks. Smaller arrays and volumes have two costs that cannot be overcome: 1. Additional disks are lost to parity, thus reducing usable capacity and increasing total cost of ownership (TCO). 2. Performance is generally slower with smaller arrays, aggregates, and volumes, affecting business and users. The most reliable protection offered by single-parity RAID is RAID 1, or mirroring. In RAID 1, the mirroring process replicates an exact copy of all data on an array, aggregate, or volume to a second array or volume. Although RAID 1 mirroring affords maximum fault tolerance from disk failure, the cost of the implementation is severe. RAID 1 requires twice the disk capacity to store the same amount of data. Using smaller arrays and volumes to improve fault tolerance increases the total cost of ownership of storage because of less usable capacity per dollar spent. RAID 1 mirror with its requirement for double the amount of capacity is the most expensive type of storage solution with the highest total cost of ownership (Figure 10-3).
132
RAID-DP significantly increases the fault tolerance from failed disk drives over traditional RAID. Based on the standard mean time to data loss (MTTDL) formula, RAID-DP is about 10,000 times more reliable than single-parity RAID on the same underlying disk drives. With this level of reliability, RAID-DP offers significantly better data protection than RAID 1 mirroring, but at RAID 4 pricing. RAID-DP offers businesses the most compelling TCO storage option without putting their data at increased risk.
PARITY
ONTAP
PARITY
133
134
Figure 10-7 represents a traditional RAID 4 group that uses row parity. It consists of four data disks (the first four columns, labeled D) and the single row parity disk (the last column, labeled P). The rows represent the standard 4 KB blocks used by the traditional RAID 4 implementation. The second row is populated with sample data in each 4 KB block. Parity calculated for data in the row is then stored in the corresponding block on the parity disk. In this case, the way parity is calculated is to add the values in each of the horizontal blocks. That sum is stored as the parity value (3 + 1 + 2 + 3 = 9). In practice, parity is calculated by an exclusive OR (XOR) process, but addition is fairly similar and works as well for the purposes of this example. If you need to reconstruct data from a single failure, the process used to generate parity is reversed. If the first disk fails, RAID 4 re-creates the data value 3 in the first column. It subtracts the values on the remaining disks from what is stored in parity (9 - 3 - 2 1 = 3). This example of reconstruction with single-parity RAID shows why data is protected up to, but not beyond, one disk failure event.
135
The diagonal parity stripe was calculated by using the addition approach for this example rather than the XOR used in practice. It was then stored on the second parity disk (1 + 2 + 2 + 7 = 12).Note that the diagonal parity stripe includes an element from row parity as part of its diagonal parity sum. RAID-DP treats all disks in the original RAID 4 construct, including both data and row parity disks, as the same. Figure 10-9 adds in the rest of the data for each block and creates corresponding row and diagonal parity stripes.
Figure 10-9 Block representation of RAID-DP corresponding with row and diagonal parity
One RAID-DP condition that is apparent from Figure 10-9 is that the diagonal stripes wrap at the edges of the row parity construct. The following are important conditions for RAID-DP's ability to recover from double disk failures: The first condition is that each diagonal parity stripe misses one (and only one) disk, but each diagonal misses a different disk The Figure 10-9 illustrates an omitted diagonal parity stripe (white blocks) stored on the second diagonal parity disk. 136
IBM System Storage N series Hardware Guide
Omitting the one diagonal stripe does not affect RAID-DP's ability to recover all data in a double-disk failure as illustrated in reconstruction example. The same RAID-DP diagonal parity conditions covered in this example are true in real storage deployments. It works even in deployments that involve dozens of disks in a RAID group and millions of rows of data written horizontally across the RAID 4 group. Recovery of larger-size RAID groups works the same, regardless of the number of disks in the RAID group. RAID-DP, based on proven mathematical theorems, provides the ability to recover all data in the even of a double-disk failure. More information can be found at: 1. One way is using mathematical theorems and proofs. For more information about the mathematical theorems, and proofs used in RAID-DP, see the Double Disk Failure Correction document available at the USENIX Organization website:
http://www.usenix.org
2. Go through the double-disk failure and subsequent recovery process presented in 10.4.4, RAID-DP reconstruction on page 137.
When engaged after a double-disk failure, RAID-DP first begins looking for a chain to begin reconstruction with. In this case, the first diagonal parity stripe in the chain that it finds is represented by the blue series of diagonal blocks. Remember that when reconstructing data for a single disk failure under RAID 4, no more than one element can be missing or failed. If an additional element is missing, data loss is inevitable.
137
With this in mind, traverse the blue series diagonal blocks in Figure 10-10 on page 137. Notice that only one of the five blue series blocks are missing. With four out of five elements available, RAID-DP has all of the information needed to reconstruct the data in the missing blue series block. Figure 10-11 shows that this data is recovered over to an available hot spare disk.
The data has been re-created from the missing diagonal blue block by using the same arithmetic addressed earlier (12 - 7 - 2 - 2 = 1). Now that the missing blue series diagonal information has been re-created, the recovery process switches from using diagonal parity to using horizontal row parity. Specifically, the top row after the blue block re-creates the missing diagonal block. There is now enough information available to reconstruct the single missing horizontal gray block in column 1, row 1, disk 3 parity (9 - 3 - 2 - 1 = 3). This process is shown in Figure 10-12.
138
The algorithm continues determining whether additional diagonal blocks can be re-created. The upper left block is re-created from row parity, and RAID-DP can proceed in re-creating the gray diagonal block in column two, row two. See Figure 10-13.
RAID-DP recovers the gray diagonal block in column two, row two. Adequate information is now available for row parity to re-creating the missing horizontal white block (one) in the first column, row two (Figure 10-14).
139
As noted earlier, the white diagonal stripe is not stored, and no additional diagonal blocks can be re-creating on the existing chain. RAID-DP continues to search for a new chain to start re-creating diagonal blocks. In this example, the procedure determines that it can re-create missing data in the gold stripe, as shown in Figure 10-15.
After RAID-DP re-creates a missing diagonal block, the process again switches to re-creating a missing horizontal block from row parity. When the missing diagonal block in the gold stripe is re-created, enough information is available to re-create the missing horizontal block from row parity, as shown in Figure 10-16.
140
After the missing block in the horizontal row is re-created, reconstruction switches back to diagonal parity to re-creating a missing diagonal block. RAID-DP can continue in the current chain on the red stripe, as shown in Figure 10-17.
Again, after the recovery of a diagonal block, the process switches back to row parity because it has enough information to re-create data for the one horizontal block. At this point in the double-disk failure scenario, all data has been re-creating with RAID-DP, as shown in Figure 10-18.
141
consist of four concurrent disk failures followed by a bad block or bit error before reconstruction is completed.
Figure 10-20 vol status command showing itso volume as traditional RAID4 volume
142
When the command is entered, the aggregate or, as in the following examples, traditional volumes are instantly denoted as RAID-DP. However, all diagonal parity stripes still need to be calculated and stored on the second parity disk. Figure 10-21 shows using the command to convert the volume.
In these examples, when changing the volume itso, the aggregates within the RAID-DP volume pl_install and TPC change when the command is run. Protection against double disk failure is not available until all diagonal parity stripes are calculated and stored on the diagonal parity disk. Figure 10-22 shows a reconstruct status that signifies that diagonal parity creation in process,
Figure 10-22 itso volume in reconstruct status during conversion of diagonal parity RAID-DP
Calculating the diagonals as part of a conversion to RAID-DP takes time and affects performance slightly on the storage controller. The amount of time and performance effect for conversions to RAID-DP depends on the storage controller and how busy the storage controller is during the conversion. Generally, run conversions to RAID-DP during off-peak hours to minimize potential performance effect to business or users. For conversions from RAID4 to RAID-DP, certain conditions are required. Conversions at the aggregate or traditional volume level require an available disk for the second diagonal parity disk for each RAID4 group. The size of the disks used for diagonal parity needs to be at least the size of the original RAID4 row parity disks. In the example, the volume itso is altered from an RAID4 status to RAID-DP.
143
Figure 10-25 shows the conversion of itso back to RAID4. In this case, the conversion is instantaneous because the old RAID4 row parity construct is still in place as a subsystem in RAID-DP.
Figure 10-26 shows the completed process. If a RAID-DP group is converted to RAID4, each RAID groups second diagonal parity disk is released and put back into the spare disk pool.
144
RAID-DP Protection
Vol_1
rg0 rg1 rg2 rg3
hot spare Data drive Parity drive dParity drive RAID group
Tip: You need at least one spare disk available per aggregate, but no more than three. In addition, the available spares need at least one disk for each disk size and disk type installed in your storage system. This configuration allows the storage system to use a disk of the same size and type as a failed disk when reconstructing a failed disk. If a disk fails and a hot spare disk of the same size is not available, the storage system uses a spare disk of the next available size up. During disk failure, the storage system replaces the failed disk with a spare and reconstructs data. If a disk fails, the storage system runs these actions: 1. The storage system replaces the failed disk with a hot spare disk. If RAID-DP is enabled and double-disk failure occurs in the RAID group, the storage system replaces each failed disk with a separate spare disk. Data ONTAP first attempts to use a hot spare disk of the same size as the failed disk. If no disk of the same size is available, Data ONTAP replaces the failed disk with a spare disk of the next available size up.
145
2. The storage system reconstructs, in the background, the missing data onto the hot spare disks. 3. The storage system logs the activity in the /etc/messages file on the root volume. With RAID-DP, these processes can be carried out even in the event of simultaneous failure of two disks in a RAID group. During reconstruction, file service can slow down. After the storage system is finished reconstructing data, replace the failed disks with new hot spare disks as soon as possible. Hot spare disks must always be available in the system.
146
11
Chapter 11.
Core technologies
This chapter addresses N series core technologies such as the WAFL file system, disk structures, and NVRAM access methods. This chapter includes the following sections: Write Anywhere File Layout (WALF) Disk structure NVRAM and system memory Intelligent caching of write requests N series read caching techniques
147
WAFL also includes the necessary file and directory mechanisms to support file-based storage, and the read and write mechanisms to support block storage or LUNs. Notice that the protocol access layer is above the data placement layer of WAFL. This layer allows all of the data to be effectively managed on disk independently of how it is accessed by the host. This level of storage virtualization offers significant advantages over other architectures that have tight association between the network protocol and data. To improve performance, WAFL attempts to avoid the disk head writing data and then moving to a special portion of the disk to update the inodes. The inodes contain the metadata. This movement across the physical disk medium increases the write time. Head seeks happen quickly, but on server-class systems you have thousands of disk accesses going on per second. This additional time adds up quickly, and greatly affects the performance of the system, particularly on write operations. WAFL does not have that handicap, and writes the metadata in line with the rest of the data. Write anywhere refers to the file systems capability to write any class of data at any location on the disk. The basic goal of WAFL is to write to the first best available location. First is the closest available block. Best is the same address block on all disks, that is, a complete stripe. The first best available is always going to be a complete stripe across an entire RAID group that
148
uses the least amount of head movement to access. That is arguably the most important criterion for choosing where WAFL is going to locate data on a disk. Data ONTAP has control over where everything goes on the disks, so it can decide on the optimal location for data and metadata. This fact has significant ramifications for the way Data ONTAP does everything, but particularly in the operation of RAID and the operation of Snapshot technology.
To write new data into a RAID stripe that already contains data (and parity), you must read the parity block. You then calculate a new parity value for the stripe, and write the data block plus the new parity block. This process adds a significant amount of extra work for each block to be written. The N series reduces this penalty by buffering NVRAM-protected writes in memory, and then writing full RAID stripes plus parity whenever possible. This process makes reading parity data before writing unnecessary, and requires only a single parity calculation for a full stripe of data blocks. WAFL does not overwrite existing blocks when they are modified, and it can write
149
data and metadata to any location. In other data layouts, modified data blocks are usually overwritten, and metadata is often required to be at fixed locations. This approach offers much better write performance, even for double-parity RAID (RAID 6). Unlike other RAID 6 implementations, RAID-DP performs so well that it is the default option for N series storage systems. Tests show that random write performance declines only 2% versus the N series RAID 4 implementation. By comparison, another major storage vendors RAID 6 random write performance decreases by 33% relative to RAID 5 on the same system. RAID 4 and RAID 5 are both single-parity RAID implementations. RAID 4 uses a designated parity disk. RAID 5 distributes parity information across all disks in a RAID group.
150
151
acknowledgement from the storage system that a write has been completed. To reply to a write request, a storage system without any NVRAM must run these steps: a. Update its in-memory data structures b. Allocate disk space for new data c. Wait for all modified data to reach disk A storage system with an NVRAM write cache runs the same steps, but copies modified data into NVRAM instead of waiting for disk writes. Data ONTAP can reply to a write request much more quickly because it need update only its in-memory data structures and log the request. It does not have to allocate disk space for new data or copy modified data and metadata to NVRAM. Optimizes disk writes. Journaling all write data immediately and acknowledging the client or host not only improve response times, but also gives Data ONTAP more time to schedule and optimize disk writes. Storage systems that cache writes in the disk driver layer must accelerate processing in all the intervening layers to provide a quick response to host or client. This requirement gives them less time to optimize. For more information about how Data ONTAP benefits from NVRAM, see the following document: http://www.redbooks.ibm.com/abstracts/redp4086.html?Open
152
calculations, parity calculations, and gathers enough data to write a full stripe across the entire RAID group. A sample client request is displayed in Figure 11-3.
WAFL never holds data longer than 10 seconds before it establishes a CP. At least every 10 seconds, WAFL takes the contents of NVRAM and commits it to disk. As soon as a write request is committed to a block on disk, WAFL clears it from the journal. On a system that is lightly loaded, an administrator can actually see the 10 second CPs happen: Every 10 seconds the lights cascade across the system. Most systems run with a heavier load than that, and CPs happen at smaller intervals depending on the system load. NVRAM does not cause a performance bottleneck. The response time of RAM and NVRAM is measured in microseconds. Disk response times are always in milliseconds and it takes a few milliseconds for a disk to respond to an I/O. Disks therefore are always the performance bottleneck of any storage system. They are the bottleneck because disks are radically slower than any other component on the system. When a system starts committing back-to-back CPs, the disks are taking writes as fast as they possibly can. That is a platform limit for that system. To improve performance when the platform limit is reached, you can spread the traffic across more heads or upgrade the head to a system with greater capacity. NVRAM can function faster if the disks can keep up. For more information about technical details of N series RAID-DP, see this document: http://www.redbooks.ibm.com/abstracts/redp4169.html?Open
153
applications require additional disk spindles to achieve optimum performance even when the additional capacity is not needed.
154
writes overflow the cache and cause other, more valuable data to be ejected. However, some read-modify-write type workloads benefit from caching recent writes. Examples include stock market simulations and some engineering applications. Sequential reads: Sequential reads can often be satisfied by reading a large amount of contiguous data from disk at one time. In addition, as with writes, caching large sequential reads can cause more valuable data to be ejected from system cache. Therefore, it is preferable to read such data from disk and preserve available read cache for data that is more likely to be read again. The N series provides algorithms to recognize sequential read activity and read data ahead, making it unnecessary to retain this type of data in cache with a high priority. Metadata: Metadata describes where and how data is stored on disk (name, size, block locations, and so on). Because metadata is needed to access user data, it is normally cached with high priority to avoid the need to read metadata from disk before every read and write. Small, random reads: Small, random reads are the most expensive disk operation because they require a higher number of head seeks per kilobyte than sequential reads. Head seeks are a major source of the read latency associated with reading from disk. Therefore, data that is randomly read is a high priority for caching in system memory. The default caching behavior for the Data ONTAP buffer cache is to prioritize small, random reads and metadata over writes and sequential reads.
155
156
12
Chapter 12.
Flash Cache
This chapter provides an overview of Flash Cache and all of its components. This chapter includes the following sections: About Flash Cache Flash Cache module How Flash Cache works
157
158
12.3.2 Data ONTAP clearing space in the system memory for more data
When more space was needed in memory, Data ONTAP analyzes what it currently holds and looks for the lowest-priority data to clear out to make more space. Depending on the workload, this data might be in system memory for seconds or hours. Either way it must be cleared as shown in Figure 12-3 on page 160.
159
160
When it is there, access to it is far faster than having to go to disk. This process is how a workload is accelerated (Figure 12-6).
Data in Module? Not in Module? Need Space in System Memory? Read from Module
161
162
13
Chapter 13.
Disk sanitization
This chapter addresses disk sanitization and the process of physically removing data from a disk. This process involves overwriting patterns on the disk in a manner that precludes the recovery of that data by any known recovery methods. It also presents the Data ONTAP disk sanitization feature and briefly addresses data confidentiality, technology drivers, costs and risks, and the sanitizing operation. This chapter includes the following sections: Data ONTAP disk sanitization Data confidentiality Data ONTAP sanitization operation Disk Sanitization with encrypted disks
163
13.2.1 Background
Data confidentiality has always been an issue of ethical concern. But with the enactment of laws to protect the privacy of individual health and financial records, it has become a legal concern as well. Most IT managers have a strategy in place for securing customer information within their networks. Especially in the healthcare industry, where controlling data interchange with vendors to ensure patient privacy is a major concern. The market offers various products and services to assist managers with these challenges. Many offer ways to integrate confidentiality and compliance into daily operations.
164
Purging is the process of preventing the retrieval of information from the erased media by using all known techniques, including specialist laboratory tools. This level of security is achieved by securely erasing the physical media by using firmware-level tools. Destruction, as the name implies, is the physical destruction of the decommissioned media. This level of security is usually only required in defense or other high security environments.
Sanitization Cycle
Access to Data
Write Pattern X
165
Attention: Do not turn off the storage system, disrupt the storage connectivity, or remove target disks while sanitizing. If sanitizing is interrupted while target disks are being formatted, the disks must be reformatted before sanitizing can finish. If you need to cancel the sanitization process, use the disk sanitize abort command. If the specified disks are undergoing the disk formatting phase of sanitization, the abort does not occur until the disk formatting is complete. At that time, Data ONTAP displays a message that the sanitization was stopped. If the sanitization process is interrupted by power failure, system panic, or by the user, the sanitization process must be repeated from the beginning.
166
Example 13-2 shows the progress of disk sanitization, starting with sanitization on drives 8a.43, 8a.44 and 8a.45. The process then formats these drives and writes a pattern (hex 0x47) multiple times (cycles) to the disks.
Example 13-2 Disk sanitization progress Tue Jun 24 02:40:10 Disk sanitization initiated on drive 8a.43 [S/N 3FP20XX400007313LSA8] Tue Jun 24 02:40:10 Disk sanitization initiated on drive 8a.44 [S/N 3FP0RFAZ00002218446B] Tue Jun 24 02:40:10 Disk sanitization initiated on drive 8a.45 [S/N 3FP0RJMR0000221844GP] Tue Jun 24 02:53:55 Disk 8a.44 [S/N 3FP0RFAZ00002218446B] format completed in 00:13:45. Tue Jun 24 02:53:59 Disk 8a.43 [S/N 3FP20XX400007313LSA8] format completed in 00:13:49. Tue Jun 24 02:54:04 Disk 8a.45 [S/N 3FP0RJMR0000221844GP] format completed in 00:13:54. Tue Jun 24 02:54:11 Disk 8a.44 [S/N 3FP0RFAZ00002218446B] cycle 1 pattern write of 0x47 completed in 00:00:16. Tue Jun 24 02:54:11 Disk sanitization on drive 8a.44 [S/N 3FP0RFAZ00002218446B] completed. Tue Jun 24 02:54:15 Disk 8a.43 [S/N 3FP20XX400007313LSA8] cycle 1 pattern write of 0x47 completed in 00:00:16. Tue Jun 24 02:54:15 Disk sanitization on drive 8a.43 [S/N 3FP20XX400007313LSA8] completed. Tue Jun 24 02:54:20 Disk 8a.45 [S/N 3FP0RJMR0000221844GP] cycle 1 pattern write of 0x47 completed in 00:00:16. Tue Jun 24 02:54:20 Disk sanitization on drive 8a.45 [S/N 3FP0RJMR0000221844GP] completed. Tue Jun 24 02:58:42 Disk sanitization initiated on drive 8a.43 [S/N 3FP20XX400007313LSA8] Tue Jun 24 03:00:09 Disk sanitization initiated on drive 8a.32 [S/N 43208987] Tue Jun 24 03:11:25 Disk 8a.32 [S/N 43208987] cycle 1 pattern write of 0x47 completed in 00:11:16. Tue Jun 24 03:12:32 Disk 8a.43 [S/N 3FP20XX400007313LSA8] sanitization aborted by user. Tue Jun 24 03:22:41 Disk 8a.32 [S/N 43208987] cycle 2 pattern write of 0x47 completed in 00:11:16. Tue Jun 24 03:22:41 Disk sanitization on drive 8a.32 [S/N 43208987] completed.
167
The sanitization process can take a long time. To view the progress, use the disk sanitize status command as shown in Example 13-3.
Example 13-3 disk sanitize status command itsotuc4*> disk sanitize status sanitization for 0c.24 is 10 % complete
The disk sanitize release command allows the user to return a sanitized disk to the spare pool. The disk sanitize abort command is used to terminate the sanitization process for the specified disks: disk sanitize abort <disk_list> If the disk is in the format stage, the process is canceled when the format is complete. A message is displayed when the format and the cancel are complete.
168
14
Chapter 14.
169
170
Percentage mix of read and write operations Percentage mix of random and sequential operations I/O sizes Working set sizes for random I/O Latency requirements Background tasks running on the storage system (for example, SnapMirror)
Tip: Always size a storage system to have reserve capacity beyond what is expected to be its normal workload.
Some systems use a third option, where they define 1 GB as 1000 x 1024 x 1024 kilobytes. This conversion between binary and decimal units causes most of the capacity lost when calculating the correct size of capacity in an N series design. These two methods represent the same capacity, a bit like measuring distance in kilometers or miles, but then using the incorrect suffix.
Chapter 14. Designing an N series solution
171
For more information, see the following website: http://en.wikipedia.org/wiki/Gigabyte Remember: This document uses decimal values exclusively, so 1 MB = 10^6 bytes.
Raw capacity
Raw capacity is determined by taking the number of disks connected and multiplying by their capacity. For example, 24 disks (the maximum in the IBM System Storage N series disk shelves) times 2 TB per drive is a raw capacity of approximately 48,000 GB, or 48 TB.
Usable capacity
Usable capacity is determined by factoring out the portion of the raw capacity that goes to support the infrastructure of the storage system. This capacity includes space used for operating system information, disk drive formatting, file system formatting, RAID protection, spare disk allocation, mirroring, and the Snapshot protection mechanism. The following example is where the storage would go in the example 24 x 2 TB drive system. Capacity usually gets used in the following areas: Disk ownership: In an N series dual controller (active/active) cluster, the disks are assigned to one, or the other, controller. In the example 24 disk system, the disks are split evenly between the two controllers (12 disks each). Spare disks: It is good practice to allocate spare disk drives to every system. These drives are used if a disk drive fails so that the data on the failed drive can automatically be rebuilt without any operator intervention or downtime. The minimum acceptable practice would be to allocate one spare drive, per drive type, per controller head. In the example, that would be two disks because it is a two-node cluster. RAID: When a drive fails, it is the RAID information that allows the lost data to be recovered. RAID-4: Protects against a single disk failure in any RAID group, and requires that one disk is reserved for RAID parity information (not user data). Because disk capacities have increased greatly over time, with a corresponding increase in the risk of an error during the RAID rebuild, do not use RAID-4 for production use. The remaining 11 drives (per controller), divided into 2 x RAID-4 groups, require two disks to be reserved for RAID-4 parity, per controller. RAID-DP: Protects against a double disk failure in any RAID group, and requires that two disks be reserved for RAID parity information (not user data). With the IBM System Storage N series, the maximum protection against loss is provided by using the RAID-DP facility. RAID-DP has many thousands of times better availability than traditional RAID-4 (or RAID-5), often for little or no additional capacity. The remaining 11 drives (per controller), allocated to 1 x RAID-DP group, require two disks to be reserved for RAID-DP parity, per controller. The RAID groups are combined to create storage aggregates that then have volumes (also called file systems) or LUNs allocated on them. Normal practice would be to treat the nine remaining disks (per controller) as data disks, thus creating a single large aggregate on each controller.
172
All 24 available disks are now allocated: Spare disk drive: 2 (1 per controller) RAID parity disks: 2 (2 per controller) Data disks: 18 (9 per controller) About 25% of the raw capacity is used by hardware protection. This amount varies depending on the ratio of data disks to protection disks. The remaining usable capacity becomes less deterministic from this point because of ever increasing numbers of variables, but a few firm guidelines are still available.
Right-sizing
A commonly misunderstood memory requirement is that imposed by the right-sizing process. This overhead is because of three main factors: Block leveling Disks from different batches (or vendors) can contain a slightly different number of addressable blocks. Therefore, the N series controller assigns a common maximum capacity across all drives of the same basic type. For example, this process makes all 1 TB disks exactly equal. Block leveling has a negligible memory requirement because disks of the same type are already similar. Decimal to binary conversion Because disk vendors measure capacity in decimal units and array vendors usually work in binary units, the stated usable capacity differs. However no capacity is really lost because both measurements refer to the same number of bytes. For example, 1000 GB decimal = 1000000000000 bytes = 931 GB binary. Checksums for data integrity Fibre Channel (FC) disks natively use 520 byte sectors, of which only 512 bytes are used to store user data. The remaining 8 bytes per sector are used to store a checksum value. This imposes a minimal capacity overhead. SATA disks natively use 512 byte sectors, all of which is used to store user data. Therefore one sector per eight blocks is reserved to store the checksum value. This imposes a higher capacity overhead than for FC disks.
Table 14-2 Right-sized disk capacities Disk Type FC Capacity (decimal GB) 72 144 300 600 Capacity GB (binary GB) 68 136 272 Checksum Type 512/520 Block (approximately 2.4%) Right-sized Cap. (binary GB) 66 132 265
173
174
As a result, the example 2000 GB (decimal) disk drives are down to only a little under 1500 GB (binary) before any user data is put on them. If you take the nine data drives per controller and allocate them to a single large volume, the resulting capacity is approximately 13,400 GB (binary) (Figure 14-1).
The example in Figure 14-1 is for a small system. The ratio of usable to raw capacity varies depending on factors such as RAID group size, disk type, and space efficiency features that can be applied later. Examples of these features include thin provisioning, deduplication, compression, and Snapshot backup.
175
176
177
Enterprise applications
Previously the domain of direct-attached storage (DAS) architectures, it is becoming much more common to deploy enterprise applications that use SAN or NAS storage systems. These environments have significantly different requirements than the home directory environment. It is common for the emphasis to be on performance, uptime, and backup rather than on flexibility and individual file recovery. Commonly, these environments use a block protocol such as iSCSI or FCP because they mimic DAS more closely than NAS technologies. However, increasingly the advantages and flexibility provided by NAS solutions have been drawing more attention. Rather than being designed to serve individual files, the configuration focuses on LUNs or the use of files as though they were LUNs. An example would be a database application that uses files for its storage instead of LUNs. At its most fundamental, the database application does not treat I/O to files any differently than it does to LUNs. This configuration allows you to choose the deployment that provides the combination of flexibility and performance required. Enterprise environments are usually deployed with their storage systems clustered. This configuration minimizes the possibility of a service outage caused by a failure of the storage appliance. In clustered environments, there is always the opportunity to spread workload across at least two active storage systems. Therefore, getting good throughput for the enterprise application is generally not difficult.
178
This assumes that the application administrator has a good idea of where the workloads are concentrated in the environment so that beneficial balancing can be accomplished. Clustered environments always have multiple I/O paths available, so it is important to balance the workload across these I/O paths and across server heads. For mission-critical environments, it is important to plan for the worst-case scenario. That is, running the enterprise when one of the storage systems fails and the remaining single unit must provide the entire load. In most circumstances, the mere fact that the enterprise is running despite a significant failure is viewed as positive. However, but there are situations in which the full performance expectation must be met even after a failure. In this case, the storage systems must be sized accordingly. Block protocols with iSCSI or FCP are also common. The use of a few files or LUNs to support the enterprise application means that the distribution of the workload is relatively easy to install and predict.
Microsoft Exchange
Microsoft Exchange has a number of parameters that affect the total storage required of N series. The following are examples of those parameters: Number of instances With Microsoft Exchange, you can specify how many instances of an email or document are saved. The default is 1. If you elect to save multiple instances, take this into consideration for storage sizing. Number of logs kept Microsoft Exchange uses a 5 MB log size. The data change rate determines the number of logs generated per day for recovery purposes. A highly active Microsoft Exchange server can generate up to 100 logs per day. Number of users This number, along with mailbox limit, user load, and percentage concurrent access, has a significant effect on the sizing. Mailbox limit The mailbox limit usually represents the quota assigned to users for their mailboxes. If you have multiple quotas for separate user groups, this limit represents the average. This average, multiplied by the number of users, determines the initial storage space required for the mailboxes. I/O load per user For a new installation, it is difficult to determine the I/O load per user, but you can estimate the load by grouping the users. Engineering and development tend to have a high workload because of drawings and technical documents. Legal might also have a high workload because of the size of legal documents. Normal staff usage, however, consists of smaller sized I/O, more frequent transaction workloads. Use the following formula to calculate the usage: IOPS/Mailbox = (average disk transfers/sec) / (number of mailboxes) Concurrent users Typically, an enterprises employees do not all work in the same time zone or location. Estimate the number of concurrent users for the peak period, which is usually the time when the most employees have daytime operations.
179
Number of storage groups Because a storage group cannot span N series storage systems, the number of storage groups affects sizing. There is no recommendation on number of storage groups per IBM System Storage N series storage system. However, the number and type of users per storage group helps determine the number of storage groups per storage system. Volume type Are FlexVols or traditional volumes used? The type of volume used affects both performance and capacity. Drive type Earlier, this chapter addressed the storage capacity effect of drive type. For Microsoft Exchange, the drive type and performance characteristics are also significant, especially with a highly used Exchange server. In an active environment, use smaller drives and higher performance characteristics such as RPM and Fibre Channel versus SATA. Read-to-write ratio The typical read-to-write ratio is 70% to 30%. Growth rate Industry estimates place data storage growth rates at 50% or higher. Size for at least two years into the future. Deleted mailbox cache space This is a feature of Microsoft Exchange that must also be sized for storage usage on the N series. Microsoft allows for a time-specified retention of documents even after deletion of a mailbox. You also must size the storage effect of this feature.
However, all of these backups run more or less at the same time. Therefore, the greatest I/O load put on the storage environment is frequently during these backup activities, instead of during normal production. IBM System Storage N series storage systems have a number of backup mechanisms available. With prior planning, you can deploy an environment that provides maximum protection against failure while also optimizing the storage and performance capabilities. Keep in mind the following issues: Storage capacity used by Snapshots How much extra storage must be available for Snapshots to use? Networking bandwidth used by SnapMirror In addition to the production storage I/O paths, SnapMirror needs bandwidth to duplicate data to the remote server. Number of possible simultaneous SnapMirror threads How many parallel backup operations can be run at the same time before some resource runs out? Resources to consider include processor cycles, network throughput, maximum parallel threads (which is platform-dependent), and the amount of data that requires transfer. Frequency of SnapMirror operations The more frequently data is synchronized, the fewer the number of changes each time. More frequent operations result in background operations running almost all the time. Rate at which stored data is modified Data that does not change much (for example, archive repositories) does not need to be synchronized as often, and each operation takes less time. Use and effect of third-party backup facilities (for example, IBM Tivoli Storage Manager) Each third-party backup tool has its unique I/O effects that must be accounted for. Data synchronization requirements of enterprise applications Certain applications such as IBM DB2, Oracle, and Microsoft Exchange, must be quiesced and flushed before performing backup operations. This process ensures data consistency of backed-up data images.
181
Spare servers
Some enterprises keep spare equipment around in case of failure. Generally, this is the most expensive solution and is only practical for the largest enterprises. An often overlooked similar situation is the installation of new servers. Additional or replacement equipment is always being brought into most data environments. Bringing this equipment in a bit early and using it as spare or test equipment is a good practice. Storage administrators can practice new procedures and configurations, and test new software without having to do so on production equipment.
Local clustering
The decision to use the high availability features of IBM System Storage N series is determined by availability and service level agreements. These agreements affect the data and applications that run on the IBM System Storage N series storage systems. If it is determined that a Active/Active configuration is needed, it affects sizing. Rather than sizing for all data, applications, and clients serviced by one IBM System Storage N series node, the workload is instead divided over two or more nodes.
Failover performance
Another aspect of a Active/Active configuration is failover performance. As an example, you have determined that the data, application, or clients require constant availability of the IBM System Storage N series, and use Active/Active configurations. However, you might have sized for normal operations on each node and not failover. So what was originally a normal workload for a single node has now doubled. You also must consider the service level agreement for response time, data access, and application performance. How long can your customers work within a degraded performance environment? If the answer is not long at all, the initial sizing of each node also must take failover workload into consideration. Because failover operation is infrequent and usually remedied quickly, it is difficult to justify these additional standby resources unless maintaining optimum performance is critical. An example is a product ordering system with the data storage or application on an IBM System Storage N series storage system. Any effect on the ability to place an order affects sales.
Software upgrades
IBM regularly releases minor upgrades and patches for the Data ONTAP software. Less frequently there are also major release upgrades, such as version 8.1. You need to be aware of the new software versions for these reasons: Patches address recently corrected software flaws Minor upgrades will bundle multiple patches together, and might introduce new features Major upgrades generally introduce significant new features To remain informed of new software releases, subscribe to the relevant sections at the IBM automatic support notification website at: https://www.ibm.com/support/mynotifications Upgrades for Data ONTAP, along with mechanisms for implementing the upgrade are available on the web at: http://www.ibm.com/storage/support/nas Be sure that you understand the recommendations from the vendor and the risks. Use all the available protection tools such as Snapshots and mirrors to provide a fallback in case the
182
upgrade introduces more problems than it solves. And whenever possible, perform incremental unit tests on an upgrade before putting an upgrade into critical production.
Testing
As storage environments become ever more complex and critical, the need for customer-specific testing increases in importance. Work with your storage vendors to determine an appropriate and cost-effective approach to testing solutions to ensure that your storage configurations are running optimally. Even more important is that testing of disaster recovery procedures become a regular and ingrained process for everyone involved with storage management.
14.3 Summary
This chapter provided only a high-level set of guidelines for planning. Consideration of the issues addressed maximizes the likelihood for a successful initial deployment of an IBM System Storage N series storage system. Other sources of specific planning templates exist or are under development. Locate them by using web search queries. Deploying a network of storage systems is not greatly challenging, and most customers can successfully deploy it themselves by following these guidelines. Because of the simplicity that appliances provide, if a mistake is made in the initial deployment, corrective actions are generally not difficult or overly disruptive. For many years customers have iterated their storage system environments into scalable, reliable, and smooth-running configurations. So getting it correct the first time is not nearly as critical as it was before the introduction of storage appliances. If storage system planners and architects remember to keep things simple and flexible, success in deploying an IBM System Storage N series system can be expected.
183
184
Part 2
Part
185
186
15
Chapter 15.
187
Appropriate tools and equipment: Pallet jack, forklift, or hand truck, depending on the hardware that you receive #1 and #2 Phillips head screwdrivers, and a flathead screwdriver for cable adapters A method for connecting to the serial console: A USB-to-Serial adapter Null modem cable (with appropriate connectors)
Documentation stored locally on your mobile computer such as ONTAP documentation, HW documentation
188
Sufficient people to safely install the equipment into a rack: Two or three people are required, depending on the hardware model See the specific hardware installation guide for your equipment
189
Type of information Ethernet interfaces Interface name IPv4 address IPv4 subnet mask IPv6 address IPv6 subnet prefix length Partner IP address or interface Media type (network type) Are jumbo frames supported? MTU size for jumbo frames Flow control e0M interface (if available) IP address Network mask Partner IP address Flow control Router (if used) Gateway name IPv4 address IPv6 address HTTP DNS Location of HTTP directory Domain name Server address 1 Server address 2 Server address 3 NIS Domain name Server address 1 Server address 2 Server address 3
Your values
190
Type of information CIFS Windows domain WINS servers (1, 2, 3) Multiprotocol or NTFS only filer? Should CIFS create default /etc/passwd and /etc/group files? Enable NIS group caching? Hours to update the NIS cache? CIFS server name (if different from default) User authentication style: (1) Active Directory domain (2) Windows NT 4 domain (3) Windows Workgroup (4) /etc/passwd or NIS/LDAP Windows Active Directory domain Domain name Time server name/IP address Windows user name Windows user password Local administrator name Local administrator password CIFS administrator or group Active Directory container BMC MAC address IP address Network mask (subnet mask) Gateway Mailhost
Your values
191
Type of information RLM MAC address IPv4 Address IPv4 Subnet mask IPv4 Gateway IPv6 Address IPv6 Subnet prefix length IPv6 Gateway AutoSupport mailhost AutoSupport recipients ACP Network interface name Domain (subnet) for network interface Netmask (subnet mask) for network interface Key management server(s) (if using Storage Encryption) IP address(es) Key tag name
Your values
192
Proceed to the next step. Remove the power supply and reinstall it, making sure that it connects with the backplane.
5. Verify disk shelf compatibility and check the disk shelf IDs. 6. Ensure that the Fibre Channel disk shelf speed is correct. If you have DS14mk2 Fibre Channel and DS14mk4 Fibre Channel shelves mixed in the same loop, set the shelf speed to 2 Gb, regardless of module type. 7. Check disk ownership to ensure that the disks are assigned to the system: a. Verify that disks are assigned to the system by entering the disk show command. a. Validate that storage is attached to the system, and verify any changes that you made, by entering disk show -v. 8. Turn off your controller and disk shelves, then turn on the disk shelves. For information about LED responses, check the quick reference card that came with the disk shelf or the hardware guide for your disk shelf. 9. Use the onboard diagnostic tests to check that Fibre Channel disks in the storage system are operating properly: a. Turn on your system and press Ctrl-C. b. Enter boot_diags at the LOADER> prompt. c. Enter fcal in the Diagnostic Monitor program that starts at boot. d. Enter 73 at the prompt to show all disk drives. e. Exit the Diagnostic Monitor by entering 99 at the prompt. f. Enter the exit command to return to LOADER. g. Start Data ONTAP by entering autoboot at the prompt. 10.Use the onboard diagnostic tests to check that SAS disks in the storage system are operating properly: a. Enter mb in the Diagnostic Monitor program. b. Enter 6 to select the SAS test menu.
Chapter 15. Preparation and installation
193
c. Enter 42 to scan and show disks on the selected SAS. Doing so displays the number of SAS disks. d. Enter 72 to show the attached SAS devices. e. Exit the Diagnostic Monitor by entering 99 at the prompt. f. Enter the exit command to return to LOADER. g. Start Data ONTAP by entering autoboot at the prompt. 11.Try starting your system again.
Table 15-3 Starting the system If your system... Then...
Proceed to setting up the software. Call IBM technical support. The system might not have the boot image downloaded on the boot device.
194
16
Chapter 16.
195
To proceed, specify a valid user name and password. Tip: By default the FilerView interface is unencrypted. Enable the HTTP/S protocol as soon as possible if you plan to use FilerView. Generally, do not use FilerView. Instead, use either the CLI or OnCommand System Manager to perform administrative tasks
196
Enter <command> help for a list of the available options of the specified command as shown in Figure 16-1.
The manual pages can be accessed by entering the man command. Figure 16-2 provides a detailed description of a command and lists options (man <command>).
197
16.1.4 OnCommand
OnCommand is an operations manager is an N series solution for managing multiple N series storage systems that provides these features: Scalable management, monitoring, and reporting software for enterprise-class environments Centralized monitoring and reporting of information for fast problem resolution Management policies with custom reporting to capture specific, relevant information to address business needs Flexible, hierarchical device grouping to allow monitoring The cost of OnCommand depends on the product purchased.
198
Example 16-2 shows the boot options. Typically, you boot in normal boot mode.
Example 16-2 Boot menu 1) Normal Boot 2) Boot without /etc/rc 3) Change Password 4) Initialize all disks 4a) Same as option 4 but create a flexible root volume 5) Maintenance boot Selection (1-5)?
199
With the IBM System Storage N series storage systems, you can specify which users receive CIFS shutdown messages. By issuing the cifs terminate command, Data ONTAP, by default, sends a message to all open client connections. This setting can be changed by issuing the following command: options cifs.shutdown_msg_level 0 | 1 | 2 The options are: 0: 1: 2: Never send CIFS shutdown messages. Send CIFS messages to clients connected and with open files only. Send CIFS messages to all open connections (default).
The cifs terminate command shuts down CIFS, ends CIFS service for a volume, or logs off a single station. The -t option can be used to specify a delay interval in minutes before CIFS stops as shown in Example 16-4.
Example 16-4 The cifs terminate -t command itsosj-n1> cifs terminate -t 3 Total number of connected CIFS users: 1 Total number of open CIFS files: 0 Warning: Terminating CIFS service while files are open may cause data loss!! 3 minutes left until termination (^C to abort)... 2 minutes left until termination (^C to abort)... 1 minute left until termination (^C to abort)... CIFS local server is shutting down... CIFS local server has shut down... itsosj-n1>
You can even select single workstations for which the CIFS service should stop as shown in Example 16-5.
Example 16-5 The cifs terminate command for a single workstation itsosj-n1> cifs terminate -t 3 workstation_01 3 minutes left until termination (^C to abort)... 2 minutes left until termination (^C to abort)... 1 minute left until termination (^C to abort)...
200
itsosj-n1> Thu Sep 8 09:41:43 PDT [itsosj-n1: cifs.terminationNotice:warning]: CIFS: shut down completed: disconnected workstation workstation_01. itsosj-n1>
When you shut down an N series, there is no need to specify the cifs terminate command. During shutdown, this command is run by the operating system automatically. Tip: Workstations running Windows 95/98 or Windows for Workgroups will not see the notification unless they are running WinPopup. Depending on the CIFS message settings, messages such as those shown in Figure 16-3 are displayed on the affected workstations.
To restart CIFS, issue the cifs restart command as shown in Example 16-6. The N series startup procedure starts the CIFS services automatically.
Example 16-6 The cifs restart command itsosj-n1> cifs restart CIFS local server is running. itsosj-n1>
You can verify whether CIFS is running by using the cifs sessions command. If CIFS is not running, a message is displayed as shown in Example 16-7.
Example 16-7 Checking whether CIFS is running on the N series itsosj-n1> cifs sessions CIFS not running. Use "cifs Use "cifs Use "cifs Use "cifs itsosj-n1> restart" to restart prefdc" to set preferred DCs testdc" to test WINS and DCs setup" to configure
201
netboot
Usually you boot the N series after you issue the halt command with the boot_ontap or bye command. These commands end the CFE prompt and restart the N series as shown in Example 16-9.
Example 16-9 Starting the N series at the CFE prompt CFE>bye CFE version 1.2.0 based on Broadcom CFE: 1.0.35 Copyright (C) 2000,2001,2002,2003 Broadcom Corporation. Portions Copyright (C) 2002,2003 Network Appliance Corporation. CPU type 0x1040102: 650MHz Total memory: 0x40000000 bytes (1024MB) Starting AUTOBOOT press any key to abort... Loading: 0xffffffff80001000/21792 0xffffffff80006520/10431377 Entry at 0xffffffff80001000 Starting program at 0xffffffff80001000 Press CTRL-C for special boot menu .................................................................................... .................................................................................... .........................................Interconnect based upon M-VIA ERing Support Copyright (c) 1998-2001 Berkeley Lab http://www.nersc.gov/research/FTG/via Wed Aug 31 19:00:46 GMT [cf.nm.nicTransitionUp:info]: Interconnect link 0 is UP Wed Aug 31 19:00:46 GMT [cf.nm.nicTransitionDown:warning]: Interconnect link 0 is DOWN Data ONTAP Release 7.1H1: Mon Aug 15 16:02:45 PDT 2005 (IBM)Copyright (c) 1992-2005 Network Appliance, Inc. Starting boot on Wed Aug 31 19:00:45 GMT 2005 Wed Aug 31 19:00:51 GMT [diskown.isEnabled:info]: software ownership has been enabled for this system Wed Aug 31 19:00:56 GMT [raid.cksum.replay.summary:info]: Replayed 0 checksum blocks. Wed Aug 31 19:00:56 GMT [raid.stripe.replay.summary:info]: Replayed 0 stripes. Wed Aug 31 19:00:57 GMT [localhost: cf.fm.launch:info]: Launching cluster monitor Wed Aug 31 19:00:57 GMT [localhost: cf.fm.notkoverClusterDisable:warning]: Cluster monitor: cluster takeover disabled (restart) add net 127.0.0.0: gateway 127.0.0.1 DBG: Failed to get partner serial number from VTIC DBG: Set filer.serialnum to: 310070722 Wed Aug 31 19:00:58 GMT [rc:notice]: The system was down for 71 seconds Wed Aug 31 12:01:00 PDT [itsosj-n1: dfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk drives Wed Aug 31 12:01:00 PDT [ltm_services:info]: Ethernet e0a: Link up add net default: gateway 192.186.101.57: network unreachable Wed Aug 31 12:01:02 PDT [rc:ALERT]: timed: time daemon started Wed Aug 31 12:01:03 PDT [itsosj-n1: mgr.boot.disk_done:info]: Data ONTAP Release 7.1H1 boot complete. Last disk update written at Wed Aug 31 11:59:46 PDT 2005 Wed Aug 31 12:01:03 PDT [itsosj-n1: mgr.boot.reason_ok:notice]: System rebooted. Password: itsosj-n1> Wed Aug 31 12:01:20 PDT [console_login_mgr:info]: root logged in from console itsosj-n1>
Depending on the CIFS Message settings and Microsoft Windows Client settings, you might receive messages on your CIFS client about the shutdown. These messages are shown in Figure 16-3 on page 201.
203
Network File System (NFS) clients can maintain use of a file over a halt or reboot because NFS is a stateless protocol. CIFS, FCP, and iSCSI clients behave differently. Therefore, use the -t option to allow users time before the shutdown to save their work. Depending on the shutdown message settings, CIFS clients might receive messages such as those shown in Figure 16-3 on page 201.
204
Part 3
Part
205
206
17
Chapter 17.
207
208
HUKs are available that support the following programs: AIX with Fibre Channel Protocol (FCP) and iSCSI Linux with FCP/iSCSI HP-UX with FCP/iSCSI Solaris Platform Edition (SPARC and x86) with FCP/iSCSI VMWare ESX with FCP/iSCSI Windows with FCP/iSCSI
209
7. Configure a multipathing solution. 8. Install Veritas Storage Foundation. 9. Install the Host Utilities. 10.Install SnapDrive for Windows. Remember: If you add a Windows 2008 R2 host to a failover cluster after installing the Host Utilities, run the Repair option of the Host Utilities installation program. This process sets the required ClusSvcHangTimeout parameter.
17.4.2 Preparation
Before you install the Host Utilities, verify that the Host Utilities version supports your host and storage system configuration.
210
Add the iSCSI or FCP license, and start the target service. The Fibre Channel and iSCSI protocols are licensed features of Data ONTAP software. If you need to purchase a license, contact your IBM or sales partner representative. Next, verify your cabling. See the FC and iSCSI Configuration Guide for detailed cabling and configuration information at: http://www.ibm.com/storage/support/nas/
211
connectivity. An iSCSI HBA offloads most iSCSI processing to the HBA card, which also provides network connectivity. The iSCSI software initiator typically provides excellent performance. In fact, an iSCSI software initiator provides better performance than an iSCSI HBA in most configurations. The iSCSI initiator software for Windows is available from Microsoft for no additional charge. In some cases, you can even SAN boot a host with an iSCSI software initiator and a supported NIC. iSCSI HBAs are best used for SAN booting. An iSCSI HBA implements SAN booting just like a Fibre Channel HBA. When booting from an iSCSI HBA, use an iSCSI software initiator to access your data LUNs. Select the appropriate iSCSI software initiator for your host configuration. Table 17-1 lists operating systems and their iSCSI software initiator options.
Table 17-1 iSCSI initiator instructions Operating System Windows Server 2003 Windows Server 2008 Windows Server 2008 R2 Windows XP guest systems on Hyper-V Instructions Download and install the iSCSI software initiator. The iSCSI initiator is built into the operating system. The iSCSI Initiator Properties dialog is available from Administrative Tools. The iSCSI initiator is built into the operating system. The iSCSI Initiator Properties dialog is available from Administrative Tools. For guest systems on Hyper-V virtual machines that access storage directly (not as a virtual hard disk mapped to the parent system), download and install the iSCSI software initiator. You cannot select the Microsoft MPIO Multipathing Support for iSCSI option. Microsoft does not support MPIO with Windows XP. A Windows XP iSCSI connection to IBM N series storage is supported only on Hyper-V virtual machines. For guest systems on Hyper-V virtual machines that access storage directly (not as a virtual hard disk mapped to the parent system), the iSCSI initiator is built into the operating system. The iSCSI Initiator Properties dialog is available from Administrative Tools. A Windows Vista iSCSI connection to IBM N series storage is supported only on Hyper-V virtual machines. For guest systems on Hyper-V virtual machines that access storage directly (not as a virtual hard disk mapped to the parent system), use an iSCSI initiator solution. This solution must be on a Hyper-V guest that is supported for stand-alone hardware. A supported version of Linux Host Utilities is required. For guest systems on Virtual Server 2005 virtual machines that access storage directly (not as a virtual hard disk mapped to the parent system), use an iSCSI initiator solution. This solution must be on a Virtual Server 2005 guest that is supported for stand-alone hardware. A supported version of Linux Host Utilities is required.
SUSE Linux Enterprise Server guest systems on Hyper-V Linux guest systems on Virtual Server 2005
212
On a Windows system, there are two main components to any MPIO solution: A DSM and the Windows MPIO components. Install a supported DSM before you install the Windows Host Utilities. Select from the following choices: The Data ONTAP DSM for Windows MPIO The Veritas DMP DSM The Microsoft iSCSI DSM (part of the iSCSI initiator package) The Microsoft msdsm (included with Windows Server 2008 and Windows Server 2008 R2) MPIO is supported for Windows Server 2003, Windows Server 2008, and Windows Server 2008 R2 systems. MPIO is not supported for Windows XP and Windows Vista running in a Hyper- V virtual machine. When you select MPIO support, the Windows Host Utilities installs the Microsoft MPIO components on Windows Server 2003. Or it enables the included MPIO feature of Windows Server 2008 and Windows Server 2008 R2.
213
c. Select the N series software you want to download, and then select the Download view. d. Use the Software Packages link on the website presented, and follow the online instructions to download the software. 3. Run the executable file, and follow the instructions on the window. Tip: The Windows Host Utilities installer checks for required Windows hotfixes. If it detects a missing hotfix, it displays an error. Download and install the requested hotfixes, then restart the installer. 4. Reboot the Windows host when prompted.
214
For Windows Server 2008 or Windows Server 2008 R2, use the Windows Storage Explorer application to display the WWPNs. For Windows Server 2003, use the Microsoft fcinfo.exe program. You can instead use the HBA manufacturer's management software if it is installed on the Windows host. Examples include HBAnyware for Emulex HBAs and SANsurfer for QLogic HBAs. If the system is SAN booted and not yet running an operating system, or the HBA management software is not available, obtain the WWPNs by using the boot BIOS.
215
On systems that use Fibre Channel, the Host Utilities installer sets the required timeout values for Emulex and QLogic Fibre Channel HBAs. If Data ONTAP DSM for Windows MPIO is detected on the host, the Host Utilities installer does not set any HBA values.
216
There are many ways to create and manage initiator groups and LUNs on your storage system. These processes vary depending on your configuration. These topics are covered in detail in the Data ONTAP Block Access Management Guide for iSCSI and Fibre Channel for your version of the Data ONTAP software.
217
218
18
Chapter 18.
219
18.1 Overview
FCP SAN boot, remote boot, and root boot refer to a configuration where the operating system is installed on a logical drive not resident locally to the server chassis. SAN Boot has the following primary benefits over booting the host OS from local storage: The ability to create a Snapshot of the host OS You can create a Snapshot of the OS before installing a hotfix, service pack, or other risky change to the OS. If it goes bad, you can restore the OS from the copy. For more information about Snapshot technology, see: http://www.ibm.com/systems/storage/network/software/snapshot/index.html Performance The host is likely to boot significantly faster in a SAN boot configuration because you can put several spindles under the boot volume. Fault tolerance There are multiple disks under the volume in a RAID 4 or RAID-DP configuration. The ability to clone FlexVols, creating FlexClone volumes This host OS cloned LUN can be used for testing purposes. Further information about FlexClone software can be found at: http://www.ibm.com/systems/storage/network/software/flexvol/index.html Interchangeable servers By allowing boot images to be stored on the SAN, servers are no longer physically bound to their startup configurations. Therefore, if a server fails, you can easily replace it with another generic server. You can then resume operations with the exact same boot image from the SAN. Only some minor reconfiguration is required on the storage system. This quick interchange helps reduce downtime and increases host application availability. Provisioning for peak usage Because the boot image is available on the SAN, it is easy to deploy additional servers to temporarily cope with high workloads. Centralized administration SAN boot enables simpler management of the startup configurations of servers. You do not need to manage boot images at the distributed level at each individual server. Instead, SAN boot allows you to manage and maintain the images at a central location in the SAN. This feature enhances storage personnel productivity and helps to streamline administration. Uses the high availability features of SAN storage SANs and SAN-based storage are typically designed with high availability in mind. SANs can use redundant features in the storage network fabric and RAID controllers to ensure that users do not incur any downtime. Most boot images on local disks or direct-attached storage do not share this protection. Using SAN boot allows boot images to take advantage of the inherent availability built into most SANs. This configuration helps to increase availability and reliability of the boot image, and reduce downtime. Efficient disaster recovery process You can have data (boot image and application data) mirrored over the SAN between a primary site and a recovery site. With this configuration, servers can take over at the secondary site if a disaster occurs on servers at the primary site.
220
Reduce overall cost of servers Locating server boot images on external SAN storage eliminates the need for a local disk in the server. This configuration helps lower costs and allows SAN boot users to purchase servers at a reduced cost while still maintaining the same functionality. In addition, SAN boot minimizes the IT costs through consolidation, which reduces the use of electricity and floor space, and through more efficient centralized management.
221
Operating system Red Hat Enterprise Linux 5.2 Windows 2003 Enterprise SP2
7.3
High latency during pagefile access can cause systems to fail with a STOP message (blue screen) or perform poorly. Carefully monitor the disk array to prevent oversubscription of the storage, which can result in high latency. Some administrators concerned about paging performance might opt to keep the pagefile on a local disk while storing the operating system on an N series SAN. There are issues with this configuration as well. If the pagefile is moved to a drive other than the boot drive, system, crash memory dumps cannot be written. This can be an issue when trying to debug operating system instability in the environment. If the local disk fails and is not mirrored, the system fails and cannot boot until the problem is corrected. In addition, do not create two pagefiles on devices with different performance profiles, such as a local disk and a SAN device. Attempting to distribute the pagefile in this manner might result in kernel inpage STOP errors. In general, if the system is paging heavily, performance suffers regardless of whether the pagefile is on a SAN device or local disk. The best way to address this problem is to add more physical memory to the system or correct the condition that is causing severe paging. At the time of publication, the costs of physical memory are such that a small investment can prevent paging and preserve the performance of the environment. It is also possible to limit the pagefile size or disable it completely to prevent SAN resource contention. If the pagefile is severely restricted or disabled to preserve performance, application instability is likely to result in cases where memory is fully used. Use this option only for servers that have enough physical memory to cover the anticipated maximum requirements of the application. Microsoft Cluster Services and SCSI port drivers: the Microsoft Cluster Service uses bus-level resets in its operation. It cannot isolate these resets from the boot device. Therefore, installations that use the SCSIport driver with Microsoft Windows 2000 or 2003 must use separate HBAs for the boot device and the shared cluster disks. In deployments where full redundancy is wanted, a minimum of four HBAs are required for MPIO. In Fibre Channel implementations, employ zoning to separate the boot and shared cluster HBAs.
223
Deploy Microsoft Cluster Services on a Windows Server 2003 platform using STORport drivers. With this configuration, both the boot disks and shared cluster disks can be accessed through the same HBA (Figure 18-1). A registry entry is required to enable a single HBA to connect to both shared and non-shared disks in an MSCS environment. For details, see the Server Clusters: Storage Area Networks - For Windows 2000 and Windows Server 2003 topic at: http://www.microsoft.com/en-us/download/details.aspx?id=13153
224
4. Load the boot sector: The first sector of the boot device, which contains the MBR (Master Boot Record), is loaded. The MBR contains the address of the bootable partition on the disk where the operating system is located.
225
3. Select the appropriate adapter and press Enter as shown in Figure 18-2.
BootBIOS displays the configuration information for the HBA, including the WWPN, as shown in Figure 18-3.
226
3. BootBIOS displays a menu of available adapters. Select the appropriate HBA and press Enter as shown in Figure 18-4.
4. The Fast!UTIL options are displayed. Select Configuration Settings and press Enter as shown in Figure 18-5.
227
The adapter settings are displayed including the WWPN, as shown in Figure 18-7.
228
Requirement: Ensure that you are using the version of firmware required by this FCP Windows Host Utility. BootBIOS firmware is disabled by default. To configure SAN booting, you must first enable BootBIOS firmware and then configure it to boot from a SAN disk. You can enable and configure BootBIOS on the HBA by using one of the following tools: Emulex LP6DUTIL.EXE: The default configuration for the Emulex expansion card for x86 BootBIOS in the Universal Boot Code image is not enabled at startup. This configuration disallows access to the BIOS Utility on power up. Otherwise, press Alt+E. In Figure 18-8 the x86 BootBIOS is enabled at startup, so we press Alt+E to access the BIOS Utility. Qlogic Fast!UTIL: Enable BootBIOS for Qlogic HBAs by using FastUTIL!.
3. Select 2 to configure the adapters parameters and press Enter as shown in Figure 18-9.
229
4. From the Configure Adapters Parameters menu, select 1 to enable the BIOS as shown in Figure 18-10.
5. This panel shows the BIOS disabled. Select 1 to enable the BIOS as shown in Figure 18-11.
230
6. Press Esc to return to the configure adapters parameters menu as shown in Figure 18-13.
231
7. Press Esc to return to the main configuration menu. You are now ready to configure your boot devices. Select 1 to configure the boot devices as shown in Figure 18-14. Tip: The Emulex adapter supports FC_AL (public and private loop) and fabric point-to-point. During initialization, the adapter determines the appropriate network topology and scans for all possible target devices.
8. The eight boot entries are zero by default. The primary boot device is listed first, it is the first bootable device. Select a boot entry to configure and select 1 as shown in Figure 18-15.
232
Clarification: In target device failover, if the first boot entry fails because of a hardware error, the system can boot from the second bootable entry. If the second boot entry fails, the system boots from the third bootable entry, and so on, up to eight distinct entries. This process provides failover protection by automatically redirecting the boot device without user intervention. 9. At initialization, Emulex scans for all possible targets or boot devices. If the HBA is attached to a storage array, the storage device is visible. To view the LUNs, select the storage array controller. Figure 18-16 shows two arrays within the entry field. Select 01 and press Enter.
Clarification: In device scanning, the adapter scans the fabric for Fibre Channel devices and lists all the connected devices by DID and WWPN. Information about each device is listed, including starting LUN number, vendor ID, product ID, and product revision level. 10.A pop-up window requests entry of the starting LUN number to display. Enter 00 to display the first 16 LUNS as shown in Figure 18-17.
233
11.BootBIOS displays a menu of bootable devices. The devices are listed in boot order. The primary boot device is the first device listed. If the primary boot device is unavailable, the host boots from the next available device in the list. In the example shown in Figure 18-18, only one LUN is available. This is because SAN zoning is configured to one path as suggested in 18.2.1, Configuration limits and preferred configurations on page 221. Select 01 to select the primary boot entry, and press Enter.
12.After the LUN is selected, another menu prompts you to specify how the boot device will be identified. Generally, use the WWPN for all boot-from-SAN configurations. Select item 1 to boot this device using the WWPN as shown in Figure 18-19.
234
13.After this process is complete, press X to exit and save your configuration as shown in Figure 18-20. Your HBAs BootBIOS is now configured to boot from a SAN on the attached storage device.
Figure 18-20 Exit Emulex Boot Utility and saved boot device panel
235
3. The Qlogic Fast!UTIL displays the available adapters, listed in boot order. The primary boot device is the first device listed. If the primary boot device is unavailable, the host boots from the next available device in the list. Select the first Fibre Channel adapter port and press Enter as shown in Figure 18-23.
236
237
6. Scroll to Host Adapter BIOS as shown in Figure 18-26. If this option is disabled, press Enter to enable it. If this option is enabled, go to the next step.
7. Press Esc to return to the Configuration Settings panel. Scroll to Selectable Boot Settings and press Enter as shown in Figure 18-27.
238
8. Scroll to Selectable Boot as shown in Figure 18-28. If this option is disabled, press Enter to enable it. If this option is enabled, go to the next step.
9. Select the entry in the (Primary) Boot Port Name, LUN field, as shown in Figure 18-29, and press Enter.
10.The available Fibre Channel devices are displayed as shown in Figure 18-30. Select the boot LUN 0 from the list of devices and press Enter.
239
11.Press Esc to return to the previous panel. Press Esc again and you are prompted to save the configuration settings as shown in Figure 18-31. Select Save changes and press Enter.
12.The changes are saved and you are returned to the configuration settings. Press Esc and you are prompted to reboot the system as shown in Figure 18-32. Select Reboot system and press Enter.
IBM BIOS
There can be slight differences within the System BIOS configuration and setup utility depending on the server model and BIOS version that are used. Knowledge of BIOS and ROM memory space usage can be required in certain situations. Some older PC architecture
240
limits ROM image memory space to 128 K maximum. This limit becomes a concern if you want more devices that require ROM spaced. If you have many HBAs in your server, you might receive a PCI error allocation message during the boot process. To avoid this error, disable the boot options in the HBAs that are not being used for SAN boot installation. To configure the IBM BIOS setup program, perform these steps: 1. Reboot the host. 2. Press F1 to enter BIOS setup as shown in Figure 18-33.
241
4. Scroll to the PCI Device Boot Priority option and select the slot in which the HBA is installed as shown in Figure 18-35.
Figure 18-35 Selecting PCI Device Boot Priority in Start Options panel
5. Scroll up to Startup Sequence Options and press Enter. Make sure that the Startup Sequence Option is configured as shown in Figure 18-36.
242
4. The Boot tab lists the boot device order. Ensure that the HBA is configured as the first boot device. Select Hard Drive. 5. Configure the LUN as the first boot device.
LUN: 00 NETAPP LUN BIOS is installed successfully! Tip: If the message does not display, do not continue installing Windows. Check to ensure that the LUN is created and mapped, and that the target HBA is in the correct mode for directly connected hosts. Also, ensure that the WWPN for the HBA is the same WWPN that you entered when creating the igroup. If the LUN is displayed but the message indicates that the BIOS is not installed, reboot and enable the BIOS. 2. When prompted, press any key to boot from the CD. 3. When prompted, press F6 to install a third-party SCSI array driver. 4. Insert the HBA driver diskette that you created previously when the following message is displayed: Setup could not determine the type of one or more mass storage devices installed in your system, or you have chosen to manually specify an adapter.
243
5. Press S to continue. 6. From the list of HBAs, select the supported HBA that you are using and press Enter. The driver for the selected HBA is configured in the Windows operating system. 7. Follow the prompts to set up the Windows operating system. When prompted, set up the Windows operating system in a partition formatted with NTFS. 8. The host system reboots and then prompts you to complete the server setup process as you normally would do. The rest of the Windows installation is the same as a normal installation. Prerequisites: After you successfully install Windows 2003, you must add the remaining WWPN for all additional HBAs to the group and install the FCP Windows Host Utilities.
244
A few boot configuration changes were introduced in the Windows 2008 server. The major change is that Boot Configuration Data (BCD) stores contain boot configuration parameters. These parameters control how the operating system is started in Microsoft Windows Server 2008 operating systems. These parameters were previously in the Boot.ini file (in BIOS-based operating systems) or in the nonvolatile RAM (NVRAM) entries (in Extensible Firmware Interface-based operating systems). You can use the Bcdedit.exe command-line tool to modify the Windows code that runs in the pre-operating system environment by changing entries in the BCD store. Bcdedit.exe is in the \Windows\System32 directory of the Windows 2008 active partition. BCD was created to provide an improved mechanism for describing boot configuration data. With the development of new firmware models (for example, the Extensible Firmware Interface (EFI)), an extensible and interoperable interface was required to abstract the underlying firmware. Windows Server 2008 R2 supports the ability to boot from a SAN, which eliminates the need for local hard disks in the individual server computers. In addition, performance for accessing storage on SANs has greatly improved. Figure 18-37 shows how booting from a SAN can dramatically reduce the number of hard disks, decreasing power consumption.
245
To install the Windows Server 2008 full installation option, perform these steps: 1. Insert the appropriate Windows Server 2008 installation media into your DVD drive. Reboot the server as shown in Figure 18-38.
2. Select an installation language, regional options, and keyboard input, and click Next, as shown in Figure 18-39.
Figure 18-39 Selecting the language to install, regional options, and keyboard input
246
3. Click Install now to begin the installation process as shown in Figure 18-40.
4. Enter the product key and click Next as shown in Figure 18-41.
247
5. Select I accept the license terms and click Next as shown in Figure 18-42.
248
7. If the window shown in Figure 18-44 does not show any hard disk drives, or if you prefer to install the HBA device driver now, click Load Driver.
8. As shown in Figure 18-45, insert appropriate media that contains the HBA device driver files and click Browse.
9. Click OK Next.
249
10.Click Next again to leave the Windows creates the partition automatically window, or click Drive options (advanced) to create the partition. Then click Next to start the installation process as shown in Figure 18-46.
11.When Windows Server 2008 Setup has completed installation, the server automatically restarts. 12.After Windows Server 2008 restarts, you are prompted to change the administrator password before you can log on. 13.After you are logged on as the administrator, a configuration wizard window is displayed. Use the wizard for naming and basic networking setup. 14.Use the Microsoft Server 2008 Roles and Features functions to set up the server to your specific needs. Tip: After you successfully install Windows 2008, add the remaining WWPN for all additional HBAs to the igroup, and install the FCP Windows Host Utilities.
250
System BIOS
The process starts when you power up or reset your System x. The processor runs the basic input/output system (BIOS) code, which then runs a power-on self-test (POST) to check and initialize the hardware. It then locates a valid device to boot the system.
Boot loader
If a boot device is found, the BIOS loads the first stage boot loader stored in the master boot record (MBR) into memory. The MBR is the first 512 bytes of the bootable device. This first stage boot loader is then run to locate and load into memory the second stage boot loader. Boot loaders are in two stages because of the limited size of the MBR. In an x86 system, the second stage boot loader can be the Linux Loader (LILO) or the GRand Unified Bootloader (GRUB). After it is loaded, it presents a list of available kernels to boot.
OS kernel
After a kernel is selected, the second stage boot loader locates the kernel binary and loads into memory the initial RAM disk image. The kernel then checks and configures hardware and peripherals, and extracts the initial RAM disk image into load drivers and modules needed to boot the system. It also mounts the root device.
251
Tip: RHEL5 can now detect, create, and install to dm-multipath devices during installation. To enable this feature, add the parameter mpath to the kernel boot line. At the initial Linux installation panel, type linux mpath and press Enter to start the Red Hat installation. The installation process is similar to local disk installation. To set up a Linux SAN boot, perform these steps: 1. Insert the Linux installation CD and reboot the host. During the installation, you are able see the LUN and install the OS on it. 2. Click Next and follow the installation wizard as you normally would do with a local disk installation. Attention: After you successfully install Red Hat Enterprise Linux 5.2, add the remaining WWPN for all additional HBAs to the igroup, and install the FCP Linux Host Utilities. IBM LUNs connected by way of a block protocol (for example, iSCSI, FCP) to Linux hosts using partitions might require special partition alignment for best performance. For more information about this issue, see: http://www.ibm.com/support/docview.wss?uid=ssg1S1002716&rs=573
252
FCoE encapsulates the Fibre Channel frame in an Ethernet packet to enable transporting storage traffic over an Ethernet interface. By transporting the entire Fibre Channel frame in Ethernet packets, FCoE makes sure that no changes are required to Fibre Channel protocol mappings, information units, session management, exchange management, and services. With FCoE technology, servers that host both HBAs and network adapters reduce their adapter count to a smaller number of converged network adapters (CNAs). CNAs support both TCP/IP networking traffic and Fibre Channel SAN traffic. Combined with native FCoE storage arrays and switches, an end-to-end FCoE solution can be deployed with all the benefits of a converged network in the data center. FCoE CNAs provide FCoE offload, and support boot from SAN. Configuring it is similar to the boot from SAN with the Fibre Channel protocol.
253
254
19
Chapter 19.
Host multipathing
This chapter introduces the concepts of host multipathing. It addresses the installation steps and describes the management interface for the Windows, Linux, and IBM AIX operating systems. The following topics are covered: Overview Multipathing software options, including ALUA Installation of IBM Data ONTAP DSM Managing DSM by using the GUI Managing DSM by using the CLI Multiple path I/O support for Red Hat Linux Multiple path I/O support for Native AIX O/S This chapter includes the following sections: Overview Multipathing software options
255
19.1 Overview
Multipath I/O (MPIO) provides multiple storage paths from hosts (initiators) to their IBM System Storage N series targets. The multiple paths provide redundancy against failures of hardware such as cabling, switches, and adapters. They also provide higher performance thresholds by aggregation or optimum path selection. Multipathing solutions provide the host-side logic to use the multiple paths of a redundant network to provide highly available and higher bandwidth connectivity between hosts and block level devices. Multipath software has these main objectives: Present the OS with a single virtualized path to the storage. Figure 19-1 includes two scenarios: OS with no multipath management software and OS with multipath management software. Without multipath management software, the OS believes that it is connected to two different physical storage devices. With multipath management software, the OS correctly interprets that both HBAs are connected to the same storage device. Seamlessly recover from a path failure. Multipath software detects failed paths and recovers from the failure by routing traffic through another available path. The recovery is automatic, usually fast, and transparent to the IT organization. The data ideally remains available at all times. Enable load balancing. Load balancing is the use of multiple data paths between server and storage to provide greater throughput of data than with only one connection. Multipathing software improves throughput by enabling load balancing across multiple paths between server and storage.
256
When multiple paths to a LUN are available, a consistent method of using those paths needs to be determined. This method is called the load balance policy. There are five standard policies in Windows Server 2008 that apply to multiconnection sessions and MPIO. Other operating systems can implement different load balancing policies. Failover only: Only one path is active at a time, and alternate paths are reserved for path failure. Round robin: I/O operations are sent down each path in turn. Round robin with subset: Some paths are used as in round robin, while the remaining paths act as failover only. Least queue depth: I/O is sent down the path with the fewest outstanding I/Os. Weighted paths: Each path is given a weight that identifies its priority, with the lowest number having the highest priority.
257
258
With the implicit ALUA style, the host multipathing software can monitor the path states but cannot change them, either automatically or manually. Of the active paths, a path can be specified as preferred (optimized in T10), and as non-preferred (non-optimized). If there are active preferred paths, only those paths receive commands and are load balanced to evenly distribute the commands. If there are no active preferred paths, the active non-preferred paths are used in a round-robin fashion. If there are no active non-preferred paths, the LUN cannot be accessed until the controller activates its standby paths. Tip: Generally, use ALUA on hosts that support ALUA. Verify that a host supports ALUA before implementing, because otherwise a cluster failover might result in system interruption or data loss. All N series LUNs presented to an individual host must have ALUA enabled. The host's MPIO software expects ALUA to be consistent for all LUNs with the same vendor. Traditionally, you had to manually identify and select the optimal paths for I/O. Utilities such as dotpaths for AIX are used to set path priorities in environments where ALUA is not supported. Using ALUA, the administrator of the host computer does not need to manually intervene in path management. It is handled automatically. Running MPIO on the host is still required, but no additional host-specific plug-ins are required. This process allows the host to maximize I/O by using the optimal path consistently and automatically. ALUA has the following limitations: ALUA can only be enabled on FCP initiator groups. ALUA is not available on non-clustered storage systems for FCP initiator groups. ALUA is not supported for iSCSI initiator groups. To enable ALUA on existing non-ALUA LUNs, perform these steps: 1. Validate the host OS and the multipathing software as well as the storage controller software support ALUA. For example, ALUA is not supported for VMware ESX until vSphere 4.0. Check with the host OS vendor for supportability. 2. Check the host system for any script that might be managing the paths automatically and disable it. 3. If using SnapDrive, verify that there are no settings that disable the ALUA set in the configuration file. ALUA is enabled or disabled on the igroup mapped to a LUN on the N series controller. The default ALUA setting in Data ONTAP varies by version and by igroup type. Check the output of the igroup show -v <igroup name> command to confirm the setting. Enabling ALUA on the igroup activates ALUA.
259
260
Part 4
Part
Performing upgrades
This part addresses the design and operational considerations for nondisruptive upgrades on the N series platform. It also provides some high-level example procedures for common hardware and software upgrades. This part contains the following chapters: System NDU Hardware upgrades
261
262
20
Chapter 20.
263
264
a. Customers upgrading from Data ONTAP 7.3.2 can do major version NDU to Data ONTAP 8.0 and 8.1 releases. This is an exception to the guidelines for major version NDU. Customers running Data ONTAP 7.3 or 7.3.1 must do a minor version NDU to 7.3.2 before upgrading directly to 8.1.
265
266
Generally, regardless of the system limits, run a system with processor and disk performance utilization no greater than 50% per storage controller.
Table 20-3 Maximum number of FlexVols for NDU Platform Minor version NDU release family 7.2 N3300 (see note) N3400 N3600 N5300 N6040 N6060 N5600 N6070 N6210 N6240 N6270 N7600 N7700 N7800 N7900 N7550 N7750 N7950 100 N/A 100 150 150 200 250 250 N/A N/A N/A 250 250 250 250 N/A N/A N/A 7.3 150 200 150 150 150 300 300 300 300 300 300 300 300 300 300 N/A N/A N/A 8.0 / 8.1 N/A 200 N/A 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500 Major version NDU release family 7.3 150 200 150 150 150 200 300 300 300 300 300 300 300 300 300 300 300 300 8.0 / 8.1 N/A 200 N/A 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500
Restriction: Major NDU from Data ONTAP 7.2.2L1 to 7.3.1 is not supported on IBM N3300 systems that contain aggregates larger than 8 TB. Therefore, a disruptive upgrade is required. Aggregates larger than 8 TB prevent the system from running a minor version NDU from Data ONTAP 7.2.2L1 to 7.2.x. The maximum FlexVol volume limit of 500 per controller matches the native Data ONTAP FlexVol volume limit. Fields that contain N/A in this column indicate platforms that are not supported by Data ONTAP 8.0.
267
Table 20-4 shows the maximum number of dense volumes, snapshot copies, LUNs, and vFiler units that are supported for NDU.
Table 20-4 Maximum limits for NDU Data ONTAP Dense volumes Snapshot copies FC, SAS storage 500 12,000 12,000 12,000 20,000 20,000 SATA storage 500 4,000 4,000 4,000 20,000 20,000 2,000 2,000 2,000 2,000 2,000 2,000 LUNs vFiler units FC, SAS storage 64 64 64 64 64 64 SATA storage 5 5 5 5 5 5
7.3.x to 8.0 7.3.x to 8.0.1 7.3.x to 8.1 8.0 to 8.0.x 8.0.x to 8.1 8.1 to 8.1.x
20.1.6 Steps for major version upgrades NDU in NAS and SAN environments
The procedural documentation for running an NDU is in the product documentation on the IBM Support site. See the Upgrade and Revert Guide of the product documentation for the destination release of the planned upgrade. For example, when doing an NDU from Data ONTAP 7.3.3. to 8.1, see the Data ONTAP 8.1 7-Mode Upgrade and Revert/Downgrade Guide at: http://www.ibm.com/support/docview.wss?uid=ssg1S7003776
269
270
a. AT-FCX modules incur two 70-second pauses in I/O for all storage (Fibre Channel, SATA) attached to the system. AT-FCz NDU functions are available with the release of Data ONTAP 7.3.2 when using AT-FCX firmware version 37 or later. b. IOM (SAS) modules in a N3000 incur two 40-second pauses in I/O if running firmware versions before 5.0 for all storage (SAS, Fibre Channel, or SATA) attached to the system. For firmware version 5.0 and later, the pauses in I/O are greatly reduced but not completely eliminated.
Normal approach
The storage download shelf process requires 5 minutes to download the code to all A shelf modules. During this time, I/O is allowed to occur. When the download completes, all A shelf modules are rebooted. This process incurs up to a 70-second disruption in I/O for the shelf on both controller modules (when running a firmware version before version 37). This disruption affects data access to the shelves regardless of whether multipath is configured. When the upgrade of the A shelf modules completes, the process repeats for all B modules. It takes 5 minutes to download the code (nondisruptively), followed by up to a 70-second disruption in I/O. The entire operation incurs two separate pauses of up to 70 seconds in I/O to all attached storage, including Fibre Channel if present in the system. Systems employing multipath HA or
271
SyncMirror are also affected. The storage download shelf command is issued only once to perform both A and B shelf module upgrades.
Alternative approach
If your system is configured as multipath HA, the loss of either A or B loops does not affect the ability to serve data. Therefore, by employing another (spare) storage controller, you can upgrade all your AT-FCX modules out of band. You remove them from your production system and put them in your spare system to conduct the upgrade there. The pause in I/O then occurs on the spare (nonproduction) storage controller rather than on the production system. This approach does not eliminate the risk of latent shelf module failure on the systems in which modules are being swapped in. It also has no effect on the risk of running different shelf controller firmware, even if only for a short time.
272
The underlying feature that enables disk firmware NDU, called momentary disk offline, is provided by the option raid.background_disk_fw_update.enable. This option is set to On (enabled) by default. Momentary disk offline is also used as a resiliency feature as part of the error recovery process for abnormally slow or nonresponsive disk drives. Services and data continue to be available throughout the disk firmware upgrade process. Beginning with Data ONTAP 8.0.2, all drives that are members of RAID-DP or RAID 4 aggregates are upgraded nondisruptively in the background. Still, upgrade all disk firmware before doing a Data ONTAP NDU.
273
274
275
276
21
Chapter 21.
277
279
Attention: The high-level procedures described in this section are of a generic nature. They are not intended to be your only guide to performing a software upgrade. For more information about procedures that are specific to your environment, see the IBM support site.
280
Before performing the storage controller NDU, perform the following steps: 1. Validate the high-availability controller configuration. 2. Remove all failed disks to allow giveback operations to succeed. 3. Upgrade the disk and shelf firmware. 4. Verify that system loads are within the acceptable range. The load should be less than 50% on each system. Table 21-1 shows supported NDU upgrade paths.
Table 21-1 Supported high-availability configuration upgrade paths Source 7.2.x 7.3.x Release 7-mode 7-mode Upgrade yes yes Revert yes yes NDU no yes
System requirements
Generally, DOT8 requires you to use 64-bit hardware. Older 32-bit hardware is not supported. At the time of writing, the following systems and hardware are supported: N series: N7900, N7700, N6070, N6060, N6040, N5600, N5300, N3040 Performance acceleration cards (PAM)
281
Revert considerations
The N series does not support NDU for the revert process for DOT 8 7-mode. The following restrictions apply to the revert process: User data is temporarily offline and unavailable during the revert. You must plan when the data is offline to limit the unavailability window and make it fall within the timeout window for the Host attach kits. You must disable DOT 8.x 7-mode features before reverting. 64-bit aggregates and 64-bit volumes cannot be reverted. Therefore, the data must be migrated. You cannot revert while an upgrade is in progress. The revert_to command reminds you of the features that need to be disabled to complete the reversion. FlexVols must be online during the reversion. Space guarantees should be checked after the reversion. You must delete any Snapshots made on Data ONTAP 8.0. You must reinitialize all SnapVault relationships after the revert because all snapshots associated with Data ONTAP 8.0 are deleted. SnapMirror sources must be reverted before SnapMirror destinations are reverted. A revert cannot be nondisruptive, so plan for system downtime. Example 21-1 shows details of the revert_to command.
Example 21-1 revert_to command
TUCSON1> revert_to usage: revert_to [-f] 7.2 (for 7.2 and 7.2.x) revert_to [-f] 7.3 (for 7.3 and 7.3.x) -f Attempt to force revert. TUCSON1> You cannot revert while the upgrade is still in progress. Use the command shown in Example 21-2 to check for upgrade processes that are still running.
Example 21-2 WAFL scan status
TUCSON1> priv set advanced Warning: These advanced commands are potentially dangerous; use them only when directed to do so by IBM personnel. TUCSON1*> wafl scan status Volume vol0: Scan id Type of scan progress 1 active bitmap rearrangement fbn 454 of 1494 w/ max_chain_len 7 ...
282
Example 21-3 shows output from the revert process. First, all 64-bit aggregates were removed, all snapshots were deleted for all volumes and aggregates (the command in Example 21-3) and snapshot schedules were disabled. SnapMirror also had to be disabled. Then the software upgrade command was issued. Finally, the revert_to command was issued. The system rebooted to the firmware level prompt. You are now able to perform a netboot or use the autoboot command.
Example 21-3 The revert process
TUCSON1> snapmirror off ... TUCSON1> snap delete -A -a aggr0 ... TUCSON1> software list 727_setup_q.exe 732_setup_q.exe 8.0RC3_q_image.zip TUCSON1> software update 732_setup_q.exe ... TUCSON1> revert_to 7.3 ... autoboot ... TUCSON1> version Data ONTAP Release 7.3.2: Thu Oct 15 04:39:55 PDT 2009 (IBM) TUCSON1> You can use the netboot option for a fresh installation of the storage system. This installation boots from a Data ONTAP version stored on a remote HTTP or Trivial File Transfer Protocol (TFTP) server. Prerequisites: This procedure assumes that the hardware is functional, and includes a 1 GB CompactFlash card, an RLM card, and a network interface card. Perform the following steps for a netboot installation: 1. Upgrade BIOS if necessary: ifconfig e0c -addr=10.10.123.??? -mask=255.255.255.0 -gw=10.10.123.1 ping 10.10.123.45 flash tftp://10.10.123.45/folder.(system_type).flash 2. Enter one of the following commands at the boot environment prompt: If you are configuring DHCP, enter: ifconfig e0a -auto If you are configuring manual connections, enter: ifconfig e0a -addr=filer_addr -mask=netmask -gw=gateway -dns=dns_addr -domain=dns_domain where: filer_addr is the IP address of the storage system. netmask is the network mask of the storage system. gateway is the gateway for the storage system. dns_addr is the IP address of a name server on your network.
Chapter 21. Hardware and software upgrades
283
dns_domain is the Domain Name System (DNS) domain name. If you use this optional parameter, you do not need a fully qualified domain name in the netboot server URL. You need only the servers host name. 3. Set up the boot environment: set-defaults setenv ONTAP_NG true setenv ntap.rlm.gdb 1 setenv ntap.init.usebootp false setenv ntap.mgwd.autoconf.disable true Depending on N6xxx or N7xxx, set it to e0c for now. You can set it back to e1a later: setenv ntap.bsdportname e0f setenv ntap.bsdportname e0c "a New variable for BR nay be needed." setenv ntap.givebsdmgmtport true #before installing build setenv ntap.givebsdmgmtport false #after installing build "FOR 10-MODE" setenv ntap.init.boot_clustered true ifconfig e0c -addr=10.10.123.??? -mask=255.255.255.0 -gw=10.10.123.1 ping 10.10.123.45 4. Netboot from the loader prompt: netboot http://10.10.123.45/home/bootimage/kernel 5. Enter the NFS root path: 10.10.123.45/vol/home/web/bootimage/rootfs.img The NFS root path is the IP address of an NFS server followed by the export path. 6. Press Ctrl+C to display the Boot menu. 7. Select Software Install (option 7). 8. Enter the URL to install the image: http://10.10.123.45/bootimage/image.tgz Tip: The provided URLs are examples only. Replace them with the URLs for your environment.
Update example
The test environment was composed of two N6070 storage systems, each with a designated EXN4000 shelf. An upgrade is performed from DOT 7.3.7. If a clean installation is required, DOT 8.1 7-mode also supports the netboot process. First, review the current system configuration by using the sysconfig -a command. The output is displayed in Example 21-4.
Example 21-4 sysconfig command
N6070A> sysconfig -a Data ONTAP Release 7.3.7: Thu May 3 04:32:51 PDT 2012 (IBM) System ID: 0151696979 (N6070A); partner ID: 0151697146 (N6070B) System Serial Number: 2858133001611 (N6070A) System Rev: A1 System Storage Configuration: Multi-Path HA System ACP Connectivity: NA slot 0: System Board 2.6 GHz (System Board XV A1) 284
IBM System Storage N series Hardware Guide
Model Name: Machine Type: Part Number: Revision: Serial Number: BIOS version: Loader version: Agent FW version: Processors: Processor ID: Microcode Version: Processor type: Memory Size: Memory Attributes:
N6070 IBM-2858-A21 110-00119 A1 702035 4.4.0 1.8 3 4 0x40f13 0x0 Opteron 16384 MB Node Interleaving Bank Interleaving Hoisting Chipkill ECC OK A Status: Online
To verify the existing firmware level, use the version -b command as shown in Example 21-5.
Example 21-5 version command
n5500-ctr-tic-1> version -b 1:/x86_elf/kernel/primary.krn: OS 7.3.7 1:/backup/x86_elf/kernel/primary.krn: OS 7.3.6P5 1:/x86_elf/diag/diag.krn: 5.6.1 1:/x86_elf/firmware/deux/firmware.img: Firmware 3.1.0 1:/x86_elf/firmware/SB_XIV/firmware.img: BIOS/NABL Firmware 3.0 1:/x86_elf/firmware/SB_XIV/bmc.img: BMC Firmware 1.3 1:/x86_elf/firmware/SB_XVII/firmware.img: BIOS/NABL Firmware 6.1 1:/x86_elf/firmware/SB_XVII/bmc.img: BMC Firmware 1.3 You can also use the license command to verify what software is licensed on the system. This example cannot be shown because of confidentiality. Next, review all necessary documentation including the Data ONTAP Upgrade Guide and Data ONTAP Release Notes for the destination version of Data ONTAP. You can obtain these documents from the IBM support website at: http://www.ibm.com/storage/support/nas
285
The directory /etc/software hosts installable ONTAP releases (Figure 21-1). The installation images have been copied from a Windows client by using the administrative share \\filer_ip\c$.
Starting with DOT 8, software images end with .zip and are no longer .exe or .tar files. The software command must be used to install or upgrade DOT 8 versions. At the time of this writing, only DOT 8.1 7-mode was available. Therefore all tasks were performed using this software version. When the system reboots, press CTRL+C to access the first boot menu. Generally, use the software command. Perform the following steps: 1. Use software get to obtain the Data ONTAP code from an http server. A simple freeware http server is sufficient for smaller environments. 2. Perform a software list to verify that the code is downloaded correctly. 3. Run the software install command with your selected code level. 4. Run the download command. 5. Run reboot to finalize your upgrade. Requirement: The boot loader must be upgraded. Otherwise, Data ONTAP 8 will not load and the previously installed version will continue to boot. Upgrade the boot loader of the system by using the update_flash command as shown in Figure 21-2 on page 287. Attention: Ensure that all firmware is up to date. If you are experiencing long boot times, you can disable the auto update of disk firmware before downloading Data ONTAP by using the following command: options raid.background_disk_fw_update.enable off
286
Next, use the autoboot command and perform another reboot if DOT 8 did not load immediately after the flash update. After the boot process is complete, verify the version by using the version and sysconfig commands as shown in Example 21-6.
Example 21-6 Version and sysconfig post upgrade
N6070A> version Data ONTAP Release 8.1 7-Mode: Wed Apr 25 23:47:02 PDT 2012 N6070A> sysconfig Data ONTAP Release 8.1 7-Mode: Wed Apr 25 23:47:02 PDT 2012 System ID: 0151696979 (N6070A); partner ID: 0151697146 (N6070B) System Serial Number: 2858133001611 (N6070A) System Rev: A1 System Storage Configuration: Multi-Path HA System ACP Connectivity: NA slot 0: System Board Processors: 4 Processor type: Opteron Memory Size: 16384 MB Memory Attributes: Node Interleaving Bank Interleaving Hoisting Chipkill ECC Controller: A Remote LAN Module Status: Online ....
287
288
Part 5
Part
Appendixes
289
290
Appendix A.
Getting started
This appendix provides information to help you document, install, and set up your IBM System Storage N series storage system. This appendix includes the following sections: Preinstallation planning Start with the hardware Power on N series Data ONTAP update Obtaining the Data ONTAP software from the IBM NAS website Installing Data ONTAP system files Downloading Data ONTAP to the storage system Setting up the network using console Changing the IP address Setting up the DNS
291
Preinstallation planning
Successful installation of the IBM System Storage N series storage system requires careful planning. This section provides information about this preparation.
Collecting documents
N series product documentation is available at: https://www-947.ibm.com/support/entry/myportal/overview/hardware/system_storage /network_attached_storage_(nas)/ Collect all documents needed for installing new storage systems: 1. N series information requires unregistered users to complete the one-time registration and then log in to the site using their registered IBM Identity with each visit. Detailed step-by-step instructions for N series registration can be downloaded from: http://www-304.ibm.com/support/docview.wss?uid=ssg1S7003278 2. Prepare the site and requirements of your system. For information about planning for the physical environment where the equipment will operate, see IBM System Storage N series Introduction and Planning Guide, GA32-0543. This planning step includes the physical space, electrical, temperature, humidity, altitude, air flow, service clearance, and similar requirements. Also check this document for rack, power supplies, power requirements, and thermal considerations. 3. Use the hardware guide to install the N series storage system: Installation and Setup instructions for N series storage system, GC26-7784 Hardware and Service Guide for N series storage system, GC26-7785 There are separate cabling instructions for single-node and Active/Active configurations. Further reading: For more information about clustering, see the Cluster Installation and Administration Guide or Active/Active Configuration Guide GC26-7964, for your version of Data ONTAP. 4. For information about how to set up the N series Data ONTAP, see IBM System Storage N series Data ONTAP Software Setup Guide, GC27-2206. This document describes how to set up and configure new storage systems that run Data ONTAP software. To ensure interoperability of third-party hardware, software, and the N series storage system, see the appropriate Interoperability Matrix at: http://www-304.ibm.com/support/docview.wss?uid=ssg1S7003897
292
Link names (physical interface names such as e0, e0a, e5a, or e9b) Number of links (number of physical interfaces to include in the vif) Name of virtual interface (name of vif, such as vif0)
293
Ethernet interfaces
IP address Subnet mask Partner IP address If your storage system is licensed for controller takeover, record the interface name or IP address of the partner that this interface takes over during an Active/Active configuration takeover. Media type (network type) (100tx-fd, tp-fd, 100tx, tp, auto (10/100/1000)). Are jumbo frames supported? MTU size for jumbo frames. Flow control (none, receive, send, full) The default is set to full. The default is set to no for most installations.
IP address Subnet mask Partner IP address Flow control (none, receive, send, full) The default is set to no for most installations. The default is set to full.
Would you like to continue setup through Web interface? You do this through the Setup Wizard. DNS Domain name Server address 1, 2, 3 NIS Domain name Server address 1, 2, 3 Customer contact Primary Name Phone Alternate phone Email address or IBM Web ID Secondary Name Phone Alternate phone Email address or IBM Web ID
294
Machine location
Business name Address City State Country code (value must be two uppercase letters) Postal code
CIFS
Multiprotocol or NTFS only storage system Should CIFS create default etc/passwd and etc/group files? Enter y here if you have a multiprotocol environment. Default UNIX accounts are created that are used when running user mapping. As an alternative to storing this information in a local file, the generic user accounts can be stored in the NIS or LDAP databases. If generic accounts are stored in the local passwd file, mapping of a Windows user to a generic UNIX user and mapping of a generic UNIX user to a Windows user work better than when generic accounts are stored in NIS or LDAP. NIS group caching NIS group caching is used when access is requested to data with UNIX security style. UNIX file and directory style permissions of rwxrwxrwx are used to determine access for both Windows and UNIX clients. This security style uses UNIX group information. Enable? Hours to update the cache
295
User authentication style: 1. Active Directory authentication (Active Directory domains only) 2. Windows NT 4 authentication (Windows NT or Active Directory domains only). 3. Windows workgroup authentication using Storage systems local user accounts 4. etc/password or NIS/LDAP authentication Windows Active Directory Domain. Windows Domain Name Time server names or IP addresses Windows user name Windows user password Local administrator name Local administrator password CIFS administrator or group You can specify an additional user or group to be added to the storage system's local BUILTIN\Administrators group, thus giving them administrative privileges as well.
296
To connect an ASCII terminal console to the N series system, perform these steps: 1. Set the following communications parameters of your system (Table A-2). For example, you can use hyperterminal or PuTTY for Windows users and for Linux users you can use a terminal program like minicom.
Table A-2 Communication parameters Parameter Baud Data bit Parity Stop bits Flow control Setting 9600 8 None 1 None
Tip: See your terminal documentation for information about changing your ASCII console terminal settings.
2. Connect the DB-9 null modem cable to the DB-9 to RJ-45 adapter cable 3. Connect the RJ-45 end to the console port on the N series system and the other end to the ASCII terminal. 4. Connect to the ASCII terminal console.
Power on N series
After you connect all power cords to the power sources, perform these steps: 1. Sequentially power on the N series systems by performing these steps: a. Turn on the power to only the expansion units, making sure that you turn them on within 5 minutes of each other. b. Turn on the N series storage systems. 2. Initialize Data ONTAP. This step provides information if you want to format all disks on a filer and reinstall Data ONTAP. This step can also be used to troubleshoot when a newly purchased storage system cannot find a root volume (vol0) when trying to boot. Otherwise, you can skip this step and continue to step 3. Attention: This procedure removes all data from all disks. a. Turn on the system. b. The system begins to boot. At the storage system prompt, enter the following command: halt
297
The storage system console then displays the boot environment prompt. The boot environment prompt can be CFE> or LOADER>, depending on your storage system. See Example A-1.
Example A-1 N series halt n3300a> halt CIFS local server is shutting down... CIFS local server has shut down... Wed May "halt". 2 03:00:13 GMT [n3300a: kern.shutdown:notice]: System shut down because :
AMI BIOS8 Modular BIOS Copyright (C) 1985-2006, American Megatrends, Inc. All Rights Reserved Portions Copyright (C) 2006 Network Appliance, Inc. All Rights Reserved BIOS Version 3.0X11 ................ Boot Loader version 1.3 Copyright (C) 2000,2001,2002,2003 Broadcom Corporation. Portions Copyright (C) 2002-2006 Network Appliance Inc. CPU Type: Mobile Intel(R) Celeron(R) CPU 2.20GHz LOADER>
c. When the message Press Ctrl C for special menu is displayed, press Ctrl+C to access the special boot menu. See Example A-2.
Example A-2 Boot menu LOADER> boot_ontap Loading:...............0x200000/33342524 0x21cc43c/31409732 0x3fc0a80/2557763 0x42311c3/5 Entry at 0x00200000 Starting program at 0x00200000 cpuid 0x80000000: 0x80000004 0x0 0x0 0x0 Press CTRL-C for special boot menu Special boot options menu will be available. Wed May 2 03:01:27 GMT [fci.initialization.failed:error]: Initialization failed on Fibre Channel adapter 0a. Wed May 2 03:01:27 GMT [fci.initialization.failed:error]: Initialization failed on Fibre Channel adapter 0b. Data ONTAP Release 7.2.4L1: Wed Nov 21 06:07:37 PST 2007 (IBM) Copyright (c) 1992-2007 Network Appliance, Inc. Starting boot on Wed May 2 03:01:12 GMT 2007 Wed May 2 03:01:28 GMT [nvram.battery.turned.on:info]: The NVRAM battery is turned ON. It is turned OFF during system shutdown. Wed May 2 03:01:31 GMT [diskown.isEnabled:info]: software ownership has been enabled for this system
298
d. At the 1-5 special boot menu, choose either option 4 or option 4a. Option 4 creates a RAID 4 traditional volume. Selecting option 4a creates a RAID-DP aggregate with a root FlexVol. The size of the root flexvol is dependant upon platform type. See Example A-3.
Example A-3 Special boot menu (1) (2) (3) (4) (4a) (5) Normal boot. Boot without /etc/rc. Change password. Initialize owned disks (6 disks are owned by this filer). Same as option 4, but create a flexible root volume. Maintenance mode boot.
Selection (1-5)? 4
e. Answer Y to the next two displayed prompts to zero your disks. See Example A-4.
Example A-4 Initializing disks Zero disks and install a new file system? y This will erase all the data on the disks, are you sure? y Zeroing disks takes about 45 minutes. Wed May 2 03:01:47 GMT [coredump.spare.none:info]: No sparecore disk was found. .................................................................................... .................................................................................... ........
Attention: Zeroing disks can take 40 minutes or more to complete. Do not turn off power to the system or interrupt the zeroing process. f. After the disks are zeroed, the system begins to boot. It stops at the first installation question, which is displayed on the console windows: Please enter the new hostname [ ]: See Example A-5.
Example A-5 Initialize complete Wed May 2 03:32:00 GMT [raid.disk.zero.done:notice]: Disk 0c.00.7 Shelf ? Bay ? [NETAPP X286_S15K5146A15 NQ06] S/N [3LN11RGT0000974325E5] : disk zeroing complete Wed May 2 03:32:01 GMT [raid.disk.zero.done:notice]: Disk 0c.00.8 Shelf ? Bay ? [NETAPP X286_S15K5146A15 NQ06] S/N [3LN1322S0000974208ZC] : disk zeroing complete Wed May 2 03:32:02 GMT [raid.disk.zero.done:notice]: Disk 0c.00.1 Shelf ? Bay ? [NETAPP X286_S15K5146A15 NQ06] S/N [3LN11G4G00009742TXB2] : disk zeroing complete Wed May 2 03:32:02 GMT [raid.disk.zero.done:notice]: Disk 0c.00.9 Shelf ? Bay ? [NETAPP X286_S15K5146A15 NQ06] S/N [3LN11RCB00009742TX02] : disk zeroing complete Wed May 2 03:32:09 GMT [raid.disk.zero.done:notice]: Disk 0c.00.10 Shelf ? Bay ? [NETAPP X286_S15K5146A15 NQ06] S/N [3LN1321A0000974209ZM] : disk zeroing complete Wed May 2 03:32:10 GMT [raid.disk.zero.done:notice]: Disk 0c.00.11 Shelf ? Bay ? [NETAPP X286_S15K5146A15 NQ06] S/N [3LN120QE00009742TT87] : disk zeroing complete Wed May 2 03:32:11 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /vol0/plex0/rg0/0c.00.7 Shelf 0 Bay 7 [NETAPP X286_S15K5146A15 NQ06] S/N [3LN11RGT0000974325E5] to volume vol0 has completed successfully Wed May 2 03:32:11 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /vol0/plex0/rg0/0c.00.1 Shelf 0 Bay 1 [NETAPP X286_S15K5146A15 NQ06] S/N [3LN11G4G00009742TXB2] to volume vol0 has completed successfully Wed May 2 03:32:11 GMT [wafl.vol.add:notice]: Volume vol0 has been added to the system. . Appendix A. Getting started
299
g. Complete the initial setup. See Example A-6 for the initial setup. h. Install the full operating system. FilerView can be used after the full operating system is installed. The full installation procedure is similar to the Data ONTAP update procedure. For more information, see Data ONTAP update on page 301. 3. The system begins to boot. Complete the initial setup by answering all the installation questions as in the initial worksheet. For more information, see the IBM System Storage Data ONTAP Software Setup Guide, GA32-0530. See Example A-6 for N3300 setup.
Example A-6 Setup Please enter the new hostname []: n3000a Do you want to configure virtual network interfaces? [n]: Please enter the IP address for Network Interface e0a []: 9.11.218.246 Please enter the netmask for Network Interface e0a [255.0.0.0]: 255.255.255.0 Should interface e0a take over a partner IP address during failover? [n]: Please enter media type for e0a {100tx-fd, tp-fd, 100tx, tp, auto (10/100/1000)} [auto]: Please enter flow control for e0a {none, receive, send, full} [full]: Do you want e0a to support jumbo frames? [n]: Please enter the IP address for Network Interface e0b []: Should interface e0b take over a partner IP address during failover? [n]: Would you like to continue setup through the web interface? [n]: Please enter the name or IP address of the default gateway: 9.11.218.1 The administration host is given root access to the filer's /etc files for system administration. To allow /etc root access to all NFS clients enter RETURN below. Please enter the name or IP address of the administration host: Where is the filer located? []: Tucson Do you want to run DNS resolver? [n]: Do you want to run NIS client? [n]: This system will send event messages and weekly reports to IBM Technical Support. To disable this feature, enter "options autosupport.support.enable off" within 24 hours. Enabling Autosupport can significantly speed problem determination and resolution should a problem occur on your system.
300
2-character country code (Required) []: us Postal code where business resides []: The root volume currently contains 2 disks; you may add more disks to it later using the "vol add" or "aggr add" commands. Now apply the appropriate licenses to the system and install the system files (supplied on the Data ONTAP CD-ROM or downloaded from the NOW site) from a UNIX or Windows host. When you are finished, type "download" to install the boot image and "reboot" to start using the system. Thu May 3 05:33:10 GMT [n3300a: init_java:warning]: Java disabled: Missing /etc/java/rt131.jar. Thu May 3 05:33:10 GMT [dfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk drives Thu May 3 05:33:13 GMT [n3300a: 10/100/1000/e0a:info]: Ethernet e0a: Link up add net default: gateway 9.11.218.1 Thu May 3 05:33:15 GMT [n3300a: httpd_servlet:warning]: Java Virtual Machine not accessible There are 4 spare disks; you may want to use the vol or aggr command to create new volumes or aggregates or add disks to the existing volume. Thu May 3 05:33:15 GMT [mgr.boot.disk_done:info]: Data ONTAP Release 7.2.5.1 boot complete. Last disk update written at Thu Jan 1 00:00:00 GMT 1970 Clustered failover is not licensed. Thu May 3 05:33:15 GMT [cf.fm.unexpectedAdapter:warning]: Warning: clustering is not licensed yet an interconnect adapter was found. NVRAM will be divided into two parts until adapter is removed Thu May 3 05:33:15 GMT [cf.fm.unexpectedPartner:warning]: Warning: clustering is not licensed yet the node once had a cluster partner Thu May 3 05:33:16 GMT [mgr.boot.reason_ok:notice]: System rebooted. Thu May 3 05:33:16 GMT [asup.config.minimal.unavailable:warning]: Minimal Autosupports unavailable. Could not read /etc/asup_content.conf n3300a> Thu May 3 05:33:18 GMT [n3300a: console_login_mgr:info]: root logged in from console
4. Add software licenses by entering the command: license add <license> See Example A-7.
Example A-7 Example NFS license n3300a> license add XXXXXXX n3300a> Wed May 3 23:19:30 GMT [rc:notice]: nfs licensed
5. Always consider updating firmware and Data ONTAP to the preferred version. For more information, see Data ONTAP update on page 301. 6. Do these steps again on the second filer for N series with model A20 or A21.
301
Upgrading Data ONTAP software requires several prerequisites, installing system files, and downloading the software to the system CompactFlash. Required procedures might include the following steps: Update the system board firmware (system firmware). To determine whether your storage system needs a system firmware update, compare the version of installed system firmware with the latest version available. Update the disk firmware. When you update the storage system software, disk firmware is updated automatically as part of the storage system software update process. A manual update is not necessary unless the new firmware is not compatible with the storage system disks. Update the Data ONTAP kernel. The latest system firmware is included with Data ONTAP update packages for CompactFlash-based storage systems. New disk firmware is sometimes included with Data ONTAP update packages. For more information, see the Data ONTAP Upgrade Guide at: http://www.ibm.com/support/docview.wss?uid=ssg1S7001558 There are two methods to upgrade storage systems in an Active/Active configuration: Nondisruptive The nondisruptive update method is appropriate when you need to maintain service availability during system updates. When you halt one node and allow takeover, the partner node continues to serve data for the halted node. Standard The standard update method is appropriate when you can schedule downtime for system updates. Upgrading Data ONTAP for a single node always requires downtime. Remember: Review the Data ONTAP Release Notes and IBM System Storage N series Data ONTAP Upgrade Guide for your version of Data ONTAP at: http://www.ibm.com/support/docview.wss?uid=ssg1S7001558
Obtaining the Data ONTAP software from the IBM NAS website
To obtain Data ONTAP, perform these steps: 1. Log in to IBM Support using a registered user account at: https://www-947.ibm.com/support/entry/myportal/overview/hardware/system_storage /network_attached_storage_%28nas%29/n_series_software/data_ontap 2. Enter a search query for Data ONTAP under Search support and downloads.
302
3. Select the Data ONTAP version. 4. Select the installation kit that you want to download. Check and confirm the license agreement to start downloading the software.
b. Set up the CIFS to install Data ONTAP by entering the following command: cifs setup See Example A-9.
Example A-9 Basic CIFS setup n3300a*> cifs setup This process will enable CIFS access to the filer from a Windows(R) system. Use "?" for help at any prompt and Ctrl-C to exit without committing changes. Your filer does not have WINS configured and is visible only to clients on the same subnet. Do you want to make the system visible via WINS? [n]: A filer can be configured for multiprotocol access, or as an NTFS-only filer. Since NFS, DAFS, VLD, FCP, and iSCSI are not licensed on this filer, we recommend that you configure this filer as an NTFS-only filer (1) NTFS-only filer (2) Multiprotocol filer Selection (1-2)? [1]: 1 CIFS requires local /etc/passwd and /etc/group files and default files will be created. The default passwd file contains entries for 'root', 'pcuser', and 'nobody'. Enter the password for the root user []: Retype the password: The default name for this CIFS server is 'N3300A'. Would you like to change this name? [n]: Data ONTAP CIFS services support four styles of user authentication. Choose the one from the list below that best suits your situation. (1) Active Directory domain authentication (Active Directory domains only) (2) Windows NT 4 domain authentication (Windows NT or Active Directory domains) (3) Windows Workgroup authentication using the filer's local user accounts (4) /etc/passwd and/or NIS/LDAP authentication Selection (1-4)? [1]: 4 What is the name of the Workgroup? [WORKGROUP]: CIFS - Starting SMB protocol... Welcome to the WORKGROUP Windows(R) workgroup CIFS local server is running.
303
n3300a*> cif Wed May 2 04:25:30 GMT [nbt.nbns.registrationComplete:info]: NBT: All CIFS name registrations have completed for the local server.
c. Give share access for C$. This access needs to be set again later for security purposes. Use this command:
cifs access <share> <user|group> <rights>
2. Map the system storage to a drive. You must log in as administrator or log in using an account that has full control on the storage system C$ directory. a. Click Tools Map Network Drive (Figure A-1).
304
c. Enter a user name and password to access the storage system (Figure A-3).
305
3. Run the Data ONTAP installer: a. Go to the drive to which you previously downloaded the software (see Obtaining the Data ONTAP software from the IBM NAS website on page 302). b. Double-click the files that you downloaded. A dialog box is displayed as shown in Figure A-5.
306
c. In the WinZip dialog box, enter the letter of the drive to which you mapped the storage system. For example, if you chose drive Y, replace DRIVE:\ETC with the following path: Y:\ETC See Figure A-6.
d. Ensure that the following check boxes are selected: Overwrite files without prompting When done unzipping open
Leave the options as they are. e. Click Unzip. A window displays the confirmation messages as files are extracted (Figure A-7).
307
308
2. Check whether your system requires a firmware update. At the console of each storage system, enter the following command to compare the installed version of system firmware with the version on the CompactFlash card. To display the version of your current system firmware: sysconfig -a See Example A-12.
Example A-12 sysconfig -a n3300a*> sysconfig -a Data ONTAP Release 7.2.5.1: Wed Jun 25 11:01:02 PDT 2008 (IBM) System ID: 0135018677 (n3300a); partner ID: 0135018673 (n3300b) System Serial Number: 2859138306700 (n3300a) System Rev: B0 slot 0: System Board 2198 MHz (System Board XIV D0) Model Name: N3300 Machine Type: IBM-2859-A20 Part Number: 110-00049 Revision: D0 Serial Number: 800949 BIOS version: 3.0 Processors: 1 Processor ID: 0xf29 Microcode Version: 0x2f Memory Size: 896 MB NVMEM Size: 128 MB of Main Memory Used CMOS RAM Status: OK Controller: B ...
309
4. Shut down the system by using the halt command. After the storage system shuts down, the firmware boot environment prompt is displayed (Example A-14).
Example A-14 Halting process n3300a*> halt CIFS local server is shutting down... waiting for CIFS shut down (^C aborts)... CIFS local server has shut down... Thu May 3 05:51:54 GMT [kern.shutdown:notice]: System shut down because : "halt". AMI BIOS8 Modular BIOS Copyright (C) 1985-2006, American Megatrends, Inc. All Rights Reserved Portions Copyright (C) 2006 Network Appliance, Inc. All Rights Reserved BIOS Version 3.0 ................ Boot Loader version 1.3 Copyright (C) 2000,2001,2002,2003 Broadcom Corporation. Portions Copyright (C) 2002-2006 Network Appliance Inc. CPU Type: Mobile Intel(R) Celeron(R) CPU 2.20GHz LOADER>
5. From the environmental prompt, you can update your firmware by using the update_flash command. 6. At the firmware environment boot prompt, enter bye to reboot the system. The reboot uses the new software and, if applicable, the new firmware (Example A-15).
Example A-15 Rebooting the system LOADER> bye AMI BIOS8 Modular BIOS Copyright (C) 1985-2006, American Megatrends, Inc. All Rights Reserved Portions Copyright (C) 2006 Network Appliance, Inc. All Rights Reserved BIOS Version 3.0 ..................
Restriction: In Data ONTAP 7.2 and later, disk firmware updates for RAID 4 aggregates must complete before the new Data ONTAP version can finish booting. Storage system services are not available until the disk firmware update completes. 7. Check the /etc/messages and sysconfig -v outputs to verify that the updates were successful.
310
Prerequisite: You must be connected to the console to use this command. If you are connected by telnet, the connection will be terminated after running the ifconfig command. See Example A-17.
Example A-17 Changing network IP n3300a> ifconfig e0a 9.11.218.147 netmask 255.255.255.0 n3300a> netstat -in Name Mtu Network Address Ipkts Ierrs Opkts e0a 1500 9.11.218/24 9.11.218.147 33k 0 13k e0b* 1500 none none 0 0 0 lo 8160 127 127.0.0.1 52 0 52
Oerrs 0 0 0
Collis Queue 0 0 0 0 0 0
3. If you want this IP address to be persistent after the N series is rebooted, update the /etc/hosts for IP address changes in the associated interface. For netmask and other network parameters, update the /etc/rc. You can modify this file from the N series console, CIFS, or NFS. The example uses a CIFS connection to update these files. See Figure A-10.
311
2. Update/confirm the DNS domain name with the following commands: To display the current DNS domain name: options dns.domainname To update the DNS domain name (Example A-18): options dns.domainname <domain name>
Example A-18 Updating DNS domain name #---check the dns domainname--n3300a> options dns.domainname dns.domainname (value might be overwritten in takeover) #---update n3300a> options dns.domainname itso.tucson.ibm.com
312
You are changing option dns.domainname which applies to both members of the cluster in takeover mode. This value must be the same in both cluster members prior to any takeover or giveback, or that next takeover/giveback may not work correctly. Sun May 6 03:41:01 GMT [n3300a: reg.options.cf.change:warning]: Option dns.domainname changed on one cluster node. n3300a> options dns.domainname dns.domainname itso.tucson.ibm.com (value might be overwritten in takeover)
3. Check that the DNS is already enabled by using the dns info command (Example A-19): options dns.enable on
Example A-19 Enabling DNS n3300a> dns info DNS is disabled n3300a> n3300a> n3300a> options dns.enable on Sun May 6 03:50:06 GMT [n3300a: reg.options.overrideRc:warning]: Setting option dns.enable to 'on' conflicts with /etc/rc that sets it to 'off'. ** Option dns.enable is being set to "on", but this conflicts ** with a line in /etc/rc that sets it to "off". ** Options are automatically persistent, but the line in /etc/rc ** will override this persistence, so if you want to make this change ** persistent, you will need to change (or remove) the line in /etc/rc. You are changing option dns.enable which applies to both members of the cluster in takeover mode. This value must be the same in both cluster members prior to any takeover or giveback, or that next takeover/giveback may not work correctly. Sun May 6 03:50:06 GMT [n3300a: reg.options.cf.change:warning]: Option dns.enable changed on one cluster node. n3300a> n3300a> n3300a> dns info DNS is enabled DNS caching is enabled 0 0 0 0 0 cache hits cache misses cache entries expired entries cache replacements
IP Address State Last Polled Avg RTT Calls Errs ------------------------------------------------------------------------------------------------------------9.11.224.114 NO INFO 0 0 0 9.11.224.130 NO INFO 0 0 0
313
4. To make this change persistent after filer reboot, update the /etc/rc to ensure that the name server exists as shown in Figure A-13.
314
Appendix B.
Operating environment
This appendix provides information about the Physical environment and operational environment specifications of N series controller and disk shelves/ This appendix includes the following sections: N3000 entry-level systems N3400 N3220 N3240 N6000 mid-range systems N6210 N6240 N6270 N7000 high-end systems N7950T N series expansion shelves EXN1000 EXN3000 EXN3500 EXN4000
315
N3400
Physical Specifications IBM System Storage N3400 Width: 446 mm (17.6 in) Depth: 569 mm (22.4 in) Height: 88.5 mm (3.5 in) Weight: 19.5 kg (43.0 lb) - Model A11 Weight: 21.5 kg (47.4 lb) - Model A21 Weight: Add 0.8 kg (1.8 lb) for each SAS drive Weight: Add 0.65 kg (1.4 lb) for each SATA drive Operating Environment Temperature: Maximum range: 10 - 40 degrees C (50 - 104 degrees F) Recommended: 20 - 25 degrees C (68 - 77 degrees F) Non-operating: -40 - 70 degrees C (-40 - 158 degrees F) Relative humidity: Maximum operating range: 20% - 80% (non-condensing) Recommended operating range: 40% - 55% Non-operating range: 10% - 95% (non-condensing) Maximum wet bulb: 28 degrees C Maximum altitude: 3050 m (10,000 ft.) Warning: Operating at extremes of environment can increase failure probability. Wet bulb (caloric value): 853 Btu/hr Maximum electrical power: 100-240 V ac, 10-4 A per node, 47-63 Hz Nominal electrical power: 100 - 120 V ac, 4 A; 200 - 240 V ac, 2 A 50-60 Hz Noise level: 54 dBa @ 1 m @ 23 degrees C 7.2 bels @ 1 m @ 23 degrees C
316
N3220
Physical Specifications IBM System Storage N3220 Model A12/A22 Width: 44.7 cm (17.61 in.) Depth: 61.9 cm (24.4 in.) with cable management arms 54.4 cm (21.4 in.) without cable management arms Height: 8.5 cm (3.4 in.) Weight: 25.4 kg (56 lb) (two controllers) Operating Environment Temperature: Maximum range: 10 - 40 degrees C (50 - 104 degrees F) Recommended: 20 - 25 degrees C (68 - 77 degrees F) Non-operating: -40 - 70 degrees C (-40 - 158 degrees F) Relative humidity: Maximum operating range: 20% - 80% (non-condensing) Recommended operating range: 40% - 55% Non-operating range: 10% - 95% (non-condensing) Maximum wet bulb: 29 degrees C Maximum altitude: 3050 m (10,000 ft.) Warning: Operating at extremes of environment can increase failure probability. Wet bulb (caloric value): 2270 Btu/hr Maximum electrical power: 100-240 V ac, 8-3 A per node, 50-60 Hz Nominal electrical power: 100-120 V ac, 16 A; 200-240 V ac, 6 A, 50-60 Hz Noise level: 66 dBa @ 1 m @ 23 degrees C 7.2 bels @ 1 m @ 23 degrees C
N3240
Physical Specifications IBM System Storage N3240 Model A14/A24 Width: 44.9 cm (17.7 in.) Depth: 65.7 cm (25.8 in.) with cable management arms 65.4 cm (25.7 in.) without cable management arms Height: 17.48 cm (6.88 in.) Weight: 45.4 kg (100 lb) Operating Environment Temperature:
317
Maximum range: 10 - 40 degrees C (50 - 104 degrees F) Recommended: 20 - 25 degrees C (68 - 77 degrees F) Non-operating: -40 - 70 degrees C (-40 - 158 degrees F) Relative humidity: Maximum operating range: 20% - 80% (non-condensing) Recommended operating range: 40% - 55% Non-operating range: 10% - 95% (non-condensing) Maximum wet bulb: 29 degrees C Maximum altitude: 3000 m (10,000 ft.) Warning: Operating at extremes of environment can increase failure probability. Wet bulb (caloric value): 2270 Btu/hr Maximum electrical power: 100-240 V ac, 8-3 A per node, 50-60 Hz Nominal electrical power: 100-120 V ac, 16 A; 200-240 V ac, 6 A, 50-60 Hz Noise level: 66 dBa @ 1 m @ 23 degrees C 7.2 bels @ 1 m @ 23 degrees C
N6210
Physical Specifications IBM System Storage N6240 Models C10, C20, C21, E11, and E21 Width: 44.7 cm (17.6 in.) Depth: 71.3 cm (28.1 in.) with cable management arms 65.5 cm (25.8 in.) without cable management arms Height: 13 cm (5.12 in.) (times 2 for E21) Operating environment Temperature: Maximum range: 10 - 40 degrees C (50 - 104 degrees F) Recommended: 20 - 25 degrees C (68 - 77 degrees F) Non-operating: -40 - 70 degrees C (-40 - 158 degrees F) Relative humidity: Maximum operating range: 20% - 80% (non-condensing) Recommended operating range: 40% - 55%
318
Non-operating range: 10% - 95% (non-condensing) Maximum wet bulb: 28 degrees C Maximum altitude: 3050 m (10,000 ft.) Warning: Operating at extremes of environment can increase failure probability. Wet bulb (caloric value): 1553 Btu/hr Maximum electrical power: 100-240 V ac, 12-8 A per node, 50-60 Hz Nominal electrical power: 100-120 V ac, 4.7 A; 200-240 V ac, 2.3 A, 50-60 Hz Noise level: 55.5 dBa @ 1 m @ 23 degrees C 7.5 bels @ 1 m @ 23 degrees C
N6240
Physical Specifications IBM System Storage N6240 Models C10, C20, C21, E11, and E21 Width: 44.7 cm (17.6 in.) Depth: 71.3 cm (28.1 in.) with cable management arms 65.5 cm (25.8 in.) without cable management arms Height: 13 cm (5.12 in.) (times 2 for E21) Operating environment Temperature: Maximum range: 10 - 40 degrees C (50 - 104 degrees F) Recommended: 20 - 25 degrees C (68 - 77 degrees F) Non-operating: -40 - 70 degrees C (-40 - 158 degrees F) Relative humidity: Maximum operating range: 20% - 80% (non-condensing) Recommended operating range: 40% - 55% Non-operating range: 10% - 95% (non-condensing) Maximum wet bulb: 28 degrees C Maximum altitude: 3050 m (10,000 ft.) Warning: Operating at extremes of environment can increase failure probability. Wet bulb (caloric value): 1553 Btu/hr Maximum electrical power: 100-240 V ac, 12-8 A per node, 50-60 Hz Nominal electrical power: 100-120 V ac, 4.7 A; 200-240 V ac, 2.3 A, 50-60 Hz
319
N6270
Physical Specifications N6270 Models C22, E12, and E22 Width: 44.7 cm (17.6 in.) Depth: 71.3 cm (28.1 in.) with cable management arms 64.6 cm (25.5 in.) without cable management arms Height: 13 cm (5.12 in.) (times 2 for E22) Operating Environment Temperature: Maximum range: 10 - 40 degrees C (50 - 104 degrees F) Recommended: 20 - 25 degrees C (68 - 77 degrees F) Non-operating: -40 - 70 degrees C (-40 - 158 degrees F) Relative humidity: Maximum operating range: 20% - 80% (non-condensing) Recommended operating range: 40% - 55% Non-operating range: 10% - 95% (non-condensing) Maximum wet bulb: 28 degrees C Maximum altitude: 3050 m (10,000 ft.) Warning: Operating at extremes of environment can increase failure probability. Wet bulb (caloric value): 1847 Btu/hr Maximum electrical power: 100-240 V ac, 12-8 A per node, 50-60 Hz Nominal electrical power: 100-120 V ac, 4.7 A; 200-240 V ac, 2.3 A, 50-60 Hz Noise level: 55.5 dBa @ 1 m @ 23 degrees C 7.5 bels @ 1 m @ 23 degrees C
320
N7950T
Physical Specifications IBM System Storage N7950T Model E22 Width: 44.7 cm (17.6 in.) Depth: 74.6 cm (29.4 in.) with cable management arms 62.7 cm (24.7 in.) without cable management arms Height: 51.8 cm (20.4 in.) Weight: 117.2 kg (258.4 lb) Operating Environment Temperature: Maximum range: 10 - 40 degrees C (50 - 104 degrees F) Recommended: 20 - 25 degrees C (68 - 77 degrees F) Non-operating: -40 - 70 degrees C (-40 - 158 degrees F) Relative humidity: Maximum operating range: 20% - 80% (non-condensing) Recommended operating range: 40% - 55% Non-operating range: 10% - 95% (non-condensing) Maximum wet bulb: 28 degrees C Maximum altitude: 3050 m (10,000 ft.) Warning: Operating at extremes of environment can increase failure probability. Wet bulb (caloric value): 2270 Btu/hr Maximum electrical power: 100-240 V ac, 12-7.8 A per node, 50-60 Hz Nominal electrical power: 100-120 V ac, 6.9 A; 200-240 V ac, 3.5 A, 50-60 Hz Noise level: 66 dBa @ 1 m @ 23 degrees C 8.1 bels @ 1 m @ 23 degrees C
EXN1000
Because the EXN1000 was withdrawn from the market and is no longer being sold, it is not covered in this book.
321
EXN3000
Physical Specifications EXN3000 SAS/SATA expansion unit Width: 448.7 mm (17.7 in) Depth: 653.5 mm (25.7 in) Height: 174.9 mm (6.9 in) Weight (minimum configuration): 24 kg (52.8 lb) Weight (maximum configuration): 44.6 kg (98.3 lb) Operating Environment Temperature: Maximum range: 10 - 40 degrees C (50 - 104 degrees F) Recommended: 20 - 25 degrees C (68 - 77 degrees F) Non-operating: -40 - 70 degrees C (-40 - 158 degrees F) Relative Humidity: Maximum operating range: 20% - 80% (non-condensing) Recommended operating range: 40% - 55% Non-operating range: 10% - 95% (non-condensing) Maximum wet bulb: 28 degrees C Maximum altitude: 3045 m (10,000 ft.) Warning: Operating at extremes of environment can increase failure probability. Wet bulb (caloric value): 2,201 Btu/hr (fully loaded shelf, SAS drives) 1,542 Btu/hr (fully loaded shelf, SATA drives) Maximum electrical power: 100 - 240VAC, 16-6 A (8-3A max per inlet) Nominal electrical power: 100 - 120VAC, 6 A; 200 - 240VAC, 3 A, 50/60 Hz (SAS drives) 100 - 120VAC, 4.4 A; 200 - 240VAC, 2.1 A, 50/60 Hz (SATA drives) Noise level: 5.7 bels @ 1 m @ 23 degrees C (SATA drives) idle 6.0 bels @ 1 m @ 23 degrees C (SAS drives) idle 6.7 bels @ 1 m @ 23 degrees C (SATA drives) operating 7.0 bels @ 1 m @ 23 degrees C (SAS drives) operating
EXN3500
Physical Specifications EXN3500 SAS expansion unit Width: 447.2 mm (17.6 in) Depth: 542.6 mm (21.4 in) Height: 85.3 mm (3.4 in) 322
IBM System Storage N series Hardware Guide
Weight (minimum configuration, 0 HDDs): 17.6 kg (38.9 lb) Weight (maximum configuration, 24 HDDs): 22.3 kg (49 lb) Operating Environment Temperature: Maximum range: 10 - 40 degrees C (50 - 104 degrees F) Recommended: 20 - 25 degrees C (68 - 77 degrees F) Non-operating: -40 - 70 degrees C (-40 - 158 degrees F) Relative humidity: Maximum operating range: 20% - 80% (non-condensing) Recommended operating range: 40 - 55% Non-operating range: 10% - 95% (non-condensing) Maximum wet bulb: 28 degrees C Maximum altitude: 3050 m (10,000 ft.) Warning: Operating at extremes of environment can increase failure probability. Wet bulb (caloric value): 1,724 Btu/hr (fully loaded shelf) Maximum electrical power: 100 - 240VAC, 12-5.9 A Nominal electrical power: 100 - 120VAC, 3.6 A; 200 - 240VAC 1.9 A, 50/60 Hz Noise level: 6.4 bels @ 1 m @ 23 degrees C
EXN4000
Physical Specifications EXN4000 FC expansion unit Width: 447 mm (17.6 in) Depth: 508 mm (20.0 in) Height: 133 mm (2.25 in) Weight: 35.8 kg (78.8 lb) Operating environment Temperature: Maximum range: 10 - 40 C (50 - 104 F) Recommended: 20 - 25 C (68 - 77 F) Non-operating: -40 - 65 C (-40 - 149 F) Relative humidity: 10 - 90 (percent, non-condensing) Wet bulb (caloric value): 1,215 Btu/hr (fully loaded shelf) Electrical power: 100-120/200-240 V ac, 7/3.5 A, 50/60 Hz Noise level:
Appendix B. Operating environment
323
324
Appendix C.
Useful resources
This appendix provides links to important online resources: N series to NetApp model reference Interoperability matrix
325
Interoperability matrix
The IBM System Storage N series Interoperability matrixes help you select the best combination of integrated storage technologies. This information helps reduce expenses, increase efficiency, and expedite storage infrastructure implementation. The information in the matrixes is intended to aid in the design of high-quality solutions for leading storage platforms. It is also intended to help reduce solution design time. With it, you can identify supported combinations of N series systems with the following items: Tape drives and libraries Storage subsystems and storage management Middleware and virus protection software System management software Other tested independent software vendor (ISV) applications The interoperability matrix is available at: http://www-304.ibm.com/support/docview.wss?uid=ssg1S7003897
326
Related publications
The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this book.
IBM Redbooks
The following IBM Redbooks publications provide additional information about the topic in this document. Note that some publications referenced in this list might be available in softcopy only. IBM System Storage N series Software Guide, SG24-7129 IBM System Storage N series MetroCluster, REDP-4259 IBM N Series Storage Systems in a Microsoft Windows Environment, REDP-4083 IBM System Storage N series A-SIS Deduplication Deployment and Implementation Guide, REDP-4320 IBM System Storage N series with FlexShare, REDP-4291 Managing Unified Storage with IBM System Storage N series Operation Manager, SG24-7734 Using an IBM System Storage N series with VMware to Facilitate Storage and Server Consolidation, REDP-4211 Using the IBM System Storage N series with IBM Tivoli Storage Manager, SG24-7243 IBM System Storage N series and VMware vSphere Storage Best Practices, SG24-7871 IBM System Storage N series with VMware vSphere 4.1, SG24-7636 IBM System Storage N series with VMware vSphere 4.1 using Virtual Storage Console 2, REDP-4863 Introduction to IBM Real-time Compression Appliances, SG24-7953 Designing an IBM Storage Area Network, SG24-5758 Introduction to Storage Area Networks, SG24-5470 IP Storage Networking: IBM NAS and iSCSI Solutions, SG24-6240 Storage and Network Convergence Using FCoE and iSCSI, SG24-7986 IBM Data Center Networking: Planning for Virtualization and Cloud Computing, SG24-7928. Using the IBM System Storage N series with IBM Tivoli Storage Manager, SG24-7243 You can search for, view, download, or order these documents and other Redbooks, Redpapers, Web Docs, draft and additional materials, at the following website: ibm.com/redbooks
327
Other publications
These publications are also relevant as further information sources: Network-attached storage: http://www.ibm.com/systems/storage/network/ IBM support: Documentation: http://www.ibm.com/support/entry/portal/Documentation IBM Storage Network Attached Storage: Resources: http://www.ibm.com/systems/storage/network/resources.html IBM System Storage N series Machine Types and Models (MTM) Cross Reference: http://www-304.ibm.com/support/docview.wss?uid=ssg1S7001844 IBM N Series to NetApp Machine type comparison table: http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105042 Interoperability matrix: http://www-304.ibm.com/support/docview.wss?uid=ssg1S7003897
Online resources
These websites are also relevant as further information sources: IBM NAS support website http://www.ibm.com/storage/support/nas/ NAS product information http://www.ibm.com/storage/nas/ IBM Integrated Technology Services http://www.ibm.com/planetwide/
328
Index
Symbols
SyncMirror 141 reconstruct status 143 /etc/hosts file 311 /etc/rc 314 /etc/resolv.conf 312 /etc/sanitization.log 166 /etc/software 286 /sbin/init program 251 asymmetrical standard active/active configuration 81 ATA-based 131 ATTO 123 autoboot command 287 Automated Deployment System (ADS) 244 automatic failover capability 103 AutoSupport email 281
B
backed-up data images 181 back-end operations 98 Back-end Switches 79 background operations 181 backup mechanisms 181 backup processes 180 backup/recovery 5 BIOS 221, 224, 230231, 283 configuration utility 236 routines 224 setup 240242 program 240 bit error 131, 133 block protocols 179 block-level 19 Boot Configuration Data (BCD) 245 boot device 225, 232 boot images 220 Boot menu 284 boot options 199 Boot Port Name 239 boot sector 225 Boot tab 243 boot_backup 202 boot_ontap 202 boot_primary 202 BootBIOS 225227, 229 firmware 228 bundles 9 bus resets 222 business continuity 101, 103 solutions 105 byte patterns 166
Numerics
1 GB CompactFlash card 283 4 Gbps capability 53 4 KB block 135 4-Gb FC-VI adapter 78 4-Gbps FC-VI 80 4-port SAS-HBA controllers 60 64-bit aggregates 282283 6500N 123 8-Gb FC-VI adapter 78 8-Gbps FC-VI 80
A
ACP 20, 66 cabling 66 rules 65 connections 65 firmware upgraded 274 active/active configuration 69, 73, 82, 103, 182, 280, 302 nodes 98 pairs 73 takeover wizard 9495 active/passive configuration 81 Adapter Port Name field 228 Adapter Settings 228 administration 195 methods 196 aggregate 130, 132, 142, 144 introduction 129 Alternative Control Path see ACP ALUA 255, 258 enable 259 explained 258 limitations 259 architecture compatibility 75 array 131132 array LUN 71 failure 76 ASCII terminal console 296297 ASIC chips 61 Asymmetric Logical Unit Access see ALUA asymmetrical configurations 81
C
C$ directory 304 cabling 59 cache management 155 campus-level distance 105 capacity density 131 per square foot 132 CAT6 65 CD-ROM 243 centralized administration 220
329
centralized monitoring 198 cf 7677, 80 basic operations 98 cf disable command 89 cf forcetakeover -d command 78, 104 cf giveback command 91, 122 cf status command 82, 8990, 92 cf takeover command 89, 91 cf_remote 78, 80 CFE 202, 298 CFE-prompt 89 cfo -d command. 119 changing network configuration 310 checking cluster status 99 cifs restart command 201 cifs sessions command 200201 cifs terminate command 200201 CLI 196 cloned LUN 220 cluster configuration best practices 72, 81 status and management 69 cluster failover see cf Cluster remote 103 Cluster_Remote license 104 clustered controllers 20 clustering eliminating single points of failure 86 local 182 reasons for failover 98 command reboot 82 command-line interface see CLI Common Internet File System (CIFS) 76, 198, 303 clients 204 license 303 message settings 201 service 200 services 200 shutdown messages 200 CompactFlash card 199, 202, 281, 308309 compliance standards 164 compression 6 configuration worksheet 189 consistency point 151 controller failover 73 core business responsibilities 165 core dump 88 core installation 244 counterpart nodes 69 CP 152 CPU utilization 177
D
daisy-chained 61 data access methods 5 confidentiality 164 drives 133 fault tolerance 129130 data confidentiality 164
Data ONTAP 56, 73, 81, 85, 101, 182, 292 command-line interface 87 disk sanitization feature 163 FilerView 87, 92 installer 306 installing 303 from Windows client 303 obtaining 302 software 292 update 280, 302 update procedure 300, 309310 upgrade image 281 version 7.2 310 version 7G 69 version 8.0 282 version 8.x 69 Data ONTAP 8 supported systems 12 data protection strategies 103 data synchronization 181 data volume copies 77 DB-9 null modem cable 297 dedicated interfaces 8384 dedicated network interface 84 deduplication 6, 150, 154 degraded mode 131 DHCP 283 diagonal block 138 diagonal parity 134, 136, 138 disk 136, 144 stripes 136, 143 sum 136 diagonal parity stripe 136 dialog 307 direct-attached storage (DAS) 178 direct-attached systems 73 disabling takeover 88 disaster recovery 103 procedures 183 process 220 disk 47 disk drive technology 130 disk failure event 135 disk firmware 272, 280, 302 disk pool assignments 76 disk sanitization 163 feature 165 disk sanitize abort command 168 disk sanitize release command 168 disk sanitize start command 166 -c option 166 disk shelf loops 71 disk shelves 47 disk structure 149 distributed application-based storage 19 DNS see Domain Name Service (DNS) Domain Name Service (DNS) 181, 284, 312 domain name 312 setup 312313 DOT 8 286
330
DOT 8 7-mode 281 DOT 8.0 7-mode features 282 double parity 134 double-disk failure 134, 137, 143, 145 recovery scenario 129 scenario 141 double-parity RAID 134 DP 136 dparity 142 DSM 212 dual gigabit Ethernet ports 21
E
efficiency features 10 electronic data security 165 Emulex 233 BIOS Utility 225, 229 LP6DUTIL.EXE 229 encryption 9, 55 enterprise environments 178 entry-level 13 environment specifications 315 ESH module 53 ESH4 53 Exchange server 180 exclusive OR 135 EXN1000 21 EXN2000 loops 53 EXN3000 48, 60 cabling 60 drives 50 shelf 20, 60 technical specification 50 EXN3500 50 cabling 60 drives 53 technical specification 53 EXN4000 5354 disk shelves 6667 drives 54 FC storage expansion unit 53 technical specification 55 expansion units 21, 47 ExpressNAV 127 external SAN storage 221
Fast!UTIL 227 fault tolerance 70, 132, 220 FCoE 208 FCP 219 SAN boot 220 fcstat device_map 86 FC-VI adapter 78 FCVI card 32 Fibre Channel 4, 76, 105, 108, 208 aggregate 78 devices 239 media type 211 queue depth 222 SAN 4 SAN booting 219 single fabric topologies 73 storage 76 switches 103 FibreBridge 123 administration 127 architecture 124 cabling 125 environmental specifications 128 ExpressNAV 127 management 127 MetroCluster 124 ports 125 self combinations 124 specifications 125 Filer booting 202 halting 202 FilerView 96, 142, 196, 300 interface 202 Flash Cache 8, 150, 154 algorithm 160 function 158 module 158 FlexCache 7, 154 FlexClone 7, 150 FlexScale 157 FlexShare 7, 150 FlexVols 7, 72, 130, 180, 220 forcetakeover command 104 Full Disk Encryption 9
G
Gateway 7, 9 gigabit Ethernet interfaces 177 giveback 86, 97 global hot spare disk 131 growth rate 180 GUI 244
F
fabric switch 119 interconnects 120 fabric-attached 78 Fabric-attached MetroCluster 7879 configurations 111 nodes 79 failover 84, 98, 182 connectivity 98 disk mismatch 98 events 84 performance 182
H
HA Configuration Checker 72 HA interconnect 71 adapters 76 HA pair 6972, 74, 8081, 98
Index
331
capability 82 configuration 73, 80, 82, 8485, 88 checker 81 interconnect cables 85 management 89 controller 64 units 81 nodes 81, 88 status 69 storage system 81 configuration 81 types 74 halt command 203 hard drive sanitization practices 165 hardware overview 5 HBA 53, 112 BIOS 243 device driver 249 files 249 driver 243 floppy disk 243 headless system 296 health care industry 164 heartbeat signal 74 heterogeneous unified storage 4 high availability (HA) 74 features 220 high-end 35 high-end value 3 high-performance SAS 21 home directories 178 horizontal row parity 138 solution 136 Host Adapter BIOS 238 Host Utilities Kit see HUK hot fix 220 hot spare disk 145 hot spare disks 129, 138, 145146 hot-pluggable 177 HTTP 202, 283 HUK 207 command line 214 components 208 defined 208 Fibre Channel HBAs and switches 211 functionalities 209 host configuration 209 iSCSI 211 LUN configuration 209 media type 211 multipath I/O software 212 parameters used 215 supported operating environment 208 Windows installation 209 hyperterminal 297
IA32 architecture 224 IBM 69 IBM BIOS setup 241 IBM LUN 252 IBM System Manager for N series 87 IBM System Storage N series 130, 145 data protection with RAID DP 129 IBM System Storage N3400 19 IBM/Brocade Fibre Channel switches 7879 igroups 216, 225 ALUA 259 initial setup 300 nitiator group 216 install from a Windows client 303 installation checklist 188 installation planning 292 Intel 32-bit 228 Interchangeable servers 220 interconnect cable 98 Interface Group 83 internal interconnect 70 internal structure 134 interoperability matrix 326 inter-switch link (ISL) 78 IOM 61 IOM A circle port 61 IOM B circle port 61 IOMs 61, 6465 IP address 83, 283 changing 311 iSCSI 178, 208, 215 direct-attached topologies 74 initiators 211 network-attached topologies 74 SAN 221 target add 217
J
journaling write 151
K
Key Management Interoperability Protocol 56 key manager 57
L
LAN interfaces 83 large RAID group advantages 133 versus smaller 133 latency 171 license command 285 license type 82 licensed protocol 85 licensing structure 10 Linux install CD 252 SAN boot 252 Linux-based operating systems 221
I
I/O load 179 per user 179 I/O sizes 171
332
load balancing 178 local node 73, 86 console 82 LUN 7071, 216, 243 access 217 igroup mapping 217 number 233 setup 216 LUN 0 244
multipathing 212, 255 I/O 256 native solutions 258 software 221 software options 257 third-party solutions 257 multiple disk failures 134 multiple I/O paths 179 MultiStore 7
M
mailbox disks 72 limit 179 man command 197 management policies 198 mathematical theorems 137 MBR 225, 251 Metadata 155 MetroCluster 7, 31, 78, 101102 and N series failure 119 configuration 7879, 101 fabric-attached 108 cabling 111 feature 101 Fibre Bridge 123 host failure 119 interconnect failure 120 N62x0 configuration 31 N62x0 FCVI card 31 overview 102 site failure 121 site recovery 122 stretch 105 stretch cabling 107 synchronous mirroring with SyncMirror 112 SyncMirror 115 TI zones 116 Microsoft Cluster Services 223 Microsoft Exchange 179 server 179 Microsoft Server 2008 roles and features functions 250 Microsoft Windows Client settings 203 Microsoft Windows XP 198 mid-range 23 mirrored active/active configuration 74 mirrored HA pairs 76 mirrored volumes 77, 80 mirroring process 132 mission-critical applications 101 environments 179 MMC 198 modern disk architectures 131 MPIO 212, 215 MS Exchange 179 MTTDL 133 formula 134 multipath storage 75, 81
N
N series expansion unit failure 119 hardware 5 registration 292 starting the system 199 stopping the system 199 storage system administration 195 System Manager tool 187 N3000 14, 21, 33, 43 family hardware specification 19, 2122 hardware 19, 21 N3220 14 N3240 16 N3300 21, 78 setup 300 N3400 1920 N3700 5 N5000 5, 80 N5600 284 N6040 80 N6210 27, 80 N6240 27 N62x0 MetroCluster 31 N7000 5 N7600 80 N7900 281 N7950T 37 SFP+ modules 43 NDU 21, 263, 281282 ACP 274 CIFS 265 compression 266 deduplication 266 disk firmware 272 FC 265 hardware requirements 266 iSCSI 265 limits 266 major version 269 NFS 265 prepare 268 RLM 275 shelf firmware 270 supported 264 NearStore 7 NetApp model reference 326 netboot 202, 284 server URL 284 Index
333
netmask 311 Network File System (NFS) 4, 178 clients 204 root path 284 server 284 network interface cards (NICs) 76 network mask 283 network-attached storage solutions 178 networking bandwidth 181 NFS see Network File System (NFS) Non Disruptive Update (NDU) see NDU nondisruptive software upgrades 70 update method 280, 302 non-disruptive expansion 4 nondistruptive 280, 302 nonvolatile memory 70 nonvolatile random access memory see NVRAM NTFS 244 NVMEM 70 NVRAM 7071, 74, 149151, 153, 202 adapter 70 failure 98 operation 152 virtualization 153
worksheets 294296 plex 7677 power on procedure 297301 prefetch data 155 prerequisites for installation 188 primary boot device 232, 234 primary image 202 protection tools 182 provisioning 220
Q
QLogic BootBIOS 236 driver 243 Fast!UTIL 225, 229, 236 qtrees 198 quad-port Ethernet card 65 SAS 60 HBA 60 QuickNAV 127
R
RAID 129, 131, 133 array 130 Double Parity 129 group 133134, 137, 145146 configurations 134 size 133 type 142 protection schemes with single-parity using larger disks 131 reliability 132 technology 131 RAID 1 132 mirror 132 mirroring 132 RAID 4 129, 134135, 137, 142143, 145, 152, 220 construct 136 group 134, 137, 143 horizontal parity 134 Horizontal Row Parity 135 row parity 144 row parity disks 143 storage 145 traditional volumes 299 RAID-DP 8, 129130, 133141, 143145, 150, 152, 172 adding double-parity stripes 136 configuration 133 construct 134, 136 data protection 132 double parity 129, 134 group 134, 144 need for 129, 131 operation 141 operation summary 141 overview 129, 133 protection levels 133 reconstruction 129, 137
O
OnCommand 7, 198 online storage resource 5 operating system pagefile 222 Operations Manager 87 optimum path selection 256 overview 4 hardware 5
P
pagefile 223 access 223 paging operations 222 PAM 158, 281 panicked partner 88 parallel backup operations 181 parallel SCSI 244 parity 131 parity disk 135 partner command 91 partner filer 75 partner node 7071, 8586, 88, 98 passive node 81 pattern 0x55 166 PC BIOS 225 PC Compact Flash card 202 PCI Device Boot Priority option 242 Performance Acceleration Module see Flash Cache Phoenix BIOS 242 planning pre-installation 169 primary issues 170
334
volume 143 random data 166 random I/O 171 raw capacity 172 read caching 150, 153 details 154 read-to-write ratio 180 reboot 204 reconstruct data 132, 135, 137, 141, 146 reconstruction 131, 133, 137, 141 times 131 recovery site 220 recovery time objectives 103 Red Hat Enterprise Linux 5.2 250 Redbooks website 327 Contact us xv Remote LAN module see RLM remote node 74 resiliency to failure 181 re-syncing of mirrors 120 reversion 282 revert_to command 282 RHEL5 222 RLM 72, 275 firmware 275 root volume 104, 146 row parity 141 component 134
S
SAN 198, 220 boot 219220, 229 boot configuration 220 device 222 environment 281 SAN-based storage 220 sanitization 6, 166 method 165 sanitize command 166 SAS 1921, 64 cabling 61 connections 60, 64 drive bays 21 firmware 21 HBA 60 shelf 6263 connectivity 64 interconnect 61 SATA 78 drives 19 scalable storage 4 script installer 307 SCSI card 251 SCSI-3 storage standards 244 SCSI-based 131 second parity disk 143 second stage boot loader 251 secondary disk failure 131 SecureAdmin 8 SED 55
key manager 57 overview 55 Self-Encrypting Disk see SED sequential operations 171 sequential reads 155 sequential workload 53 Serial-Attached SCSI see SAS service clearance 292 shared boot images 244 shared interfaces 83 shared loops 81 shared network interface 83 shelf firmware 270 upgrade 271 shelf technology 48 simultaneous backup streams 180 simultaneous failure 146 simultaneous SnapMirror threads 181 single controller 21 Single Mailbox Recovery for Exchange see SMBR single parity RAID 130 single power outage 72 single row parity disk 135 single storage controller configurations 74 single-node 292 single-parity 129 RAID 129132, 134135 solution 130 site disaster 78, 104, 121 small RAID group advantages 133 small random reads 155 smaller arrays 132 SMBR 8 SnapDrive 8 SnapLock 8 SnapManager 8 SnapMirror 8, 171 operations 181 option 177 sources 282 SnapRestore 8 recovery 187 Snapshot 9, 175176, 178, 181, 220 data integrity 222 facilities 178 protection consideration 175 technology 176 SnapVault 9 relationships 282 software licensing structure 10 space guarantees 282 spare disk 72 pool 144 spare servers 182 SPOFs 86 standard HA pair 70, 74 configuration 74 standard update method 280, 302 standby interfaces 83
Index
335
standby network interface 84 Startup Sequence options 242 STOP errors 223 STOR Miniport driver 243 storage controller 143 deployment 137 efficiency technologies 154 environment 5 HBA 224 I/O paths 181 infrastructure 3 resources 5 system software 280, 302 STORport 224 Stretch MetroCluster 77 nodes 77 simplified 77 striping 130 strong data protection 5 switch bandwidth 222 configuration 79 synchronous mirroring 101 synchronously mirrored aggregates 104 SyncMirror 9, 72, 76, 103, 105, 133 local 115 setup 101 without MetroCluster 115 syncmirror_local 7778, 80 SyncMirrored plex 115 sysconfig -r command 113 sysconfig -v outputs 310 system board firmware 280, 302 system buffer cache 154 system crash dumps 223 system firmware 280, 302 System Manager 93, 9697, 198 key features 198 release 1.1 198 system memory 150 System x machine 251 series 221 Servers with Red Hat Enterprise Linux 5.2 219
time zone 179 Tivoli Key Lifecycle Manager 9, 57 TKLM see Tivoli Key Lifecycle Manager total cost of ownership (TCO) 132 storage option 133 traditional SAN 116 traditional volumes 130, 142143, 145 triple disk failure 134 troubleshooting 193 two disk failure 134
U
uncorrectable bit errors 130 unified storage 4, 21 update 308 update_flash 310 update_flash command 286 upgrade 263, 278 controller head 279 Data ONTAP 279 disk shelf 278 hardware 278 non-disruptive 263 PCI adapter 278 usable capacity 172 USENIX 137
V
Veritas Storage Foundation 218 versatile-single integrated architecture 4 version -b command 285 VIF 72, 83 virtual interfaces (VIFs) 81 virtual volumes 130 vol status command 142 volume 133
W
WAFL 148149, 152153 data objects 148 features 148 file system 174 impact of 174 Windows 2003 243 CD-ROM 243 Enterprise for System x Servers 219 Enterprise SP2 243 installation 243 Windows 2008 Enterprise Server for System x Servers 219 server 244 Windows operating system 244, 255 boot 244 domain 72 installation 244 pagefile 222 Windows Server 2003 243 Windows Server 2008
T
takeover 81, 88, 98 condition 90 mode 7172, 91 takeover/giveback 87 TCO 132 TCP/IP 178 third-party backup facilities 181 backup tool 181 disposal 164 SCSI array driver 243 storage 76 TI zones 116
336
full installation option 246 installation media 246 R2 245 setup 250 workload mix 170 worksheets 294296 Worldwide Port Name (WWPN) see WWPN Write Anywhere File Layout see WAFL write caching 151 write workloads 154 WWPN 214, 225226, 228, 234
X
x86 environment 251 XOR 136
Index
337
338
Back cover
Select the right N series hardware for your environment Understand N series unified storage solutions Take storage efficiency to the next level
This IBM Redbooks publication provides a detailed look at the features, benefits, and capabilities of the IBM System Storage N series hardware offerings. The IBM System Storage N series systems can help you tackle the challenge of effective data management by using virtualization technology and a unified storage architecture. The N series delivers low- to high-end enterprise storage and data management capabilities with midrange affordability. Built-in serviceability and manageability features help support your efforts to increase reliability; simplify and unify storage infrastructure and maintenance; and deliver exceptional economy. The IBM System Storage N series systems provide a range of reliable, scalable storage solutions to meet various storage requirements. These capabilities are achieved by using network access protocols such as Network File System (NFS), Common Internet File System (CIFS), HTTP, and iSCSI, and storage area network technologies such as Fibre Channel. Using built-in Redundant Array of Independent Disks (RAID) technologies, all data is protected with options to enhance protection through mirroring, replication, Snapshots, and backup. These storage systems also have simple management interfaces that make installation, administration, and troubleshooting straightforward. In addition, this book also addresses high-availability solutions including clustering and MetroCluster supporting highest business continuity requirements. MetroCluster is a unique solution that combines array-based clustering with synchronous mirroring to deliver continuous availability.