Wednesday, May 18, 2011

Netapp upgrade 8.0.1 7 mode

So the upgrade to 8.0.1, why would you want to goto 8.0.1 (general) here is a list. Please check other sources besides Netapp as well.

https://now.netapp.com/NOW/knowledge/docs/ontap/rel801/html/ontap/rnote/GUID-5A3E2A88-F6FA-4473-8127-81ED0C39FE89.html


If already at a 8.0 (like an RC) this should be easy enough, but run the upgrade adivisor.

Step 1: 
-Download 801_q_image.tgz (from link above) and copy to the netapp filers in /etc/software. Use CIFS to mount /etc$ from the netapp
-On netapp run sofware list to confirm that the filer sees the new image
Step 2:
-Check netapp syslog messages for errors or issues on filers, stop and confirm it is ok to go ahead with upgrade with Netapp support. Suggestion is to replace failed HDDs before starting upgrade.
Step 3:
-Generate autosupport with non-disruptive as syntax:  options autosupport.doit starting_NDU
Step 4:
From the filer backup to a machine other than the netapp filer
/etc/hosts
/etc/rc 
Step 5:
On filer run 'version -b', which will show the current firmwares and ontap version on the cfcard:

/cfcard/x86_64/diag/diag.krn:  5.4.7
/cfcard/x86_64/firmware/excelsio/firmware.img: Firmware 1.7.0
/cfcard/x86_64/firmware/DrWho/firmware.img: Firmware 2.4.0
/cfcard/x86_64/firmware/SB_XV/firmware.img: Firmware 4.3.0
/cfcard/x86_64/firmware/SB_XVI/firmware.img: Firmware 5.0.0
/cfcard/x86_64/firmware/SB_XVIII/firmware.img: Firmware 7.0.1
/cfcard/boot/loader: Loader 1.7
/cfcard/common/firmware/zdi/zdi_fw.zpk: PAM II Firmware 1.8 (Build )
/cfcard/common/firmware/zdi/zdi_fw.zpk: X1936A FPGA Configuration PROM 1.0 (Build 0x201558)

Step 6 (Do this on both filers):

We start the upgrade. Cifs will break and NFS may or may not.
netappA*> software update 801_q_image.tgz -r
software: You can cancel this operation by hitting Ctrl-C in the next 6 seconds.
software: Depending on system load, it may take many minutes
software: to complete this operation. Until it finishes, you will
software: not be able to use the console.
cmd = ngsh -c system image update -node local -package file://localhost/mroot/etc/software/801_q_image.tgz -setdefault true
Software update started on node calnetapp1a. Updating image2 package: file://localhost/mroot/etc/software/801_q_image.tgz current image: image1
Listing package contents.
Untarring package contents.
Invoking script (validation phase).
INSTALL running in check only mode
Mode of operation is UPDATE
Current image is image1
Alternate image is image2
Available space on boot device is 594 MB
Required  space on boot device is 379 MB
Kernel binary matches install machine type
Package MD5 checksums pass
Versions are compatible
Invoking script (install phase). This may take up to 30 minutes.
Mode of operation is UPDATE
Current image is image1
Alternate image is image2
Available space on boot device is 594 MB
Required  space on boot device is 379 MB

Package MD5 checksums pass
Versions are compatible
Getting ready to install image
Directory /cfcard/x86_64/freebsd/image2 created
Syncing device...
Extracting to /cfcard/x86_64/freebsd/image2...
x BUILD
x CHECKSUM
x COMPAT.TXT
x INSTALL
x README.TXT
x VERSION
x cap.xml
x diags.tgz
x kernel
x perl.tgz
x platform.ko
x platfs.img
x rootfs.img
Installed MD5 checksums pass
Installing diagnostics and firmware files
Installation complete. image2 updated on node calnetapp1a
image2 has been set as the default
software: installation of 801_q_image.tgz completed.
Please type "reboot" for the changes to take effect.


Step 7: 
After installing the above on both filers
netappA> cf takeover [netappB will reboot at this time]
---------cut---------------
Wed Apr 13 22:24:55 PDT [netappA (takeover): cf.fm.takeoverComplete:notice]: Failover monitor: takeover completed
Wed Apr 13 22:24:55 PDT [netappA (takeover): cf.fm.takeoverDuration:info]: Failover monitor: takeover duration time is 6 seconds
Wed Apr 13 22:25:01 PDT [netappA (takeover): monitor.globalStatus.critical:CRITICAL]: This node has taken over netappB.
--------cut----------------

Step 8:
Netapp suggests waiting about 10 minutes between cf takeover and giveback.
netappA> cf giveback -f
--------cut----------------
Wed Apr 13 22:37:04 PDT [netappA: cf.fm.givebackComplete:notice]: Failover monitor: giveback completed
Wed Arp 13 22:37:04 PDT [netappA: cf.fm.givebackDuration:notice]: Failover monitor: giveback duration time is 3 seconds
--------cut----------------
-Run version on the filer that just rebooted and confirm the new version.

Step 9:
Do the same steps for the other filer in the cluster.

Step 10:
Check services, I found CIFS had to be restarted.