Friday, May 15, 2009

Re-IP RAC hosts

Recently we went thru the exercise of doing a re-IP of our RAC hosts. There are white papers and other documentation available on this but the steps (syntactically) weren't clear if you were using "IPMP" (multi-pathing) on the public and private networks.

Here is a step-by-step approach that we followed -

Configuration
4-node RAC
Sun Solaris 10
Oracle 10.2.0.4

Steps
1) Check current nodeapps configurations

pwqa@db01-lisqa /home/pwqa] $ srvctl config nodeapps -n db01-lisqa -a
VIP exists.: /vip-db01-lisqa.ctn/192.168.222.44/255.255.255.0/ce0:ce2
[pwqa@db01-lisqa /home/pwqa] $ srvctl config nodeapps -n db02-lisqa -a
VIP exists.: /vip-db02-lisqa.ctn/192.168.222.45/255.255.255.0/ce0:ce2
[pwqa@db01-lisqa /home/pwqa] $ srvctl config nodeapps -n db03-lisqa -a
VIP exists.: /vip-db03-lisqa.ctn/192.168.222.46/255.255.255.0/ce0:ce2
[pwqa@db01-lisqa /home/pwqa] $ srvctl config nodeapps -n db04-lisqa -a
VIP exists.: /vip-db04-lisqa.ctn/192.168.222.47/255.255.255.0/ce0:ce2


As we can see, the old VIP are 192.168.222.[44-47] for our 4 node QA servers. Interfaces "ce0" and "ce2" are used for the public.


2) Stop databases, ASM instance on all the nodes.

[pwqa@db01-lisqa /home/pwqa] $ srvctl stop database -d PWQA
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop database -d PLQA
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop database -d CGQA
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop database -d HLPR
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop asm -n db01-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop asm -n db02-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop asm -n db03-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop asm -n db04-lisqa



3) Stop Nodeapps on all 4 nodes

[pwqa@db01-lisqa /home/pwqa] $ srvctl status nodeapps -n db01-lisqa
VIP is running on node: db01-lisqa
GSD is running on node: db01-lisqa
Listener is running on node: db01-lisqa
ONS daemon is running on node: db01-lisqa

[pwqa@db01-lisqa /home/pwqa] $ srvctl stop nodeapps -n db01-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop nodeapps -n db02-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop nodeapps -n db03-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl stop nodeapps -n db04-lisqa

[pwqa@db01-lisqa /home/pwqa] $ srvctl status nodeapps -n db01-lisqa
VIP is not running on node: db01-lisqa
GSD is not running on node: db01-lisqa
Listener is not running on node: db01-lisqa
ONS daemon is not running on node: db01-lisqa



4) Disable databases and ASM instance (so they don't automatically restart on a CRS restart)
[pwqa@db01-lisqa /home/pwqa] $ srvctl disable database -d HLPR
[pwqa@db01-lisqa /home/pwqa] $ srvctl disable database -d CGQA
[pwqa@db01-lisqa /home/pwqa] $ srvctl disable database -d PLQA
[pwqa@db01-lisqa /home/pwqa] $ srvctl disable database -d PWQA
[pwqa@db01-lisqa /home/pwqa] $ srvctl disable asm -n db01-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl disable asm -n db02-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl disable asm -n db03-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl disable asm -n db04-lisqa


[Note: I believe only disabling ASM should be sufficient since ASM is a dependency for the database resource (could be verified with crs_stat-p command)]
Make sure that nothing else is running except the CRS.


5) Perform the IP and DNS changes at the OS level. No reboot of servers needed at this time.


6) From any one node (preferably node 1), check the new VIPs (/etc/hosts) and the nodeapps configuration for the 4 RAC nodes
[Note: It might show the new VIP address for the local node from where you're working - we still need to run the "srvctl modify" to change the VIPs]

[pwqa@db01-lisqa /home/pwqa] $ grep vip /etc/hosts|grep lisqa
156.30.179.98 vip-db01-lisqa.ctn vip-db01-lisqa
156.30.179.102 vip-db02-lisqa.ctn vip-db02-lisqa
156.30.179.106 vip-db03-lisqa.ctn vip-db03-lisqa
156.30.179.110 vip-db04-lisqa.ctn vip-db04-lisqa

[pwqa@db01-lisqa /home/pwqa] $ srvctl config nodeapps -n db01-lisqa -a
VIP exists.: /vip-db01-lisqa.ctn/156.30.179.98/255.255.255.0/ce0:ce2
[pwqa@db01-lisqa /home/pwqa] $ srvctl config nodeapps -n db02-lisqa -a
VIP exists.: /vip-db02-lisqa.ctn/192.168.222.45/255.255.255.0/ce0:ce2
[pwqa@db01-lisqa /home/pwqa] $ srvctl config nodeapps -n db03-lisqa -a
VIP exists.: /vip-db03-lisqa.ctn/192.168.222.46/255.255.255.0/ce0:ce2
[pwqa@db01-lisqa /home/pwqa] $ srvctl config nodeapps -n db04-lisqa -a
VIP exists.: /vip-db04-lisqa.ctn/192.168.222.47/255.255.255.0/ce0:ce2


7) Run SRVCTL MODIFY to change the VIP addresses. Make sure to use the correct netmask.
Netmask could be found out from running the "ifconfig -a" command --
[pwqa@db01-lisqa /home/pwqa] $ ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1 inet 127.0.0.1 netmask ff000000
ce0: flags=1000843 mtu 1500 index 2 inet 156.30.179.97 netmask fffffe00 broadcast 156.30.179.255 groupname main
...

The netmask here is fffffe00 (255.255.254.0).

Execute the SRVCTL MODIFY command for all nodes (from any one node) -

[pwqa@db01-lisqa /home/pwqa] $ sudo srvctl modify nodeapps -n db01-lisqa -A 156.30.179.98/255.255.254.0/"ce0|ce2"
[pwqa@db01-lisqa /home/pwqa] $ sudo srvctl modify nodeapps -n db02-lisqa -A 156.30.179.102/255.255.254.0/"ce0|ce2"
[pwqa@db01-lisqa /home/pwqa] $ sudo srvctl modify nodeapps -n db03-lisqa -A 156.30.179.106/255.255.254.0/"ce0|ce2"
[pwqa@db01-lisqa /home/pwqa] $ sudo srvctl modify nodeapps -n db04-lisqa -A 156.30.179.110/255.255.254.0/"ce0|ce2"



8) Modify the public IP using "oifcfg"
From any node (preferably node 1),

[pwqa@db01-lisqa /home/pwqa] $ . oraenv
ORACLE_SID = [CRS] ?
[pwqa@db01-lisqa /home/pwqa] $
[pwqa@db01-lisqa /home/pwqa] $ oifcfg getif
ce0 192.168.222.0 global public
ce2 192.168.222.0 global public


[Note: The private interfaces are not shown here as they are configured using IPMP]
We need to change from the old network (192.168..) to the new network (156.30..)

[pwqa@db01-lisqa /home/pwqa] $ oifcfg delif -global ce0
[pwqa@db01-lisqa /home/pwqa] $ oifcfg delif -global ce2
[pwqa@db01-lisqa /home/pwqa] $ oifcfg setif -global ce0/156.30.179.0:public
[pwqa@db01-lisqa /home/pwqa] $ oifcfg setif -global ce2/156.30.179.0:public



9) Reboot the servers (all 4 nodes) and verify things using "srvctl config nodeapps -n -a" and "oifcfg getif". They should reflect the right IPs now.


10) Update listener.ora (if it has any hard coded IP addresses) and enable ASM and database.

[pwqa@db01-lisqa /home/pwqa] $ srvctl enable asm -n db01-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl enable asm -n db02-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl enable asm -n db03-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl enable asm -n db04-lisqa
[pwqa@db01-lisqa /home/pwqa] $ srvctl enable database -d PWQA
[pwqa@db01-lisqa /home/pwqa] $ srvctl enable database -d PLQA
[pwqa@db01-lisqa /home/pwqa] $ srvctl enable database -d CGQA
[pwqa@db01-lisqa /home/pwqa] $ srvctl enable database -d HLPR


Finally, make sure anyone tnsnames.ora entries with hard coded IP addresses are updated.