Also, the database can be started and stopped fine using SQL*Plus. No problems there.
$ srvctl start database -d HLPR
PRKP-1001 : Error starting instance HLPR1 on node db01-lisqa
CRS-0215: Could not start resource 'ora.HLPR.HLPR1.inst'.
PRKP-1001 : Error starting instance HLPR2 on node db02-lisqa
CRS-0215: Could not start resource 'ora.HLPR.HLPR2.inst'.
PRKP-1001 : Error starting instance HLPR3 on node db03-lisqa
CRS-0215: Could not start resource 'ora.HLPR.HLPR3.inst'.
PRKP-1001 : Error starting instance HLPR4 on node db04-lisqa
CRS-0215: Could not start resource 'ora.HLPR.HLPR4.inst'.
$
As seen above, it is a 4-node RAC database.
2 issues here
1) REQUIRED_RESOURCE missing for one of the nodes
Doing a CRS_STAT on the above resources, shows that the REQUIRED_RESOURCES for node 4 instance (HLPR4) is missing. For the other 3 nodes the ASM dependency is defined. It is already verified that ASM is up and running on the 4th node (also there are other RAC databases on the same cluster that do not have any issues).
$ crs_stat -p ora.HLPR.HLPR1.inst|grep REQUIRED
REQUIRED_RESOURCES=ora.db01-lisqa.ASM1.asm
$ crs_stat -p ora.HLPR.HLPR2.inst|grep REQUIRED
REQUIRED_RESOURCES=ora.db02-lisqa.ASM2.asm
$ crs_stat -p ora.HLPR.HLPR3.inst|grep REQUIRED
REQUIRED_RESOURCES=ora.db03-lisqa.ASM3.asm
$ crs_stat -p ora.HLPR.HLPR4.inst|grep REQUIRED
REQUIRED_RESOURCES=
$
Fix
Add the REQUIRED_RESOURCE using "srvctl modify". First lets look at the syntax (-h for help) -
$ srvctl modify instance -h
Usage: srvctl modify database -d
-d
-n
-o
-m
-p
-r
-s
-y
-h Print usage
$ srvctl modify instance -d HLPR -i HLPR4 -s +ASM4
$
$ crs_stat -p ora.HLPR.HLPR4.inst|grep REQUIRE
REQUIRED_RESOURCES=ora.db04-lisqa.ASM4.asm
$
2) SPFILE was incorrectly defined in the OCR
The imon log for the database under
$ cd $ORACLE_HOME/log/
$ vi imon_HLPR.log
...
...
2009-03-19 20:36:02.729: [ RACG][1] [8400][1][ora.HLPR.HLPR1.inst]: racgimon exiting
2009-03-19 20:43:34.680: [ RACG][1] [12432][1][ora.HLPR.HLPR1.inst]: racgimon started
2009-03-19 20:43:37.137: [ RACG][6] [12432][6][ora.HLPR.HLPR1.inst]:
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Mar 19 20:43:36 2009
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Enter user-name: Connected to an idle instance.
SQL> ORA-01565: error in identifying file '+RAC_REDO02/HLPR/spfileHLPR.ora'
2009-03-19 20:43:37.137: [ RACG][6] [12432][6][ora.HLPR.HLPR1.inst]: ORA-17503: ksfdopn:2 Failed to open file
+RAC_REDO02/HLPR/spfileHLPR.ora
ORA-15056: additional error message
ORA-17503: ksfdopn:DGOpenFile05 Failed to open file +RAC_REDO02/hlpr/spfile
ORA-01078: failure in processing system parameters
SQL> Disconnected
This shows that the SPFILE was incorrectly defined in the OCR
Fix
Modified OCR for the correct location of the SPFILE -
$ cat $ORACLE_HOME/dbs/initHLPR1.ora
spfile='+HLPR_DATA/HLPR/PARAMETERFILE/spfile.286.640265465'
$
$ srvctl modify database -d HLPR -p +HLPR_DATA/HLPR/PARAMETERFILE/spfile.286.640265465
$
This fixed the OCR and database could then be started using SRVCTL.
There is also one more issue that I noticed today in the listener configuration w/ CRS. On node4, the listener was up under db_home (typically it is seen under ASM home). Doing a "ps -ef|grep tns" will tell what OH is being used for the listener home. Also, the crs_stat -p output shows that.
[pwqa@db04-lisqa /home/pwqa] $ ps -ef|grep tns
pwqa 1982 1 0 May 13 ? 2:06 /d01/app/pwqa/product/10.2/pwqadb/bin/tnslsnr LISTENER_DB04-LISQA -inherit
pwqa 1965 1 0 May 13 ? 5:49 /d01/app/pwqa/product/10.2/pwqadb/bin/tnslsnr LISTENER_CGQA_DB04-LISQA -inherit
pwqa 2007 1 0 May 13 ? 4:33 /d01/app/pwqa/product/10.2/pwqadb/bin/tnslsnr LISTENER_HLPR_DB04-LISQA -inherit
pwqa 2041 1 0 May 13 ? 5:43 /d01/app/pwqa/product/10.2/pwqadb/bin/tnslsnr LISTENER_PWQA_DB04-LISQA -inherit
pwqa 2024 1 0 May 13 ? 6:33 /d01/app/pwqa/product/10.2/pwqadb/bin/tnslsnr LISTENER_PLQA_DB04-LISQA -inherit
pwqa 22092 23479 0 11:18:42 pts/1 0:00 grep tns
As we can see from above, listener "
[pwqa@db04-lisqa /home/pwqa] $ crs_stat -p ora.db04-lisqa.LISTENER_DB04-LISQA.lsnr
NAME=ora.db04-lisqa.LISTENER_DB04-LISQA.lsnr
TYPE=application
ACTION_SCRIPT=/d01/app/pwqa/product/10.2/pwqadb/bin/racgwrap
ACTIVE_PLACEMENT=0
AUTO_START=1
CHECK_INTERVAL=600
DESCRIPTION=CRS application for listener on node
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
HOSTING_MEMBERS=db04-lisqa
OPTIONAL_RESOURCES=
PLACEMENT=restricted
REQUIRED_RESOURCES=ora.db04-lisqa.vip
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=600
START_TIMEOUT=0
STOP_TIMEOUT=0
UPTIME_THRESHOLD=7d
USR_ORA_ALERT_NAME=
USR_ORA_CHECK_TIMEOUT=0
USR_ORA_CONNECT_STR=/ as sysdba
USR_ORA_DEBUG=0
USR_ORA_DISCONNECT=false
USR_ORA_FLAGS=
USR_ORA_IF=
USR_ORA_INST_NOT_SHUTDOWN=
USR_ORA_LANG=
USR_ORA_NETMASK=
USR_ORA_OPEN_MODE=
USR_ORA_OPI=false
USR_ORA_PFILE=
USR_ORA_PRECONNECT=none
USR_ORA_SRV=
USR_ORA_START_TIMEOUT=0
USR_ORA_STOP_MODE=immediate
USR_ORA_STOP_TIMEOUT=0
USR_ORA_VIP=
[pwqa@db04-lisqa /home/pwqa] $
The action script clearly points to the database home.
To change this, we again use the "srvctl modify" command.
[pwqa@db04-lisqa /home/pwqa] $ srvctl modify listener -h
Usage: srvctl modify listener -n
-n
-l "
-o
-h Print usage
[pwqa@db04-lisqa /home/pwqa] $
[pwqa@db04-lisqa /home/pwqa] $ srvctl modify listener -n db04-lisqa -l LISTENER_DB04-LISQA -o /d01/app/pwqa/product/10.2/ASM
[pwqa@db04-lisqa /home/pwqa] $ crs_stat -p ora.db04-lisqa.LISTENER_DB04-LISQA.lsnr
NAME=ora.db04-lisqa.LISTENER_DB04-LISQA.lsnr
TYPE=application
ACTION_SCRIPT=/d01/app/pwqa/product/10.2/ASM/bin/racgwrap
ACTIVE_PLACEMENT=0
AUTO_START=1
CHECK_INTERVAL=600
DESCRIPTION=CRS application for listener on node
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
HOSTING_MEMBERS=db04-lisqa
OPTIONAL_RESOURCES=
PLACEMENT=restricted
REQUIRED_RESOURCES=ora.db04-lisqa.vip
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=600
START_TIMEOUT=0
STOP_TIMEOUT=0
UPTIME_THRESHOLD=7d
USR_ORA_ALERT_NAME=
USR_ORA_CHECK_TIMEOUT=0
USR_ORA_CONNECT_STR=/ as sysdba
USR_ORA_DEBUG=0
USR_ORA_DISCONNECT=false
USR_ORA_FLAGS=
USR_ORA_IF=
USR_ORA_INST_NOT_SHUTDOWN=
USR_ORA_LANG=
USR_ORA_NETMASK=
USR_ORA_OPEN_MODE=
USR_ORA_OPI=false
USR_ORA_PFILE=
USR_ORA_PRECONNECT=none
USR_ORA_SRV=
USR_ORA_START_TIMEOUT=0
USR_ORA_STOP_MODE=immediate
USR_ORA_STOP_TIMEOUT=0
USR_ORA_VIP=
[pwqa@db04-lisqa /home/pwqa] $
The CRS registration has been changed.
Now we only need to stop the listener and restart it under the ASM home.
1 comment:
Hi Ravi,
We are also facing same issue with 2node cluster. 2Node cluster are not coming up due spfile parameter value.
Your document helped me to troubleshoot my issue.
Thanks for putting good docs here.
Jitu
Post a Comment