How i can add Virtual ip for the groups in cluster service .
Thanks
Regards
How i can add Virtual ip for the groups in cluster service .
Thanks
Regards
Solaris 10 ,SFRAC 5.1sp1rp2
===========
after reboot:
Hi
After created the resource and trying to going online for resource this command came (
Regards
SFHA5.1sp1rp3 on Aix,
Customer has 8 processors, cpu usage 97%, gab panic system, is it normal?
From common point, one processor maybe busy, why not use other processor?800% mean all cpu used up?
========
Hello Guys,
I need to find the rootcause for the service failure in our veritas cluster.
service groups didnot failover to other node.
Below are the logs as i can see all this strated with NIcs failure and IPMULTINICB resource going faulty.
if anyone can help me here
engine logs
====================================
2013/05/17 12:41:41 VCS INFO V-16-2-13075 (pk-ercoss1) Resource(ossfs_ip) has reported unexpected OFFLINE 1 times, which is still within the Tol
eranceLimit(1).
2013/05/17 12:41:42 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys pk-ercoss1
2013/05/17 12:41:42 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System pk-ercoss1
2013/05/17 12:41:42 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:42 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-6-0 (pk-ercoss1) resfault:(resfault) Invoked with arg0=pk-ercoss1, arg1=pub_mnic, arg2=ONLINE
2013/05/17 12:41:43 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=pub_mnic, arg2=ONLINE
2013/05/17 12:41:43 VCS INFO V-16-0 (pk-ercoss1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss1 ,arg2=pub_mnic
2013/05/17 12:41:43 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=pub_mnic
2013/05/17 12:41:43 VCS INFO V-16-6-15002 (pk-ercoss1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss1 pub_mnic ONLIN
E successfully
2013/05/17 12:41:43 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 pub_mnic ONLIN
E successfully
2013/05/17 12:41:43 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:41:43 VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS NOTICE V-16-1-10446 Group PubLan is offline on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss2 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss1 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears
. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/05/17 12:41:43 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on pk-ercoss1 (VCS initiated)
2013/05/17 12:41:43 VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10446 Group PubLan is offline on system pk-ercoss1
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss2 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss1 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears
. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:(postoffline) Invoked with arg0=pk-ercoss2, arg1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=pk-ercoss2, arg
1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:(postoffline) Invoked with arg0=pk-ercoss1, arg1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=pk-ercoss1, arg
1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline.sh:PubLan:Nothing done
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group Pu
bLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline.sh:PubLan:Nothing done
2013/05/17 12:41:44 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline pk-ercoss2 PubLan su
ccessfully
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group Pu
bLan
2013/05/17 12:41:44 VCS INFO V-16-6-15002 (pk-ercoss1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline pk-ercoss1 PubLan su
ccessfully
2013/05/17 12:41:49 VCS INFO V-16-2-13075 (pk-ercoss1) Resource(snmp_ip) has reported unexpected OFFLINE 1 times, which is still within the Tole
ranceLimit(1).
2013/05/17 12:41:50 VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:50 VCS NOTICE V-16-1-10300 Initiating Offline of Resource stop_sybase (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:41:50 VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:50 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=syb1_p1, arg2=ONLINE
2013/05/17 12:41:50 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=ossfs_p1, arg2=ONLINE
2013/05/17 12:41:50 VCS INFO V-16-10001-88 (pk-ercoss2) Application:stop_sybase:offline:Executed [/ericsson/core/cluster/scripts/stop_sybase.sh
stop] successfully.
2013/05/17 12:41:50 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=syb1_p1
2013/05/17 12:41:50 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=ossfs_p1
2013/05/17 12:41:50 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 syb1_p1 ONLINE
successfully
2013/05/17 12:41:50 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 ossfs_p1 ONLIN
E successfully
2013/05/17 12:41:53 VCS INFO V-16-1-10305 Resource stop_sybase (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:41:53 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) on Syst
em pk-ercoss2
2013/05/17 12:42:00 VCS NOTICE V-16-20018-26 (pk-ercoss2) SybaseBk:masterdataservice_BACKUP:offline:Sybase Backup service masterdataservice_BACK
UP has been stopped
2013/05/17 12:42:00 VCS INFO V-16-2-13716 (pk-ercoss2) Resource(masterdataservice_BACKUP): Output of the completed operation (offline)
==============================================
Password:
Backup Server: 3.48.1.1: The Backup Server will go down immediately.
Terminating sessions.
==============================================
2013/05/17 12:42:00 VCS WARNING V-16-20018-301 (pk-ercoss2) SybaseBk:masterdataservice_BACKUP:monitor:Open for backupserver failed, setting cook
ie to NULL
2013/05/17 12:42:00 VCS INFO V-16-1-10305 Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS i
nitiated)
2013/05/17 12:42:00 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice (Owner: Unspecified, Group: Sybase1) on System pk-e
rcoss2
2013/05/17 12:42:02 VCS NOTICE V-16-20018-18 (pk-ercoss2) Sybase:masterdataservice:offline:Sybase service masterdataservice has been stopped
2013/05/17 12:42:03 VCS INFO V-16-2-13716 (pk-ercoss2) Resource(masterdataservice): Output of the completed operation (offline)
==============================================
Password:
Server SHUTDOWN by request.
ASE is terminating this process.
CT-LIBRARY error:
ct_results(): network packet layer: internal net library error: Net-Library operation terminated due to disconnect
==============================================
2013/05/17 12:42:03 VCS WARNING V-16-20018-301 (pk-ercoss2) Sybase:masterdataservice:monitor:Open for dataserver failed, setting cookie to NULL
2013/05/17 12:42:03 VCS INFO V-16-1-10305 Resource masterdataservice (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiate
d)
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb1_ip (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:42:03 VCS INFO V-16-1-10305 Resource syb1_ip (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybmaster_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercoss
2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercos
s2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-erco
ss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-erco
ss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource dbdumps_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercos
s2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb1bak_ip (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource sybmaster_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource syblog_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
===========================================================================================
message file
============================================================================================
May 17 12:41:37 pk-ercoss1 in.mpathd[6024]: [ID 594170 daemon.error] NIC failure detected on oce9 of group pub_mnic
May 17 12:41:37 pk-ercoss1 in.mpathd[6024]: [ID 832587 daemon.error] Successfully failed over from NIC oce9 to NIC oce0
May 17 12:41:38 pk-ercoss1 in.mpathd[6024]: [ID 168056 daemon.error] All Interfaces in group pub_m
have failed
May 17 12:41:42 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is
FAULTED (timed out) on sys pk-ercoss1
May 17 12:41:42 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is
FAULTED (timed out) on sys pk-ercoss2
May 17 12:41:43 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss2
May 17 12:41:43 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss1
May 17 12:41:50 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is
FAULTED (timed out) on sys pk-ercoss2
May 17 12:41:50 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is F
AULTED (timed out) on sys pk-ercoss2
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 299542 daemon.error] NIC repair detected on oce0 of group pub_mnic
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 620804 daemon.error] Successfully failed back to NIC oce0
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 237757 daemon.error] At least 1 interface (oce0) of group pub_mnic has repaired
May 17 12:41:57 pk-ercoss1 in.mpathd[6024]: [ID 299542 daemon.error] NIC repair detected on oce9 of group pub_mnic
May 17 12:41:57 pk-ercoss1 in.mpathd[6024]: [ID 620804 daemon.error] Successfully failed back to NIC oce9
May 17 12:42:10 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group Sybase1 is faulted on system pk-ercoss2
May 17 12:42:11 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(5) Agent is calling clean for resource(
ossfs_ip) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:11 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(ossfs_ip
) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:11 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(5) Resource(ossfs_ip) - clean completed
successfully.
May 17 12:42:12 pk-ercoss1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,3a40@1c/pci103c,3245@0/sd@1,0 (sd9):
May 17 12:42:12 pk-ercoss1 drive offline
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 480808 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 30/0x240 belonging to the dmpnode 264/0x40
due to open failure
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 264/0x40
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x108/0x42
May 17 12:42:12 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
May 17 12:42:12 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 sybasedg: dg import with I/O fence enabled
May 17 12:42:15 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-10001-5004 IPMultiNICB:syb1_ip:online:Can not online.
No interfaces available
May 17 12:42:15 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-10001-5004 (pk-ercoss1) IPMultiNICB:syb1_ip:online:Can not online
. No interfaces available
May 17 12:42:19 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(5) Agent is calling clean for resource(
snmp_ip) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:19 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(snmp_ip)
because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:19 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(5) Resource(snmp_ip) - clean completed
successfully.
May 17 12:42:21 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is F
AULTED (timed out) on sys pk-ercoss1
May 17 12:42:21 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is
FAULTED (timed out) on sys pk-ercoss1
May 17 12:42:50 pk-ercoss1 svc.startd[9]: [ID 652011 daemon.warning] svc:/ericsson/eric_3pp/activemq_oss_loggingbroker:default: Method "/ericsso
n/activemq/bin/activeMQ.sh stopActiveMqLogger" failed with exit status 1.
May 17 12:42:50 pk-ercoss1 svc.startd[9]: [ID 652011 daemon.warning] svc:/ericsson/eric_3pp/activemq:default: Method "/ericsson/activemq/bin/act
iveMQ.sh stopActiveMq" failed with exit status 1.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13066 Thread(4) Agent is calling clean for resource(
syb1_ip) because the resource is not up even after online completed.
May 17 12:43:16 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13066 (pk-ercoss1) Agent is calling clean for resource(syb1_ip)
because the resource is not up even after online completed.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(4) Resource(syb1_ip) - clean completed
successfully.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13072 Thread(4) Resource(syb1_ip): Agent is retrying
online (attempt number 1 of 1).
May 17 12:43:28 pk-ercoss1 vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x108/0x42
May 17 12:43:28 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
May 17 12:43:28 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 sybasedg: dg import with I/O fence enabled
May 17 12:43:30 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out. Killi
ng contract 322855.
May 17 12:43:30 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
to signal KILL.
May 17 12:44:31 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out. Killi
ng contract 322861.
May 17 12:44:31 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
to signal KILL.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(13) Agent is calling clean for resource
(tomcat) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:44:49 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(tomcat)
because the resource became OFFLINE unexpectedly, on its own.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(13) Resource(tomcat) - clean completed
successfully.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13073 Thread(13) Resource(tomcat) became OFFLINE une
xpectedly on its own. Agent is restarting (attempt number 1 of 2) the resource.
May 17 12:44:49 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13073 (pk-ercoss1) Resource(tomcat) became OFFLINE unexpectedly
on its own. Agent is restarting (attempt number 1 of 2) the resource.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out. Killi
ng contract 322865.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
to signal KILL.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 748625 daemon.error] ericsson/eric_ep/TBS:default failed: transitioned to maintenance (see 'svcs -
xv' for details)
May 17 12:46:18 pk-ercoss1 su: [ID 810491 auth.crit] 'su sybase' failed for sybase on /dev/???
May 17 12:47:19 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_3pp/glassfish:default: Method or service exit timed out.
Killing contract 540.
May 17 12:47:19 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13011 Thread(14) Resource(glassfish): offline proced
ure did not complete within the expected time.
May 17 12:47:19 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13063 Thread(14) Agent is calling clean for resource
(glassfish) because offline did not complete within the expected time.
May 17 12:47:19 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13063 (pk-ercoss1) Agent is calling clean for resource(glassfis
h) because offline did not complete within the expected time.
May 17 12:47:20 pk-ercoss1 svc.startd[9]: [ID 748625 daemon.error] ericsson/eric_3pp/glassfish:default failed: transitioned to maintenance (see
'svcs -xv' for details)
May 17 12:47:21 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(14) Resource(glassfish) - clean complet
ed successfully.
======================================================
hastatus output at present
========================================================
-- SYSTEM STATE
-- System State Frozen
A pk-ercoss1 RUNNING 0
A pk-ercoss2 RUNNING 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B BkupLan pk-ercoss1 Y N ONLINE
B BkupLan pk-ercoss2 Y N ONLINE
B DDCMon pk-ercoss1 Y N ONLINE
B DDCMon pk-ercoss2 Y N PARTIAL
B Oss pk-ercoss1 Y N ONLINE
B Oss pk-ercoss2 Y N OFFLINE
B Ossfs pk-ercoss1 Y N ONLINE
B Ossfs pk-ercoss2 Y N OFFLINE
B PrivLan pk-ercoss1 Y N ONLINE
B PrivLan pk-ercoss2 Y N ONLINE
B PubLan pk-ercoss1 Y N ONLINE
B PubLan pk-ercoss2 Y N ONLINE
B StorLan pk-ercoss1 Y N ONLINE
B StorLan pk-ercoss2 Y N ONLINE
B Sybase1 pk-ercoss1 Y N OFFLINE
B Sybase1 pk-ercoss2 Y N ONLINE
============================================================================
pk-ercoss1{root} # hagrp -resources PubLan
pub_mnic
pub_p
pk-ercoss1{root} # hares -display pub_mnic
#Resource Attribute System Value
pub_mnic Group global PubLan
pub_mnic Type global MultiNICB
pub_mnic AutoStart global 1
pub_mnic Critical global 1
pub_mnic Enabled global 1
pub_mnic LastOnline global pk-ercoss2
pub_mnic MonitorOnly global 0
pub_mnic ResourceOwner global
pub_mnic TriggerEvent global 0
pub_mnic ArgListValues pk-ercoss1 UseMpathd 1 1 MpathdCommand 1 /usr/lib/inet/in.mpathd ConfigCheck 1 1 MpathdRestart 1 1 Device 4 oce0 0 oce9 1 NetworkHosts 1 10.207.1.254 LinkTestRatio 1 1 IgnoreLinkStatus 1 1 NetworkTimeout 1 100 OnlineTestRepeatCount 1 3 OfflineTestRepeatCount 1 3 NoBroadcast 1 0 DefaultRouter 1 0.0.0.0 Failback 1 0 GroupName 1 "" Protocol 1 IPv4
pub_mnic ArgListValues pk-ercoss2 UseMpathd 1 1 MpathdCommand 1 /usr/lib/inet/in.mpathd ConfigCheck 1 1 MpathdRestart 1 1 Device 4 oce0 0 oce9 1 NetworkHosts 1 10.207.1.254 LinkTestRatio 1 1 IgnoreLinkStatus 1 1 NetworkTimeout 1 100 OnlineTestRepeatCount 1 3 OfflineTestRepeatCount 1 3 NoBroadcast 1 0 DefaultRouter 1 0.0.0.0 Failback 1 0 GroupName 1 "" Protocol 1 IPv4
pub_mnic ConfidenceLevel pk-ercoss1 0
pub_mnic ConfidenceLevel pk-ercoss2 0
pub_mnic ConfidenceMsg pk-ercoss1
pub_mnic ConfidenceMsg pk-ercoss2
pub_mnic Flags pk-ercoss1
pub_mnic Flags pk-ercoss2
pub_mnic IState pk-ercoss1 not waiting
pub_mnic IState pk-ercoss2 not waiting
pub_mnic MonitorMethod pk-ercoss1 Traditional
pub_mnic MonitorMethod pk-ercoss2 Traditional
pub_mnic Probed pk-ercoss1 1
pub_mnic Probed pk-ercoss2 1
pub_mnic Start pk-ercoss1 0
pub_mnic Start pk-ercoss2 0
pub_mnic State pk-ercoss1 ONLINE
pub_mnic State pk-ercoss2 ONLINE
pub_mnic ComputeStats global 0
pub_mnic ConfigCheck global 1
pub_mnic DefaultRouter global 0.0.0.0
pub_mnic Failback global 0
pub_mnic GroupName global
pub_mnic IgnoreLinkStatus global 1
pub_mnic LinkTestRatio global 1
pub_mnic MpathdCommand global /usr/lib/inet/in.mpathd
pub_mnic MpathdRestart global 1
pub_mnic NetworkHosts global 10.207.1.254
pub_mnic NetworkTimeout global 100
pub_mnic NoBroadcast global 0
pub_mnic OfflineTestRepeatCount global 3
pub_mnic OnlineTestRepeatCount global 3
pub_mnic Protocol global IPv4
pub_mnic TriggerResStateChange global 0
pub_mnic UseMpathd global 1
pub_mnic ContainerInfo pk-ercoss1 Type Name Enabled
pub_mnic ContainerInfo pk-ercoss2 Type Name Enabled
pub_mnic Device pk-ercoss1 oce0 0 oce9 1
pub_mnic Device pk-ercoss2 oce0 0 oce9 1
pub_mnic MonitorTimeStats pk-ercoss1 Avg 0 TS
pub_mnic MonitorTimeStats pk-ercoss2 Avg 0 TS
pub_mnic ResourceInfo pk-ercoss1 State Valid Msg TS
pub_mnic ResourceInfo pk-ercoss2 State Valid Msg TS
pk-ercoss1{root} # hares -display pub_p
#Resource Attribute System Value
pub_p Group global PubLan
pub_p Type global Phantom
pub_p AutoStart global 1
pub_p Critical global 1
pub_p Enabled global 1
pub_p LastOnline global pk-ercoss1
pub_p MonitorOnly global 0
pub_p ResourceOwner global
pub_p TriggerEvent global 0
pub_p ArgListValues pk-ercoss1 ""
pub_p ArgListValues pk-ercoss2 ""
pub_p ConfidenceLevel pk-ercoss1 100
pub_p ConfidenceLevel pk-ercoss2 100
pub_p ConfidenceMsg pk-ercoss1
pub_p ConfidenceMsg pk-ercoss2
pub_p Flags pk-ercoss1
pub_p Flags pk-ercoss2
pub_p IState pk-ercoss1 not waiting
pub_p IState pk-ercoss2 not waiting
pub_p MonitorMethod pk-ercoss1 Traditional
pub_p MonitorMethod pk-ercoss2 Traditional
pub_p Probed pk-ercoss1 1
pub_p Probed pk-ercoss2 1
pub_p Start pk-ercoss1 1
pub_p Start pk-ercoss2 1
pub_p State pk-ercoss1 ONLINE
pub_p State pk-ercoss2 ONLINE
pub_p ComputeStats global 0
pub_p TriggerResStateChange global 0
pub_p ContainerInfo pk-ercoss1 Type Name Enabled
pub_p ContainerInfo pk-ercoss2 Type Name Enabled
pub_p MonitorTimeStats pk-ercoss1 Avg 0 TS
pub_p MonitorTimeStats pk-ercoss2 Avg 0 TS
pub_p ResourceInfo pk-ercoss1 State Valid Msg TS
pub_p ResourceInfo pk-ercoss2 State Valid Msg TS
One 2-node vcs cluster, the heartbeat NICs are eth2 and eth3 on each node,
IF eth2 on node1 down, and eth3 on node2 down. Does this mean the 2 heartbeat Links both down, and the Cluster is in split brain situation?
Can LLT heartbeats communicate between NIC eth2 and NIC eth3?
Since the 《VCS Installation Guide》requires the 2 heartbeat Links in different networks.We should put eth2 of both nodes in the VLAN (VLAN1), and put eth3 of both nodes in another vlan (VLAN2). So in this situation heartbeats cannot communicate between eth2 and eth3.
But, in a production cluster system, we found out the 4 NICs--eth2 and eth3 of both nodes are all in a same VLAN. and this lead me to post the discussion thread to ask this question:
IF eth2 on node1 down, and eth3 on node2 down, What will happen to the cluster (which is in active-standby mode) ?
Thanks!
I recently downloaded the Veritas Cluster media for learninig/testing purpose.
issue is while i try to extract the media;
gtar -xvzf VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz
it started to extract after extracting 865 MB of data it returened the error
./dvd1-sol_sparc/readme_first.txt
gzip: stdin: invalid compressed data--crc error
/usr/sfw/bin/gtar: Child returned status 1
/usr/sfw/bin/gtar: Error is not recoverable: exiting now
bash-3.2# du -sh dvd1-sol_sparc/
865M dvd1-sol_sparc
then I tried
bash-3.2# gunzip -d VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz
gunzip: VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz: invalid compressed data--crc error
then
gzip -d VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz | tar xvf -
gzip: VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz: invalid compressed data--crc error
tar: blocksize = 0
I tried winrar, 7-zip etc
Thanks in advanc Guru.......
Owais Hyder.
vcs5.1sp1 on redhat
Customer configure notifier by wizard, and get following vcs alter mail:
Hello
I need to change the NIC ports for VCS heartbeat. Normally I use vi to edit the /etc/lltab and reboot the nodes.
Is that the correct way to chenge the llt connection.
Thank you.
Hello,
I'm currently create a VCS cluster (without SFW for the disk management) and i have some blue screens when i mount disk ressources in the cluster. In VCS witch the LDM from windows, you need to use DiskRes and Mount ressources in the cluster.
When i online DiskRes, all is ok (it works), but when i try to online the Mount ressource, i have a blue screen stop 0xBADFFFFF
I've found a solution on this site : http://www.experts-exchange.com/OS/Microsoft_Operating_Systems/Server/Windows_Server_2008/Q_27781502.html. The MPIO must be configured on Failover policy, not on RoundRobin policy. In the Disk administrator, click on the disk (not the partition) and choose properties. Go to the MPIO tab, and change the Policy to Failover Only.
Hi Support
We have a two node VCS cluster,one node is down & not able to boot,we locally up the cluster in the other node. The node which is not booting up giving the following errors,I have mentioned the last few lines only.OS is solaris9,platform is SF4900.
=============================================================
Rebooting with command: boot -s
Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
hba0: QLogic QLA2300 Fibre Channel Host Adapter fcode version 2.00.05 01/29/03
hba0: Firmware v3.3.12 (ipx)
QLogic Fibre Channel Driver v4.20 Instance: 1
hba1: QLogic QLA2300 Fibre Channel Host Adapter fcode version 2.00.05 01/29/03
hba1: Firmware v3.3.12 (ipx)
Hardware watchdog enabled
VxVM sysboot INFO V-5-2-3390 Starting restore daemon...
VxVM sysboot INFO V-5-2-3409 starting in boot mode...
NOTICE: VxVM vxdmp V-5-0-34 added disk array DISKS, datype = Disk
VxVM vxconfigd ERROR V-5-1-0 Segmentation violation
/etc/rcS.d/S25vxvm-sysboot: egettxt: not found
VxVM sysboot NOTICE V-5-2-3388 Halting system...
syncing file systems... done
NOTICE:
==========================================================
with thanks & regards
Arup
Getting the following error in engine_A.log:
2013/05/31 14:16:12 VCS INFO V-16-1-10307 Resource cvmvoldg5 (Owner: unknown, Group: lic_DG) is offline on dwlemm2b (Not initiated by VCS)
2013/05/31 14:16:14 VCS INFO V-16-6-15004 (dwlemm2b) hatrigger:Failed to send trigger for resfault; script doesn't exist
Hi All,
Can you please help me with a guide or a step by step document to install SQL Server 2005 & 2008 on a Veritas cluster enviornment , since start of my career i had worked only with MCS.
I very new to this enviornment and wanted to know the install method and cluster configuration.
thanks ,
Shiv
Hello All,
Maybe this is already known answer but I tried to find how VCS 6.0.1 HostMonitor is calculating CPU usage on Linux hosts and didn't found answer.
Can somebody of you answer this question?
I would like to know from where it is collecting CPU data, if it is an avarage CPU usage... (It is looks like that collected information is wrong)
Thank you very much.
Iv4n
Hi
Now my services groups is online on node 1 and the type ( Faliover) i mounted inside groups resource .
So now if the node 1 going down for any Reason the resource shoud going to node 2 is this ( high availability)
Regards
Hi,
I have a customer who has two VCS clusters running on RHEL 5.6 servers. These clusters are further protected by site failover using GCO (global cluster option). All was working fine since installation with remote cluster operations showing up on the local cluster etc. But then this error started to appear in the wac_A.log file ....
VCS WARNING V-16-1-10543 IpmServer::open Cannot create socket errno = 97
Since this the cluster will not see of the remote clusters state, but can ping it as seen from the hastatus command below:
site-ab04# hastatus -sum
-- SYSTEM STATE
-- System State Frozen
A site-ab04 RUNNING 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B ClusterService site-ab04 Y N ONLINE
B SG_commonsg site-ab04 Y N ONLINE
B SG_site-b04g3 site-ab04 Y N OFFLINE
B SG_site-b04g4 site-ab04 Y N OFFLINE
B SG_site-a04g0 site-ab04 Y N OFFLINE
B SG_site-a04g1 site-ab04 Y N OFFLINE
B SG_site-a04g2 site-ab04 Y N OFFLINE
B vxfen site-ab04 Y N ONLINE
-- WAN HEARTBEAT STATE
-- Heartbeat To State
M Icmp site-b04c ALIVE
-- REMOTE CLUSTER STATE
-- Cluster State
N site-b04c INIT
Does any one have any ideas? Networking all seems to be in order.
Thanks,
Rich
Environment
AIX= 6.1 TL8 SP2
HA/VCS = 6.0.1
Cluster Nodes = 2
We just updated from VCS 5.1 to 6.0.1 and received the following error when we attempt to test fail over (either direction)
12:11:15 V-16-10011-350 (clustertest0) Application:lab_app:online:Execution failed with return value [110]
This is followed by:
12:11:15 V-16-10011-260 (clustertest0) Application:lab_app:online:Execution of start Program (/etc/clu_app.x) returned (1)
12:11::20 V-16-1-10298 Resource lab_app (Owner Unspecified, Group: LABSG) is online on clustertest0 (VCS initiated)
12:11::20 V-16-1-10447 Group LABSG is online on clustertest0
The application comes up and is running post fail over, but we are wondering what is causing the error so we can remedy this issue.
Hello
I have installed the SFHA6.0 for windows 2008 with keyless option. However after I applied the permanent key, the system still show the error log in event viewer.
I seach the google then found that it should be disable the keyless option.
But I found the vxkeyless is disappear on windows server.
Thank you.
Hello,
I'm working on a VCS cluster 6.0.1 under Windows 2008R2 with SQL2008R2. I cannot use the Wizard for SQL to configure Service Groups for SQL2008, because i have some issues with shared disk (call in progress to symantec support).
So, i try to configure manualy the SG, but i have an issue with the RegRep ressource. I cannot found real informations to configure this ressource. Actually my main.cf containt this :
RegRep SQLInst1_SQL_RegRep (
MountResName = SQLInst1_RegRep_Mount
Keys = { "HKLM\\Software\\VERITAS\\" = "",
"HKLM\\Software\\Microsoft\\Microsoft SQL Server" = "",
"HKLM\\Software\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQLINST1" = "" }
)
But, i think it's not sufficient, because i have many troubles with the failover of Service Groups.
Someone have a cluster configured with the wizard ? Who can share the complete definition of a service group for SQL2008 ? Thank you.
Note : When you use Wizard, you obtain this :
RegRep SG_SQL03-RegRep-MSSQL (
MountResName = SG_SQL03-Mount
ReplicationDirectory = "\\RegRep\\SG_SQL03-RegRep-MSSQL"
Keys = {
"HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\MSSQLServer" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_MSSQLServer.reg",
"HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\PROVIDERS" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_PROVIDERS.reg",
"HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\Replication" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_Replication.reg",
"HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\SQLServerAgent" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_SQLServerAgent.reg",
"HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\SQLServerSCP" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_SQLServerSCP.reg" }
ExcludeKeys = {
"HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\MSSQLServer\\CurrentVersion" }
)
Hi all:
I use three share disks on iscsi(target using LIO) as the I/O fence disk in VCS 6.0 on SLES11 SP1. After configuration, the kernel panic when starting vxfen. Log is here:
[ 83.640266] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060[ 83.641248] IP: [<ffffffff81396113>] down_read+0x3/0x10[ 83.641933] PGD 37b62067 PUD 3c7b2067 PMD 0[ 83.642512] Oops: 0002 [#1] SMP[ 83.643016] last sysfs file: /sys/devices/platform/host2/iscsi_host/host2/initiatorname[ 83.644009] CPU 0[ 83.644054] Modules linked in: vxfen(PN) dmpalua(PN) vxspec(PN) vxio(PN) vxdmp(PN) snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc gab(PN) ipv6 crc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi af_packet llt(PN) amf(PN) microcode fuse loop fdd(PN) exportfs vxportal(PN) vxfs(PN) dm_mod virtio_blk virtio_balloon virtio_net sg rtc_cmos rtc_core rtc_lib tpm_tis button tpm tpm_bios floppy i2c_piix4 virtio_pci virtio_ring pcspkr i2c_core virtio uhci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd ext3 mbcache jbd fan processor ide_pci_generic piix ide_core ata_generic ata_piix libata scsi_mod thermal thermal_sys hwmon[ 83.644054] Supported: Yes, External[ 83.644054] Pid: 4670, comm: vxfen Tainted: P 2.6.32.12-0.7-default #1 Bochs[ 83.644054] RIP: 0010:[<ffffffff81396113>] [<ffffffff81396113>] down_read+0x3/0x10[ 83.644054] RSP: 0018:ffff88002e831638 EFLAGS: 00010286[ 83.644054] RAX: 0000000000000060 RBX: 0000000000000000 RCX: ffff88003c4e2480[ 83.644054] RDX: 0000000000000001 RSI: 0000000000002000 RDI: 0000000000000060[ 83.644054] RBP: ffff88002c9aa000 R08: 0000000000000000 R09: ffff88003c4e2480[ 83.644054] R10: ffff88003d6da140 R11: 00000000000000d0 R12: 0000000000000060[ 83.644054] R13: ffff88002c9a8000 R14: 0000000000000000 R15: ffff88002c9a8001[ 83.644054] FS: 0000000000000000(0000) GS:ffff880006200000(0000) knlGS:0000000000000000[ 83.644054] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b[ 83.644054] CR2: 0000000000000060 CR3: 000000003c72a000 CR4: 00000000000006f0[ 83.644054] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 83.644054] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 83.644054] Process vxfen (pid: 4670, threadinfo ffff88002e830000, task ffff88002c8e8140)[ 83.644054] Stack:[ 83.644054] ffffffff810325ab 0000000000000000 0000001000000000 ffff88003ca9d690[ 83.644054] <0> ffff88003c4e2480 000000013d0181c0 0000000000000002 0000000000000002[ 83.644054] <0> 0000000000000002 0000000000001010 0000000000000000 0000000000000000[ 83.644054] Call Trace:[ 83.644054] [<ffffffff810325ab>] get_user_pages_fast+0x11b/0x1a0[ 83.644054] [<ffffffff81127c27>] __bio_map_user_iov+0x167/0x2a0[ 83.644054] [<ffffffff81127d69>] bio_map_user_iov+0x9/0x30[ 83.644054] [<ffffffff81127dac>] bio_map_user+0x1c/0x30[ 83.644054] [<ffffffff811bab91>] __blk_rq_map_user+0x111/0x140[ 83.644054] [<ffffffff811bacc5>] blk_rq_map_user+0x105/0x190[ 83.644054] [<ffffffff811bebe7>] sg_io+0x3c7/0x3e0[ 83.644054] [<ffffffff811bf1ec>] scsi_cmd_ioctl+0x2ac/0x470[ 83.644054] [<ffffffffa01b75b1>] sd_ioctl+0xa1/0x120 [sd_mod][ 83.644054] [<ffffffffa0ccaa93>] vxfen_ioctl_by_bdev+0xc3/0xd0 [vxfen][ 83.644054] [<ffffffffa0ccb6ac>] vxfen_ioc_kernel_scsi_ioctl+0xec/0x3c0 [vxfen][ 83.644054] [<ffffffffa0ccbe1f>] vxfen_lnx_pgr_in+0xff/0x380 [vxfen][ 83.644054] [<ffffffffa0ccc11a>] vxfen_plat_pgr_in+0x7a/0x1c0 [vxfen][ 83.644054] [<ffffffffa0cd26c3>] vxfen_readkeys+0xa3/0x380 [vxfen][ 83.644054] [<ffffffffa0cd3514>] vxfen_membreg+0x84/0xae0 [vxfen][ 83.644054] [<ffffffffa0cce6d6>] vxfen_preexist_split_brain_scsi3+0x96/0x2d0 [vxfen][ 83.644054] [<ffffffffa0ccf96d>] vxfen_reg_coord_disk+0x7d/0x660 [vxfen][ 83.644054] [<ffffffffa0ca5e0b>] vxfen_reg_coord_pt+0xfb/0x250 [vxfen][ 83.644054] [<ffffffffa0cb849c>] vxfen_handle_local_config_done+0x14c/0x8d0 [vxfen][ 83.644054] [<ffffffffa0cbad57>] vxfen_vrfsm_cback+0xad7/0x17b0 [vxfen][ 83.644054] [<ffffffffa0cd5b20>] vrfsm_step+0x1b0/0x3b0 [vxfen][ 83.644054] [<ffffffffa0cd7e1c>] vrfsm_recv_thread+0x32c/0x970 [vxfen][ 83.644054] [<ffffffffa0cd85b4>] vxplat_lx_thread_base+0xa4/0x100 [vxfen][ 83.644054] [<ffffffff81003fba>] child_rip+0xa/0x20[ 83.644054] Code: 48 85 f6 74 0f 48 89 e7 e8 5b 09 cd ff 85 c0 48 63 d0 7e 07 48 c7 c2 fc fd ff ff 48 83 c4 78 48 89 d0 5b 5d c3 00 00 00 48 89 f8 <3e> 48 ff 00 79 05 e8 52 fa e4 ff c3 90 48 89 f8 48 ba 01 00 00[ 83.644054] RIP [<ffffffff81396113>] down_read+0x3/0x10[ 83.644054] RSP <ffff88002e831638>[ 83.644054] CR2: 0000000000000060vcs1:~ # vxdisk listDEVICE TYPE DISK GROUP STATUSaluadisk0_1 auto:cdsdisk - - onlinealuadisk0_2 auto:cdsdisk - - onlinealuadisk0_3 auto:cdsdisk - - onlinesda auto:none - - online invalidvda simple vda data_dg onlinevdb simple vdb data_dg onlinevcs1:~ # cat /proc/partitionsmajor minor #blocks name8 0 8388608 sda8 1 7333641 sda18 2 1052257 sda2253 0 10485760 vda253 16 4194304 vdb8 16 1048576 sdb8 19 1046528 sdb38 24 1046528 sdb88 32 1048576 sdc8 35 1046528 sdc38 40 1046528 sdc88 48 1048576 sdd8 51 1046528 sdd38 56 1046528 sdd8201 0 8388608 VxDMP1201 1 7333641 VxDMP1p1201 2 1052257 VxDMP1p2201 16 1048576 VxDMP2201 19 1046528 VxDMP2p3201 24 1046528 VxDMP2p8201 32 1048576 VxDMP3201 35 1046528 VxDMP3p3201 40 1046528 VxDMP3p8201 48 1048576 VxDMP4201 51 1046528 VxDMP4p3201 56 1046528 VxDMP4p8199 6000 8388608 VxVM6000199 6001 153600 VxVM6001LIO Target server message:[60529.780169] br0: port 3(vnet2) entered forwarding state[60675.372132] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.[60675.372217] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.[60675.373337] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.[60675.373426] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.[60675.374304] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.[60675.374374] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
All test have passed by
vcs1:~ # vxfentsthdw -mVeritas vxfentsthdw version 6.0.000.000-GA LinuxThe utility vxfentsthdw works on the two nodes of the cluster.The utility verifies that the shared storage one intends to use isconfigured to support I/O fencing. It issues a series of vxfenadmcommands to setup SCSI-3 registrations on the disk, verifies theregistrations on the disk, and removes the registrations from the disk.******** WARNING!!!!!!!! ********THIS UTILITY WILL DESTROY THE DATA ON THE DISK!!Do you still want to continue : [y/n] (default: n) yThe logfile generated for vxfentsthdw is /var/VRTSvcs/log/vxfen/vxfentsthdw.log.9431Enter the first node of the cluster:vcs1Enter the second node of the cluster:vcs2Enter the disk name to be checked for SCSI-3 PGR on node vcs1 in the format:for dmp: /dev/vx/rdmp/sdxfor raw: /dev/sdxMake sure it is the same disk as seen by nodes vcs1 and vcs2/dev/sdbEnter the disk name to be checked for SCSI-3 PGR on node vcs2 in the format:for dmp: /dev/vx/rdmp/sdxfor raw: /dev/sdxMake sure it is the same disk as seen by nodes vcs1 and vcs2/dev/sdb***************************************************************************Testing vcs1 /dev/sdb vcs2 /dev/sdbEvaluate the disk before testing ........................ No Pre-existing keysRegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... PassedVerify registrations for disk /dev/sdb on node vcs1 .................... PassedRegisterIgnoreKeys on disk /dev/sdb from node vcs2 ..................... PassedVerify registrations for disk /dev/sdb on node vcs2 .................... PassedUnregister keys on disk /dev/sdb from node vcs1 ........................ PassedVerify registrations for disk /dev/sdb on node vcs2 .................... PassedUnregister keys on disk /dev/sdb from node vcs2 ........................ PassedCheck to verify there are no keys from node vcs1 ....................... PassedCheck to verify there are no keys from node vcs2 ....................... PassedRegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... PassedVerify registrations for disk /dev/sdb on node vcs1 .................... PassedRead from disk /dev/sdb on node vcs1 ................................... PassedWrite to disk /dev/sdb from node vcs1 .................................. PassedRead from disk /dev/sdb on node vcs2 ................................... PassedWrite to disk /dev/sdb from node vcs2 .................................. PassedReserve disk /dev/sdb from node vcs1 ................................... PassedVerify reservation for disk /dev/sdb on node vcs1 ...................... PassedRead from disk /dev/sdb on node vcs1 ................................... PassedRead from disk /dev/sdb on node vcs2 ................................... PassedWrite to disk /dev/sdb from node vcs1 .................................. PassedExpect no writes for disk /dev/sdb on node vcs2 ........................ PassedRegisterIgnoreKeys on disk /dev/sdb from node vcs2 ..................... PassedVerify registrations for disk /dev/sdb on node vcs1 .................... PassedVerify registrations for disk /dev/sdb on node vcs2 .................... PassedWrite to disk /dev/sdb from node vcs1 .................................. PassedWrite to disk /dev/sdb from node vcs2 .................................. PassedPreempt and abort key KeyA using key KeyB on node vcs2 ................. PassedTest to see if I/O on node vcs1 terminated ............................. PassedRegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... PassedVerify registrations for disk /dev/sdb on node vcs1 .................... PassedPreempt key KeyC using key KeyB on node vcs2 ........................... PassedTest to see if I/O on node vcs1 terminated ............................. PassedVerify registrations for disk /dev/sdb on node vcs1 .................... PassedVerify registrations for disk /dev/sdb on node vcs2 .................... PassedVerify reservation for disk /dev/sdb on node vcs1 ...................... PassedVerify reservation for disk /dev/sdb on node vcs2 ...................... PassedRemove key KeyB on node vcs2 ........................................... PassedCheck to verify there are no keys from node vcs1 ....................... PassedCheck to verify there are no keys from node vcs2 ....................... PassedCheck to verify there are no reservations on disk /dev/sdb from node vcs1 PassedCheck to verify there are no reservations on disk /dev/sdb from node vcs2 PassedRegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... PassedVerify registrations for disk /dev/sdb on node vcs1 .................... PassedRegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... PassedVerify registrations for disk /dev/sdb on node vcs1 .................... PassedClear PGR on node vcs1 ................................................. PassedCheck to verify there are no keys from node vcs1 ....................... PassedALL tests on the disk /dev/sdb have PASSED.The disk is now ready to be configured for I/O Fencing on node vcs1.ALL tests on the disk /dev/sdb have PASSED.The disk is now ready to be configured for I/O Fencing on node vcs2.Removing test keys and temporary files, if any...