Hello Guys,
I need to find the rootcause for the service failure in our veritas cluster.
service groups didnot failover to other node.
Below are the logs as i can see all this strated with NIcs failure and IPMULTINICB resource going faulty.
if anyone can help me here
engine logs
====================================
2013/05/17 12:41:41 VCS INFO V-16-2-13075 (pk-ercoss1) Resource(ossfs_ip) has reported unexpected OFFLINE 1 times, which is still within the Tol
eranceLimit(1).
2013/05/17 12:41:42 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys pk-ercoss1
2013/05/17 12:41:42 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System pk-ercoss1
2013/05/17 12:41:42 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:42 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-6-0 (pk-ercoss1) resfault:(resfault) Invoked with arg0=pk-ercoss1, arg1=pub_mnic, arg2=ONLINE
2013/05/17 12:41:43 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=pub_mnic, arg2=ONLINE
2013/05/17 12:41:43 VCS INFO V-16-0 (pk-ercoss1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss1 ,arg2=pub_mnic
2013/05/17 12:41:43 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=pub_mnic
2013/05/17 12:41:43 VCS INFO V-16-6-15002 (pk-ercoss1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss1 pub_mnic ONLIN
E successfully
2013/05/17 12:41:43 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 pub_mnic ONLIN
E successfully
2013/05/17 12:41:43 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:41:43 VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS NOTICE V-16-1-10446 Group PubLan is offline on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss2 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss1 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears
. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/05/17 12:41:43 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on pk-ercoss1 (VCS initiated)
2013/05/17 12:41:43 VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10446 Group PubLan is offline on system pk-ercoss1
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss2 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss1 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears
. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:(postoffline) Invoked with arg0=pk-ercoss2, arg1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=pk-ercoss2, arg
1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:(postoffline) Invoked with arg0=pk-ercoss1, arg1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=pk-ercoss1, arg
1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline.sh:PubLan:Nothing done
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group Pu
bLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline.sh:PubLan:Nothing done
2013/05/17 12:41:44 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline pk-ercoss2 PubLan su
ccessfully
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group Pu
bLan
2013/05/17 12:41:44 VCS INFO V-16-6-15002 (pk-ercoss1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline pk-ercoss1 PubLan su
ccessfully
2013/05/17 12:41:49 VCS INFO V-16-2-13075 (pk-ercoss1) Resource(snmp_ip) has reported unexpected OFFLINE 1 times, which is still within the Tole
ranceLimit(1).
2013/05/17 12:41:50 VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:50 VCS NOTICE V-16-1-10300 Initiating Offline of Resource stop_sybase (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:41:50 VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:50 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=syb1_p1, arg2=ONLINE
2013/05/17 12:41:50 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=ossfs_p1, arg2=ONLINE
2013/05/17 12:41:50 VCS INFO V-16-10001-88 (pk-ercoss2) Application:stop_sybase:offline:Executed [/ericsson/core/cluster/scripts/stop_sybase.sh
stop] successfully.
2013/05/17 12:41:50 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=syb1_p1
2013/05/17 12:41:50 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=ossfs_p1
2013/05/17 12:41:50 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 syb1_p1 ONLINE
successfully
2013/05/17 12:41:50 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 ossfs_p1 ONLIN
E successfully
2013/05/17 12:41:53 VCS INFO V-16-1-10305 Resource stop_sybase (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:41:53 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) on Syst
em pk-ercoss2
2013/05/17 12:42:00 VCS NOTICE V-16-20018-26 (pk-ercoss2) SybaseBk:masterdataservice_BACKUP:offline:Sybase Backup service masterdataservice_BACK
UP has been stopped
2013/05/17 12:42:00 VCS INFO V-16-2-13716 (pk-ercoss2) Resource(masterdataservice_BACKUP): Output of the completed operation (offline)
==============================================
Password:
Backup Server: 3.48.1.1: The Backup Server will go down immediately.
Terminating sessions.
==============================================
2013/05/17 12:42:00 VCS WARNING V-16-20018-301 (pk-ercoss2) SybaseBk:masterdataservice_BACKUP:monitor:Open for backupserver failed, setting cook
ie to NULL
2013/05/17 12:42:00 VCS INFO V-16-1-10305 Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS i
nitiated)
2013/05/17 12:42:00 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice (Owner: Unspecified, Group: Sybase1) on System pk-e
rcoss2
2013/05/17 12:42:02 VCS NOTICE V-16-20018-18 (pk-ercoss2) Sybase:masterdataservice:offline:Sybase service masterdataservice has been stopped
2013/05/17 12:42:03 VCS INFO V-16-2-13716 (pk-ercoss2) Resource(masterdataservice): Output of the completed operation (offline)
==============================================
Password:
Server SHUTDOWN by request.
ASE is terminating this process.
CT-LIBRARY error:
ct_results(): network packet layer: internal net library error: Net-Library operation terminated due to disconnect
==============================================
2013/05/17 12:42:03 VCS WARNING V-16-20018-301 (pk-ercoss2) Sybase:masterdataservice:monitor:Open for dataserver failed, setting cookie to NULL
2013/05/17 12:42:03 VCS INFO V-16-1-10305 Resource masterdataservice (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiate
d)
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb1_ip (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:42:03 VCS INFO V-16-1-10305 Resource syb1_ip (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybmaster_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercoss
2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercos
s2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-erco
ss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-erco
ss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource dbdumps_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercos
s2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb1bak_ip (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource sybmaster_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource syblog_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
===========================================================================================
message file
============================================================================================
May 17 12:41:37 pk-ercoss1 in.mpathd[6024]: [ID 594170 daemon.error] NIC failure detected on oce9 of group pub_mnic
May 17 12:41:37 pk-ercoss1 in.mpathd[6024]: [ID 832587 daemon.error] Successfully failed over from NIC oce9 to NIC oce0
May 17 12:41:38 pk-ercoss1 in.mpathd[6024]: [ID 168056 daemon.error] All Interfaces in group pub_m
have failed
May 17 12:41:42 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is
FAULTED (timed out) on sys pk-ercoss1
May 17 12:41:42 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is
FAULTED (timed out) on sys pk-ercoss2
May 17 12:41:43 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss2
May 17 12:41:43 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss1
May 17 12:41:50 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is
FAULTED (timed out) on sys pk-ercoss2
May 17 12:41:50 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is F
AULTED (timed out) on sys pk-ercoss2
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 299542 daemon.error] NIC repair detected on oce0 of group pub_mnic
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 620804 daemon.error] Successfully failed back to NIC oce0
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 237757 daemon.error] At least 1 interface (oce0) of group pub_mnic has repaired
May 17 12:41:57 pk-ercoss1 in.mpathd[6024]: [ID 299542 daemon.error] NIC repair detected on oce9 of group pub_mnic
May 17 12:41:57 pk-ercoss1 in.mpathd[6024]: [ID 620804 daemon.error] Successfully failed back to NIC oce9
May 17 12:42:10 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group Sybase1 is faulted on system pk-ercoss2
May 17 12:42:11 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(5) Agent is calling clean for resource(
ossfs_ip) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:11 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(ossfs_ip
) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:11 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(5) Resource(ossfs_ip) - clean completed
successfully.
May 17 12:42:12 pk-ercoss1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,3a40@1c/pci103c,3245@0/sd@1,0 (sd9):
May 17 12:42:12 pk-ercoss1 drive offline
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 480808 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 30/0x240 belonging to the dmpnode 264/0x40
due to open failure
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 264/0x40
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x108/0x42
May 17 12:42:12 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
May 17 12:42:12 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 sybasedg: dg import with I/O fence enabled
May 17 12:42:15 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-10001-5004 IPMultiNICB:syb1_ip:online:Can not online.
No interfaces available
May 17 12:42:15 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-10001-5004 (pk-ercoss1) IPMultiNICB:syb1_ip:online:Can not online
. No interfaces available
May 17 12:42:19 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(5) Agent is calling clean for resource(
snmp_ip) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:19 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(snmp_ip)
because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:19 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(5) Resource(snmp_ip) - clean completed
successfully.
May 17 12:42:21 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is F
AULTED (timed out) on sys pk-ercoss1
May 17 12:42:21 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is
FAULTED (timed out) on sys pk-ercoss1
May 17 12:42:50 pk-ercoss1 svc.startd[9]: [ID 652011 daemon.warning] svc:/ericsson/eric_3pp/activemq_oss_loggingbroker:default: Method "/ericsso
n/activemq/bin/activeMQ.sh stopActiveMqLogger" failed with exit status 1.
May 17 12:42:50 pk-ercoss1 svc.startd[9]: [ID 652011 daemon.warning] svc:/ericsson/eric_3pp/activemq:default: Method "/ericsson/activemq/bin/act
iveMQ.sh stopActiveMq" failed with exit status 1.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13066 Thread(4) Agent is calling clean for resource(
syb1_ip) because the resource is not up even after online completed.
May 17 12:43:16 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13066 (pk-ercoss1) Agent is calling clean for resource(syb1_ip)
because the resource is not up even after online completed.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(4) Resource(syb1_ip) - clean completed
successfully.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13072 Thread(4) Resource(syb1_ip): Agent is retrying
online (attempt number 1 of 1).
May 17 12:43:28 pk-ercoss1 vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x108/0x42
May 17 12:43:28 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
May 17 12:43:28 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 sybasedg: dg import with I/O fence enabled
May 17 12:43:30 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out. Killi
ng contract 322855.
May 17 12:43:30 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
to signal KILL.
May 17 12:44:31 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out. Killi
ng contract 322861.
May 17 12:44:31 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
to signal KILL.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(13) Agent is calling clean for resource
(tomcat) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:44:49 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(tomcat)
because the resource became OFFLINE unexpectedly, on its own.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(13) Resource(tomcat) - clean completed
successfully.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13073 Thread(13) Resource(tomcat) became OFFLINE une
xpectedly on its own. Agent is restarting (attempt number 1 of 2) the resource.
May 17 12:44:49 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13073 (pk-ercoss1) Resource(tomcat) became OFFLINE unexpectedly
on its own. Agent is restarting (attempt number 1 of 2) the resource.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out. Killi
ng contract 322865.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
to signal KILL.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 748625 daemon.error] ericsson/eric_ep/TBS:default failed: transitioned to maintenance (see 'svcs -
xv' for details)
May 17 12:46:18 pk-ercoss1 su: [ID 810491 auth.crit] 'su sybase' failed for sybase on /dev/???
May 17 12:47:19 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_3pp/glassfish:default: Method or service exit timed out.
Killing contract 540.
May 17 12:47:19 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13011 Thread(14) Resource(glassfish): offline proced
ure did not complete within the expected time.
May 17 12:47:19 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13063 Thread(14) Agent is calling clean for resource
(glassfish) because offline did not complete within the expected time.
May 17 12:47:19 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13063 (pk-ercoss1) Agent is calling clean for resource(glassfis
h) because offline did not complete within the expected time.
May 17 12:47:20 pk-ercoss1 svc.startd[9]: [ID 748625 daemon.error] ericsson/eric_3pp/glassfish:default failed: transitioned to maintenance (see
'svcs -xv' for details)
May 17 12:47:21 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(14) Resource(glassfish) - clean complet
ed successfully.
======================================================
hastatus output at present
========================================================
-- SYSTEM STATE
-- System State Frozen
A pk-ercoss1 RUNNING 0
A pk-ercoss2 RUNNING 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B BkupLan pk-ercoss1 Y N ONLINE
B BkupLan pk-ercoss2 Y N ONLINE
B DDCMon pk-ercoss1 Y N ONLINE
B DDCMon pk-ercoss2 Y N PARTIAL
B Oss pk-ercoss1 Y N ONLINE
B Oss pk-ercoss2 Y N OFFLINE
B Ossfs pk-ercoss1 Y N ONLINE
B Ossfs pk-ercoss2 Y N OFFLINE
B PrivLan pk-ercoss1 Y N ONLINE
B PrivLan pk-ercoss2 Y N ONLINE
B PubLan pk-ercoss1 Y N ONLINE
B PubLan pk-ercoss2 Y N ONLINE
B StorLan pk-ercoss1 Y N ONLINE
B StorLan pk-ercoss2 Y N ONLINE
B Sybase1 pk-ercoss1 Y N OFFLINE
B Sybase1 pk-ercoss2 Y N ONLINE
============================================================================
pk-ercoss1{root} # hagrp -resources PubLan
pub_mnic
pub_p
pk-ercoss1{root} # hares -display pub_mnic
#Resource Attribute System Value
pub_mnic Group global PubLan
pub_mnic Type global MultiNICB
pub_mnic AutoStart global 1
pub_mnic Critical global 1
pub_mnic Enabled global 1
pub_mnic LastOnline global pk-ercoss2
pub_mnic MonitorOnly global 0
pub_mnic ResourceOwner global
pub_mnic TriggerEvent global 0
pub_mnic ArgListValues pk-ercoss1 UseMpathd 1 1 MpathdCommand 1 /usr/lib/inet/in.mpathd ConfigCheck 1 1 MpathdRestart 1 1 Device 4 oce0 0 oce9 1 NetworkHosts 1 10.207.1.254 LinkTestRatio 1 1 IgnoreLinkStatus 1 1 NetworkTimeout 1 100 OnlineTestRepeatCount 1 3 OfflineTestRepeatCount 1 3 NoBroadcast 1 0 DefaultRouter 1 0.0.0.0 Failback 1 0 GroupName 1 "" Protocol 1 IPv4
pub_mnic ArgListValues pk-ercoss2 UseMpathd 1 1 MpathdCommand 1 /usr/lib/inet/in.mpathd ConfigCheck 1 1 MpathdRestart 1 1 Device 4 oce0 0 oce9 1 NetworkHosts 1 10.207.1.254 LinkTestRatio 1 1 IgnoreLinkStatus 1 1 NetworkTimeout 1 100 OnlineTestRepeatCount 1 3 OfflineTestRepeatCount 1 3 NoBroadcast 1 0 DefaultRouter 1 0.0.0.0 Failback 1 0 GroupName 1 "" Protocol 1 IPv4
pub_mnic ConfidenceLevel pk-ercoss1 0
pub_mnic ConfidenceLevel pk-ercoss2 0
pub_mnic ConfidenceMsg pk-ercoss1
pub_mnic ConfidenceMsg pk-ercoss2
pub_mnic Flags pk-ercoss1
pub_mnic Flags pk-ercoss2
pub_mnic IState pk-ercoss1 not waiting
pub_mnic IState pk-ercoss2 not waiting
pub_mnic MonitorMethod pk-ercoss1 Traditional
pub_mnic MonitorMethod pk-ercoss2 Traditional
pub_mnic Probed pk-ercoss1 1
pub_mnic Probed pk-ercoss2 1
pub_mnic Start pk-ercoss1 0
pub_mnic Start pk-ercoss2 0
pub_mnic State pk-ercoss1 ONLINE
pub_mnic State pk-ercoss2 ONLINE
pub_mnic ComputeStats global 0
pub_mnic ConfigCheck global 1
pub_mnic DefaultRouter global 0.0.0.0
pub_mnic Failback global 0
pub_mnic GroupName global
pub_mnic IgnoreLinkStatus global 1
pub_mnic LinkTestRatio global 1
pub_mnic MpathdCommand global /usr/lib/inet/in.mpathd
pub_mnic MpathdRestart global 1
pub_mnic NetworkHosts global 10.207.1.254
pub_mnic NetworkTimeout global 100
pub_mnic NoBroadcast global 0
pub_mnic OfflineTestRepeatCount global 3
pub_mnic OnlineTestRepeatCount global 3
pub_mnic Protocol global IPv4
pub_mnic TriggerResStateChange global 0
pub_mnic UseMpathd global 1
pub_mnic ContainerInfo pk-ercoss1 Type Name Enabled
pub_mnic ContainerInfo pk-ercoss2 Type Name Enabled
pub_mnic Device pk-ercoss1 oce0 0 oce9 1
pub_mnic Device pk-ercoss2 oce0 0 oce9 1
pub_mnic MonitorTimeStats pk-ercoss1 Avg 0 TS
pub_mnic MonitorTimeStats pk-ercoss2 Avg 0 TS
pub_mnic ResourceInfo pk-ercoss1 State Valid Msg TS
pub_mnic ResourceInfo pk-ercoss2 State Valid Msg TS
pk-ercoss1{root} # hares -display pub_p
#Resource Attribute System Value
pub_p Group global PubLan
pub_p Type global Phantom
pub_p AutoStart global 1
pub_p Critical global 1
pub_p Enabled global 1
pub_p LastOnline global pk-ercoss1
pub_p MonitorOnly global 0
pub_p ResourceOwner global
pub_p TriggerEvent global 0
pub_p ArgListValues pk-ercoss1 ""
pub_p ArgListValues pk-ercoss2 ""
pub_p ConfidenceLevel pk-ercoss1 100
pub_p ConfidenceLevel pk-ercoss2 100
pub_p ConfidenceMsg pk-ercoss1
pub_p ConfidenceMsg pk-ercoss2
pub_p Flags pk-ercoss1
pub_p Flags pk-ercoss2
pub_p IState pk-ercoss1 not waiting
pub_p IState pk-ercoss2 not waiting
pub_p MonitorMethod pk-ercoss1 Traditional
pub_p MonitorMethod pk-ercoss2 Traditional
pub_p Probed pk-ercoss1 1
pub_p Probed pk-ercoss2 1
pub_p Start pk-ercoss1 1
pub_p Start pk-ercoss2 1
pub_p State pk-ercoss1 ONLINE
pub_p State pk-ercoss2 ONLINE
pub_p ComputeStats global 0
pub_p TriggerResStateChange global 0
pub_p ContainerInfo pk-ercoss1 Type Name Enabled
pub_p ContainerInfo pk-ercoss2 Type Name Enabled
pub_p MonitorTimeStats pk-ercoss1 Avg 0 TS
pub_p MonitorTimeStats pk-ercoss2 Avg 0 TS
pub_p ResourceInfo pk-ercoss1 State Valid Msg TS
pub_p ResourceInfo pk-ercoss2 State Valid Msg TS