How I can add Virtual ip in a cluster service group

May 23, 2013, 1:04 am

≫ Next: after reboot server "VCS RAC INFO V-10-1-15047 mmpl_reconfig_ioctl: dev_ioctl failed, vxfen May not be configured "show in messages

≪ Previous: Does VCS low pri LLT support configure on a public bond NIC with public IP ?

I need a solution

How i can add Virtual ip for the groups in cluster service .

Thanks

Regards

8776801

↧

after reboot server "VCS RAC INFO V-10-1-15047 mmpl_reconfig_ioctl: dev_ioctl failed, vxfen May not be configured "show in messages

May 23, 2013, 2:36 am

≫ Next: Resources state offline

≪ Previous: How I can add Virtual ip in a cluster service group

I need a solution

Solaris 10 ,SFRAC 5.1sp1rp2

===========

PKGINST: VRTSdbac

NAME: Veritas Oracle Real Application Cluster Support Package by Symantec

CATEGORY: system

ARCH: sparc

VERSION: 5.1

BASEDIR: /

VENDOR: Symantec Corporation

DESC: Veritas Oracle Real Application Cluster Support Package by Symantec

PSTAMP: 5.1.102.100-5.1SP1RP2P1-2012-03-06_21:43:16

INSTDATE: Jul 31 2012 15:02

STATUS: completely installed

FILES: 219 installed pathnames

33 shared pathnames

5 linked files

56 directories

135 executables

10306 blocks used (approx)

after reboot:

Jan23 15:06:38 hostXX llt: [ID 860062 kern.notice] LLT INFO V-14-1-10024 link 1 (ethXXX) node 1 active

Jan23 15:06:42 hostXX gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port o gen 562e0e membership 01

Jan23 15:06:42 hostXX gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port a gen 562e0d membership 01

Jan23 15:06:42 hostXX vcsmm: [ID 357760 kern.notice] VCS RAC INFO V-10-1-15047 mmpl_reconfig_ioctl: dev_ioctl failed, vxfen May not be configured

...

=======

SFRAC start ok, just need check if it's a issue not fixed in this version.

↧

Resources state offline

May 25, 2013, 5:30 am

≫ Next: 8 processors, cpu usage 97%, gab panic system, is it normal?

≪ Previous: after reboot server "VCS RAC INFO V-10-1-15047 mmpl_reconfig_ioctl: dev_ioctl failed, vxfen May not be configured "show in messages

I need a solution

After created the resource and trying to going online for resource this command came (

Resource has not been probed on system ) when i probe the resource tell me hares -probe ( name resource )sys (name system) success . but still offline . i working from VOM GUI.

Regards

↧

8 processors, cpu usage 97%, gab panic system, is it normal?

May 28, 2013, 1:26 am

≫ Next: need to find rootcause for service failure in veritas clusster

≪ Previous: Resources state offline

I need a solution

SFHA5.1sp1rp3 on Aix,

Customer has 8 processors, cpu usage 97%, gab panic system, is it normal?

From common point, one processor maybe busy, why not use other processor?800% mean all cpu used up?

========

2013/05/17 02:51:09 VCS CRITICAL V-16-1-50086 CPU usage on hostXXX is 97% <<<<

2013/05/17 03:16:13 VCS INFO V-16-1-10196 Cluster logger started

===

=========errpt log============

LABEL: KERNEL_PANIC

IDENTIFIER: 225E3B63

Date/Time: Mon May 17 03:07:16 2013

...

Node Id: hostXXX

...

PANIC STRING

GAB: Port h halting system due to client process failure at [14:1019]

==========

000818 02:52:04 1c1cfee6 0 ... 2803 0 GAB WARNING V-15-1-20058 Port h process 10879486: heartbeat failed, killing process

000819 02:52:04 1c1cfee6 0 ... 2803 0 GAB INFO V-15-1-20059 Port h heartbeat interval 30000 msec. Statistics:

000820 02:52:04 1c1cfee6 0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 0 ~ 6000 msec: 615408

000821 02:52:04 1c1cfee6 0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 6000 ~ 12000 msec: 0

000822 02:52:04 1c1cfee6 0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 12000 ~ 18000 msec: 0

000823 02:52:04 1c1cfee6 0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 18000 ~ 24000 msec: 0

000824 02:52:04 1c1cfee6 0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 24000 ~ 30000 msec: 0

000825 02:52:04 1c1cfee6 0 ... 2803 0 GAB INFO V-15-1-20041 Port h: client process failure: killing process

000826 02:52:19 1c1d04c2 0 ... 2803 0 GAB WARNING V-15-1-20035 Port h attempting to kill process due to client process failure

000827 02:52:34 1c1d0a9e 0 ... 2803 0 GAB WARNING V-15-1-20035 Port h attempting to kill process due to client process failure

000828 02:52:49 1c1d107a 0 ... 2803 0 GAB WARNING V-15-1-20035 Port h attempting to kill process due to client process failure

000829 02:53:19 1c1d1c2c 0 ... 2803 0 GAB WARNING V-15-1-20138 Port h isolated due to client process failure

000830 02:54:11 1c1d30be 0 ... 2803 0 GAB WARNING V-15-1-20139 Port h has been isolatedand marked invalid

==========

↧

need to find rootcause for service failure in veritas clusster

May 28, 2013, 11:34 am

≫ Next: Can LLT heartbeats communicate between NICs with different device names?

≪ Previous: 8 processors, cpu usage 97%, gab panic system, is it normal?

I need a solution

Hello Guys,

I need to find the rootcause for the service failure in our veritas cluster.

service groups didnot failover to other node.

Below are the logs as i can see all this strated with NIcs failure and IPMULTINICB resource going faulty.

if anyone can help me here

engine logs

====================================

2013/05/17 12:41:41 VCS INFO V-16-2-13075 (pk-ercoss1) Resource(ossfs_ip) has reported unexpected OFFLINE 1 times, which is still within the Tol
eranceLimit(1).
2013/05/17 12:41:42 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys pk-ercoss1
2013/05/17 12:41:42 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System pk-ercoss1
2013/05/17 12:41:42 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:42 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-6-0 (pk-ercoss1) resfault:(resfault) Invoked with arg0=pk-ercoss1, arg1=pub_mnic, arg2=ONLINE
2013/05/17 12:41:43 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=pub_mnic, arg2=ONLINE
2013/05/17 12:41:43 VCS INFO V-16-0 (pk-ercoss1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss1 ,arg2=pub_mnic
2013/05/17 12:41:43 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=pub_mnic
2013/05/17 12:41:43 VCS INFO V-16-6-15002 (pk-ercoss1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss1 pub_mnic ONLIN
E successfully
2013/05/17 12:41:43 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 pub_mnic ONLIN
E successfully
2013/05/17 12:41:43 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:41:43 VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS NOTICE V-16-1-10446 Group PubLan is offline on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss2 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss1 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears
. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/05/17 12:41:43 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on pk-ercoss1 (VCS initiated)
2013/05/17 12:41:43 VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10446 Group PubLan is offline on system pk-ercoss1
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss2 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss1 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears
. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:(postoffline) Invoked with arg0=pk-ercoss2, arg1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=pk-ercoss2, arg
1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:(postoffline) Invoked with arg0=pk-ercoss1, arg1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=pk-ercoss1, arg
1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline.sh:PubLan:Nothing done
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group Pu
bLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline.sh:PubLan:Nothing done
2013/05/17 12:41:44 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline pk-ercoss2 PubLan su
ccessfully
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group Pu
bLan
2013/05/17 12:41:44 VCS INFO V-16-6-15002 (pk-ercoss1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline pk-ercoss1 PubLan su
ccessfully
2013/05/17 12:41:49 VCS INFO V-16-2-13075 (pk-ercoss1) Resource(snmp_ip) has reported unexpected OFFLINE 1 times, which is still within the Tole
ranceLimit(1).
2013/05/17 12:41:50 VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:50 VCS NOTICE V-16-1-10300 Initiating Offline of Resource stop_sybase (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:41:50 VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:50 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=syb1_p1, arg2=ONLINE
2013/05/17 12:41:50 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=ossfs_p1, arg2=ONLINE
2013/05/17 12:41:50 VCS INFO V-16-10001-88 (pk-ercoss2) Application:stop_sybase:offline:Executed [/ericsson/core/cluster/scripts/stop_sybase.sh
stop] successfully.
2013/05/17 12:41:50 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=syb1_p1
2013/05/17 12:41:50 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=ossfs_p1
2013/05/17 12:41:50 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 syb1_p1 ONLINE
successfully
2013/05/17 12:41:50 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 ossfs_p1 ONLIN
E successfully
2013/05/17 12:41:53 VCS INFO V-16-1-10305 Resource stop_sybase (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:41:53 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) on Syst
em pk-ercoss2
2013/05/17 12:42:00 VCS NOTICE V-16-20018-26 (pk-ercoss2) SybaseBk:masterdataservice_BACKUP:offline:Sybase Backup service masterdataservice_BACK
UP has been stopped
2013/05/17 12:42:00 VCS INFO V-16-2-13716 (pk-ercoss2) Resource(masterdataservice_BACKUP): Output of the completed operation (offline)
==============================================
Password:
Backup Server: 3.48.1.1: The Backup Server will go down immediately.
Terminating sessions.
==============================================

2013/05/17 12:42:00 VCS WARNING V-16-20018-301 (pk-ercoss2) SybaseBk:masterdataservice_BACKUP:monitor:Open for backupserver failed, setting cook
ie to NULL
2013/05/17 12:42:00 VCS INFO V-16-1-10305 Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS i
nitiated)
2013/05/17 12:42:00 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice (Owner: Unspecified, Group: Sybase1) on System pk-e
rcoss2
2013/05/17 12:42:02 VCS NOTICE V-16-20018-18 (pk-ercoss2) Sybase:masterdataservice:offline:Sybase service masterdataservice has been stopped
2013/05/17 12:42:03 VCS INFO V-16-2-13716 (pk-ercoss2) Resource(masterdataservice): Output of the completed operation (offline)
==============================================
Password:
Server SHUTDOWN by request.
ASE is terminating this process.
CT-LIBRARY error:
ct_results(): network packet layer: internal net library error: Net-Library operation terminated due to disconnect
==============================================

2013/05/17 12:42:03 VCS WARNING V-16-20018-301 (pk-ercoss2) Sybase:masterdataservice:monitor:Open for dataserver failed, setting cookie to NULL
2013/05/17 12:42:03 VCS INFO V-16-1-10305 Resource masterdataservice (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiate
d)
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb1_ip (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:42:03 VCS INFO V-16-1-10305 Resource syb1_ip (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybmaster_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercoss
2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercos
s2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-erco
ss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-erco
ss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource dbdumps_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercos
s2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb1bak_ip (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource sybmaster_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource syblog_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)

===========================================================================================

message file

============================================================================================

May 17 12:41:37 pk-ercoss1 in.mpathd[6024]: [ID 594170 daemon.error] NIC failure detected on oce9 of group pub_mnic
May 17 12:41:37 pk-ercoss1 in.mpathd[6024]: [ID 832587 daemon.error] Successfully failed over from NIC oce9 to NIC oce0
May 17 12:41:38 pk-ercoss1 in.mpathd[6024]: [ID 168056 daemon.error] All Interfaces in group pub_m
have failed
May 17 12:41:42 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is
FAULTED (timed out) on sys pk-ercoss1
May 17 12:41:42 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is
FAULTED (timed out) on sys pk-ercoss2
May 17 12:41:43 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss2
May 17 12:41:43 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss1
May 17 12:41:50 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is
FAULTED (timed out) on sys pk-ercoss2
May 17 12:41:50 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is F
AULTED (timed out) on sys pk-ercoss2
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 299542 daemon.error] NIC repair detected on oce0 of group pub_mnic
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 620804 daemon.error] Successfully failed back to NIC oce0
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 237757 daemon.error] At least 1 interface (oce0) of group pub_mnic has repaired
May 17 12:41:57 pk-ercoss1 in.mpathd[6024]: [ID 299542 daemon.error] NIC repair detected on oce9 of group pub_mnic
May 17 12:41:57 pk-ercoss1 in.mpathd[6024]: [ID 620804 daemon.error] Successfully failed back to NIC oce9
May 17 12:42:10 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group Sybase1 is faulted on system pk-ercoss2
May 17 12:42:11 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(5) Agent is calling clean for resource(
ossfs_ip) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:11 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(ossfs_ip
) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:11 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(5) Resource(ossfs_ip) - clean completed
successfully.
May 17 12:42:12 pk-ercoss1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,3a40@1c/pci103c,3245@0/sd@1,0 (sd9):
May 17 12:42:12 pk-ercoss1 drive offline
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 480808 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 30/0x240 belonging to the dmpnode 264/0x40
due to open failure
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 264/0x40
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x108/0x42
May 17 12:42:12 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
May 17 12:42:12 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 sybasedg: dg import with I/O fence enabled
May 17 12:42:15 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-10001-5004 IPMultiNICB:syb1_ip:online:Can not online.
No interfaces available
May 17 12:42:15 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-10001-5004 (pk-ercoss1) IPMultiNICB:syb1_ip:online:Can not online
. No interfaces available
May 17 12:42:19 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(5) Agent is calling clean for resource(
snmp_ip) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:19 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(snmp_ip)
because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:19 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(5) Resource(snmp_ip) - clean completed
successfully.
May 17 12:42:21 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is F
AULTED (timed out) on sys pk-ercoss1
May 17 12:42:21 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is
FAULTED (timed out) on sys pk-ercoss1
May 17 12:42:50 pk-ercoss1 svc.startd[9]: [ID 652011 daemon.warning] svc:/ericsson/eric_3pp/activemq_oss_loggingbroker:default: Method "/ericsso
n/activemq/bin/activeMQ.sh stopActiveMqLogger" failed with exit status 1.
May 17 12:42:50 pk-ercoss1 svc.startd[9]: [ID 652011 daemon.warning] svc:/ericsson/eric_3pp/activemq:default: Method "/ericsson/activemq/bin/act
iveMQ.sh stopActiveMq" failed with exit status 1.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13066 Thread(4) Agent is calling clean for resource(
syb1_ip) because the resource is not up even after online completed.
May 17 12:43:16 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13066 (pk-ercoss1) Agent is calling clean for resource(syb1_ip)
because the resource is not up even after online completed.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(4) Resource(syb1_ip) - clean completed
successfully.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13072 Thread(4) Resource(syb1_ip): Agent is retrying
online (attempt number 1 of 1).
May 17 12:43:28 pk-ercoss1 vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x108/0x42
May 17 12:43:28 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
May 17 12:43:28 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 sybasedg: dg import with I/O fence enabled
May 17 12:43:30 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out. Killi
ng contract 322855.
May 17 12:43:30 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
to signal KILL.
May 17 12:44:31 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out. Killi
ng contract 322861.
May 17 12:44:31 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
to signal KILL.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(13) Agent is calling clean for resource
(tomcat) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:44:49 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(tomcat)
because the resource became OFFLINE unexpectedly, on its own.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(13) Resource(tomcat) - clean completed
successfully.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13073 Thread(13) Resource(tomcat) became OFFLINE une
xpectedly on its own. Agent is restarting (attempt number 1 of 2) the resource.
May 17 12:44:49 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13073 (pk-ercoss1) Resource(tomcat) became OFFLINE unexpectedly
on its own. Agent is restarting (attempt number 1 of 2) the resource.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out. Killi
ng contract 322865.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
to signal KILL.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 748625 daemon.error] ericsson/eric_ep/TBS:default failed: transitioned to maintenance (see 'svcs -
xv' for details)
May 17 12:46:18 pk-ercoss1 su: [ID 810491 auth.crit] 'su sybase' failed for sybase on /dev/???
May 17 12:47:19 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_3pp/glassfish:default: Method or service exit timed out.
Killing contract 540.
May 17 12:47:19 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13011 Thread(14) Resource(glassfish): offline proced
ure did not complete within the expected time.
May 17 12:47:19 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13063 Thread(14) Agent is calling clean for resource
(glassfish) because offline did not complete within the expected time.
May 17 12:47:19 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13063 (pk-ercoss1) Agent is calling clean for resource(glassfis
h) because offline did not complete within the expected time.
May 17 12:47:20 pk-ercoss1 svc.startd[9]: [ID 748625 daemon.error] ericsson/eric_3pp/glassfish:default failed: transitioned to maintenance (see
'svcs -xv' for details)
May 17 12:47:21 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(14) Resource(glassfish) - clean complet
ed successfully.

======================================================

hastatus output at present

========================================================

-- SYSTEM STATE
-- System State Frozen

A pk-ercoss1 RUNNING 0
A pk-ercoss2 RUNNING 0

-- GROUP STATE
-- Group System Probed AutoDisabled State

B BkupLan         pk-ercoss1           Y          N               ONLINE
B BkupLan         pk-ercoss2           Y          N               ONLINE
B DDCMon          pk-ercoss1           Y          N               ONLINE
B DDCMon          pk-ercoss2           Y          N               PARTIAL
B Oss             pk-ercoss1           Y          N               ONLINE
B Oss             pk-ercoss2           Y          N               OFFLINE
B Ossfs           pk-ercoss1           Y          N               ONLINE
B Ossfs           pk-ercoss2           Y          N               OFFLINE
B PrivLan         pk-ercoss1           Y          N               ONLINE
B PrivLan         pk-ercoss2           Y          N               ONLINE
B PubLan          pk-ercoss1           Y          N               ONLINE
B PubLan          pk-ercoss2           Y          N               ONLINE
B StorLan         pk-ercoss1           Y          N               ONLINE
B StorLan         pk-ercoss2           Y          N               ONLINE
B Sybase1         pk-ercoss1           Y          N               OFFLINE
B Sybase1         pk-ercoss2           Y          N               ONLINE

============================================================================

pk-ercoss1{root} # hagrp -resources PubLan
pub_mnic
pub_p
pk-ercoss1{root} # hares -display pub_mnic
#Resource    Attribute              System     Value
pub_mnic     Group                  global     PubLan
pub_mnic     Type                   global     MultiNICB
pub_mnic     AutoStart              global     1
pub_mnic     Critical               global     1
pub_mnic     Enabled                global     1
pub_mnic     LastOnline             global     pk-ercoss2
pub_mnic     MonitorOnly            global     0
pub_mnic     ResourceOwner          global
pub_mnic     TriggerEvent           global     0
pub_mnic     ArgListValues          pk-ercoss1 UseMpathd        1       1       MpathdCommand   1       /usr/lib/inet/in.mpathd ConfigCheck     1       1       MpathdRestart   1       1       Device 4       oce0    0       oce9    1       NetworkHosts    1       10.207.1.254    LinkTestRatio   1       1       IgnoreLinkStatus        1       1       NetworkTimeout 1       100     OnlineTestRepeatCount   1       3       OfflineTestRepeatCount 1       3       NoBroadcast     1       0       DefaultRouter   1       0.0.0.0 Failback        1       0       GroupName       1       ""      Protocol        1       IPv4
pub_mnic     ArgListValues          pk-ercoss2 UseMpathd        1       1       MpathdCommand   1       /usr/lib/inet/in.mpathd ConfigCheck     1       1       MpathdRestart   1       1       Device 4       oce0    0       oce9    1       NetworkHosts    1       10.207.1.254    LinkTestRatio   1       1       IgnoreLinkStatus        1       1       NetworkTimeout 1       100     OnlineTestRepeatCount   1       3       OfflineTestRepeatCount 1       3       NoBroadcast     1       0       DefaultRouter   1       0.0.0.0 Failback        1       0       GroupName       1       ""      Protocol        1       IPv4
pub_mnic     ConfidenceLevel        pk-ercoss1 0
pub_mnic     ConfidenceLevel        pk-ercoss2 0
pub_mnic     ConfidenceMsg          pk-ercoss1
pub_mnic     ConfidenceMsg          pk-ercoss2
pub_mnic     Flags                  pk-ercoss1
pub_mnic     Flags                  pk-ercoss2
pub_mnic     IState                 pk-ercoss1 not waiting
pub_mnic     IState                 pk-ercoss2 not waiting
pub_mnic     MonitorMethod          pk-ercoss1 Traditional
pub_mnic     MonitorMethod          pk-ercoss2 Traditional
pub_mnic     Probed                 pk-ercoss1 1
pub_mnic     Probed                 pk-ercoss2 1
pub_mnic     Start                  pk-ercoss1 0
pub_mnic     Start                  pk-ercoss2 0
pub_mnic     State                  pk-ercoss1 ONLINE
pub_mnic     State                  pk-ercoss2 ONLINE
pub_mnic     ComputeStats           global     0
pub_mnic     ConfigCheck            global     1
pub_mnic     DefaultRouter          global     0.0.0.0
pub_mnic     Failback               global     0
pub_mnic     GroupName              global
pub_mnic     IgnoreLinkStatus       global     1
pub_mnic     LinkTestRatio          global     1
pub_mnic     MpathdCommand          global     /usr/lib/inet/in.mpathd
pub_mnic     MpathdRestart          global     1
pub_mnic     NetworkHosts           global     10.207.1.254
pub_mnic     NetworkTimeout         global     100
pub_mnic     NoBroadcast            global     0
pub_mnic     OfflineTestRepeatCount global     3
pub_mnic     OnlineTestRepeatCount global     3
pub_mnic     Protocol               global     IPv4
pub_mnic     TriggerResStateChange global     0
pub_mnic     UseMpathd              global     1
pub_mnic     ContainerInfo          pk-ercoss1 Type             Name            Enabled
pub_mnic     ContainerInfo          pk-ercoss2 Type             Name            Enabled
pub_mnic     Device                 pk-ercoss1 oce0     0       oce9    1
pub_mnic     Device                 pk-ercoss2 oce0     0       oce9    1
pub_mnic     MonitorTimeStats       pk-ercoss1 Avg      0       TS
pub_mnic     MonitorTimeStats       pk-ercoss2 Avg      0       TS
pub_mnic     ResourceInfo           pk-ercoss1 State    Valid   Msg             TS
pub_mnic     ResourceInfo           pk-ercoss2 State    Valid   Msg             TS
pk-ercoss1{root} # hares -display pub_p
#Resource    Attribute             System     Value
pub_p        Group                 global     PubLan
pub_p        Type                  global     Phantom
pub_p        AutoStart             global     1
pub_p        Critical              global     1
pub_p        Enabled               global     1
pub_p        LastOnline            global     pk-ercoss1
pub_p        MonitorOnly           global     0
pub_p        ResourceOwner         global
pub_p        TriggerEvent          global     0
pub_p        ArgListValues         pk-ercoss1 ""
pub_p        ArgListValues         pk-ercoss2 ""
pub_p        ConfidenceLevel       pk-ercoss1 100
pub_p        ConfidenceLevel       pk-ercoss2 100
pub_p        ConfidenceMsg         pk-ercoss1
pub_p        ConfidenceMsg         pk-ercoss2
pub_p        Flags                 pk-ercoss1
pub_p        Flags                 pk-ercoss2
pub_p        IState                pk-ercoss1 not waiting
pub_p        IState                pk-ercoss2 not waiting
pub_p        MonitorMethod         pk-ercoss1 Traditional
pub_p        MonitorMethod         pk-ercoss2 Traditional
pub_p        Probed                pk-ercoss1 1
pub_p        Probed                pk-ercoss2 1
pub_p        Start                 pk-ercoss1 1
pub_p        Start                 pk-ercoss2 1
pub_p        State                 pk-ercoss1 ONLINE
pub_p        State                 pk-ercoss2 ONLINE
pub_p        ComputeStats          global     0
pub_p        TriggerResStateChange global     0
pub_p        ContainerInfo         pk-ercoss1 Type              Name            Enabled
pub_p        ContainerInfo         pk-ercoss2 Type              Name            Enabled
pub_p        MonitorTimeStats      pk-ercoss1 Avg       0       TS
pub_p        MonitorTimeStats      pk-ercoss2 Avg       0       TS
pub_p        ResourceInfo          pk-ercoss1 State     Valid   Msg             TS
pub_p        ResourceInfo          pk-ercoss2 State     Valid   Msg             TS

↧

Can LLT heartbeats communicate between NICs with different device names?

May 30, 2013, 7:42 am

≫ Next: unable to extract vrts_sf_ha media

≪ Previous: need to find rootcause for service failure in veritas clusster

I need a solution

One 2-node vcs cluster, the heartbeat NICs are eth2 and eth3 on each node,

IF eth2 on node1 down, and eth3 on node2 down. Does this mean the 2 heartbeat Links both down, and the Cluster is in split brain situation?

Can LLT heartbeats communicate between NIC eth2 and NIC eth3?

Since the 《VCS Installation Guide》requires the 2 heartbeat Links in different networks.We should put eth2 of both nodes in the VLAN (VLAN1), and put eth3 of both nodes in another vlan (VLAN2). So in this situation heartbeats cannot communicate between eth2 and eth3.

But, in a production cluster system, we found out the 4 NICs--eth2 and eth3 of both nodes are all in a same VLAN. and this lead me to post the discussion thread to ask this question:

IF eth2 on node1 down, and eth3 on node2 down, What will happen to the cluster (which is in active-standby mode) ?

Thanks!

↧

unable to extract vrts_sf_ha media

June 1, 2013, 5:04 am

≫ Next: If customer can change subject of vcs alert mail

≪ Previous: Can LLT heartbeats communicate between NICs with different device names?

I need a solution

I recently downloaded the Veritas Cluster media for learninig/testing purpose.

issue is while i try to extract the media;

gtar -xvzf VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz

it started to extract after extracting 865 MB of data it returened the error

./dvd1-sol_sparc/readme_first.txt

gzip: stdin: invalid compressed data--crc error
/usr/sfw/bin/gtar: Child returned status 1
/usr/sfw/bin/gtar: Error is not recoverable: exiting now
bash-3.2# du -sh dvd1-sol_sparc/
865M dvd1-sol_sparc

then I tried

bash-3.2# gunzip -d VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz

gunzip: VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz: invalid compressed data--crc error

then

gzip -d VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz | tar xvf -

gzip: VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz: invalid compressed data--crc error
tar: blocksize = 0

I tried winrar, 7-zip etc

Thanks in advanc Guru.......

Owais Hyder.

↧

If customer can change subject of vcs alert mail

June 4, 2013, 8:00 pm

≫ Next: Change VCS llt connection

≪ Previous: unable to extract vrts_sf_ha media

I need a solution

vcs5.1sp1 on redhat

Customer configure notifier by wizard, and get following vcs alter mail:

From: VCS_ALert@nodename [mailto:VCS_ALert@nodename <mailto:VCS_ALert@nodename> ]

Sent: Tuesday, July xx, 2012 XXXX

Subject: VCS Warning for Resource resourcename, Resource went online by itself

Event Time: XXX

Entity Name: resourcename

Entity Type: Resource

Entity Subtype: BBB

Entity State: Resource went online by itself

Traps Origin: Veritas_Cluster_Server

System Name: hostname

Entities Container Name: AAA

Entities Container Type: Service Group

Like to know if mail subject and sender can be modified by customer.

↧

Change VCS llt connection

June 6, 2013, 12:30 am

≫ Next: VCS on Windows 2008R2 : Blue Screen

≪ Previous: If customer can change subject of vcs alert mail

I need a solution

Hello

I need to change the NIC ports for VCS heartbeat. Normally I use vi to edit the /etc/lltab and reboot the nodes.

Is that the correct way to chenge the llt connection.

Thank you.

↧

VCS on Windows 2008R2 : Blue Screen

June 6, 2013, 4:10 am

≫ Next: VCS cluster having VXVM solaris node not booting

≪ Previous: Change VCS llt connection

I do not need a solution (just sharing information)

Hello,

I'm currently create a VCS cluster (without SFW for the disk management) and i have some blue screens when i mount disk ressources in the cluster. In VCS witch the LDM from windows, you need to use DiskRes and Mount ressources in the cluster.

When i online DiskRes, all is ok (it works), but when i try to online the Mount ressource, i have a blue screen stop 0xBADFFFFF

I've found a solution on this site : http://www.experts-exchange.com/OS/Microsoft_Operating_Systems/Server/Windows_Server_2008/Q_27781502.html. The MPIO must be configured on Failover policy, not on RoundRobin policy. In the Disk administrator, click on the disk (not the partition) and choose properties. Go to the MPIO tab, and change the Policy to Failover Only.

↧

VCS cluster having VXVM solaris node not booting

June 6, 2013, 8:04 am

≫ Next: Veritas Cluster -engine_A.log

≪ Previous: VCS on Windows 2008R2 : Blue Screen

I need a solution

Hi Support

We have a two node VCS cluster,one node is down & not able to boot,we locally up the cluster in the other node. The node which is not booting up giving the following errors,I have mentioned the last few lines only.OS is solaris9,platform is SF4900.

=============================================================

Rebooting with command: boot -s

hba0: QLogic QLA2300 Fibre Channel Host Adapter fcode version 2.00.05 01/29/03
hba0: Firmware v3.3.12 (ipx)
QLogic Fibre Channel Driver v4.20 Instance: 1
hba1: QLogic QLA2300 Fibre Channel Host Adapter fcode version 2.00.05 01/29/03
hba1: Firmware v3.3.12 (ipx)
Hardware watchdog enabled
VxVM sysboot INFO V-5-2-3390 Starting restore daemon...
VxVM sysboot INFO V-5-2-3409 starting in boot mode...
NOTICE: VxVM vxdmp V-5-0-34 added disk array DISKS, datype = Disk

VxVM vxconfigd ERROR V-5-1-0 Segmentation violation
/etc/rcS.d/S25vxvm-sysboot: egettxt: not found

VxVM sysboot NOTICE V-5-2-3388 Halting system...
syncing file systems... done
NOTICE:

==========================================================

with thanks & regards

Arup

↧

Veritas Cluster -engine_A.log

June 6, 2013, 11:22 pm

≫ Next: SQL Server Install on VCS

≪ Previous: VCS cluster having VXVM solaris node not booting

I need a solution

Getting the following error in engine_A.log:

2013/05/31 14:16:12 VCS INFO V-16-1-10307 Resource cvmvoldg5 (Owner: unknown, Group: lic_DG) is offline on dwlemm2b (Not initiated by VCS)

2013/05/31 14:16:14 VCS INFO V-16-6-15004 (dwlemm2b) hatrigger:Failed to send trigger for resfault; script doesn't exist

2013/05/31 14:16:15 VCS NOTICE V-16-10001-5510 (dwlemm2b) CFSMount:cfsmount5:offline:Attempting fuser TERM : Mount Point : /var/opt/sentinel

2013/05/31 14:16:18 VCS NOTICE V-16-10001-5510 (dwlemm2b) CFSMount:cfsmount2:offline:Attempting fuser TERM : Mount Point : /var/opt/BGw/ServerGroup1

2013/05/31 15:11:51 VCS INFO V-16-1-10307 Resource fmm (Owner: unknown, Group: FMMgrp) is offline on dwlemm2a (Not initiated by VCS)

and many more similar kind of error.May anyone knows the cause of any of these error.

↧

SQL Server Install on VCS

June 7, 2013, 1:58 am

≫ Next: VCS 6.0.1 linux - How CPU usage is calculated?

≪ Previous: Veritas Cluster -engine_A.log

I need a solution

Hi All,

Can you please help me with a guide or a step by step document to install SQL Server 2005 & 2008 on a Veritas cluster enviornment , since start of my career i had worked only with MCS.

I very new to this enviornment and wanted to know the install method and cluster configuration.

thanks ,

Shiv

↧

VCS 6.0.1 linux - How CPU usage is calculated?

June 7, 2013, 3:05 am

≫ Next: Faliover between two nodes

≪ Previous: SQL Server Install on VCS

I need a solution

Hello All,

Maybe this is already known answer but I tried to find how VCS 6.0.1 HostMonitor is calculating CPU usage on Linux hosts and didn't found answer.
Can somebody of you answer this question?
I would like to know from where it is collecting CPU data, if it is an avarage CPU usage... (It is looks like that collected information is wrong)

Thank you very much.

Iv4n

↧

Faliover between two nodes

June 8, 2013, 4:59 am

≫ Next: VCS Global Clustering - WAC Error V-16-1-10543 IpmServer::open Cannot create socket errno = 97

≪ Previous: VCS 6.0.1 linux - How CPU usage is calculated?

I need a solution

Now my services groups is online on node 1 and the type ( Faliover) i mounted inside groups resource .

So now if the node 1 going down for any Reason the resource shoud going to node 2 is this ( high availability)

Regards

↧

VCS Global Clustering - WAC Error V-16-1-10543 IpmServer::open Cannot create socket errno = 97

June 11, 2013, 4:41 am

≫ Next: VCS Error V-16-10011-350

≪ Previous: Faliover between two nodes

I need a solution

Hi,

I have a customer who has two VCS clusters running on RHEL 5.6 servers. These clusters are further protected by site failover using GCO (global cluster option). All was working fine since installation with remote cluster operations showing up on the local cluster etc. But then this error started to appear in the wac_A.log file ....

VCS WARNING V-16-1-10543 IpmServer::open Cannot create socket errno = 97

Since this the cluster will not see of the remote clusters state, but can ping it as seen from the hastatus command below:

site-ab04# hastatus -sum

-- SYSTEM STATE
-- System State Frozen

A site-ab04 RUNNING 0

-- GROUP STATE
-- Group System Probed AutoDisabled State

B ClusterService site-ab04       Y          N               ONLINE
B SG_commonsg     site-ab04       Y          N               ONLINE
B SG_site-b04g3 site-ab04       Y          N               OFFLINE
B SG_site-b04g4 site-ab04       Y          N               OFFLINE
B SG_site-a04g0 site-ab04       Y          N               OFFLINE
B SG_site-a04g1 site-ab04       Y          N               OFFLINE
B SG_site-a04g2 site-ab04       Y          N               OFFLINE
B vxfen           site-ab04       Y          N               ONLINE

-- WAN HEARTBEAT STATE
-- Heartbeat To State

M Icmp site-b04c ALIVE

-- REMOTE CLUSTER STATE
-- Cluster State

N site-b04c INIT

Does any one have any ideas? Networking all seems to be in order.

Thanks,

Rich

↧

VCS Error V-16-10011-350

June 11, 2013, 1:29 pm

≫ Next: Storage Foundation Keyless issue

≪ Previous: VCS Global Clustering - WAC Error V-16-1-10543 IpmServer::open Cannot create socket errno = 97

I need a solution

Environment

AIX= 6.1 TL8 SP2

HA/VCS = 6.0.1

Cluster Nodes = 2

We just updated from VCS 5.1 to 6.0.1 and received the following error when we attempt to test fail over (either direction)

12:11:15 V-16-10011-350 (clustertest0) Application:lab_app:online:Execution failed with return value [110]

This is followed by:

12:11:15 V-16-10011-260 (clustertest0) Application:lab_app:online:Execution of start Program (/etc/clu_app.x) returned (1)

12:11::20 V-16-1-10298 Resource lab_app (Owner Unspecified, Group: LABSG) is online on clustertest0 (VCS initiated)

12:11::20 V-16-1-10447 Group LABSG is online on clustertest0

The application comes up and is running post fail over, but we are wondering what is causing the error so we can remedy this issue.

↧

Storage Foundation Keyless issue

June 14, 2013, 12:39 am

≫ Next: Issue with RegRep (Registry Replication Keys) ressource in VCS6.0 for SQL2008R2

≪ Previous: VCS Error V-16-10011-350

I need a solution

Hello

I have installed the SFHA6.0 for windows 2008 with keyless option. However after I applied the permanent key, the system still show the error log in event viewer.

The keyless license option does not entitle this system to run Veritas Storage Foundation for Windows, Veritas Storage Foundation and High Availability for Windows, or Veritas Dynamic Multi-Pathing for Windows. The product is no longer in licensing compliance even though it is still functioning properly. Symantec requires that you perform any of the following tasks to resolve the license compliance violation: -Add this system as a managed host to a Veritas Operations Manager (VOM) Management Server. or -Add an appropriate and valid license key on this system using the Symantec product installer from Windows Add/Remove Programs.

For more information, click the following link: http://entsupport.symantec.com/umi/V-76-58642-7026

I seach the google then found that it should be disable the keyless option.

But I found the vxkeyless is disappear on windows server.

Thank you.

↧

Issue with RegRep (Registry Replication Keys) ressource in VCS6.0 for SQL2008R2

June 26, 2013, 12:06 am

≫ Next: vxfen module cause sles 11 sp1 kernel panic

≪ Previous: Storage Foundation Keyless issue

I need a solution

Hello,

I'm working on a VCS cluster 6.0.1 under Windows 2008R2 with SQL2008R2. I cannot use the Wizard for SQL to configure Service Groups for SQL2008, because i have some issues with shared disk (call in progress to symantec support).

So, i try to configure manualy the SG, but i have an issue with the RegRep ressource. I cannot found real informations to configure this ressource. Actually my main.cf containt this :

    RegRep SQLInst1_SQL_RegRep (
       MountResName = SQLInst1_RegRep_Mount
       Keys = { "HKLM\\Software\\VERITAS\\" = "",
           "HKLM\\Software\\Microsoft\\Microsoft SQL Server" = "",
           "HKLM\\Software\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQLINST1" = "" }
       )

But, i think it's not sufficient, because i have many troubles with the failover of Service Groups.

Someone have a cluster configured with the wizard ? Who can share the complete definition of a service group for SQL2008 ? Thank you.

Note : When you use Wizard, you obtain this :

    RegRep SG_SQL03-RegRep-MSSQL (
       MountResName = SG_SQL03-Mount
       ReplicationDirectory = "\\RegRep\\SG_SQL03-RegRep-MSSQL"
       Keys = {
           "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\MSSQLServer" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_MSSQLServer.reg",
           "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\PROVIDERS" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_PROVIDERS.reg",
           "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\Replication" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_Replication.reg",
           "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\SQLServerAgent" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_SQLServerAgent.reg",
           "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\SQLServerSCP" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_SQLServerSCP.reg" }
       ExcludeKeys = {
           "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\MSSQLServer\\CurrentVersion" }
       )

↧

vxfen module cause sles 11 sp1 kernel panic

June 27, 2013, 6:51 pm

≫ Next: diskgroup resource failed to online in a VCS environment with "failed to import with groupreserve option as scsi3pr is disabled"

≪ Previous: Issue with RegRep (Registry Replication Keys) ressource in VCS6.0 for SQL2008R2

I need a solution

Hi all:

I use three share disks on iscsi(target using LIO) as the I/O fence disk in VCS 6.0 on SLES11 SP1. After configuration, the kernel panic when starting vxfen. Log is here:

[ 83.640266] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
[ 83.641248] IP: [<ffffffff81396113>] down_read+0x3/0x10
[ 83.641933] PGD 37b62067 PUD 3c7b2067 PMD 0
[ 83.642512] Oops: 0002 [#1] SMP
[ 83.643016] last sysfs file: /sys/devices/platform/host2/iscsi_host/host2/initiatorname
[ 83.644009] CPU 0
[ 83.644054] Modules linked in: vxfen(PN) dmpalua(PN) vxspec(PN) vxio(PN) vxdmp(PN) snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc gab(PN) ipv6 crc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi af_packet llt(PN) amf(PN) microcode fuse loop fdd(PN) exportfs vxportal(PN) vxfs(PN) dm_mod virtio_blk virtio_balloon virtio_net sg rtc_cmos rtc_core rtc_lib tpm_tis button tpm tpm_bios floppy i2c_piix4 virtio_pci virtio_ring pcspkr i2c_core virtio uhci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd ext3 mbcache jbd fan processor ide_pci_generic piix ide_core ata_generic ata_piix libata scsi_mod thermal thermal_sys hwmon
[ 83.644054] Supported: Yes, External
[ 83.644054] Pid: 4670, comm: vxfen Tainted: P 2.6.32.12-0.7-default #1 Bochs
[ 83.644054] RIP: 0010:[<ffffffff81396113>] [<ffffffff81396113>] down_read+0x3/0x10
[ 83.644054] RSP: 0018:ffff88002e831638 EFLAGS: 00010286
[ 83.644054] RAX: 0000000000000060 RBX: 0000000000000000 RCX: ffff88003c4e2480
[ 83.644054] RDX: 0000000000000001 RSI: 0000000000002000 RDI: 0000000000000060
[ 83.644054] RBP: ffff88002c9aa000 R08: 0000000000000000 R09: ffff88003c4e2480
[ 83.644054] R10: ffff88003d6da140 R11: 00000000000000d0 R12: 0000000000000060
[ 83.644054] R13: ffff88002c9a8000 R14: 0000000000000000 R15: ffff88002c9a8001
[ 83.644054] FS: 0000000000000000(0000) GS:ffff880006200000(0000) knlGS:0000000000000000
[ 83.644054] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[ 83.644054] CR2: 0000000000000060 CR3: 000000003c72a000 CR4: 00000000000006f0
[ 83.644054] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 83.644054] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 83.644054] Process vxfen (pid: 4670, threadinfo ffff88002e830000, task ffff88002c8e8140)
[ 83.644054] Stack:
[ 83.644054] ffffffff810325ab 0000000000000000 0000001000000000 ffff88003ca9d690
[ 83.644054] <0> ffff88003c4e2480 000000013d0181c0 0000000000000002 0000000000000002
[ 83.644054] <0> 0000000000000002 0000000000001010 0000000000000000 0000000000000000
[ 83.644054] Call Trace:
[ 83.644054] [<ffffffff810325ab>] get_user_pages_fast+0x11b/0x1a0
[ 83.644054] [<ffffffff81127c27>] __bio_map_user_iov+0x167/0x2a0
[ 83.644054] [<ffffffff81127d69>] bio_map_user_iov+0x9/0x30
[ 83.644054] [<ffffffff81127dac>] bio_map_user+0x1c/0x30
[ 83.644054] [<ffffffff811bab91>] __blk_rq_map_user+0x111/0x140
[ 83.644054] [<ffffffff811bacc5>] blk_rq_map_user+0x105/0x190
[ 83.644054] [<ffffffff811bebe7>] sg_io+0x3c7/0x3e0
[ 83.644054] [<ffffffff811bf1ec>] scsi_cmd_ioctl+0x2ac/0x470
[ 83.644054] [<ffffffffa01b75b1>] sd_ioctl+0xa1/0x120 [sd_mod]
[ 83.644054] [<ffffffffa0ccaa93>] vxfen_ioctl_by_bdev+0xc3/0xd0 [vxfen]
[ 83.644054] [<ffffffffa0ccb6ac>] vxfen_ioc_kernel_scsi_ioctl+0xec/0x3c0 [vxfen]
[ 83.644054] [<ffffffffa0ccbe1f>] vxfen_lnx_pgr_in+0xff/0x380 [vxfen]
[ 83.644054] [<ffffffffa0ccc11a>] vxfen_plat_pgr_in+0x7a/0x1c0 [vxfen]
[ 83.644054] [<ffffffffa0cd26c3>] vxfen_readkeys+0xa3/0x380 [vxfen]
[ 83.644054] [<ffffffffa0cd3514>] vxfen_membreg+0x84/0xae0 [vxfen]
[ 83.644054] [<ffffffffa0cce6d6>] vxfen_preexist_split_brain_scsi3+0x96/0x2d0 [vxfen]
[ 83.644054] [<ffffffffa0ccf96d>] vxfen_reg_coord_disk+0x7d/0x660 [vxfen]
[ 83.644054] [<ffffffffa0ca5e0b>] vxfen_reg_coord_pt+0xfb/0x250 [vxfen]
[ 83.644054] [<ffffffffa0cb849c>] vxfen_handle_local_config_done+0x14c/0x8d0 [vxfen]
[ 83.644054] [<ffffffffa0cbad57>] vxfen_vrfsm_cback+0xad7/0x17b0 [vxfen]
[ 83.644054] [<ffffffffa0cd5b20>] vrfsm_step+0x1b0/0x3b0 [vxfen]
[ 83.644054] [<ffffffffa0cd7e1c>] vrfsm_recv_thread+0x32c/0x970 [vxfen]
[ 83.644054] [<ffffffffa0cd85b4>] vxplat_lx_thread_base+0xa4/0x100 [vxfen]
[ 83.644054] [<ffffffff81003fba>] child_rip+0xa/0x20
[ 83.644054] Code: 48 85 f6 74 0f 48 89 e7 e8 5b 09 cd ff 85 c0 48 63 d0 7e 07 48 c7 c2 fc fd ff ff 48 83 c4 78 48 89 d0 5b 5d c3 00 00 00 48 89 f8 <3e> 48 ff 00 79 05 e8 52 fa e4 ff c3 90 48 89 f8 48 ba 01 00 00
[ 83.644054] RIP [<ffffffff81396113>] down_read+0x3/0x10
[ 83.644054] RSP <ffff88002e831638>
[ 83.644054] CR2: 0000000000000060

vcs1:~ # vxdisk list
DEVICE TYPE DISK GROUP STATUS
aluadisk0_1 auto:cdsdisk - - online
aluadisk0_2 auto:cdsdisk - - online
aluadisk0_3 auto:cdsdisk - - online
sda auto:none - - online invalid
vda simple vda data_dg online
vdb simple vdb data_dg online

vcs1:~ # cat /proc/partitions
major minor #blocks name

8 0 8388608 sda
8 1 7333641 sda1
8 2 1052257 sda2
253 0 10485760 vda
253 16 4194304 vdb
8 16 1048576 sdb
8 19 1046528 sdb3
8 24 1046528 sdb8
8 32 1048576 sdc
8 35 1046528 sdc3
8 40 1046528 sdc8
8 48 1048576 sdd
8 51 1046528 sdd3
8 56 1046528 sdd8
201 0 8388608 VxDMP1
201 1 7333641 VxDMP1p1
201 2 1052257 VxDMP1p2
201 16 1048576 VxDMP2
201 19 1046528 VxDMP2p3
201 24 1046528 VxDMP2p8
201 32 1048576 VxDMP3
201 35 1046528 VxDMP3p3
201 40 1046528 VxDMP3p8
201 48 1048576 VxDMP4
201 51 1046528 VxDMP4p3
201 56 1046528 VxDMP4p8
199 6000 8388608 VxVM6000
199 6001 153600 VxVM6001

LIO Target server message:

[60529.780169] br0: port 3(vnet2) entered forwarding state
[60675.372132] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.372217] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.373337] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.373426] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.374304] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.374374] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.

All test have passed by

vcs1:~ # vxfentsthdw -m

Veritas vxfentsthdw version 6.0.000.000-GA Linux

The utility vxfentsthdw works on the two nodes of the cluster.
The utility verifies that the shared storage one intends to use is
configured to support I/O fencing. It issues a series of vxfenadm
commands to setup SCSI-3 registrations on the disk, verifies the
registrations on the disk, and removes the registrations from the disk.

******** WARNING!!!!!!!! ********

THIS UTILITY WILL DESTROY THE DATA ON THE DISK!!

Do you still want to continue : [y/n] (default: n) y
The logfile generated for vxfentsthdw is /var/VRTSvcs/log/vxfen/vxfentsthdw.log.9431

Enter the first node of the cluster:
vcs1
Enter the second node of the cluster:
vcs2

Enter the disk name to be checked for SCSI-3 PGR on node vcs1 in the format:
for dmp: /dev/vx/rdmp/sdx
for raw: /dev/sdx
Make sure it is the same disk as seen by nodes vcs1 and vcs2
/dev/sdb

Enter the disk name to be checked for SCSI-3 PGR on node vcs2 in the format:
for dmp: /dev/vx/rdmp/sdx
for raw: /dev/sdx
Make sure it is the same disk as seen by nodes vcs1 and vcs2
/dev/sdb

***************************************************************************

Testing vcs1 /dev/sdb vcs2 /dev/sdb

Evaluate the disk before testing ........................ No Pre-existing keys
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs2 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs2 .................... Passed
Unregister keys on disk /dev/sdb from node vcs1 ........................ Passed
Verify registrations for disk /dev/sdb on node vcs2 .................... Passed
Unregister keys on disk /dev/sdb from node vcs2 ........................ Passed
Check to verify there are no keys from node vcs1 ....................... Passed
Check to verify there are no keys from node vcs2 ....................... Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Read from disk /dev/sdb on node vcs1 ................................... Passed
Write to disk /dev/sdb from node vcs1 .................................. Passed
Read from disk /dev/sdb on node vcs2 ................................... Passed
Write to disk /dev/sdb from node vcs2 .................................. Passed
Reserve disk /dev/sdb from node vcs1 ................................... Passed
Verify reservation for disk /dev/sdb on node vcs1 ...................... Passed
Read from disk /dev/sdb on node vcs1 ................................... Passed
Read from disk /dev/sdb on node vcs2 ................................... Passed
Write to disk /dev/sdb from node vcs1 .................................. Passed
Expect no writes for disk /dev/sdb on node vcs2 ........................ Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs2 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Verify registrations for disk /dev/sdb on node vcs2 .................... Passed
Write to disk /dev/sdb from node vcs1 .................................. Passed
Write to disk /dev/sdb from node vcs2 .................................. Passed
Preempt and abort key KeyA using key KeyB on node vcs2 ................. Passed
Test to see if I/O on node vcs1 terminated ............................. Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Preempt key KeyC using key KeyB on node vcs2 ........................... Passed
Test to see if I/O on node vcs1 terminated ............................. Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Verify registrations for disk /dev/sdb on node vcs2 .................... Passed
Verify reservation for disk /dev/sdb on node vcs1 ...................... Passed
Verify reservation for disk /dev/sdb on node vcs2 ...................... Passed
Remove key KeyB on node vcs2 ........................................... Passed
Check to verify there are no keys from node vcs1 ....................... Passed
Check to verify there are no keys from node vcs2 ....................... Passed
Check to verify there are no reservations on disk /dev/sdb from node vcs1 Passed
Check to verify there are no reservations on disk /dev/sdb from node vcs2 Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Clear PGR on node vcs1 ................................................. Passed
Check to verify there are no keys from node vcs1 ....................... Passed

ALL tests on the disk /dev/sdb have PASSED.
The disk is now ready to be configured for I/O Fencing on node vcs1.

ALL tests on the disk /dev/sdb have PASSED.
The disk is now ready to be configured for I/O Fencing on node vcs2.

Removing test keys and temporary files, if any...

↧