Quantcast
Channel: Symantec Connect - Storage and Clustering - Discussions
Viewing all 543 articles
Browse latest View live

How I can add Virtual ip in a cluster service group

$
0
0
No
I need a solution

How i can add Virtual ip for the groups in cluster service .

Thanks 

Regards

8776801

after reboot server "VCS RAC INFO V-10-1-15047 mmpl_reconfig_ioctl: dev_ioctl failed, vxfen May not be configured "show in messages

$
0
0
I need a solution

Solaris 10 ,SFRAC 5.1sp1rp2

===========

 

 PKGINST:  VRTSdbac
      NAME:  Veritas Oracle Real Application Cluster Support Package by Symantec
  CATEGORY:  system
      ARCH:  sparc
   VERSION:  5.1
   BASEDIR:  /
    VENDOR:  Symantec Corporation
      DESC:  Veritas Oracle Real Application Cluster Support Package by Symantec
    PSTAMP:  5.1.102.100-5.1SP1RP2P1-2012-03-06_21:43:16
  INSTDATE:  Jul 31 2012 15:02
    STATUS:  completely installed
     FILES:      219 installed pathnames
                  33 shared pathnames
                   5 linked files
                  56 directories
                 135 executables
               10306 blocks used (approx)

 

after reboot:

 

 

Jan23 15:06:38 hostXX llt: [ID 860062 kern.notice] LLT INFO V-14-1-10024 link 1 (ethXXX) node 1 active
Jan23 15:06:42 hostXX gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port o gen   562e0e membership 01
Jan23 15:06:42 hostXX gab: [ID 316943 kern.notice] GAB INFO V-15-1-20036 Port a gen   562e0d membership 01
Jan23 15:06:42 hostXX vcsmm: [ID 357760 kern.notice] VCS RAC INFO V-10-1-15047 mmpl_reconfig_ioctl: dev_ioctl failed, vxfen May not be configured
...
 
 
=======
SFRAC start ok, just need check if it's a issue not fixed in this version.

Resources state offline

$
0
0
I need a solution

Hi

 

After created the resource and trying to going online for resource this command came (

Resource has not been probed on system ) when i probe the resource tell me hares -probe ( name resource )sys (name system) success . but  still offline .  i working from VOM GUI. 

 

Regards

 

 

8 processors, cpu usage 97%, gab panic system, is it normal?

$
0
0
I need a solution

SFHA5.1sp1rp3 on Aix,

Customer has 8 processors, cpu usage 97%, gab panic system, is it normal?

From common point, one processor maybe busy, why not use other processor?800% mean all cpu used up?

 

========

2013/05/17 02:51:09 VCS CRITICAL V-16-1-50086 CPU usage on hostXXX is 97%  <<<<
2013/05/17 03:16:13 VCS INFO V-16-1-10196 Cluster logger started
===
 
 
=========errpt log============
  LABEL:          KERNEL_PANIC
IDENTIFIER:     225E3B63
 
Date/Time:       Mon May 17 03:07:16 2013
...
Node Id:         hostXXX
...
PANIC STRING
GAB: Port h halting system due to client process failure at [14:1019]
 
==========
000818 02:52:04 1c1cfee6  0 ... 2803 0 GAB WARNING V-15-1-20058 Port h process 10879486: heartbeat failed, killing process
 
000819 02:52:04 1c1cfee6  0 ... 2803 0 GAB INFO V-15-1-20059 Port h heartbeat interval 30000 msec. Statistics:
 
000820 02:52:04 1c1cfee6  0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 0 ~ 6000 msec: 615408
 
000821 02:52:04 1c1cfee6  0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 6000 ~ 12000 msec: 0
 
000822 02:52:04 1c1cfee6  0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 12000 ~ 18000 msec: 0
 
000823 02:52:04 1c1cfee6  0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 18000 ~ 24000 msec: 0
 
000824 02:52:04 1c1cfee6  0 ... 2803 0 GAB INFO V-15-1-20129 Port h: heartbeats in 24000 ~ 30000 msec: 0
 
000825 02:52:04 1c1cfee6  0 ... 2803 0 GAB INFO V-15-1-20041 Port h: client process failure: killing process
 
000826 02:52:19 1c1d04c2  0 ... 2803 0 GAB WARNING V-15-1-20035 Port h attempting to kill process due to client process failure
 
000827 02:52:34 1c1d0a9e  0 ... 2803 0 GAB WARNING V-15-1-20035 Port h attempting to kill process due to client process failure
 
000828 02:52:49 1c1d107a  0 ... 2803 0 GAB WARNING V-15-1-20035 Port h attempting to kill process due to client process failure
 
000829 02:53:19 1c1d1c2c  0 ... 2803 0 GAB WARNING V-15-1-20138 Port h isolated due to client process failure
 
000830 02:54:11 1c1d30be  0 ... 2803 0 GAB WARNING V-15-1-20139 Port h has been isolatedand marked invalid
==========

need to find rootcause for service failure in veritas clusster

$
0
0
I need a solution

Hello Guys,

I need to find the rootcause for the service failure in our veritas cluster.

service groups didnot failover to other node.

Below are the logs as i can see all this strated with NIcs failure and IPMULTINICB resource going  faulty.

if anyone can help me here

engine logs

====================================

2013/05/17 12:41:41 VCS INFO V-16-2-13075 (pk-ercoss1) Resource(ossfs_ip) has reported unexpected OFFLINE 1 times, which is still within the Tol
eranceLimit(1).
2013/05/17 12:41:42 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys pk-ercoss1
2013/05/17 12:41:42 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System pk-ercoss1
2013/05/17 12:41:42 VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:42 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pub_p (Owner: Unspecified, Group: PubLan) on System pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-6-0 (pk-ercoss1) resfault:(resfault) Invoked with arg0=pk-ercoss1, arg1=pub_mnic, arg2=ONLINE
2013/05/17 12:41:43 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=pub_mnic, arg2=ONLINE
2013/05/17 12:41:43 VCS INFO V-16-0 (pk-ercoss1) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss1 ,arg2=pub_mnic
2013/05/17 12:41:43 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=pub_mnic
2013/05/17 12:41:43 VCS INFO V-16-6-15002 (pk-ercoss1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss1 pub_mnic ONLIN
E  successfully
2013/05/17 12:41:43 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 pub_mnic ONLIN
E  successfully
2013/05/17 12:41:43 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:41:43 VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS NOTICE V-16-1-10446 Group PubLan is offline on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss2 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss1 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears
. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/05/17 12:41:43 VCS INFO V-16-1-10305 Resource pub_p (Owner: Unspecified, Group: PubLan) is offline on pk-ercoss1 (VCS initiated)
2013/05/17 12:41:43 VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10446 Group PubLan is offline on system pk-ercoss1
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss2 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss2
2013/05/17 12:41:43 VCS INFO V-16-1-10493 Evaluating pk-ercoss1 as potential target node for group PubLan
2013/05/17 12:41:43 VCS INFO V-16-1-50010 Group PubLan is online or faulted on system pk-ercoss1
2013/05/17 12:41:43 VCS NOTICE V-16-1-10235 Restart is set for group PubLan. Group will be brought online if fault on persistent resource clears
. If group is brought online anywhere else from AutoStartList or manually, then Restart will be reset
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:(postoffline) Invoked with arg0=pk-ercoss2, arg1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=pk-ercoss2, arg
1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:(postoffline) Invoked with arg0=pk-ercoss1, arg1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:Executing /ericsson/core/cluster/scripts/postoffline.sh with arg0=pk-ercoss1, arg
1=PubLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline.sh:PubLan:Nothing done
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss2) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group Pu
bLan
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline.sh:PubLan:Nothing done
2013/05/17 12:41:44 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline pk-ercoss2 PubLan   su
ccessfully
2013/05/17 12:41:44 VCS INFO V-16-6-0 (pk-ercoss1) postoffline:Completed execution of /ericsson/core/cluster/scripts/postoffline.sh for group Pu
bLan
2013/05/17 12:41:44 VCS INFO V-16-6-15002 (pk-ercoss1) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/postoffline pk-ercoss1 PubLan   su
ccessfully
2013/05/17 12:41:49 VCS INFO V-16-2-13075 (pk-ercoss1) Resource(snmp_ip) has reported unexpected OFFLINE 1 times, which is still within the Tole
ranceLimit(1).
2013/05/17 12:41:50 VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:50 VCS NOTICE V-16-1-10300 Initiating Offline of Resource stop_sybase (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:41:50 VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is FAULTED (timed out) on sys pk-ercoss2
2013/05/17 12:41:50 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=syb1_p1, arg2=ONLINE
2013/05/17 12:41:50 VCS INFO V-16-6-0 (pk-ercoss2) resfault:(resfault) Invoked with arg0=pk-ercoss2, arg1=ossfs_p1, arg2=ONLINE
2013/05/17 12:41:50 VCS INFO V-16-10001-88 (pk-ercoss2) Application:stop_sybase:offline:Executed [/ericsson/core/cluster/scripts/stop_sybase.sh
stop] successfully.
2013/05/17 12:41:50 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=syb1_p1
2013/05/17 12:41:50 VCS INFO V-16-0 (pk-ercoss2) resfault:(resfault.sh) Invoked with arg0=/ericsson/core/cluster/scripts/resfault.sh, arg1=pk-er
coss2 ,arg2=ossfs_p1
2013/05/17 12:41:50 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 syb1_p1 ONLINE
  successfully
2013/05/17 12:41:50 VCS INFO V-16-6-15002 (pk-ercoss2) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/resfault pk-ercoss2 ossfs_p1 ONLIN
E  successfully
2013/05/17 12:41:53 VCS INFO V-16-1-10305 Resource stop_sybase (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:41:53 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) on Syst
em pk-ercoss2
2013/05/17 12:42:00 VCS NOTICE V-16-20018-26 (pk-ercoss2) SybaseBk:masterdataservice_BACKUP:offline:Sybase Backup service masterdataservice_BACK
UP has been stopped
2013/05/17 12:42:00 VCS INFO V-16-2-13716 (pk-ercoss2) Resource(masterdataservice_BACKUP): Output of the completed operation (offline)
==============================================
Password:
Backup Server: 3.48.1.1: The Backup Server will go down immediately.
Terminating sessions.
==============================================

2013/05/17 12:42:00 VCS WARNING V-16-20018-301 (pk-ercoss2) SybaseBk:masterdataservice_BACKUP:monitor:Open for backupserver failed, setting cook
ie to NULL
2013/05/17 12:42:00 VCS INFO V-16-1-10305 Resource masterdataservice_BACKUP (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS i
nitiated)
2013/05/17 12:42:00 VCS NOTICE V-16-1-10300 Initiating Offline of Resource masterdataservice (Owner: Unspecified, Group: Sybase1) on System pk-e
rcoss2
2013/05/17 12:42:02 VCS NOTICE V-16-20018-18 (pk-ercoss2) Sybase:masterdataservice:offline:Sybase service masterdataservice has been stopped
2013/05/17 12:42:03 VCS INFO V-16-2-13716 (pk-ercoss2) Resource(masterdataservice): Output of the completed operation (offline)
==============================================
Password:
Server SHUTDOWN by request.
ASE is terminating this process.
CT-LIBRARY error:
        ct_results(): network packet layer: internal net library error: Net-Library operation terminated due to disconnect
==============================================

2013/05/17 12:42:03 VCS WARNING V-16-20018-301 (pk-ercoss2) Sybase:masterdataservice:monitor:Open for dataserver failed, setting cookie to NULL
2013/05/17 12:42:03 VCS INFO V-16-1-10305 Resource masterdataservice (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiate
d)
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb1_ip (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:42:03 VCS INFO V-16-1-10305 Resource syb1_ip (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybmaster_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercoss
2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercos
s2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-erco
ss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsyblog_mount (Owner: Unspecified, Group: Sybase1) on System pk-erco
ss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource fmsybdata_mount (Owner: Unspecified, Group: Sybase1) on System pk-erc
oss2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource dbdumps_mount (Owner: Unspecified, Group: Sybase1) on System pk-ercos
s2
2013/05/17 12:42:03 VCS NOTICE V-16-1-10300 Initiating Offline of Resource syb1bak_ip (Owner: Unspecified, Group: Sybase1) on System pk-ercoss2
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource sybmaster_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource syblog_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource sybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource pmsyblog_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)
2013/05/17 12:42:05 VCS INFO V-16-1-10305 Resource pmsybdata_mount (Owner: Unspecified, Group: Sybase1) is offline on pk-ercoss2 (VCS initiated)

===========================================================================================

message file

============================================================================================

May 17 12:41:37 pk-ercoss1 in.mpathd[6024]: [ID 594170 daemon.error] NIC failure detected on oce9 of group pub_mnic
May 17 12:41:37 pk-ercoss1 in.mpathd[6024]: [ID 832587 daemon.error] Successfully failed over from NIC oce9 to NIC oce0
May 17 12:41:38 pk-ercoss1 in.mpathd[6024]: [ID 168056 daemon.error] All Interfaces in group pub_m
 have failed
May 17 12:41:42 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is
FAULTED (timed out) on sys pk-ercoss1
May 17 12:41:42 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource pub_mnic (Owner: Unspecified, Group: PubLan) is
FAULTED (timed out) on sys pk-ercoss2
May 17 12:41:43 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss2
May 17 12:41:43 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group PubLan is faulted on system pk-ercoss1
May 17 12:41:50 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is
FAULTED (timed out) on sys pk-ercoss2
May 17 12:41:50 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is F
AULTED (timed out) on sys pk-ercoss2
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 299542 daemon.error] NIC repair detected on oce0 of group pub_mnic
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 620804 daemon.error] Successfully failed back to NIC oce0
May 17 12:41:56 pk-ercoss1 in.mpathd[6024]: [ID 237757 daemon.error] At least 1 interface (oce0) of group pub_mnic has repaired
May 17 12:41:57 pk-ercoss1 in.mpathd[6024]: [ID 299542 daemon.error] NIC repair detected on oce9 of group pub_mnic
May 17 12:41:57 pk-ercoss1 in.mpathd[6024]: [ID 620804 daemon.error] Successfully failed back to NIC oce9
May 17 12:42:10 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10205 Group Sybase1 is faulted on system pk-ercoss2
May 17 12:42:11 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(5) Agent is calling clean for resource(
ossfs_ip) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:11 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(ossfs_ip
) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:11 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(5) Resource(ossfs_ip) - clean completed
 successfully.
May 17 12:42:12 pk-ercoss1 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,3a40@1c/pci103c,3245@0/sd@1,0 (sd9):
May 17 12:42:12 pk-ercoss1      drive offline
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 480808 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 30/0x240 belonging to the dmpnode 264/0x40
due to open failure
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 264/0x40
May 17 12:42:12 pk-ercoss1 vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x108/0x42
May 17 12:42:12 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
May 17 12:42:12 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 sybasedg: dg import with I/O fence enabled
May 17 12:42:15 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-10001-5004 IPMultiNICB:syb1_ip:online:Can not online.
No interfaces available
May 17 12:42:15 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-10001-5004 (pk-ercoss1) IPMultiNICB:syb1_ip:online:Can not online
. No interfaces available
May 17 12:42:19 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(5) Agent is calling clean for resource(
snmp_ip) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:19 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(snmp_ip)
 because the resource became OFFLINE unexpectedly, on its own.
May 17 12:42:19 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(5) Resource(snmp_ip) - clean completed
successfully.
May 17 12:42:21 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource ossfs_p1 (Owner: Unspecified, Group: Ossfs) is F
AULTED (timed out) on sys pk-ercoss1
May 17 12:42:21 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-1-10303 Resource syb1_p1 (Owner: Unspecified, Group: Sybase1) is
FAULTED (timed out) on sys pk-ercoss1
May 17 12:42:50 pk-ercoss1 svc.startd[9]: [ID 652011 daemon.warning] svc:/ericsson/eric_3pp/activemq_oss_loggingbroker:default: Method "/ericsso
n/activemq/bin/activeMQ.sh stopActiveMqLogger" failed with exit status 1.
May 17 12:42:50 pk-ercoss1 svc.startd[9]: [ID 652011 daemon.warning] svc:/ericsson/eric_3pp/activemq:default: Method "/ericsson/activemq/bin/act
iveMQ.sh stopActiveMq" failed with exit status 1.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13066 Thread(4) Agent is calling clean for resource(
syb1_ip) because the resource is not up even after online completed.
May 17 12:43:16 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13066 (pk-ercoss1) Agent is calling clean for resource(syb1_ip)
 because the resource is not up even after online completed.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(4) Resource(syb1_ip) - clean completed
successfully.
May 17 12:43:16 pk-ercoss1 AgentFramework[5819]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13072 Thread(4) Resource(syb1_ip): Agent is retrying
 online (attempt number 1 of 1).
May 17 12:43:28 pk-ercoss1 vxdmp: [ID 238993 kern.notice] NOTICE: VxVM vxdmp 0 dmp_tur_temp_pgr: open failed: error = 6 dev=0x108/0x42
May 17 12:43:28 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 : dg import with I/O fence enabled
May 17 12:43:28 pk-ercoss1 vxvm:vxconfigd: [ID 702911 daemon.notice] V-5-1-11401 sybasedg: dg import with I/O fence enabled
May 17 12:43:30 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out.  Killi
ng contract 322855.
May 17 12:43:30 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
 to signal KILL.
May 17 12:44:31 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out.  Killi
ng contract 322861.
May 17 12:44:31 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
 to signal KILL.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 Thread(13) Agent is calling clean for resource
(tomcat) because the resource became OFFLINE unexpectedly, on its own.
May 17 12:44:49 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13067 (pk-ercoss1) Agent is calling clean for resource(tomcat)
because the resource became OFFLINE unexpectedly, on its own.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(13) Resource(tomcat) - clean completed
successfully.
May 17 12:44:49 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13073 Thread(13) Resource(tomcat) became OFFLINE une
xpectedly on its own. Agent is restarting (attempt number 1 of 2) the resource.
May 17 12:44:49 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13073 (pk-ercoss1) Resource(tomcat) became OFFLINE unexpectedly
 on its own. Agent is restarting (attempt number 1 of 2) the resource.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method or service exit timed out.  Killi
ng contract 322865.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 636263 daemon.warning] svc:/ericsson/eric_ep/TBS:default: Method "/etc/init.d/TBS stop" failed due
 to signal KILL.
May 17 12:45:32 pk-ercoss1 svc.startd[9]: [ID 748625 daemon.error] ericsson/eric_ep/TBS:default failed: transitioned to maintenance (see 'svcs -
xv' for details)
May 17 12:46:18 pk-ercoss1 su: [ID 810491 auth.crit] 'su sybase' failed for sybase on /dev/???
May 17 12:47:19 pk-ercoss1 svc.startd[9]: [ID 122153 daemon.warning] svc:/ericsson/eric_3pp/glassfish:default: Method or service exit timed out.
  Killing contract 540.
May 17 12:47:19 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13011 Thread(14) Resource(glassfish): offline proced
ure did not complete within the expected time.
May 17 12:47:19 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13063 Thread(14) Agent is calling clean for resource
(glassfish) because offline did not complete within the expected time.
May 17 12:47:19 pk-ercoss1 Had[5742]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13063 (pk-ercoss1) Agent is calling clean for resource(glassfis
h) because offline did not complete within the expected time.
May 17 12:47:20 pk-ercoss1 svc.startd[9]: [ID 748625 daemon.error] ericsson/eric_3pp/glassfish:default failed: transitioned to maintenance (see
'svcs -xv' for details)
May 17 12:47:21 pk-ercoss1 AgentFramework[5813]: [ID 702911 daemon.notice] VCS ERROR V-16-2-13068 Thread(14) Resource(glassfish) - clean complet
ed successfully.

======================================================

hastatus output at present

========================================================

-- SYSTEM STATE
-- System               State                Frozen              

A  pk-ercoss1           RUNNING              0                    
A  pk-ercoss2           RUNNING              0                    

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State          

B  BkupLan         pk-ercoss1           Y          N               ONLINE         
B  BkupLan         pk-ercoss2           Y          N               ONLINE         
B  DDCMon          pk-ercoss1           Y          N               ONLINE         
B  DDCMon          pk-ercoss2           Y          N               PARTIAL        
B  Oss             pk-ercoss1           Y          N               ONLINE         
B  Oss             pk-ercoss2           Y          N               OFFLINE        
B  Ossfs           pk-ercoss1           Y          N               ONLINE         
B  Ossfs           pk-ercoss2           Y          N               OFFLINE        
B  PrivLan         pk-ercoss1           Y          N               ONLINE         
B  PrivLan         pk-ercoss2           Y          N               ONLINE         
B  PubLan          pk-ercoss1           Y          N               ONLINE         
B  PubLan          pk-ercoss2           Y          N               ONLINE         
B  StorLan         pk-ercoss1           Y          N               ONLINE         
B  StorLan         pk-ercoss2           Y          N               ONLINE         
B  Sybase1         pk-ercoss1           Y          N               OFFLINE        
B  Sybase1         pk-ercoss2           Y          N               ONLINE         

============================================================================

pk-ercoss1{root} # hagrp -resources PubLan
pub_mnic
pub_p
pk-ercoss1{root} # hares -display pub_mnic
#Resource    Attribute              System     Value
pub_mnic     Group                  global     PubLan
pub_mnic     Type                   global     MultiNICB
pub_mnic     AutoStart              global     1
pub_mnic     Critical               global     1
pub_mnic     Enabled                global     1
pub_mnic     LastOnline             global     pk-ercoss2
pub_mnic     MonitorOnly            global     0
pub_mnic     ResourceOwner          global     
pub_mnic     TriggerEvent           global     0
pub_mnic     ArgListValues          pk-ercoss1 UseMpathd        1       1       MpathdCommand   1       /usr/lib/inet/in.mpathd ConfigCheck     1       1       MpathdRestart   1       1       Device  4       oce0    0       oce9    1       NetworkHosts    1       10.207.1.254    LinkTestRatio   1       1       IgnoreLinkStatus        1       1       NetworkTimeout  1       100     OnlineTestRepeatCount   1       3       OfflineTestRepeatCount  1       3       NoBroadcast     1       0       DefaultRouter   1       0.0.0.0 Failback        1       0       GroupName       1       ""      Protocol        1       IPv4
pub_mnic     ArgListValues          pk-ercoss2 UseMpathd        1       1       MpathdCommand   1       /usr/lib/inet/in.mpathd ConfigCheck     1       1       MpathdRestart   1       1       Device  4       oce0    0       oce9    1       NetworkHosts    1       10.207.1.254    LinkTestRatio   1       1       IgnoreLinkStatus        1       1       NetworkTimeout  1       100     OnlineTestRepeatCount   1       3       OfflineTestRepeatCount  1       3       NoBroadcast     1       0       DefaultRouter   1       0.0.0.0 Failback        1       0       GroupName       1       ""      Protocol        1       IPv4
pub_mnic     ConfidenceLevel        pk-ercoss1 0
pub_mnic     ConfidenceLevel        pk-ercoss2 0
pub_mnic     ConfidenceMsg          pk-ercoss1
pub_mnic     ConfidenceMsg          pk-ercoss2
pub_mnic     Flags                  pk-ercoss1
pub_mnic     Flags                  pk-ercoss2
pub_mnic     IState                 pk-ercoss1 not waiting
pub_mnic     IState                 pk-ercoss2 not waiting
pub_mnic     MonitorMethod          pk-ercoss1 Traditional
pub_mnic     MonitorMethod          pk-ercoss2 Traditional
pub_mnic     Probed                 pk-ercoss1 1
pub_mnic     Probed                 pk-ercoss2 1
pub_mnic     Start                  pk-ercoss1 0
pub_mnic     Start                  pk-ercoss2 0
pub_mnic     State                  pk-ercoss1 ONLINE
pub_mnic     State                  pk-ercoss2 ONLINE
pub_mnic     ComputeStats           global     0
pub_mnic     ConfigCheck            global     1
pub_mnic     DefaultRouter          global     0.0.0.0
pub_mnic     Failback               global     0
pub_mnic     GroupName              global     
pub_mnic     IgnoreLinkStatus       global     1
pub_mnic     LinkTestRatio          global     1
pub_mnic     MpathdCommand          global     /usr/lib/inet/in.mpathd
pub_mnic     MpathdRestart          global     1
pub_mnic     NetworkHosts           global     10.207.1.254
pub_mnic     NetworkTimeout         global     100
pub_mnic     NoBroadcast            global     0
pub_mnic     OfflineTestRepeatCount global     3
pub_mnic     OnlineTestRepeatCount  global     3
pub_mnic     Protocol               global     IPv4
pub_mnic     TriggerResStateChange  global     0
pub_mnic     UseMpathd              global     1
pub_mnic     ContainerInfo          pk-ercoss1 Type             Name            Enabled
pub_mnic     ContainerInfo          pk-ercoss2 Type             Name            Enabled
pub_mnic     Device                 pk-ercoss1 oce0     0       oce9    1
pub_mnic     Device                 pk-ercoss2 oce0     0       oce9    1
pub_mnic     MonitorTimeStats       pk-ercoss1 Avg      0       TS      
pub_mnic     MonitorTimeStats       pk-ercoss2 Avg      0       TS      
pub_mnic     ResourceInfo           pk-ercoss1 State    Valid   Msg             TS      
pub_mnic     ResourceInfo           pk-ercoss2 State    Valid   Msg             TS      
pk-ercoss1{root} # hares -display pub_p
#Resource    Attribute             System     Value
pub_p        Group                 global     PubLan
pub_p        Type                  global     Phantom
pub_p        AutoStart             global     1
pub_p        Critical              global     1
pub_p        Enabled               global     1
pub_p        LastOnline            global     pk-ercoss1
pub_p        MonitorOnly           global     0
pub_p        ResourceOwner         global     
pub_p        TriggerEvent          global     0
pub_p        ArgListValues         pk-ercoss1 ""
pub_p        ArgListValues         pk-ercoss2 ""
pub_p        ConfidenceLevel       pk-ercoss1 100
pub_p        ConfidenceLevel       pk-ercoss2 100
pub_p        ConfidenceMsg         pk-ercoss1
pub_p        ConfidenceMsg         pk-ercoss2
pub_p        Flags                 pk-ercoss1
pub_p        Flags                 pk-ercoss2
pub_p        IState                pk-ercoss1 not waiting
pub_p        IState                pk-ercoss2 not waiting
pub_p        MonitorMethod         pk-ercoss1 Traditional
pub_p        MonitorMethod         pk-ercoss2 Traditional
pub_p        Probed                pk-ercoss1 1
pub_p        Probed                pk-ercoss2 1
pub_p        Start                 pk-ercoss1 1
pub_p        Start                 pk-ercoss2 1
pub_p        State                 pk-ercoss1 ONLINE
pub_p        State                 pk-ercoss2 ONLINE
pub_p        ComputeStats          global     0
pub_p        TriggerResStateChange global     0
pub_p        ContainerInfo         pk-ercoss1 Type              Name            Enabled
pub_p        ContainerInfo         pk-ercoss2 Type              Name            Enabled
pub_p        MonitorTimeStats      pk-ercoss1 Avg       0       TS      
pub_p        MonitorTimeStats      pk-ercoss2 Avg       0       TS      
pub_p        ResourceInfo          pk-ercoss1 State     Valid   Msg             TS      
pub_p        ResourceInfo          pk-ercoss2 State     Valid   Msg             TS
 

Can LLT heartbeats communicate between NICs with different device names?

$
0
0
I need a solution

One 2-node vcs cluster, the heartbeat NICs are eth2 and eth3 on each node,

IF eth2 on node1 down, and eth3 on node2 down. Does this mean the 2 heartbeat Links both down, and the Cluster is in split brain situation?

Can LLT heartbeats communicate between NIC eth2 and NIC eth3?

Since the 《VCS Installation Guide》requires the 2 heartbeat Links in different networks.We should put eth2 of both nodes in the VLAN (VLAN1), and put eth3 of both nodes in another vlan (VLAN2). So in this situation heartbeats cannot communicate between eth2 and  eth3.

But, in a production cluster system, we found out the 4 NICs--eth2 and eth3 of both nodes are all in a same VLAN. and this lead me to post the discussion thread to ask this question:

IF eth2 on node1 down, and eth3 on node2 down, What will happen to the cluster (which is in active-standby mode) ?

Thanks!

 

unable to extract vrts_sf_ha media

$
0
0
I need a solution

I recently downloaded the Veritas Cluster media for learninig/testing purpose.

issue is while i try to extract the media;

gtar -xvzf  VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz

it started to extract after extracting 865 MB of data it returened the error

./dvd1-sol_sparc/readme_first.txt

gzip: stdin: invalid compressed data--crc error
/usr/sfw/bin/gtar: Child returned status 1
/usr/sfw/bin/gtar: Error is not recoverable: exiting now
bash-3.2# du -sh dvd1-sol_sparc/
 865M   dvd1-sol_sparc

then I tried

bash-3.2# gunzip -d VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz

gunzip: VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz: invalid compressed data--crc error

then

gzip -d VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz | tar xvf -

gzip: VRTS_SF_HA_Solutions_6.0_Solaris_SPARC.tar.gz: invalid compressed data--crc error
tar: blocksize = 0

I tried winrar, 7-zip etc

 

 

Thanks in advanc Guru.......
 

 

Owais Hyder.

If customer can change subject of vcs alert mail

$
0
0
I need a solution

vcs5.1sp1 on redhat 

Customer configure notifier by wizard, and get  following  vcs alter mail:

 

From: VCS_ALert@nodename [mailto:VCS_ALert@nodename <mailto:VCS_ALert@nodename> ] 
Sent: Tuesday, July xx, 2012 XXXX
Subject: VCS Warning for Resource resourcename, Resource went online by itself
 
 
 
Event Time: XXX
Entity Name: resourcename
Entity Type: Resource
Entity Subtype: BBB
Entity State: Resource went online by itself
Traps Origin: Veritas_Cluster_Server
System Name: hostname
 Entities Container Name: AAA
Entities Container Type: Service Group
 
Like to know if mail subject and sender can be modified by customer.

Change VCS llt connection

$
0
0
I need a solution

Hello

  I need to change the NIC ports for VCS heartbeat.  Normally I use vi to edit the /etc/lltab and reboot the nodes.

Is that the correct way to chenge the llt connection.

Thank you.

VCS on Windows 2008R2 : Blue Screen

$
0
0
I do not need a solution (just sharing information)

Hello,

I'm currently create a VCS cluster (without SFW for the disk management) and i have some blue screens when i mount disk ressources in the cluster. In VCS witch the LDM from windows, you need to use DiskRes and Mount ressources in the cluster. 

When i online DiskRes, all is ok (it works), but when i try to online the Mount ressource, i have a blue screen stop 0xBADFFFFF

I've found a solution on this site : http://www.experts-exchange.com/OS/Microsoft_Operating_Systems/Server/Windows_Server_2008/Q_27781502.html. The MPIO must be configured on Failover policy, not on RoundRobin policy. In the Disk administrator, click on the disk (not the partition) and choose properties. Go to the MPIO tab, and change the Policy to Failover Only.

 

VCS cluster having VXVM solaris node not booting

$
0
0
I need a solution

Hi Support

We have a two node VCS cluster,one node is down & not able to boot,we locally up the cluster in the other node. The node which is not booting up giving the following errors,I have mentioned the last few lines only.OS is solaris9,platform is SF4900.

=============================================================

Rebooting with command: boot -s

Copyright 1983-2003 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.

hba0: QLogic QLA2300 Fibre Channel Host Adapter fcode version 2.00.05 01/29/03
hba0: Firmware v3.3.12 (ipx)
QLogic Fibre Channel Driver v4.20 Instance: 1
hba1: QLogic QLA2300 Fibre Channel Host Adapter fcode version 2.00.05 01/29/03
hba1: Firmware v3.3.12 (ipx)
Hardware watchdog enabled
VxVM sysboot INFO V-5-2-3390 Starting restore daemon...
VxVM sysboot INFO V-5-2-3409 starting in boot mode...
NOTICE: VxVM vxdmp V-5-0-34 added disk array DISKS, datype = Disk

VxVM vxconfigd ERROR V-5-1-0 Segmentation violation
/etc/rcS.d/S25vxvm-sysboot: egettxt: not found

VxVM sysboot NOTICE V-5-2-3388 Halting system...
syncing file systems... done
NOTICE:

==========================================================
 

with thanks & regards

Arup

 

Veritas Cluster -engine_A.log

$
0
0
I need a solution

Getting the following error in engine_A.log:

 

2013/05/31 14:16:12 VCS INFO V-16-1-10307 Resource cvmvoldg5 (Owner: unknown, Group: lic_DG) is offline on dwlemm2b (Not initiated by VCS)

2013/05/31 14:16:14 VCS INFO V-16-6-15004 (dwlemm2b) hatrigger:Failed to send trigger for resfault; script doesn't exist

 

2013/05/31 14:16:15 VCS NOTICE V-16-10001-5510 (dwlemm2b) CFSMount:cfsmount5:offline:Attempting fuser TERM : Mount Point : /var/opt/sentinel
2013/05/31 14:16:18 VCS NOTICE V-16-10001-5510 (dwlemm2b) CFSMount:cfsmount2:offline:Attempting fuser TERM : Mount Point : /var/opt/BGw/ServerGroup1
2013/05/31 15:11:51 VCS INFO V-16-1-10307 Resource fmm (Owner: unknown, Group: FMMgrp) is offline on dwlemm2a (Not initiated by VCS)
 
and many more similar kind of error.May anyone knows the cause of any of these error.

 

SQL Server Install on VCS

$
0
0
I need a solution

Hi All,

 

Can you please help me with a guide or a step by step document to install SQL Server 2005 & 2008 on a Veritas cluster enviornment , since start of my career  i had worked only with MCS.

I very new to this enviornment and wanted to know the install method and cluster configuration.

 

thanks ,

 

Shiv

VCS 6.0.1 linux - How CPU usage is calculated?

$
0
0
I need a solution

Hello All,

Maybe this is already known answer but I tried to find how VCS 6.0.1 HostMonitor is calculating CPU usage on Linux hosts and didn't found answer.
Can somebody of you answer this question?
I would like to know from where it is collecting CPU data, if it is an avarage CPU usage... (It is looks like that collected information is wrong)

Thank you very much.

Iv4n

 

Faliover between two nodes

$
0
0
I need a solution

Hi

 

Now my services groups is online on node 1   and the type ( Faliover) i mounted inside groups resource .

 

So now if the node 1 going down for any Reason the resource shoud going to node 2  is this ( high availability)

 

 

 

Regards

 

 


VCS Global Clustering - WAC Error V-16-1-10543 IpmServer::open Cannot create socket errno = 97

$
0
0
I need a solution

Hi,

 

I have a customer who has two VCS clusters running on RHEL 5.6 servers. These clusters are further protected by site failover using GCO (global cluster option). All was working fine since installation with remote cluster operations showing up on the local cluster etc. But then this error started to appear in the wac_A.log file ....

VCS WARNING V-16-1-10543 IpmServer::open Cannot create socket errno = 97

Since this the cluster will not see of the remote clusters state, but can ping it as seen from the hastatus command below:

 

site-ab04# hastatus -sum

-- SYSTEM STATE
-- System               State                Frozen

A  site-ab04       RUNNING              0

-- GROUP STATE
-- Group           System               Probed     AutoDisabled    State

B  ClusterService  site-ab04       Y          N               ONLINE
B  SG_commonsg     site-ab04       Y          N               ONLINE
B  SG_site-b04g3 site-ab04       Y          N               OFFLINE
B  SG_site-b04g4 site-ab04       Y          N               OFFLINE
B  SG_site-a04g0 site-ab04       Y          N               OFFLINE
B  SG_site-a04g1 site-ab04       Y          N               OFFLINE
B  SG_site-a04g2 site-ab04       Y          N               OFFLINE
B  vxfen           site-ab04       Y          N               ONLINE

-- WAN HEARTBEAT STATE
-- Heartbeat       To                   State

M  Icmp            site-b04c       ALIVE

-- REMOTE CLUSTER STATE
-- Cluster         State

N  site-b04c  INIT

 

Does any one have any ideas? Networking all seems to be in order.

 

Thanks,

 

Rich

VCS Error V-16-10011-350

$
0
0
I need a solution

Environment

AIX= 6.1 TL8 SP2

HA/VCS = 6.0.1

Cluster Nodes = 2

 

We just updated from VCS 5.1 to 6.0.1 and received the following error when we attempt to test fail over (either direction)

 

12:11:15               V-16-10011-350 (clustertest0) Application:lab_app:online:Execution failed with return value [110]

 

This is followed by:

 

12:11:15               V-16-10011-260 (clustertest0) Application:lab_app:online:Execution of start Program (/etc/clu_app.x) returned (1)

12:11::20              V-16-1-10298 Resource lab_app (Owner Unspecified, Group: LABSG) is online on clustertest0 (VCS initiated)

12:11::20              V-16-1-10447 Group LABSG is online on clustertest0

 

The application comes up and is running post fail over, but we are wondering what is causing the error so we can remedy this issue.

Storage Foundation Keyless issue

$
0
0
I need a solution

Hello

   I have installed the SFHA6.0 for windows 2008 with keyless option. However after I applied the permanent key, the system still show the error log in event viewer.

The keyless license option does not entitle this system to run Veritas Storage Foundation for Windows, Veritas Storage Foundation and High Availability for Windows, or Veritas Dynamic Multi-Pathing for Windows. The product is no longer in licensing compliance even though it is still functioning properly. Symantec requires that you perform any of the following tasks to resolve the license compliance violation: -Add this system as a managed host to a Veritas Operations Manager (VOM) Management Server. or -Add an appropriate and valid license key on this system using the Symantec product installer from Windows Add/Remove Programs.
 
For more information, click the following link: http://entsupport.symantec.com/umi/V-76-58642-7026

   I seach the google then found that it should be disable the keyless option.

But I found the vxkeyless is disappear on windows server.

 

Thank you.

Issue with RegRep (Registry Replication Keys) ressource in VCS6.0 for SQL2008R2

$
0
0
I need a solution

Hello,

I'm working on a VCS cluster 6.0.1 under Windows 2008R2 with SQL2008R2. I cannot use the Wizard for SQL to configure Service Groups for SQL2008, because i have some issues with shared disk (call in progress to symantec support).

So, i try to configure manualy the SG, but i have an issue with the RegRep ressource. I cannot found real informations to configure this ressource. Actually my main.cf containt this :

    RegRep SQLInst1_SQL_RegRep (
        MountResName = SQLInst1_RegRep_Mount
        Keys = { "HKLM\\Software\\VERITAS\\" = "",
             "HKLM\\Software\\Microsoft\\Microsoft SQL Server" = "",
             "HKLM\\Software\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQLINST1" = "" }
        )

But, i think it's not sufficient, because i have many troubles with the failover of Service Groups.

Someone have a cluster configured with the wizard ? Who can share the complete definition of a service group for SQL2008 ? Thank you.

 

Note : When you use Wizard, you obtain this :

    RegRep SG_SQL03-RegRep-MSSQL (
        MountResName = SG_SQL03-Mount
        ReplicationDirectory = "\\RegRep\\SG_SQL03-RegRep-MSSQL"
        Keys = {
             "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\MSSQLServer" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_MSSQLServer.reg",
             "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\PROVIDERS" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_PROVIDERS.reg",
             "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\Replication" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_Replication.reg",
             "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\SQLServerAgent" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_SQLServerAgent.reg",
             "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\SQLServerSCP" = "SaveRestoreFile:SG_SQL03-RegRep-MSSQL_SQLServerSCP.reg" }
        ExcludeKeys = {
             "HKLM\\SOFTWARE\\Microsoft\\Microsoft SQL Server\\MSSQL10_50.SQL03\\MSSQLServer\\CurrentVersion" }
        )
 

vxfen module cause sles 11 sp1 kernel panic

$
0
0
I need a solution

Hi all: 

I use three share disks on iscsi(target using LIO) as the I/O fence disk in VCS 6.0 on SLES11 SP1. After configuration, the kernel panic when starting vxfen. Log is here:

 

[   83.640266] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
[   83.641248] IP: [<ffffffff81396113>] down_read+0x3/0x10
[   83.641933] PGD 37b62067 PUD 3c7b2067 PMD 0 
[   83.642512] Oops: 0002 [#1] SMP 
[   83.643016] last sysfs file: /sys/devices/platform/host2/iscsi_host/host2/initiatorname
[   83.644009] CPU 0 
[   83.644054] Modules linked in: vxfen(PN) dmpalua(PN) vxspec(PN) vxio(PN) vxdmp(PN) snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc gab(PN) ipv6 crc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi af_packet llt(PN) amf(PN) microcode fuse loop fdd(PN) exportfs vxportal(PN) vxfs(PN) dm_mod virtio_blk virtio_balloon virtio_net sg rtc_cmos rtc_core rtc_lib tpm_tis button tpm tpm_bios floppy i2c_piix4 virtio_pci virtio_ring pcspkr i2c_core virtio uhci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd ext3 mbcache jbd fan processor ide_pci_generic piix ide_core ata_generic ata_piix libata scsi_mod thermal thermal_sys hwmon
[   83.644054] Supported: Yes, External
[   83.644054] Pid: 4670, comm: vxfen Tainted: P             2.6.32.12-0.7-default #1 Bochs
[   83.644054] RIP: 0010:[<ffffffff81396113>]  [<ffffffff81396113>] down_read+0x3/0x10
[   83.644054] RSP: 0018:ffff88002e831638  EFLAGS: 00010286
[   83.644054] RAX: 0000000000000060 RBX: 0000000000000000 RCX: ffff88003c4e2480
[   83.644054] RDX: 0000000000000001 RSI: 0000000000002000 RDI: 0000000000000060
[   83.644054] RBP: ffff88002c9aa000 R08: 0000000000000000 R09: ffff88003c4e2480
[   83.644054] R10: ffff88003d6da140 R11: 00000000000000d0 R12: 0000000000000060
[   83.644054] R13: ffff88002c9a8000 R14: 0000000000000000 R15: ffff88002c9a8001
[   83.644054] FS:  0000000000000000(0000) GS:ffff880006200000(0000) knlGS:0000000000000000
[   83.644054] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[   83.644054] CR2: 0000000000000060 CR3: 000000003c72a000 CR4: 00000000000006f0
[   83.644054] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   83.644054] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   83.644054] Process vxfen (pid: 4670, threadinfo ffff88002e830000, task ffff88002c8e8140)
[   83.644054] Stack:
[   83.644054]  ffffffff810325ab 0000000000000000 0000001000000000 ffff88003ca9d690
[   83.644054] <0> ffff88003c4e2480 000000013d0181c0 0000000000000002 0000000000000002
[   83.644054] <0> 0000000000000002 0000000000001010 0000000000000000 0000000000000000
[   83.644054] Call Trace:
[   83.644054]  [<ffffffff810325ab>] get_user_pages_fast+0x11b/0x1a0
[   83.644054]  [<ffffffff81127c27>] __bio_map_user_iov+0x167/0x2a0
[   83.644054]  [<ffffffff81127d69>] bio_map_user_iov+0x9/0x30
[   83.644054]  [<ffffffff81127dac>] bio_map_user+0x1c/0x30
[   83.644054]  [<ffffffff811bab91>] __blk_rq_map_user+0x111/0x140
[   83.644054]  [<ffffffff811bacc5>] blk_rq_map_user+0x105/0x190
[   83.644054]  [<ffffffff811bebe7>] sg_io+0x3c7/0x3e0
[   83.644054]  [<ffffffff811bf1ec>] scsi_cmd_ioctl+0x2ac/0x470
[   83.644054]  [<ffffffffa01b75b1>] sd_ioctl+0xa1/0x120 [sd_mod]
[   83.644054]  [<ffffffffa0ccaa93>] vxfen_ioctl_by_bdev+0xc3/0xd0 [vxfen]
[   83.644054]  [<ffffffffa0ccb6ac>] vxfen_ioc_kernel_scsi_ioctl+0xec/0x3c0 [vxfen]
[   83.644054]  [<ffffffffa0ccbe1f>] vxfen_lnx_pgr_in+0xff/0x380 [vxfen]
[   83.644054]  [<ffffffffa0ccc11a>] vxfen_plat_pgr_in+0x7a/0x1c0 [vxfen]
[   83.644054]  [<ffffffffa0cd26c3>] vxfen_readkeys+0xa3/0x380 [vxfen]
[   83.644054]  [<ffffffffa0cd3514>] vxfen_membreg+0x84/0xae0 [vxfen]
[   83.644054]  [<ffffffffa0cce6d6>] vxfen_preexist_split_brain_scsi3+0x96/0x2d0 [vxfen]
[   83.644054]  [<ffffffffa0ccf96d>] vxfen_reg_coord_disk+0x7d/0x660 [vxfen]
[   83.644054]  [<ffffffffa0ca5e0b>] vxfen_reg_coord_pt+0xfb/0x250 [vxfen]
[   83.644054]  [<ffffffffa0cb849c>] vxfen_handle_local_config_done+0x14c/0x8d0 [vxfen]
[   83.644054]  [<ffffffffa0cbad57>] vxfen_vrfsm_cback+0xad7/0x17b0 [vxfen]
[   83.644054]  [<ffffffffa0cd5b20>] vrfsm_step+0x1b0/0x3b0 [vxfen]
[   83.644054]  [<ffffffffa0cd7e1c>] vrfsm_recv_thread+0x32c/0x970 [vxfen]
[   83.644054]  [<ffffffffa0cd85b4>] vxplat_lx_thread_base+0xa4/0x100 [vxfen]
[   83.644054]  [<ffffffff81003fba>] child_rip+0xa/0x20
[   83.644054] Code: 48 85 f6 74 0f 48 89 e7 e8 5b 09 cd ff 85 c0 48 63 d0 7e 07 48 c7 c2 fc fd ff ff 48 83 c4 78 48 89 d0 5b 5d c3 00 00 00 48 89 f8 <3e> 48 ff 00 79 05 e8 52 fa e4 ff c3 90 48 89 f8 48 ba 01 00 00 
[   83.644054] RIP  [<ffffffff81396113>] down_read+0x3/0x10
[   83.644054]  RSP <ffff88002e831638>
[   83.644054] CR2: 0000000000000060
 
 
vcs1:~ # vxdisk  list
DEVICE       TYPE            DISK         GROUP        STATUS
aluadisk0_1  auto:cdsdisk    -            -            online
aluadisk0_2  auto:cdsdisk    -            -            online
aluadisk0_3  auto:cdsdisk    -            -            online
sda          auto:none       -            -            online invalid
vda          simple          vda          data_dg      online
vdb          simple          vdb          data_dg      online
 
vcs1:~ # cat /proc/partitions 
major minor  #blocks  name
 
   8        0    8388608 sda
   8        1    7333641 sda1
   8        2    1052257 sda2
 253        0   10485760 vda
 253       16    4194304 vdb
   8       16    1048576 sdb
   8       19    1046528 sdb3
   8       24    1046528 sdb8
   8       32    1048576 sdc
   8       35    1046528 sdc3
   8       40    1046528 sdc8
   8       48    1048576 sdd
   8       51    1046528 sdd3
   8       56    1046528 sdd8
 201        0    8388608 VxDMP1
 201        1    7333641 VxDMP1p1
 201        2    1052257 VxDMP1p2
 201       16    1048576 VxDMP2
 201       19    1046528 VxDMP2p3
 201       24    1046528 VxDMP2p8
 201       32    1048576 VxDMP3
 201       35    1046528 VxDMP3p3
 201       40    1046528 VxDMP3p8
 201       48    1048576 VxDMP4
 201       51    1046528 VxDMP4p3
 201       56    1046528 VxDMP4p8
 199     6000    8388608 VxVM6000
 199     6001     153600 VxVM6001
 
 
 
LIO Target server message:
 
 
[60529.780169] br0: port 3(vnet2) entered forwarding state
[60675.372132] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.372217] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.373337] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.373426] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.374304] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
[60675.374374] TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x24, sending CHECK_CONDITION.
 
 

All test have passed by 

vcs1:~ # vxfentsthdw -m 
 
Veritas vxfentsthdw version 6.0.000.000-GA Linux
 
 
The utility vxfentsthdw works on the two nodes of the cluster.
The utility verifies that the shared storage one intends to use is
configured to support I/O fencing.  It issues a series of vxfenadm
commands to setup SCSI-3 registrations on the disk, verifies the
registrations on the disk, and removes the registrations from the disk.
 
 
******** WARNING!!!!!!!! ********
 
THIS UTILITY WILL DESTROY THE DATA ON THE DISK!! 
 
Do you still want to continue : [y/n] (default: n) y
The logfile generated for vxfentsthdw is /var/VRTSvcs/log/vxfen/vxfentsthdw.log.9431
 
Enter the first node of the cluster:
vcs1
Enter the second node of the cluster:
vcs2
 
Enter the disk name to be checked for SCSI-3 PGR on node vcs1 in the format:
for dmp: /dev/vx/rdmp/sdx
for raw: /dev/sdx
Make sure it is the same disk as seen by nodes vcs1 and vcs2
/dev/sdb
 
Enter the disk name to be checked for SCSI-3 PGR on node vcs2 in the format:
for dmp: /dev/vx/rdmp/sdx
for raw: /dev/sdx
Make sure it is the same disk as seen by nodes vcs1 and vcs2
/dev/sdb
 
***************************************************************************
 
Testing vcs1 /dev/sdb vcs2 /dev/sdb
 
Evaluate the disk before testing  ........................ No Pre-existing keys
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs2 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs2 .................... Passed
Unregister keys on disk /dev/sdb from node vcs1 ........................ Passed
Verify registrations for disk /dev/sdb on node vcs2 .................... Passed
Unregister keys on disk /dev/sdb from node vcs2 ........................ Passed
Check to verify there are no keys from node vcs1 ....................... Passed
Check to verify there are no keys from node vcs2 ....................... Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Read from disk /dev/sdb on node vcs1 ................................... Passed
Write to disk /dev/sdb from node vcs1 .................................. Passed
Read from disk /dev/sdb on node vcs2 ................................... Passed
Write to disk /dev/sdb from node vcs2 .................................. Passed
Reserve disk /dev/sdb from node vcs1 ................................... Passed
Verify reservation for disk /dev/sdb on node vcs1 ...................... Passed
Read from disk /dev/sdb on node vcs1 ................................... Passed
Read from disk /dev/sdb on node vcs2 ................................... Passed
Write to disk /dev/sdb from node vcs1 .................................. Passed
Expect no writes for disk /dev/sdb on node vcs2 ........................ Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs2 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Verify registrations for disk /dev/sdb on node vcs2 .................... Passed
Write to disk /dev/sdb from node vcs1 .................................. Passed
Write to disk /dev/sdb from node vcs2 .................................. Passed
Preempt and abort key KeyA using key KeyB on node vcs2 ................. Passed
Test to see if I/O on node vcs1 terminated ............................. Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Preempt key KeyC using key KeyB on node vcs2 ........................... Passed
Test to see if I/O on node vcs1 terminated ............................. Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Verify registrations for disk /dev/sdb on node vcs2 .................... Passed
Verify reservation for disk /dev/sdb on node vcs1 ...................... Passed
Verify reservation for disk /dev/sdb on node vcs2 ...................... Passed
Remove key KeyB on node vcs2 ........................................... Passed
Check to verify there are no keys from node vcs1 ....................... Passed
Check to verify there are no keys from node vcs2 ....................... Passed
Check to verify there are no reservations on disk /dev/sdb from node vcs1  Passed
Check to verify there are no reservations on disk /dev/sdb from node vcs2  Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
RegisterIgnoreKeys on disk /dev/sdb from node vcs1 ..................... Passed
Verify registrations for disk /dev/sdb on node vcs1 .................... Passed
Clear PGR on node vcs1 ................................................. Passed
Check to verify there are no keys from node vcs1 ....................... Passed
 
ALL tests on the disk /dev/sdb have PASSED.
The disk is now ready to be configured for I/O Fencing on node vcs1.
 
ALL tests on the disk /dev/sdb have PASSED.
The disk is now ready to be configured for I/O Fencing on node vcs2.
 
Removing test keys and temporary files, if any...
 
 

 

 

 

 

Viewing all 543 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>