Hi,
I am following the veritas cluster server administrators guide for linux and trying to trigger the resnotoff script. From the documentation it is my understanding that is a resource faults and the clean command returns 1, resnotoff should be triggered.
To begin my service group is in an ONLINE state:
[root@node1 ~]# hastatus -sum | grep test
B Grp_CS_c1_testservice node1 Y N ONLINE
B Grp_CS_c1_testservice node2 Y N ONLINE
I have the clean limit set to 1 and the clean script set to /bin/false to force this to return an error exit code.
Res_App_c1_fmmed1_testapplication ArgListValues node1 User 1 root StartProgram 1 "/usr/share/litp/vcs
_lsb_start vmservice 5" StopProgram 1 "/usr/share/litp/vcs_lsb_stop vmservice 5" CleanProgram 1 /bin/false M
onitorProgram 1 "/usr/share/litp/vcs_lsb_status vmservice" PidFiles 0 MonitorProcesses 0 EnvF
ile 1 "" UseSUDash 1 0 State 1 2 IState 1 0
Res_App_c1_fmmed1_testapplication ArgListValues node2 User 1 root StartProgram 1 "/usr/share/litp/vcs
_lsb_start vmservice 5" StopProgram 1 "/usr/share/litp/vcs_lsb_stop vmservice 5" CleanProgram 1 /bin/false M
onitorProgram 1 "/usr/share/litp/vcs_lsb_status vmservice" PidFiles 0 MonitorProcesses 0 EnvF
ile 1 "" UseSUDash 1 0 State 1 2 IState 1 0
Res_App_c1_fmmed1_testapplication CleanProgram global /bin/false
Res_App_c1_fmmed1_testapplication CleanRetryLimit global 1
The resnotoff is enables for this resource
Res_App_c1_fmmed1_testapplication TriggersEnabled global RESNOTOFF
Now I manually kill the service Grp_CS_c1_testservice on node 1 and see the following in the /var/log/messages
Jun 16 17:02:33 node1 AgentFramework[10323]: VCS ERROR V-16-2-13067 Thread(4147325808) Agent is calling clean for resource(Res_App_c
1_fmmed1_testapplication) because the resource became OFFLINE unexpectedly, on its own.Jun 16 17:02:33 node1 Had[9975]: VCS ERROR V-16-2-13067 (node1) Agent is calling clean for resource(Res_App_c1_fmmed1_testapplicatio
n) because the resource became OFFLINE unexpectedly, on its own.
Jun 16 17:02:34 node1 AgentFramework[10323]: VCS ERROR V-16-2-13069 Thread(4147325808) Resource(Res_App_c1_fmmed1_testapplication) -
clean failed.
and in the engine_A.log
2015/06/16 17:02:33 VCS ERROR V-16-2-13067 (node1) Agent is calling clean for resource(Res_App_c1_fmmed1_testapplication) because the resourc
e became OFFLINE unexpectedly, on its own.
2015/06/16 17:02:34 VCS INFO V-16-10031-504 (node1) Application:Res_App_c1_fmmed1_testapplication:clean:Executed /bin/false as user root
2015/06/16 17:02:35 VCS ERROR V-16-2-13069 (node1) Resource(Res_App_c1_fmmed1_testapplication) - clean failed.2015/06/16 17:03:35 VCS ERROR V-16-1-50148 ADMIN_WAIT flag set for resource Res_App_c1_fmmed1_testapplication on system node1 with the reason
4
2015/06/16 17:03:35 VCS INFO V-16-10031-504 (node1) Application:Res_App_c1_fmmed1_testapplication:clean:Executed /bin/false as user root
From my understanding of the VCS adminisrator guide section titles 'VCS behavior when an online resource faults' the resnotoff should be triggered however it is not and the resource goes to an ADMIN WAIT state.
group resource system message
--------------- -------------------- --------------- --------------------
Res_App_c1_fmmed1_testapplication node1 |ADMIN WAIT|
Is it possible to get the resnotoff triggered for a cluster in this state or do I need to use the resadminwait trigger (contrary to the documentation).
Thanks,