Hi there,
I have a system where the cleanup script can fail/timeout and I want to execute another script if this happens. And I was wondering which can be the best way of doing this.
In the veritas cluster server administrators guide for Linux I found the trigger RESNOTOFF.
From the documentation it is my understanding that this trigger will be triggered in the following cases:
- A resource fails going offline (started by VCS) and the clean up fails.
- A resource goes offline unexpectedly and the clean up fails.
I have tested this and the RESNOTOFF is working in the first scenario but not in the second.
For testing the second scenario I kill the service and I can see the following message in the engine_A.log:
VCS ERROR V-16-2-13067 (node1) Agent is calling clean for resource(service1) because the resource became OFFLINE unexpectedly, on its own.
When the cleanup fails I would expect the resource to became UNABLE TO OFFLINE. However, the status of the resource is still ONLINE:
# hares -state service1
#Resource Attribute System Value
service1 State node1 ONLINE
service1 State node2 OFFLINE
So the resource is ONLINE and VCS keeps running the cleanup command indefinitely (which is failing).
I was wondering if I need to configure something else to make the RESNOTOFF to work in this particular scenario.
Thanks,