We are running VCS 6.0.2 on RHEL 6.5, below is our cluster configuration:
Heartbeat link: eth3, eth4
Low-priority heartbeat link: not be enabled
Fencing: not be enabled
Cluster contains 2 servers: jarry-crf1, jarry-crf2.
Server groups:
TestGrp1 contains a "FileOnOff" resource and Parallel mode is enabled.
TestGrp2 contains a "FileOnOff" resource, depends on TestGrp1 and Failover mode is enabled.
Test steps:
1. Take TestGrp1 online on both server, take TestGrp2 online on server "jarry-crf2"
2. Stop both heartbeat links on server "jarry-crf1" by command "ifdown eth3; sleep 60; ifdown eth4"
3. Recover heartbeat links by command "ifup eth3; ifup eth4"
Then we found the "had" process is restared on server "jarry-crf2", and we found below logs in engine_A.log
2015/02/10 00:48:43 VCS NOTICE V-16-1-10433 Group TestGrp2 will not start automatically on System jarry-crf2 as the system is in restart mode.
2015/02/10 00:48:43 VCS NOTICE V-16-1-10433 Group TestGrp1 will not start automatically on System jarry-crf2 as the system is in restart mode.
2015/02/10 00:48:43 VCS NOTICE V-16-1-10445 Group TestGrp1 will not start automatically as atleast one system in the SystemList attribute of the group is in restart mode.
2015/02/10 00:48:47 VCS NOTICE V-16-1-10433 Group VCShmg will not start automatically on System jarry-crf2 as the system is in restart mode.
2015/02/10 00:48:47 VCS NOTICE V-16-1-10445 Group VCShmg will not start automatically as atleast one system in the SystemList attribute of the group is in restart mode.
My Questions:
1. Why "had" process is restarted after heartbeat being recovered.
2. What's means of "restart mode", how to bring service group leave "restart mode" and start automatically.
Thanks in advance!