Just some quick questions. We use Veritas Clustering for providing an HA solution for database servers. We have set up the vcs montoring to try to connect to the database every 3 minutes. After 4 attempts the VCS kills the primary pid and fails over to the other side. The box is self is fine, just vcs was unable to connect for whatever reason. The theory is that if the database is unresponsive for 12 minutes it must be in a hung state. Correctly or not, there are many reasons that a database might be unresponsive other than hung.
My question is this, should we be checking the host (hardware) not the database for whether or not to fail over. What are other companies that use VCS for HA doing to ensure that failovers do not happen for unnecessary or unintend reasons? Is VCS intended for checking application availability?
Jim