Environment
SFHA/DR
iSCSI SAN
Primary Site = two nodes
DR Site = one node
SFHA version = 5.0 MP4 RP1
RHEL OS = 5
DiskGroup Agent logs
2013/09/10 11:56:48 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4155296656)
2013/09/10 11:56:49 VCS ERROR V-16-2-13027 Thread(4152359824) Resource(DG) - monitor procedure did not complete within the expected time.
2013/09/10 11:58:48 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4152359824)
2013/09/10 12:00:48 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4155296656)
2013/09/10 12:02:48 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4152359824)
2013/09/10 12:02:49 VCS ERROR V-16-2-13210 Thread(4155296656) Agent is calling clean for resource(DG) because 4 successive invocations of the monitor procedure did not complete within the expected time.
2013/09/10 12:02:50 VCS ERROR V-16-2-13068 Thread(4155296656) Resource(DG) - clean completed successfully.
2013/09/10 12:03:50 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4155296656)
2013/09/10 12:03:51 VCS ERROR V-16-2-13077 Thread(4152359824) Agent is unable to offline resource(DG). Administrative intervention may be required.
2013/09/10 12:05:50 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4152359824)
2013/09/10 12:07:50 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4155296656)
2013/09/10 12:09:50 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4152359824)
2013/09/10 12:11:50 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4155296656)
2013/09/10 12:13:50 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4152359824)
2013/09/10 12:15:50 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4155296656)
2013/09/10 12:17:51 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4152359824)
2013/09/10 12:19:51 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4155296656)
2013/09/10 12:21:51 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4152359824)
2013/09/10 12:23:51 VCS WARNING V-16-2-13139 Thread(4156349328) Canceling thread (4155296656)
ERROR V-16-2-13210
https://sort.symantec.com/ecls/umi/V-16-2-13210
===========================================================================
Mount Agent logs
2013/09/10 12:12:53 VCS ERROR V-16-2-13064 Thread(4154968976) Agent is calling clean for resource(MOUNT1) because the resource is up even after offline completed.
2013/09/10 12:12:54 VCS ERROR V-16-2-13069 Thread(4154968976) Resource(MOUNT1) - clean failed.
2013/09/10 12:13:54 VCS ERROR V-16-2-13077 Thread(4154968976) Agent is unable to offline resource(MOUNT1). Administrative intervention may be required.
2013/09/10 12:14:00 VCS WARNING V-16-2-13102 Thread(4153916304) Resource (MOUNT1) received unexpected event info in state GoingOfflineWaiting
2013/09/10 12:15:28 VCS WARNING V-16-2-13102 Thread(4153916304) Resource (MOUNT1) received unexpected event info in state GoingOfflineWaiting
WARNING V-16-2-13102
https://sort.symantec.com/ecls/umi/V-16-2-13102
===========================================================================
My investigation
Extract from the above TN => Links 0 and 1 are both connected to the same switch, resulting in a cross-link scenario (googled last extract and found next TN) =>http://www.symantec.com/business/support/index?page=content&id=HOWTO79920 (Description of this TN next ) => It seems that you're using only one network switch between the cluster nodes. Symantec recommends configuring two independent networks between the cluster nodes with one network switch for each network for failure protection.
Seems the problem is this => Links 0 and 1 are both connected to the same switch, resulting in a cross-link scenario
===========================================================================
===========================================================================
The above is the one case. I also see the problem is with disks as well. Which may lead the disks unresponsive and mount resource caused problem. See the attached RHEL messages log: