Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Standby Cluster Server 2003 Issues accessing replicated LUNS

65 views
Skip to first unread message

Darren Bolton

unread,
Nov 24, 2008, 9:45:00 AM11/24/08
to
Hello all, looking for a bit of advice as I seem to have hit a brick wall in
being able to get our setup working.

I have a four node Exchange 2003 cluster at our main site running A/A/A/P
and connected to a EMC CX500 storage array. This cluster is running 3
exchange virtual servers and each virtual server has a 2 dedicated LUNs one
for the Database(s) and one for the logs.

At our recovery site I have a standby cluster in place which is also
connected to annother EMC CX500 storage array.

Every night the LUN's containing the DB and Logs for exchange that are
associated with the production exchange cluster are replicated to the CX500
storage array at the remote site. This is done using EMC Replication
Manager/SE.

The problem is when we have come to do a test recovery the standby cluster
fails to see any volumes that are present on the replicated LUNS, although
the disks are visable via Disk Management & Diskpart.

Both clusters are setup identically and the disk signatures have been
matched to what the cluster is expecting to see. (both production and standby)

When I present the replicated LUNs to the standby cluster, they appear in
Disk Management, but fail to be given the correct drive letter. And when
trying to give them the correct letter disk management returns an error the
following error.

Logical Disk Manager
The operation did not complete. Check the System Event Log for more
information on the error. Close the Disk Management console, then restart
Disk Management before retrying the operation.

I've even tried using diskpart to assign a letter to the volume, but again
Windows will not give the volume a letter.

What is even stranger is that if I then reboot the host on the standby
cluster when it returns the disk is still present and visible via Disk
Management but the volume has dissapeared as though it does not exist on the
disk.

To prove that the clones contain the correct volumes and data I have mounted
them on a host that is not part of the cluster and they appear straight away
with all data present.

To try and troubleshoot this further I have shut down all the nodes in the
standby cluster bar one. On the remaining node in the standby cluster I have
set the Cluster Disk Drive to disabled and disabled the Cluster service and
then rebooted the node.

When the node came back online, without the above services/drivers enabled
the disk appeared correctly, there labels were readable although not all were
assigned a drive letter. I was then able to assign the correct letters. I
then re-enabled the services and rebooted the node hoping that this time when
it came back up with the Cluster service enable all disks would be
accessable. I was left with the same issue.

Am I missing something really straight forward here? The disk signatures are
correct and how they should be, I've check these over and over again.

Any advice on how to move forward would be really appreciated.

Any questions or clarifications please ask.

Darren Bolton

unread,
Nov 24, 2008, 11:23:02 AM11/24/08
to
Would deleting the Signatures key from
HKLM\CurrentControlSet\Services\ClusDisk\Parameters on each node in the
standby cluster while the others are offline and having the Cluster Disk
Driver disabled achieving anything?

So that when the the Cluster Disk Driver is reanbled in the first node in
the cluster and brought back online the signatures are redetected? Have I got
the wrong end of the stick?

Jeff Hughes [MSFT]

unread,
Nov 25, 2008, 9:14:04 AM11/25/08
to
Do NOT mess with the clusdisk key. That's the key we use to implement the
'shared nothing' model so that one node and only one node, can access a disk
at a time. I would run this by EMC as they have a lot of stuff they're doing
behind the scenes to get the disk replication to work. Cluster doesn't even
know there's any replication going on. Could be that the replication process
is doing something like failing to clear the 'readonly' flag. That's
something they could look into. Do the disks even come online in Cluster
Administrator at the backup site?
--
Jeff Hughes, MCSE
Senior Support Escalation Engineer
Microsoft Enterprise Platforms Support (Server Core/Cluster)


"Darren Bolton" <Darren...@discussions.microsoft.com> wrote in message
news:9F6BC255-74D9-497C...@microsoft.com...

Darren Bolton

unread,
Nov 25, 2008, 9:21:02 AM11/25/08
to
Thanks for the reply.

I think that I am possibly looking into it to much and therefore missing the
obvious. I've checked all the flags and they all appear to be correct.

One thing that I have noticed that is different is the MPVolGuids that the
standby cluster is expecting on the disks is different to what is acctually
one the disks being replicated although the signatures are correct.

Im now working through deletting and recreated the disk resources on the
standby cluster (may I should of done this first.. but again I was looking to
deep).

Will report back how this goes.

Darren Bolton

unread,
Nov 25, 2008, 11:29:03 AM11/25/08
to
Well, when trying to delete the disk resource and recreated, the cluster log
now shows the following.

068:7a0.11/25[15:43:58.407](039231) ERR Physical Disk <EVS1_DB - J:\>: Get
Assigned Letter for \Device\Harddisk3\Partition1 returned status C000000D.
068:7a0.11/25[15:43:58.407](039231) WARN Physical Disk <EVS1_DB - J:\>:
ResourceControl: MountieVerify failed, error: 87
068:ad8.11/25[15:44:07.891](039235) WARN Physical Disk <EVS1_DB - J:\>:
[DiskArb] Assume ownership of the device.
068:ad8.11/25[15:44:55.906](039235) ERR Physical Disk <EVS1_DB - J:\>:
Online, volumes not ready. Error: 258.
068:ad8.11/25[15:44:55.906](039235) INFO Physical Disk <EVS1_DB - J:\>:
Online, returning final error 258 ResourceState 4 Valid 0
0c0:2cc.11/25[15:44:55.906](039235) WARN [FM] FmpHandleResourceTransition:
Resource Name = 214c916f-89ef-4d5f-90ca-10a8230d23ca [EVS1_DB - J:\] old
state=129 new state=4
0c0:2cc.11/25[15:44:55.906](039235) WARN [FM] FmpHandleResourceTransition:
Resource failed, post a work item
068:398.11/25[15:44:55.906](039236) WARN Physical Disk <EVS1_DB - J:\>:
[PnP] RemoveDisk: disk 18bfaac0 not found or previously removed
068:538.11/25[15:44:55.937](039236) INFO Physical Disk <EVS1_DB - J:\>:
DiskCleanup returning final error 0
068:af0.11/25[15:44:55.937](039238) WARN Physical Disk <EVS1_DB - J:\>:
[DiskArb] Assume ownership of the device.
068:af0.11/25[15:45:43.952](039240) ERR Physical Disk <EVS1_DB - J:\>:
Online, volumes not ready. Error: 258.
068:af0.11/25[15:45:43.952](039240) INFO Physical Disk <EVS1_DB - J:\>:
Online, returning final error 258 ResourceState 4 Valid 0
0c0:2cc.11/25[15:45:43.952](039240) WARN [FM] FmpHandleResourceTransition:
Resource Name = 214c916f-89ef-4d5f-90ca-10a8230d23ca [EVS1_DB - J:\] old
state=129 new state=4
0c0:2cc.11/25[15:45:43.952](039240) WARN [FM] FmpHandleResourceTransition:
Resource failed, post a work item
068:398.11/25[15:45:43.952](039241) WARN Physical Disk <EVS1_DB - J:\>:
[PnP] RemoveDisk: disk 18bfaac0 not found or previously removed
068:7a0.11/25[15:45:43.983](039241) INFO Physical Disk <EVS1_DB - J:\>:
DiskCleanup returning final error 0
068:208.11/25[15:45:43.983](039243) WARN Physical Disk <EVS1_DB - J:\>:
[DiskArb] Assume ownership of the device.
068:208.11/25[15:46:32.060](039245) ERR Physical Disk <EVS1_DB - J:\>:
Online, volumes not ready. Error: 258.
068:208.11/25[15:46:32.060](039245) INFO Physical Disk <EVS1_DB - J:\>:
Online, returning final error 258 ResourceState 4 Valid 0
0c0:2cc.11/25[15:46:32.060](039245) WARN [FM] FmpHandleResourceTransition:
Resource Name = 214c916f-89ef-4d5f-90ca-10a8230d23ca [EVS1_DB - J:\] old
state=129 new state=4
0c0:2cc.11/25[15:46:32.060](039245) WARN [FM] FmpHandleResourceTransition:
Resource failed, post a work item
068:398.11/25[15:46:32.060](039246) WARN Physical Disk <EVS1_DB - J:\>:
[PnP] RemoveDisk: disk 18bfaac0 not found or previously removed
068:7a0.11/25[15:46:32.091](039246) INFO Physical Disk <EVS1_DB - J:\>:
DiskCleanup returning final error 0
068:0dc.11/25[15:46:32.091](039248) WARN Physical Disk <EVS1_DB - J:\>:
[DiskArb] Assume ownership of the device.
068:0dc.11/25[15:47:20.091](039250) ERR Physical Disk <EVS1_DB - J:\>:
Online, volumes not ready. Error: 258.
068:0dc.11/25[15:47:20.091](039250) INFO Physical Disk <EVS1_DB - J:\>:
Online, returning final error 258 ResourceState 4 Valid 0
0c0:2cc.11/25[15:47:20.091](039250) WARN [FM] FmpHandleResourceTransition:
Resource Name = 214c916f-89ef-4d5f-90ca-10a8230d23ca [EVS1_DB - J:\] old
state=129 new state=4
0c0:2cc.11/25[15:47:20.091](039250) WARN [FM] FmpHandleResourceTransition:
Resource failed, post a work item
068:398.11/25[15:47:20.091](039252) WARN Physical Disk <EVS1_DB - J:\>:
[PnP] RemoveDisk: disk 18bfaac0 not found or previously removed
068:6d4.11/25[15:47:20.122](039252) INFO Physical Disk <EVS1_DB - J:\>:
DiskCleanup returning final error 0
0c0:2cc.11/25[15:47:20.122](039252) WARN [FM] Group failure for group
<4840203c-54ab-48ce-87b2-92e21926dddd>. Create thread to take offline and
move.
068:398.11/25[16:02:44.429](039465) WARN Physical Disk <EVS1_DB - J:\>:
[PnP] RemoveDisk: disk 18bfaac0 not found or previously removed
068:6d4.11/25[16:02:44.429](039465) INFO Physical Disk <EVS1_DB - J:\>:
DiskCleanup returning final error 0

John Fullbright

unread,
Dec 1, 2008, 5:26:01 AM12/1/08
to
170, uh it's in use. 258, what you tried timed out. I'd take this up with
EMC. My guess would be that the replication tool still has control of the
disk.


"Darren Bolton" <Darren...@discussions.microsoft.com> wrote in message

news:E77042D4-971B-485B...@microsoft.com...

Edwin vMierlo [MVP]

unread,
Dec 2, 2008, 4:55:44 AM12/2/08
to
Agree with John F, do take this up with EMC,

However the "volumes not ready. Error: 258" error is also related to "VSS
Volume Flags"
if they are set then it is not a storage problem, it is VSS not unsetting
these flags, see http://support.microsoft.com/kb/886702

Also, I have seen VSP.sys as an upperfilter causing this time out. Remove
this driver and retry.
(actually, any upperfilter on the volume could cause this timeout)

rgds,
Edwin.


"John Fullbright" <fjohn@donotspamnetappdotcom> wrote in message
news:Ouxu285U...@TK2MSFTNGP03.phx.gbl...

0 new messages