Failover Cluster pausing node error 0x80071763 storage reporting healthy

Question

Failover Cluster pausing node error 0x80071763 storage reporting healthy

Tech Guy 25 20

I have a failover cluster running S2D and 4 nodes that will not let me pause a node and drain the roles. It generates error 0x80071763 on all 4 nodes. I have checked the storage and physical drives and they all report as ok. I have also rebuilt each of the virtual disks and moved the VM's to them destroying the old virtual disks in the process. Cluster validation reports not issues either.

Nodes can be paused without draining roles and they can be put in a down state by stopping cluster services which also will drain the roles from the node. Any ideas on how to resolve this and restore to normal operation?

Answer accepted by question author

3 additional answers

Your answer

Answer 1

Hello,

The error 0x80071763 you are seeing when attempting to pause and drain roles from nodes in your Storage Spaces Direct (S2D) cluster is a known issue tied to the way Cluster Shared Volumes and S2D handle role ownership transitions. The code itself translates to “The cluster resource cannot be moved to another node because a cluster resource dependency failed,” which means the drain operation is blocked at the storage layer even though validation and health checks report no issues.

The fact that you can pause nodes without draining, and that stopping the cluster service does successfully drain roles, confirms that the cluster service itself is healthy but the orchestrated “drain roles” workflow is failing. This typically occurs when the Cluster service cannot reconcile ownership of S2D virtual disks during the drain sequence. Even though you rebuilt the virtual disks, the dependency chain between the CSVs and the roles may still be inconsistent.

Microsoft guidance for this scenario is to first ensure all nodes are running the latest cumulative update and servicing stack for Windows Server, as there have been fixes in recent updates specifically for S2D drain behavior. Next, check the cluster logs (Get-ClusterLog -UseLocalTime -Destination C:\ClusterLogs) immediately after a failed drain attempt. Look for entries referencing MoveGroup or DrainRole failures; these usually point to a specific CSV or resource that is blocking the move.

If the logs show that the CSVs are healthy but the drain still fails, the recommended workaround is to use Suspend-ClusterNode -Drain with the -ForceDrain parameter. This bypasses the dependency check and forces role migration. Microsoft has documented that in certain S2D builds, the standard drain fails with 0x80071763 but forced drain succeeds.

If even forced drain fails, then the issue is likely a bug in the cluster resource DLL for S2D. In that case, the only supported resolution is to open a case with Microsoft support, as they can provide hotfixes or confirm whether your build is affected by a known regression.

In short, the error is not hardware but a cluster dependency issue in S2D. Update all nodes fully, review the cluster logs for the blocking resource, and use Suspend-ClusterNode -Drain -ForceDrain as a workaround. If the problem persists, escalate to Microsoft support for a fix.

I hope you've found something useful here. If it helps you get more insight into the issue, it's appreciated to accept the answer. Should you have more questions, feel free to leave a message. Have a nice day!

Domic Vo.

Answer 2

To provide a final update on this issue and resolution for future reference. Backed out update KB5075899 on all nodes and rebuilt 1 of the virtual disks. It seems the updated impacted the syncing between nodes and broke the virtual disk. After rebuilding and moving the VM's to this new disk, the systems have been stable. The powershell commands never did indicate that this disk was in fact problematic. It took trial and error to determine this disk was preventing the pausing and draining of the nodes.

Answer 3

Tech Guy 25 20

Thank you for the prompt replies. Both commands come back with no results.

Checking for the owner node on all cluster resources it lists out all 4 nodes by hostname.

Turning off the health service does not allow the drain to complete. It gave me an error that the service wasn't running.

Answer 4

Tech Guy 25 20

The nodes are all updated with the latest patches that were released this week. The logs mention the Health Service and RcmAgent wanting to veto the drain and error code 5987. I also see an warning that says "s_ApiOpenResourceEx: Resource 152220d3-ff45-421c-bc2d-df94117a292a not found, status = 5007".

Forcedrain says one or more roles were not moved yet the roles tab is empty and nothing is showing attached to that node.

Domic Vo 17,825 Reputation points Independent Advisor

2026-02-14T16:53:05.21+00:00

Hello,

Based on the additional details you’ve shared, the situation is clearer. The veto from Health Service and RcmAgent combined with error 5987 indicates that the cluster’s internal health checks are actively preventing the drain. In S2D, the Health Service monitors storage resiliency and vetoes operations if it believes moving roles could compromise data integrity. The warning s_ApiOpenResourceEx: Resource … not found, status = 5007 points to a stale or orphaned resource GUID in the cluster database — essentially, the drain process is trying to move a resource that no longer exists or has been deleted when you rebuilt the virtual disks.

This explains why -ForceDrain reports that roles were not moved even though the Roles tab is empty. The cluster is attempting to process dependencies tied to a phantom resource, and the veto logic blocks the operation.

The recommended next step is to clean up the cluster database to remove orphaned resource entries. Run:

Mã

Get-ClusterResource | ? { $_.OwnerNode -eq $null }

and

Mã

Get-ClusterResource | ? { $_.State -eq "Failed" }

to identify resources that are no longer valid. If you see entries with GUIDs matching the warnings, remove them with Remove-ClusterResource <ResourceName>. After cleanup, retry the drain operation.

If the Health Service continues to veto, you can temporarily disable veto checks for testing by stopping the Health Service (Stop-ClusterResource "Cluster Health Service") and then attempting the drain. This is not a long‑term solution, but it will confirm whether the veto is the blocker. If disabling the Health Service allows the drain to succeed, then the issue is indeed stale resource dependencies being misinterpreted as active.

If cleanup does not resolve the problem, this points to a deeper corruption in the cluster database. At that stage, Microsoft support should be engaged, as they can provide scripts or hotfixes to repair the cluster resource map.

I hope you've found something useful here. If it helps you get more insight into the issue, it's appreciated to accept the answer. Should you have more questions, feel free to leave a message. Have a nice day!

Domic

Share via

Failover Cluster pausing node error 0x80071763 storage reporting healthy

3 additional answers

Your answer