This is a follow on post to my original dynamic quorum post which can be found here. I have had a lot of scenario based questions around dynamic quorum and it’s behaviour and I have done some extensive testing to try and explain some scenarios with detail.
Scenario 1: Explaining dynamic quorum using a five node cluster and gracefully shutting down nodes in a sequential manner.
In this example I will turn off the cluster service on the cluster nodes one at a time. The quorum model has been set to node majority.
All nodes in the cluster are up and running.
Notice the ID's of the nodes. These ID's correspond to the joining order of the node in the cluster. This is the same as the HKLM\Cluster\Nodes registry key.
I stop nodes 2, 3 & 4 sequentially (using failover cluster manager to stop the cluster service). You can see below the nodes are down but more importantly my cluster is up and running. To maintain quorum a single vote is required which is coming from node 5. Through testing I have found the node with the highest ID will be the one casting the dynamic vote.
To prove the above I started node 2 and 3 and also stopped node 5 and 1. If my theory is correct the dynamic vote will be cast by node 2 as it has the highest ID. This can be seen in the next screen shot.
Now for the last man standing if I stop node 2 the dynamic weight should be transferred to node 3. You can see from the screen shot below that this is true. My cluster remains up and running as dynamic quorum has been able to calculate quorum throughout my testing i.e. quorum was maintained as the cluster service was turned off on some nodes and turned on other nodes.
This is all well and good and shows how dynamic quorum works but in the real world we do not always have the luxury of gracefully stopping the cluster service. In the next set of tests I will simulate failures as opposed to graceful shutdowns.
Scenario 2: Explaining dynamic quorum using a five node cluster and simulating failures.
In this example I will turn off the cluster node at the Hyper-V level. I will turn off (not shutdown) each node one at a time and show you what happens. The quorum model has been set to node majority.
All nodes up, running and voting to achieve quorum.
Turning off nodes 2 and 3 one by one.
Critical events have been logged by the cluster and you can see the results below from the remaining nodes
Dynamic quorum voting looks like this
Now depending on the node I turn off the node with the highest ID that is still up will have the dynamic vote to maintain the cluster.
Turn off node 1. Of the nodes remaining the node with the highest ID is node 4 therefore that should be the one casting the vote to maintain the quorum.
As you can see the above is correct. Node 4 is the one casting the vote.
Now for the last man standing scenario. I need to turn off node 4 and the dynamic weight needs to shift to node 5. This will then be the only node in the cluster up and running. However this is not the case when a cluster node goes down unexpectedly.
Conclusion
After a lot of testing I have come to the following conclusions;
- Dynamic quorum will work for an x node cluster all the way down to a single node if cluster nodes are shutdown gracefully in a sequential manner allowing the cluster to maintain quorum and then calculate dynamic quorum.
- If the nodes in the cluster are turned off i.e. power unplugged the cluster will try to maintain quorum. If it can it will then calculate the new dynamic quorum value. This will work all the way down to the last two nodes of the cluster.
- Once you are down to the last two nodes and you turn off the node with the dynamic weight the cluster will then go down. See the screenshot below. This is happening as the cluster does not have a chance to sync up with the node turned off as there is no more communication therefore it cannot transfer the vote across.
- Once you are down to the last two nodes and you turn off the node which is not voting the cluster will remain up. See the screen shot below shows this.
So in the real world the last man standing scenario will work if the last node online is the node which has the dynamic vote. Transfer of the vote is not possible if the node voting is turned off. It is only possible if there is a graceful shutdown.
If you find yourselves in a scenario where you are on the last two nodes of the cluster it may be worthwhile failing over services to the node with the dynamic vote as that will be the node that remains up if the other node goes down.
I hope that helps some of the scenario based questions for dynamic quorum.
Aeval
Premier Field Engineer - Failover Clustering & Virtualisation