Differences in Cluster Traffic with different workloads

Hi all,

last week I was at a customer and I had the chance to do some screenshots from two cluster which show how many heartbeat traffic a cluster can produce.

1. A two node cluster with four virtual machines. There you have a great heartbeat with 500 Kbit/s on the network. So pretty low.

Traffic01

 

2. A six node cluster with 215 virtual machines. Without any external action the cluster heartbeat has a continues traffic from around 60 MBit/s. I started some livemigrations and the heartbeat partly jumps up to 150 MBit/s.

Traffic02

 

So my conlusion, the heartbeat traffic multiplies with the load and number of nodes within the cluster.

 

 

Why you should have a network for cluster heartbeat only!

One topic I see often during my day to day work is that customers forgott to use a cluster network and install it on other networks like livemigration or management.

With my blogpost today I want to explain why you should use a separated cluster network and what you should configure to get it running.

At first, how does a cluster heartbeat work. You can see it like your own heartbeat. Every cluster node sends every second a heartbeat and ask the other nodes after their status. If 5 heartbeats fail within 10 seconds, the cluster will remove the host and migrate workloads.

So what happens if you set cluster heartbeat to for example livemigration. When a livemigration starts, the cluster heartbeat will fail and you livemigration and cluster node will fail.

Ok .. now some MVPs and IT Pro’s say, you can use other networks as fall back heartbeat networks. Yes you can have fallbacks BUT the cluster will try 3 times to bring the heartbeat through and than change to the oher network. Normally the heartbeat will fail there too.

In your own interest, you should use an own cluster network.

Now lets go to the options that you have to create a cluster network.

1. You can use a physical NIC Team for you Cluster Network

Cluster01

2. You can share a NIC Team via additional VLAN Tag on the Team for example with management

Cluster02

3. For Hyper-V you can create an additional virtual NIC for cluster traffic

Cluster03

 

After you created your cluster network, you need to do some more steps to guarantee bandwidth for the cluster heartbeat.

1. enable QoS (Quality of Service) on your network for the cluster network

2. configure network connection binding and cluster communication priority like discibed in my last blogpost How to configure cluster traffic priority on a windows server

3. on a Hyper-V Host or with Virtual Machine Manager you need to set a minimum bandwidth for the cluster network interface. I normally use a minimum of 5 to 10%

PowerShell for Hyper-V Host

In VMM you use the Hyper-V Port Profil “Cluster Workload”

 

So that should do the trick. 🙂