Book Tipp: Introduction to Windows Server Failover Clustering

Hi everybody,

5nine Software Vice President Symon Perriman published a book with some cool best practices for configuration Windows Server Failover Clusters. ūüôā

Greate stuff and great to read.

You can download it here.

Where to find logs from cluster role movement?

Hi everybody,

as you know sometimes it is neccessary to get some more information about where and when roles within a cluster were successfully moved from one node to another.

One example could be Hyper-V Livemigrations or role movements from mixed role cluster servers.

When you have errors or warnings during rolemigrations, you can see the events directly within the failover cluster manager.

2016-02-19_11-03-59

But what is if I want to see successfull role moves? Where do I find them?

Those messages are a bit hidden within the logfile structures of a Windows Server.

  1. Navigate to Applications and Services Logs\Microsoft\Windows\Failover Clustering\Operational
  2. There you can find several events dependig if you are on the role sender or receiver2016-02-19_11-07-222016-02-19_11-07-522016-02-19_11-08-14

Hyper-V|W2k12R2|4x1GB|2xFC

Hyper-V Cluster Network configuration with following parameters:

The following configuration leverages 4x 1GB Ethernet and 2x Fibre channel connections. The storage can be connected via Fibre Channel with MPIO. The configurations uses physical configuration and software defined / converged network for Hyper-V.


¬†Pro’s and Con’s of that solution

 Pro Con
– High Bandwidth for VM- Good Bandwidth for Storage
– Fault redundant
– Can be used in switch independent or LACP (with stacked switches) teaming mode
– Fibrechannel ist most common SAN technology
– Limited Bandwidth for Livemigration
– a lot of technologies involved

 Switches

Switch name Bandwidth Switchtyp
1GBE SW01 1 GBit/s physical stacked or independed
1GBE SW02 1 GBit/s physical stacked or independed
FC SW01 4/8 GB FC/s physical stacked or independed
FC SW02 4/8 GB FC/s physical stacked or independed
SoftSW01 1 GBit/s Software defined / converged
SoftSW02 1 GBit/s Software defined / converged

 Neccessary Networks

Networkname VLAN IP Network (IPv4) Connected to Switch
Management 100 10.11.100.0/24 SoftSW01
Cluster 101 10.11.101.0/24  SoftSW01
Livemigration 450 10.11.45.0/24  SoftSW01
Virtual Machines 200 Рx 10.11.x.x/x  SoftSW02

 Possible rearview Server

NIC17


 Schematic representation

NIC14 NIC15

Switch Port Configuration

NIC16  

Bandwidth Configuration vNICs

vNIC min. Bandwidth Weight PowerShell Command
Management 20%
Cluster 10%
Livemigration 40%

QoS Configuration Switch

Networkname Priority
Management medium
Cluster high
Livemigration medium
VMs dependig on VM Workload

 

Small scripts to configure Livemigration on Hyper-V Hosts & Cluster

Hi everybody,

based on blogposts by¬†John Savill and Ben Armstrong I developed two small scripts to configure Livemigration on Hyper-V Hosts and Cluster to save some time for configuration. I will no provide my whole script but I think the two snipes will help you too ūüôā

First one you run on all Hyper-V Hosts.

The next one you run on the cluster its self. You only need to run one, depending on what suits you best.

 

 

Cluster Manager, Server Manager & Hyper-V Console not starting

This week I had a very strange issue with a Hyper-V Cluster managed by Virtual Machine Manager.

Completely randomly different cluster nodes failed and I weren’t able to start failover cluster manager on one of the cluster nodes. ¬†On the infected node it self, I wasn’t able to open the hyper-v manager or server manager.

After a lot of research I found a solution from the windows server core team which pointed me to the solution.

Unable to launch Cluster Failover Manager on any node of a 2012/2012R2 Cluster

When Failover Cluster Manager is opened to manage a Cluster, it will contact all the nodes and retrieve Cluster configuration information using WMI calls. If any one of the nodes in the Cluster does not have the cluster namespace “root\mscluster” in WMI, Failover Cluster Manager will fail and give one of the below errors:

clip_image002

Or,

Unfortunately, it does not give any indication of which node is missing the WMI namespace.  One of the ways you can check to see which one has it missing is to run the below command on each node of the Cluster.

It can be a bit tedious and time consuming if you have quite a few nodes, say like 64 of them.  The below script can be run on one of the nodes that will connect to all the other nodes and check to see if the namespace is present.  If it is, it will succeed.  If the namespace does not exist, it will fail.

—————–

 
—————–

In the below example, you can see that one of the nodes failed.

To correct the problem, you would need to run the below from an administrative command prompt on the “failed” node(s).

cd c:\windows\system32\wbem
mofcomp.exe cluswmi.mof

Once the Cluster WMI has been added back, you can successfully open Failover Cluster Management.  There is no restart of the machine or the Cluster Service needed.

Quote: Microsoft Ask the Core Team Blog

In my case I wasn’t able to fix it so easy because the server vendor implemented the WMI Provider directly in his BMC via Agent (for the interested ones Fujitsu). during the process of recompiling the WMI for the Cluster the whole Server Network interfaces and BMC fail.

so my fix:

  1. shutdown the server
  2. make it powerless
  3. start it
  4. check cluster (everything fine)
  5. uninstall the (fucking) agent

Since than it worked.