Azure Stack RTM PoC Deployment stops @ Step 60.120.121 – deploy identity provider

Hello Community,

some of you maybe encountered following issue during the deployment of the Azure Stack RTM PoC.

Lets look on the field configuration:

  1. One server HP DL360 G8
  2. NIC Type 1GBE Intel i360 (HP OEM Label)
  3. Two Public IPv4 Adresses published directly to the host and host configured as exposed host in the border gateway firewalls
  4. No Firewall Rules for that host on the gateways
  5. Switchports for that host configured as Trunk/Uplink ports with VLAN tagging enabled
  6. We use Azure AD for Authentication

In my case, the important point is the port trunk and the VLAN tagging.

Normally VLAN tagging is no issue because the deployment toolkit should set the tag automatically during deployment for all VMs required and the host system.

In my case and during many test and validation deployments, that didn’t happen. After I start the deployment, a new virtual switch will be deployed and a virtual NIC named “deployment” will be configured for the host. Afterwards the deployment starts. Around 3 hours later, the deployment stops in step 60.120.121 and could not connect to the identity provider.

Whats the reason for the failure?

First you should know, that the Azure Stack Deployment switches between host and BGPNAT VM for internet communication. Mostly all traffic runs through the NAT VM but in that case, the host communicates directly with the internet.

So what happend? After creating the “deployment” NIC for the host, the deployment tool didn’t set the VLAN Tag on that virtual NIC. That breaks the network communication for the host, for the VMs there isn’t any issue because the VLAN is set for the NAT VM correctly.

What is the Workaround?

  1. Start the deployment and configure it like normal
  2. Let the deployment run into the failure
  3. Open a new PowerShell with admin permissions (Run as Administrator)
  4. Type in following Command:
  5. Rerun the deployment with

    From the installation folder.

Afterwards the deployment runs smoothly.

 

Please be aware, after the installation, the VLAN ID is removed again. So you need to set it one more time. 

Hyper-V|W2k12R2|4x1GB|2xFC

Hyper-V Cluster Network configuration with following parameters:

The following configuration leverages 4x 1GB Ethernet and 2x Fibre channel connections. The storage can be connected via Fibre Channel with MPIO. The configurations uses physical configuration and software defined / converged network for Hyper-V.


 Pro’s and Con’s of that solution

 Pro Con
– High Bandwidth for VM- Good Bandwidth for Storage
– Fault redundant
– Can be used in switch independent or LACP (with stacked switches) teaming mode
– Fibrechannel ist most common SAN technology
– Limited Bandwidth for Livemigration
– a lot of technologies involved

 Switches

Switch name Bandwidth Switchtyp
1GBE SW01 1 GBit/s physical stacked or independed
1GBE SW02 1 GBit/s physical stacked or independed
FC SW01 4/8 GB FC/s physical stacked or independed
FC SW02 4/8 GB FC/s physical stacked or independed
SoftSW01 1 GBit/s Software defined / converged
SoftSW02 1 GBit/s Software defined / converged

 Neccessary Networks

Networkname VLAN IP Network (IPv4) Connected to Switch
Management 100 10.11.100.0/24 SoftSW01
Cluster 101 10.11.101.0/24  SoftSW01
Livemigration 450 10.11.45.0/24  SoftSW01
Virtual Machines 200 – x 10.11.x.x/x  SoftSW02

 Possible rearview Server

NIC17


 Schematic representation

NIC14 NIC15

Switch Port Configuration

NIC16  

Bandwidth Configuration vNICs

vNIC min. Bandwidth Weight PowerShell Command
Management 20%
Cluster 10%
Livemigration 40%

QoS Configuration Switch

Networkname Priority
Management medium
Cluster high
Livemigration medium
VMs dependig on VM Workload

 

Hyper-V|W2k12R2|8x1GB|2x10GB

Hyper-V Cluster Network configuration with following parameters:

The following configuration leverages 8x 1GB Ethernet NICs and LOM (LAN on Motherboard) and 2x 10GB Ethernet NICs. The storage can be connected via iSCSI with MPIO or SMB 3.x.x without RDMA. The configurations uses physical configuration and software defined / converged network for Hyper-V.


 Pro’s and Con’s of that solution

 Pro Con
– Good Bandwidth for VM
– Good Bandwidth for Storage
– Separated NICs for Livemigration, Cluster and Management
– Full fault redundant
– Can be used in switch independent or LACP (with stacked switches) teaming mode
– Only one Hardware Technology is used
– Network becomes limited by a large number of VMs
– Combination between Hardware Defined and Software Defined Network is sometimes hardly to understand

 Switches

1GBE SW01 1 GBit/s physical stacked or independed
1GBE SW02 1 GBit/s physical stacked or independed
10GBE SW01 10 GBit/s physical stacked or independed
10GBE SW02 10 GBit/s physical stacked or independed
SoftSW01 10 GBit/s Software defined / converged
SoftSW02 10 GBit/s Software defined / converged

 Neccessary Networks

Networkname VLAN IP Network (IPv4) Connected to Switch
Management 10 10.11.10.0/24 SoftSW01
Cluster 11 10.11.11.0/24  SoftSW01
Livemigration 45 10.11.45.0/24 1GBE SW01 / 1GBE SW02
With iSCSI – Storage 40 10.11.40.0/24 10GBE SW01 / 10GBE SW02
With SMB – Storage 5051 10.11.50.0/2410.11.51.0/24 10GBE SW0110GBE SW02
Virtual Machines 200 – x 10.11.x.x/x 1GBE SW01 / 1GBE SW02

 Possible rearview Server


 Schematic representation


Switch Port Configuration

   

Bandwidth Configuration vNICs

vNIC min. Bandwidth Weight PowerShell Command
Management 10%
Cluster 5%

QoS Configuration Switch

Networkname Priority
Management medium
Cluster high
Storage high
Livemigration medium
VMs dependig on VM Workload

 

Why you should have a network for cluster heartbeat only!

One topic I see often during my day to day work is that customers forgott to use a cluster network and install it on other networks like livemigration or management.

With my blogpost today I want to explain why you should use a separated cluster network and what you should configure to get it running.

At first, how does a cluster heartbeat work. You can see it like your own heartbeat. Every cluster node sends every second a heartbeat and ask the other nodes after their status. If 5 heartbeats fail within 10 seconds, the cluster will remove the host and migrate workloads.

So what happens if you set cluster heartbeat to for example livemigration. When a livemigration starts, the cluster heartbeat will fail and you livemigration and cluster node will fail.

Ok .. now some MVPs and IT Pro’s say, you can use other networks as fall back heartbeat networks. Yes you can have fallbacks BUT the cluster will try 3 times to bring the heartbeat through and than change to the oher network. Normally the heartbeat will fail there too.

In your own interest, you should use an own cluster network.

Now lets go to the options that you have to create a cluster network.

1. You can use a physical NIC Team for you Cluster Network

Cluster01

2. You can share a NIC Team via additional VLAN Tag on the Team for example with management

Cluster02

3. For Hyper-V you can create an additional virtual NIC for cluster traffic

Cluster03

 

After you created your cluster network, you need to do some more steps to guarantee bandwidth for the cluster heartbeat.

1. enable QoS (Quality of Service) on your network for the cluster network

2. configure network connection binding and cluster communication priority like discibed in my last blogpost How to configure cluster traffic priority on a windows server

3. on a Hyper-V Host or with Virtual Machine Manager you need to set a minimum bandwidth for the cluster network interface. I normally use a minimum of 5 to 10%

PowerShell for Hyper-V Host

In VMM you use the Hyper-V Port Profil “Cluster Workload”

 

So that should do the trick. 🙂

 

 

 

How to configure cluster traffic priority on a windows server

During writing my current cluster network series, I saw some points some people normally miss when configuring a Microsoft Cluster via Failover Cluster Manager.

One thing is, that they do not prioties Cluster Networks against each other and to change the routing interface.

The following task must be done on every cluster server. We change the connection settings so that our routed traffic goes over the management interface first.

1. navigate to your network adapter properties and open the advanced settings in the menu bar.

06-07-_2015_15-27-332. in the next menu you move you management interface which will be your gateway on the highest place.

Network

 

So thats all for the routing part. For the next point you connect to a cluster node. You only need to do that operation once per cluster.

1. Check your Cluster Network. All Networks should be up and running.

Clusternetwork

2. Now you need to open and change the network metric. Here the lowest metric means, that cluster network has the highes priority. I recommand you to give cluster heartbeat traffic the highest priority because if that traffic fails, your node will go down within the cluster.

I set that configuration on a scale out fileserver, so my traffic will be priotiesed as followed:

high -> Cluster – Storage 01 – Storage 02 – Management -> low

So that means I need to run following script:

You can check the result with:

The result should look like the screen below.

metric

Thats all, only small changes but improves the stability of your clusters in a high rate.