Welcome back to my series on our journey of testing Nutanix and Mellanox. Part 1 & 2 of the series focused on Nutanix AHV networking and integrating with Mellanox, so we’re going to shift in Part 3 and look at the AHV configuration for getting Data Protection going and performing a failover test.
As part of our lab test, we will be replicating VMs between our 2 Nutanix clusters, to see how AHV VMs can be migrated between clusters, and restored as needed. For the purposes of our scenario, we will be using the following networks:
Prism Network Name
Setting up Data Protection within Prism is very easy to accomplish, with only a few steps needed to get thru. Let’s take a look at those steps below:
- Configure Remote Site settings on each Cluster
- Remote Site Name
- Capabilities (DR or Backup)
- IP Address and Port
- Configure Mappings for Network and vStore
- Mappings are 1:1 for each entity
- Networks will be chosen from a dropdown (testing appears to require same name)
- Create Protection Domain (AsyncDR)
- Entities (VMs)
Let’s take a deeper dive into each of these areas.
Remote Site Settings
Configuring a Remote site is extremely simple in Prism. From the Data Protection dashboard, click the button to create a Remote Site.
You have the option to replicate to AWS or Azure for Data Protection, but for the purpose of this article we are focusing on sending to another Nutanix cluster.
Enter the remote site name, leave capabilities as Disaster Recovery, and enter the IP Address and port of the DR cluster.
On the Settings screen, set any default bandwidth to throttle as needed, as well as the option to compress on wire. For this lab, since we have ample bandwidth we will not be performing any throttling or compression. We choose our network mapping (from the table above), as well as map the VStore settings for each cluster.
Once entered on both sides, we can validate that the Remote Sites have peered by viewing the summary details. The cluster value for the remote site should match the last portion of the CLUSTER ID and the CLUSTER INCARNATION ID.
Create Protection Domain
Enter a name for the Protection Domain, something unique to the group of VMs being protected.
On the Entities screen, add the VMs that should be protected by this Domain, and either use the Entity name or create a new Consistency Group name, and click the Protect Selected Entities button to add them to the group.
On the Schedule screen, configure 1 or more schedules for this Protection Domain. This can be one by minutes/hours/days, or weekly/monthly. Note that the minutes option has a minimum value of 60 minutes.
In our case, we will be scheduling the retention policy to keep 2 snapshots local and 2 will be replicated to our remote cluster.
Once configured, we can see the initial snapshot taken, as well replications occurring under the Replications tab on the Production cluster.
We can also validate the replication of the snapshot on our DR cluster by looking at the Replication status for incoming replication, and by also checking the Local Snapshot tab.
Now that we have the VMs associated to a Protection Domain, we can now test failover from one Nutanix cluster to the other. We can see from the screenshot below that our 3 VMs are powered on, and reachable on the network.
To initiate a Planned failover, from the AsyncDR tab, highlight the Protection Domain and click the Migrate option to initiate a migration.
Since Nutanix supports a 1:1 and 1:Many replication scheme, you are provided with the option to select a Remote site to migrate the Protection Domain to. We’ll select our DR_NTNX_POC Remote site option, and start the migration.
Once the migration begins, a new snapshot will be taken and replicated over to the Remote site.
The VM’s will be powered off and un-registered from the Production cluster, migrated and registered on the DR cluster. By design, the VMs are not powered on once migrated, so any other failover activities can take place prior to the machines coming online.
One thing that can be a manual process with AHV is if you’re not using DHCP, doing a stretched Layer 2 network, or using some sort of overlay technology such as Cisco OTV, there is no method to change the IP Address of the AHV VM should the networks need to be changed. In our case, we were going from a 10.0.12.0/24 network to 10.0.212.0/24. So the design of AHV to not power on the VMs in this case is a good one, as it allows us to go into the VMs after being powered on and updating the IP Address, and become accessible on the network.
Reversing this operation is as simple as clicking the Migrate button on the AsyncDR tab again, and migrating the VMs back to the alternate cluster. The same process will occur, with a snapshot being taken based on our Retention policies, and replicated to the alternate cluster. We can see the Replicated Snapshot on the Production cluster that was replicated from the DR cluster below.
And that’s pretty much covers using Data Protection to replicate VMs between Nutanix clusters. Should we have gone into a true dark situation where the Production cluster was lose, rather than using the Migrate option we would use the Activate option from the DR cluster. This would bring the VMs online using the last snapshot, and we could clean up the Production cluster and restart replication again.
I hope you found this article useful. I’m finding the more that i use AHV the more I’m liking it, and I find that it can fill a segment of our customers who don’t need or want to pay for all the VMware vSphere features, yet need functionality that will still allow them to run the business without worry of failure.
Thanks for reading, I’m trying to decide what the next entry in the Nutanix/Mellanox series, if you have any suggestions let me know!
Nutanix and Mellanox Series:
- Nutanix and Mellanox – A Journey
- Part 1: Lab Setup and Networking
- Part 2: Nutanix Network Configuration
- Part 3: Acropolis Data Protection Configuration
- Part 4: Prism Central Deployment and Configuration