Using Veeam Replication to move Datacenters : Lessons Learned

Monday, April 30, 2018


I recently participated in a small datacenter move that involved migrating a few dozen VMs from a legacy EqualLogic system to some shiny new Hyper-Converged Infrastructure.   For a variety of reasons (budget, licensing, timeframe, criticality), we decided to use Veeam Backup and Restore Suite.  Specifically the Replication features to replicate VMs from the old environment to the new one.   In our situation, we were not changing IPs or anything but did not want to connect the new vCenter infrastructure running on the HCI Cluster with the old legacy environment.  This was really just a lift and shift type of move.  In order to control downtime, the plan was to leverage replication jobs to seed and keep up with delta changes until the final cutover so that even for large multi-terabyte VMs, the outage window could be managed and minimized. 

While doing this migration, I jotted down some general notes and some lessons learned so that yours and my next migrations will go that much smoother.

Verify all systems are in working condition – If something (Virtual Machine, OS or Application) is broken before the migration, it will most likely be broken as much or more, after the migration. Verify the working condition of applications and VMs BEFORE migrating so you are not chasing ghosts AFTER the migration.

Take the time to Spring Clean the storage before the migration.  Make sure everything you are migrating needs to be migrated.  See if you can get rid of those old ISOs, images, no longer used dev boxes or abandoned POCs.  You will save time on the actual data migrations but also give the new environment a nice clean new baseline to operate under.   Garbage in, Garbage out as they say so try to limit the intake as much as possible.

This one might seem super obvious but make sure you have enough storage space on the other side of the replication from Point A to Point B.  In cases of Nutanix clusters, we made sure we have compression and deduplication licensed, configured and ready to go before sending the data across so that we could work it inline rather than having to spend cycles crunching it down after the fact.  

Verify all your support contacts – This includes not just vendor contract numbers but also internal support phone numbers and after hours access methods.  Calling someone who can help resolve your issue in a few minutes is better than spending hours trying to unravel it yourself sometimes.

For phased, staged or multipart migrations, organization is key.  My personal preference is to work off a spreadsheet that has all the VMs and their pertinent information.  This includes IP addresses, sizes, DNS names and other information necessary to rebuild the VMX or NIC configurations if needed.   I also typically have columns or color coded rows to depict migration status (Seeded/Completed/Failed).

Make sure you know all the local Administrator passwords to all the systems.  Often during the migration, domain membership might become stale or unavailable and the fastest way to revive the systems is to log in and rejoin the domain.  Good luck if you don’t have that local administrator password.   Hacking and cracking passwords are options but you don’t want to do that more than once or twice unless you plan on sleeping in the datacenter.  (Spoiler Alert: it’s time consuming)

For Veeam specifically, remember that digests do not transfer from job to job.  So if you start your replication with one job and then for some reason a few days later decide you need to recreate the job again (maybe to split up VMs or something), Veeam will have to walk the VMDKs again. The replica created earlier can be used (via replica mapping) but the files will need to be walked and that can be time consuming for large Virtual Machines.

Be sure to hold or cancel any normal backup jobs that might contend for resources on machines you are trying to replicate.  This is especially important when timing and deadlines are critical.  Weigh the risks associated with the backups being cancelled or held against the cost of the contention in terms of timing of your migration.

Make sure your Veeam server and Backup Proxies have enough cores and resources.  Simultaneous threads are recommended to be 1:1 with the cores.  If you have too few cores associated with your Veeam proxy, large VMs can bottleneck other replication jobs that could have been running in parallel.

Don’t make really big replication jobs. The way planned failover works, if your VMs are part of a large job with all the VMs, failover will occur one by one and not in parallel.  Setting up individual jobs for VMs or at a maximum application groups makes the most sense and will give you the most flexibility.  For operations at scale, use Powershell to create all the jobs programmatically.

  This list is by no means complete; so please feel free to add your favorite tips or gotchas to the comments.

Next Post »