VMs present challenges under the best circumstances.
And within the best environments.
While provisioning, licensing and general management come to mind, they certainly aren’t new problems. Rather, it’s the VM sprawl, hyperscale lock-in and hypervisor inaccessibility that are like gremlins itching to make an appearance.
If you’ve been keeping up, then you’re aware of all the good when it comes to working with cloud-based VMs.
So it’s only fair that we dissect a bit of the bad and the ugly.
SO WHY ARE VMs IN THE CLOUD PROBLEMATIC? (THE BAD)
Let’s discuss what additional problems Cloud VMs present:
VM Sprawl
VM sprawl became a thing when we realised that we could create more servers, as we weren’t limited to physical machines.
The only things that kept engineering armies in check was the limited nature of the underlying hypervisor infrastructure – which would scream at the engineers every so often, letting them know they needed to review and clean up.
The cloud opens the gates completely. There is no limit to how much you can build. With theoretically unlimited resources like memory, CPU and storage, VM Sprawl becomes an amplified issue.
This creates not only a messy environment, but also carries some severe cost implications.
Lack of Planning
A secondary symptom of the unlimited nature of cloud-based VMs is the lack of planning it creates.
Previously, the systems being requested were rigorously architected due to the fact that physical resources were limited. Often accompanied by a laundry list of boxes that needed to be ticked before a virtual server could be built. Things like licensing, network capacity, lifecycle management, backup and retention periods, etc.
With the cloud, these are all of lesser concern, which has made diligent engineers more relaxed in their approach to building VMs.
Customisations and Images are more difficult in the Cloud.
Previously, images could be captured off of a reference machine and would become of standard use. While still achievable in the cloud, additional steps and effort in maintaining the lifecycle of these images is needed – leading to increased ad-hoc engineering efforts that leave the infrastructure vulnerable to misaligned machines.
Although this is less of a concern – as engineering teams are normally thorough – it is something that I’ve closely observed throughout the client journeys I have seen.
Hyperscale Lock-In
The last ‘gotcha’ to provisioning VMs in the cloud is the fact that hyperscalers often want you to use pre-configured machine specifications or templates.
The reason behind this is actually quite interesting.
It is a side effect of the military-like orchestration hyperscalers apply to their hardware management. These specifications are so well calculated so as to squeeze the best performance out of that VM for the tier it is built.
The problems come along when software / workload requirements do not align to that, so often the build decision takes the VM to the next tier in configuration which isn’t a problem typically, but in a consumption model, does impact cost and it means design is often thrown out of the window.
If you design your systems well, you ideally want to make sure you hit that performance sweet spot. But we are, more often than not, forced to over allocate resources to keep the VM aligned to the excellent orchestration offered by the hyperscaler.
So Why Are VMs In The Cloud A Bad Choice? (The Ugly)
I’m not even going to try and lie to myself here. Most of the technical audience reading this really only care about this section… I get it, and yes, I could have started with this, but then my 1966 Western reference wouldn’t have worked and we can’t have that now can we?
What are some of the things that would (and should) make you avoid VMs in the cloud? There are some potential showstoppers – and they need to be considered when looking to use the cloud for IaaS.
Heavy Costs
VMs are the most expensive way to use the cloud. The costs can be prohibitive, especially given the bad habits that can be formed due to the unlimited nature of the cloud. Which can escalate quickly.
It’s just far too easy to run up a VM bill when you have less to worry about.
No Access To The Underlying Hypervisor
You do not, as a cloud client, have any access to the underlying hypervisor. So you cannot tweak anything that is not an option in the cloud console or command-line interface (CLI).
No matter how hard you engineer, your options are given to you. Which also means no kernel-based virtual machine (KVM) in the cloud. This was a safety net used quite heavily in the golden age of virtualisation – especially after notorious patch tuesdays. So if a vm bricks in the cloud, you have no idea where – or have a very limited view.
As tech evolves, I am sure this will become an option. As of right now, the closest you can get is a KVM snapshot and having to pour through instance logs to discover what went wrong.
This lack of insight means you typically have to go N + 1 or 2 or 3 to make sure you have continuity. Although this isn’t a bad option, the costs become ugly and the management involved increases.
VMs Are Just That: VMs
The last reason I do not love VMs in the cloud is that they are still just VMs. Nothing truly changes from on-prem, except that you have mitigated infrastructure risk and given yourself access to unlimited hardware.
The same old problems that all operating systems inherently have get transferred to the cloud. Your engineering teams? They become an army of OS support experts.
This isn’t harnessing the power of the cloud for true modernisation and should always be a consideration, because organisational change takes time.
It all comes down to planning and knowing your environment.