A famous line from the 1975 film Jaws sees a terrified Chief Brody backing into the cabin, uttering “You’re gonna need a bigger boat”.
This is because he had just seen the size of the shark they were hunting.
Allow me to ruin Jaws for you. It could’ve been a simple, 30-minute movie if they had just equipped the boats more appropriately to help them catch the shark.
Yeah. I said it.
But what exactly does that have to do with scaling your technology stack? Let me elaborate.
What Exactly Is Scaling?
Scaling, in dictionary terms, can refer to the removal of scales, or even the scraping of tartar from teeth. But in tech terms, it speaks to a system’s ability to grow according to the workloads it needs to handle.
This is often in two buckets, Vertical Scaling or Horizontal scaling.
Vertical scaling (Also referred to as scale up) implies that we get a bigger processing mechanism to handle a workload. We do this by upgrading memory, adding more CPU, increasing disk space, using more network interfaces, etc. and essentially upgrade the system.
Horizontal scaling implies that we add additional processing points of similar size and specification and distribute the load amongst them. It eases the loads, minimising any lag, downtimes or crashes. It follows the basic tenet of “many hands make light work”.
A Brief Background on Vertical Scaling
Vertical scaling speaks to a more monolithic approach to systems design. Although an evolution on mainframes, it still uses the concept of throwing the whole workload at a single processing point.
This was popularized by hardware vendors being able to box a lot more hardware into smaller packages and making it expandable. But as with everything there is an eventual bottleneck, which in the case of vertical scaling was most often the motherboard.
There is only so much silicon you can throw at a problem until you are eventually wasting precious silicon and coming nowhere near solving the problem.
Is vertical scaling still a thing? Simply put, yes.
Financial institutions still run mainframes and utility companies still run data warehouses. Any privately owned and managed infrastructure footprint is a prime candidate for vertical scaling. Especially in the golden age of virtualization. It saw a massive resurgence in the viability of vertically scaling the underlying infrastructure to unlock additional guest processing.
The ability to add more (up to a point) can also make financial sense when the usable lifetime of the hardware can be extended by a facelift here and there.
Is vertical scaling ideal? Succinctly put, no.
A Brief Background on Horizontal Scaling
Horizontal scaling started to make sense when virtualization became reliably productionized. Organizations were not using all of their hardware most of the time and made the mistake of always buying more. Much to the frustration of the people writing the cheques.
Using the available hardware and creating more instances instead of massive single instances made a lot of sense and it was a more efficient use of the hardware that was there.
This required a significant amount of good design and network planning, but it opened up a lot of opportunities such as allowing for distributed system design to really take root. It also allowed for maintenance without downtime at reduced cost and complexity.
This meant that the infrastructure could grow according to business requirements. Start off small and expand as needed, or cater to peaks and troughs in theory.
In a fight between the two, horizontal scaling just made sense.
Horizontal Scaling Isn’t All Smooth Sailing
That headline is terrible, I used a blatant rhyming opportunity and coupled it very loosely to the minor nautical theme at the beginning of the document. I couldn’t resist.
Although we’ve established that horizontal scaling is the clear winner, it was not always the go to method of systems design.
There are numerous reasons for that. But the one I am going to pick on today is the fact that the infrastructure needed to back this was either on-premise and managed by the company, or by a 3rd party with no real vested interest in the success of the system.
To help illustrate, here is a scenario.
Imagine I want to add 5 more virtual machines to an on premise system to handle an expected surge in demand over a peak period. These are the things I need to consider:
- Do I have enough Compute (CPU and Memory) to handle this?
- Does it break the redundancy model on my hypervisor stack?
- Do I have sufficient software licenses to handle this?
- Do I have sufficient capacity to protect the data?
- Can my network manage this backend capability?
- How much engineering time do you require to implement and maintain this?
Assuming the answers to the above are all in favour of completing the exercise, then it was done so and left behind more often than not. That’s because of the sheer amount of effort it took to get it right.
Now, imagine a forward-thinking technology professional saying, “the peak is over, let’s tear it down and rebuild it when we need to”. This would result in a burning at the stake, followed by a brief trial for witchcraft.
But ask yourself, why not tear it down? Why sweat the underlying hardware because it was so much effort to build in the first place?
If left in place, eventually it would be torn down, because capacity would be needed for some other requirement.
And by the next peak, more hardware would need to be ordered so that it can be built again. Now it seems the premature pyrotechnics of the passionate professional was potentially uncalled for. But we have gone and vertically scaled the stack so we can horizontally scale the solution anyway.
I have made this mistake. And in so doing, have blamed businesses for not investing up-front in hardware, or not planning their sales efforts appropriately. Or just not caring enough about tech teams, altogether. All of which was rubbish.
I was just comfortable and felt they were making unnecessary work.
Inflexibility Breaks A Business
Nothing philosophical here. Just the hard facts that made me realize that there needs to be a better way.
Why shouldn’t the business be able to ask for capacity whenever they want to run a sale or when the nature of the business sees these sorts of spikes multiple times a year?
Why can a business not confidently define their own peak periods by running sales cycles based on their understanding of the market and their clients?
In my folly, the answer would be, “We can do it, but it is going to double your infrastructure costs”. Which has protected me from a lot of angry people responsible for growing the business.
A business needs infrastructure to sell their product. But that doesn’t mean they want to sell their product in order to buy more infrastructure.
Not understanding this means that the business becomes hampered by the cost of scaling, coupled with the lack of flexibility. The business becomes routine driven, which does not lend itself to innovation.
The ability to scale infrastructure up and down as the business requires is the definition of flexibility.
Is There A Happy Medium?
The playing field has changed. Even for monolithic workloads. The ability to use a Hyperscaler to abstract the 6 pain points mentioned above means that scale out is now not only feasible, it’s the only logical approach.
By adopting and migrating to a cloud provider or hyperscaler, the business is now able to define their peak cycles without the concern of system availability and cost.
I must apologise for using that word so openly. Cost.
The word cost brings some interesting considerations to scaling however.
But first, remember: vertical bad, horizontal good.
How does vertical scaling play to cost in the cloud?
It doesn’t really because it’s more the understanding of vertical scaling on premise, which helps us leverage it to our advantage in terms of cost
Comparing on-premise infrastructure planning to that of cloud is unfair. This is for a number of reasons. But primarily, because the hardware used by hyperscalers is well defined and orchestrated to squeeze every last IOP out of it. That hardware is also refreshed far more regularly than most organizations can achieve.
On premise, I would have a VM running 4 vCPU 12GB of memory and an OS disk sitting on hybrid storage. That will have to do for the next 5 years, because of depreciation cycles. To scale that, I would need to replicate it however many times to reach the desired effect.
Assuming that all the pain points mentioned are of no concern, it’s not an issue. That’s because the hardware is paid for, or being paid for, regardless. As a result, an engineer seldom goes and looks at the performance across the board. We never really put effort into establishing baselines for Peak vs. Non-Peak performance.
Assuming you would like to have this replicated in the cloud, these VMs would run a significant bill every month.
But understanding vertical scaling requirements and applying that to your horizontal approach means that we don’t build a 4 vCPU, 12GB beast. We build a much lighter version that we know can multiply at a moment’s notice, so we start making cost gains off of our non-peak usage to help manage the infrastructure costs for the peak demands.
“My Business Isn’t Cloud Ready”
This is a common phrase we hear. Either the technology stack is too old, or the processes are not mature enough or the cloud is too expensive.
All of these are valid reactions to a fundamental shift in technology, but all manageable concerns with the right planning and forethought.
And this does not require 3 budget cycles worth of planning. This is the part that brings us around to scaling and understanding scaling of our on-premise environments.
My most successful cloud migrations all begin with an intimate understanding of just how much horsepower we need. The reason cloud planning becomes a shorter cycle is because the performance baseline is often the last 5% of your environment that you do not know.
Spending the time to understand how you would need to scale based on what you are actually using is 5% time spent now, but is what I estimate to be 40% of your cloud planning done.
Clients are often shocked when we tell them their environments are vastly over spec’d and they’re pleasantly surprised to see the proposed cloud spend, even when the worst case is presented.
Provided the workload numbers are observed and as accurate as can be obtained, there is no risk of cardiac arrest at a cloud bill.
Do We Need A Bigger Boat?
To summarise the above, no.
Even for monolithic workloads, horizontal scaling will give the best results. This is all however pointless reading unless a full understanding of what the current processing capabilities of your system is.
This will make for a well defined on-premise experience, as well as a right-sized cloud migration and environment that does not break the bank but gives all the flexibility.