Imagine a scenario where you can set up your cloud platform and forget about it. You know exactly how many cloud resources your business will need for the foreseeable future. The applications you support will not encounter any significant changes in usage, day to day, week to week. All is calm. Everything is taken care of.
In the real world, this rarely happens. Cloud usage can swing wildly, especially if you are supporting consumer applications that are popular on particular days or in particular seasons. What you need is an autoscaling capability that lets you expand or contract your resources on a dynamic basis. The question is, how do you handle the scaling function? Do you build it right into the app – creating a so-called elastic application? Or should you let the cloud platform do the work?
This is an important issue for companies that are relying on cloud to help them not only manage peak usage situations but also grow their businesses over time. Companies want to have agile systems, keep costs down and avoid having to assign staff to manually turn knobs on a regular basis. Autoscaling can move them in the direction they want to go.
While there are advantages to building flexibility right into the app itself, the more common decision people make is to leverage the autoscaling services available in the cloud infrastructure. AWS, Google and Microsoft Azure have all made autoscaling an integral part of their cloud platforms.
How Autoscaling Works
Here is an example of autoscaling in action. Say you have a service that handles payments of cell phone bills. People being the procrastinators they are, the cellphone company fields the bulk of its payment traffic the last three days of every month. If the company has an on-premises application, it could port that app to the cloud without making any code changes, and still get the benefits of autoscaling the platform.
The platform provider’s autoscaling function watches the number of requests coming into the website, and when the traffic gets to a certain level, determined by the cell phone company, the platform will automatically provision more copies of that front-end system to handle the load.
The app itself has no clue this is happening; it is just consuming the resources. The platform provider’s router routes the traffic and balances the load across all those servers. When the load starts dropping off, the platform autoscales back down again. It sees the inbound traffic is sliding off, and the app uses only the services it needs based on the new set of demands.
The challenge with equipping your platform with an autoscaling function is that you can get false positives. When traffic comes in, the system reacts. But what if the traffic is being generated by somebody launching a DNS attack? Your platform will spin up resources to handle inbound traffic even when it is traffic you do not want to see.
The Benefits of Elastic Apps
Creating an elastic app does a better job of capturing such information. The app itself knows what is real. It can determine whether the traffic is being generated by real users and whether it is important to the business.
For example, consider a scenario in a bond trading environment. The autoscaling platform would not know if a firm has a high number of bond trade settlements coming in behind the scenes. There might be some indication around the size of the volume, but the infrastructure is not going to be able to determine whether it is reacting to legitimate trades. Elastic applications can be configured to react to certain kinds of traffic, and authorize the right level of resources at the right time.
The challenge is, elastic applications are hard to build. They are growing in popularity, but most companies are still at the stage where they can act more effectively by implementing autoscaling functions at the infrastructure level.
Another option is to pursue a hybrid approach – to have the autoscaling of apps and the infrastructure working in concert with each other. Companies could start by autoscaling their platforms, master that, start building elastic applications and eventually get to the point where all new apps are being built with autoscaling in mind.
Communication Is Key
What kinds of pitfalls can you face when you set up an autoscaling function in your platform? For one, you can get into a situation where you are overengineering. If you provision for large amounts of resources, you have to be realistic about whether you are going to get the payback you are looking for. Everybody wants to do autoscaling, but they often want roast beef when they are on a bologna budget. It has to make sense for the business.
This underlies the importance of having the infrastructure team, the application teams and the business units working closely with each other. If there is a turf war, you will have problems. Lack of communication creates situations where the infrastructure group configures the platform, the other groups want something different and nothing gets done.
Define Autoscaling Rules
Everyone wants a platform to be flexible. But how do you determine how and when to scale up and scale down?
Basically, you need to define your autoscaling rules based on application demands. As an infrastructure lead for your company, you will need to have detailed conversations with app teams to understand their elasticity needs. Here are a few key questions to ask:
- How dramatic is the inbound traffic likely to be over time for the application?
- What are the user requirements for the app that will drive elasticity at the infrastructure level?
- What are the economics around building an autoscaling platform behind the app?
- What changes need to be made to help the autoscaling of the app be successful?
Once you get the answers to these questions, you can plot your course. But you are not done. You cannot set it and forget it.
You have to constantly tune your autoscaling platform to understand how your infrastructure is responding to the application demands. That means the application team has to have visibility into the user experience. If the user experience is slow, cumbersome in its latency and not able to get the storage it needs in a timely fashion, the autoscaling functions are not working well enough to give high marks in a customer service application. In addition, the converse is true: if the app does not scale down fast enough, you will encounter significantly more costs. You will have all these resources on that are not doing any work. There should be a balance between knowing when to turn resources down and when to turn them up.
To a certain extent, autoscaling could be considered table stakes for using the cloud. If you have variable workloads, you need to use the resources available to you to manage them. Done right, autoscaling can help your business take advantage of opportunities and operate in the most efficient manner possible.