Depending on which sources you read, you may have expected your cloud program to generate IT cost reductions of 40 percent to 50 percent, and have been disappointed to achieve savings far less than that. Gartner sees organizations reporting saving an average of 16 percent by moving to public cloud, but, generally speaking, higher cost reductions are in fact possible. However, achieving higher levels requires your organization to adhere to best practices around cost optimization, and to carefully control its cloud spend. In a perfect world, cost governance and optimization strategies would be implemented in tandem as your cloud program expands, but in reality, the required changes in behaviors and policies often do not keep up with technology adoption. Lax cost governance can lead to unpleasant surprises when your cloud bill continues to increase and after you expected a cost savings opportunity.
The public cloud represents an opportunity for fast and near infinite infrastructure scalability. Therefore, without the correct guardrails, public cloud spend can easily spiral out of control. In some cases, unexpected cost overruns got so bad, they were reported to account for 10 percent of the repatriation of workloads away from public cloud. Fortunately, there is an arsenal of options at your disposal to get your cloud spend in check. These range from adjusting the services you consume and implementing new financial controls, to adapting your application architectures to take advantage of cloud capabilities.
Reducing Current Spend: Changing How Services Are Consumed
One of the most obvious ways to reduce current cloud spend is to reduce the amount you consume by promoting efficiency. This does not mean sacrificing performance or capabilities, but rather, to more closely match your consumption to your needs, so that you are not paying for idle resources. The average utilization rate for hardware in a data center is around 20 percent, which means organizations are provisioning (and paying for) copious amounts of unutilized overhead capacity. Instead of letting this expensive habit follow you to the cloud, change your organization’s behavior to take advantage of the flexibility of the cloud, and rightsize your computing instances to appropriate levels. You may have already gone through a rightsizing exercise when you migrated, but this is not a once-and-done activity. Set target minimum and maximum utilization rates and continually monitor and rightsize your environment without fear of needing to provision today for the next three or five years of use. As your organization becomes more advanced, automated scaling and provisioning strategies can also dramatically increase infrastructure utilization rates, driving a more efficient spend.
A popular tweet by a handle called Deadprogrammer, said it well: the cloud is not about paying for what you use, it is about paying for what you forgot to turn off. We therefore recommend regular scans of your environments to check for orphaned instances and unused capacity. Despite your best efforts, the freedom of the cloud will inevitably lead to some services getting turned on and subsequently forgotten, while they quietly add to your monthly bill. Our best practices include checking for stopped compute instances, unattached storage volumes, unused load balancers and NAT gateways and to rotate your logs after their retention period has expired.
Another common way to reduce cloud consumption, and thereby lower your cloud bill, is to manage the uptime of your environments more efficiently. This is one of the new behaviors uniquely enabled by the cloud, which entails spinning down lower environments when they are not being actively used. When combined with automation, environments in the cloud can be provisioned and spun up within minutes rather than the months it might take in an on-premises environment. This means you can deprovision idle resources and avoid paying for them without negatively impacting productivity when you need them again. To implement this type of behavior, start by categorizing your applications and environments based upon criteria such as which ones are under active development, the locations of the development teams, the number of expected releases and hours of active use per week or per month. Once this is done, you will see that things like user acceptance testing (UAT) environments are prime candidates for being turned off, since they may only be used for a short time before each monthly release. Similarly, development environments are likely to be used only during regular business hours and could be spun down overnight and on the weekends.
The cost reductions you can achieve by managing the uptime of environments are enabled by the on-demand, or pay-as-you-go, payment models available through the cloud providers. With this type of flexible pricing, you are billed per second for the compute resources that you consume. However, there are other opportunities to reduce your cost by utilizing other payment methods. Most notably, the Reserved Instance (RI) payment model, available in AWS and Azure, allows your organization to reserve compute resources for one or three years, and recognize significant cost savings when compared to operating those same resources continually in a pay-as-you-go scenario. Depending on the cloud platform your organization uses, a payment model strategy is key to driving down costs.
Payment model strategies may vary per cloud, but CTP has developed general best practices that apply across platforms. One notable strategy is to pre-purchase capacity up to a level your organization is certain it will use. This typically means all production-level environments will be running on reserved instances, and that you should stagger purchasing new reservations across each quarter to account for variability in demand. This allows you to adapt to variations in your usage patterns, and also gets you out of the mindset of needing to overprovision to allow for the distant future. This is also why we recommend sticking with one-year instead of three-year reservations. While the three-year reservations may provide better cost reduction options, they require the same inflexible capacity forecasting seen on-premises.
Production environments are not the only cases which can benefit from reservations. Every instance (regardless of environment) has a break-even point where the on-demand payment model will end up being cheaper than the long-term reservation.
If you are expecting to use a non-production instance for fewer hours than the calculated break-even point, it makes more financial sense to use an on-demand payment model. Conversely, if you expect to have an instance running for more hours than the calculated break-even point, it makes more financial sense to purchase a reservation.
There are cost-reduction opportunities related to your billing and account structure that you may be benefiting from without even knowing it. For example, there are settings, usually enabled by default, which allow your reservations to be shared across accounts. This means that if you make a reservation in one account, and do not end up needing it for the entire term, the balance of that discount can be applied to another account. We recommend ensuring that reservation sharing is enabled across all your accounts, so that any excess reservations in one account do not go to waste.
Your account structure can also help your organization recognize volume discounts offered by cloud providers. Consider S3 on AWS, which has a tiered pricing structure that gets cheaper per unit the more you consume. To determine your monthly bill, AWS will aggregate all of the consumption of S3 under an account, determine what level of volume discount is applicable and then apply that discount proportionally across subscriptions within the account. If you have numerous independent accounts, they may never consume enough S3 to earn these volume discounts, so we recommend making sure that all your accounts tie back to one master billing account.
Controlling Costs Moving Forward: Policies and Controls
Getting the expected cost reductions and return on investment from your cloud program require more than one-time corrections. Financial control policies which reflect the new world of cloud need to replace old static infrastructure behaviors. Many of the common patterns for controlling costs in the cloud are predicated upon having a robust and actively enforced naming and tagging standard. In a world where infrastructure can come and go by the minute, it is imperative to take the time to develop a comprehensive tagging schema, and to use tools to auto-enforce and monitor adherence. This subsequently will enable all these downstream controls, and help with the success of your cloud program.
With proper naming and tagging, you will be able to pinpoint which users or groups consumed which services (showback), and thereby have the option to charge them their portion of the cloud bill each month (chargeback). This increased level of cost transparency helps inform teams that may have had no idea of how much they were consuming and can directly incent them to be more cost conscious with their provisioning, deprovisioning and consumption habits. In fact, some companies have found success by gamifying cost reduction through friendly competitions among development teams driving to a better bottom line. While not all cloud services can be tagged, our best practice is to implement showback to the application teams for all services which can be tagged, before considering if chargeback is right for your organization.
While showback and chargeback could also be implemented on-premises to promote the deprovisioning of resources, that hardware will still exist in the data center and the cost must still be carried by the enterprise. The ability to scale the number (and size) of environments down (and up) and immediately recognize cost benefits, is a new opportunity uniquely enabled by the cloud, and you should take advantage of it.
Showback and chargeback are great ways to provide feedback and course correct month-over-month, but there are also ways via budgeting and alerting to be more proactive about controlling consumption before the bill is received at month’s end. Cloud providers have tools and systems in place to enable setting soft quotas for development teams that, when reached, will trigger alarms and send notifications to identified stakeholders. Services such as AWS Budgets and Azure Consumption’s Budget API will utilize the tags on your resources to enable you to set alerts like, “Let me know if my team gets within 50 percent of our budget within the first 10 days of the month.” These types of guardrails were not necessary in an on-premises world where the only resources you had to use were those in your data center. But in the cloud, they represent a new behavior that your organization needs to adopt in order to keep your costs in check.
Rearchitecting for Cloud: Cost-Conscious Changes at the Application Level
So far, we have focused primarily on how to reduce and control your costs by adopting new behaviors and becoming savvy about how you consume different services. But there are also enormous opportunities to reduce ongoing cloud costs by rethinking your application architectures. In addition to performance and feature enhancements, refactoring your applications to take full advantage of cloud native services can lead to a lower cost of operation. Consider whether or not your applications could benefit from the following options:
The unparalleled elasticity of cloud allows for rapid horizontal scaling, which effectively performs the rightsizing described above, but in real time. This type of near-instantaneous scaling is ideal for applications with decoupled architectures, which experience large fluctuations in demand. Enabling autoscaling prevents you from over-provisioning for the peaks and can trim the number of servers to as little as one, to reduce costs when there is minimal load on the application.
Low-impact Disaster Recovery:
Traditional disaster recovery (DR) architectures can be inherently expensive, requiring a full duplicate set of production resources sitting idly in a secondary data center. Depending on the Recovery Point Objective (RPO) and Recovery Time Objective (RTO) requirements of an application, a DR solution can be architected in the cloud with a much smaller (and cheaper) footprint. For example, a pilot light architecture requires only that an application’s data and some minimal services be replicated into another region, allowing for a much smaller footprint and lower cost than your traditional DR architecture.
PaaS vs. IaaS:
Often the easiest and fastest way to migrate your portfolio to the cloud is to utilize a lift-and-shift approach which involves rehosting your workloads on cloud infrastructure (IaaS). However, this may not be the most cost-effective solution in the long run. For applications which are appropriately decoupled, replatforming to utilize platform-as-a-service (PaaS) services removes some of the burden of management, which can lead to lower operating complexity and lower operating costs.
Intelligent Storage Selection:
Storage options in the cloud go far beyond what is available in the average on-premises data center. Your organization may have started off by moving everything to Amazon S3 or Azure Premium Storage, but there are dozens, if not hundreds, of storage options and configurations available. Some may not only better meet your requirements but could also ultimately be cheaper to operate. Taking advantage of cold storage for your archives, only using premium storage for production environments, choosing the appropriate replication levels and setting lifecycle rules for data retention can all help reduce your monthly costs.
Implementing serverless architectures can help organizations further reduce their cloud spend. Compute costs are typically the largest contributor to your cloud bill and primarily come from the use of VM instances. While these are billed on their per-second uptime, they are not typically utilized 100 percent. Serverless services are billed at the per-100-millisecond rate, and only persist for the length of time required to complete their function. A serverless application can sit idle for free and only incur expenses for the few milliseconds that it is needed, potentially resulting in drastic savings. As an added bonus, the serverless AWS Lambda service and Azure Functions service each offer 1,000,000 free requests per month and up to 400,000 GB-s of free compute time!
Cloud has introduced a new paradigm to IT, and requires new ways of thinking at all levels. Achieving the cost reductions you expect from your cloud program will not happen automatically. Your organization must adapt its behaviors and processes. Following these best practices will start you on your journey to optimizing your costs, and, through continual application, these practices will keep your cloud bill in check.
Continuously monitor, manage and optimize costs in the cloud with our Continuous Cost Control managed solution. Learn more here.
 IDC – Fostering Business and Organizational Transformation to Generate Business Value with Amazon Web Services, 2018, CTP Research 2010-2019
 Amazon Web Services – Cloud Computing, Server Utilization, & the Environment, 2015
 CIO Magazine – 2011 Virtualization: Bump Up Your Server Utilization for Big Enterprise IT Gains