“Serverless architecture” refers to applications that significantly depend upon third-party services (known as “back end as a service,” or BaaS) or upon custom code that’s run in ephemeral containers (“function as a service,” or FaaS). Currently, the best-known serverless vendor host is AWS Lambda, but Microsoft and Google are both hot on Amazon Web Services’ heals for serverless systems.
By moving much of an application’s behavior to the front end, serverless architectures remove the need for the traditional “always-on” server system sitting behind an application. Although this is not the place for a 101 article on serverless computing, much information on the topic can be found here and other places.
We hear quite a bit about how serverless computing frees developers from their reliance upon physical or virtual machines in the public IaaS cloud. But what is the impact for IT Ops, which must integrate serverless apps with the rest of the IT operations? How does it affect IT operations management, IT service management, data center orchestration, operational intelligence data, and your cloud management strategy? Here are the issues, and how IT Ops should adapt.
What’s the Big Rush on Serverless?
First, we need to consider the growth of serverless computing, which translates into urgency for IT Ops. The graphic below depicts the growth of the “serverless” term and the growth of AWS Lambda in the last few years. Lambda grew faster than the “serverless” buzzword, and this is indicative of an adoption pattern around a product, which became a group of products, which became the buzzword “serverless.”
The growth reflects the fact that IT Ops, especially Cloud Ops, is required to change processes and toolsets around changes that occur in IT around the adoption of new technology, in this case, serverless.
So, what’s different? Consider the following table:
Cost and Usage Monitoring
When considering traditional systems, those that are still found in data centers, it is not a priority for IT Ops to monitor show-back and charge-back. These are typically fixed costs that have been allocated to the departments that own them.
The use of the public cloud changed all of that. Indeed, the cost of cloud usage is typically billed by the number of virtual server instances you spin up, the time that they operate, and some variable costs around those items, such as database and network usage.
However, traditional public IaaS cloud servers are easy to monitor for cost and usage. You have a choice between native tools that are provided by the particular cloud vendor and non-native tools provided by a third party. Many of these third-party tools can even span traditional systems and those that are public cloud-based. This provides IT Ops with the ability to holistically orchestrate cost-monitoring operations.
Serverless tosses a new monkey wrench into the IT Ops mix, namely the need to monitor cost and usage at the applications level versus the server level. You only pay for the back-end resources you leveraged, as defined by the functions leveraged by the application you write and invoke, and those costs will be directly linked to usage.
In other words, if an application is transactional in nature, such as one that records a sale of a product, it may cost 0.002 cents for a transaction to be carried out using a serverless event call. IT Ops must, therefore, understand exactly what the application is doing and monitor its usage, typically via the use of native monitoring tools. This compares to simple usage server monitoring that could be running many applications or database operations, all opaque to IT and Cloud Ops.
So IT Ops needs to become application-aware, rather than just server-aware. This could mean having more IT Ops staff who understand applications and databases.
Traditional governance mostly consisted of rules and policies that were passive in nature. In other words, the rules were not enforced using automated governance tools. However, in the new IaaS clouds, no matter if you’re using serverless technology or not, limits need to be placed upon service usage.
Resource governance, including tools that support resource governance, makes it easy to place limits on who can use cloud services and resources (including storage, databases, compute, etc.) and when and how much they can use them. Simply set the policies, and the governance tools ensure that cloud users will not go out of bounds. This is both complex and server/service-oriented.
When considering serverless, the governance approach (as related to IT Ops) is even more complex than that of server-oriented, cloud-based systems. Instead of dealing with just resources, such as limitations on server instances, IT Ops must be concerned with what each serverless function is programmed to do and when limitations should be placed upon the execution of a serverless function.
Here you need to depend upon the monitoring and management tools that are cloud-native, meaning that your ability to provide Ops governance is limited to the capabilities of the governance tools within the cloud-based, serverless computing system you leverage.
When considering performance related to serverless and IT Ops, the best way to describe the changes that you’ll see in the world of serverless computing is “much more dynamic.” While traditional server-oriented IaaS clouds provide an elastic-static approach, meaning that you can increase performance by provisioning more servers, this is more of a reactive way of dealing with performance issues. While leveraging performance monitoring and analytical tools will help somewhat, there still needs to be human intervention for most enterprises; we can call this “elastic-static.”
What changes with serverless are that the functions that define the behavior of a serverless application can be said to be “elastic-dynamic.” What’s dynamic? You need to understand what the applications will need in terms of back-end resources, including how the serverless applications behave under load.
Since the back-end resources are dynamically assigned to a function, we can get pretty consistent results during just a single invocation of a serverless function. However, a thousand functions called at once may result in very different performance characteristics. Tens of thousands of functions firing off at the same time can affect performance under an increasing load, even though you push the task of allocating enough resources to provide good application performance to the serverless cloud computing provider. The results could be much more random than IT Ops and users would like.
Business Continuity and Disaster Recovery
When it comes to serverless cloud computing, business continuity and disaster recovery (BC/DR) are perhaps the most neglected parts of the serverless computing movement. Why? The cloud provider has made assurances that by just being in the cloud, BC/DR is built in. But IT Ops has questions here, and for good reason.
In the traditional systems world, we leveraged physical replications of systems, typically hot or cold standbys, that could be activated at times when the primary systems went down. Moving to the public IaaS clouds, we had the concept of virtual replication or leveraging other systems that are typically physically diverse, to host the hot or cold standbys. While you could leverage an inter-cloud solution, meaning two different public cloud computing brands (e.g., Microsoft and AWS), most enterprises are doing virtual replication intra-cloud but using different regions.
With serverless computing, there is really no platform analog out of the public cloud you leverage to provide a hot standby. Indeed, if you leverage a specific brand of serverless computing, the functions are not portable from cloud to cloud. Thus, you’re limited by the tools and processes that the serverless cloud provider gives you to support BC/DR.
While this does cover simple IT Ops, it lacks options to support BC/DR. A number of serverless application deployments and the types of serverless applications that are built and deployed have not set clear requirements for BC/DR. Indeed, if BC/DR is a priority right now for the IT Ops teams, you may want to steer clear of serverless cloud computing systems until more options arrive on the market.
It’s All About the Application
Deployment and operations of serverless computing systems have just started to take off. Best practices in serverless development have yet to be formed for most enterprises, and IT Ops is just at the exploration stage right now.
What we do know: Serverless computing, as related to IT Ops, will be all about the application, not about the resources that the applications leverage. IT Ops has focused on physical or virtual servers within data centers, and now within IaaS clouds. While the transition has been from traditional systems to public clouds, the approaches to IT Ops are really very much alike.
Serverless is different. The resources are dynamically allocated with the functions or serverless applications. IT Ops must focus on the applications, not on the servers they run on. That change alone could mean more complexity and cost for IT Ops. It’s time to increase your budgets, before serverless gains any more traction.