Governance Practices: “IT governance is a set of practices designed to ensure alignment between the business vision and IT strategy, which facilitates the oversight of these enterprise demands through consistent management, cohesive policies, guidance, processes and decision rights for a given area of responsibility. IT management is primarily concerned with the effective, stable and optimal execution of technology operations, in support of any existing governance strategies, and is tactically focused.”
Without a solid set of governance practices, an IT organization would be wandering around in the wilderness. It would have no formal program to take stakeholders’ interests into account and no structure to ensure that IT is supporting ongoing business strategy.
This is especially true for IT organizations operating in the cloud. Issues that are simple to manage in an on-premises environment get increasingly more complex when IT leaders no longer have physical, hands-on access to equipment. This is exacerbated by the fact that the speed of change increases dramatically as you shift to cloud-oriented deployment practices. To take advantage of what the cloud has to offer – automation, scale, velocity, agility – organizations need to create governance policies specifically suited to the cloud.
But how? Can on-premises IT governance frameworks be tweaked and molded to work in the cloud? Or do organizations need to scrap their whole governance structures and start over? And how do you get a cloud governance project up and running? These are important questions for cloud operators to consider long before they begin moving workloads.
Embedding Governance in Systems
One thing to remember when you develop a governance plan is that it is not just about the systems themselves – it is also about people. The technology is relatively straightforward, but organizations tend to ignore the people and process issues, which are not. When you take manual tasks out of the process, it is important to embed governance decisions in your systems. Companies need to think about how governance, and the work people do, will be handled differently in a long list of operational management functions – including change management, configuration management, asset management, service desk and logging and monitoring.
Governance is still required in cloud environments, but there are places you can streamline and automate tasks. Some areas, such as governance in security, can be adapted fairly seamlessly from the policies in your on-premises environments. In other areas where it is tactically different, you need to create new policies and procedures. Below, we outline some ways to start building your cloud governance strategy.
Change management is one of the easier functions to adapt yet is a common bottleneck. The overall process stays the same in the cloud, but you need to find ways to streamline it wherever possible, such as finding decisions that can be standardized and automated. Since changes in the cloud can be done much faster, you are looking for ways to leverage automation and speed up the approval process while still balancing risk.
Most companies have a robust process to govern change, which includes a change approval board that sets a commonly accepted throughput and frequency (establishing what is too slow, too infrequent). Rapid cycle time between the business need and production deployment drives the most value out of the cloud. (This is your time to value.) Through the use of automated development and deployment, processes will become streamlined, quickly making change governance the bottleneck on this road to value.
A common place to accelerate change management is anywhere you can adopt a preauthorized, lower risk, “standard” class of change within the process, which would flow through automatically. If you do not have an existing standard change process with automatic approvals for certain changes, you need to develop that out of the gate. Once that is adopted, establish a regular optimization review of your change process. Drive to minimize the changes that require a trip to the change review board by standardizing as much as possible over time. This is key to driving down your cycle time and increasing your velocity.
Just as in traditional IT management, you need to build secure hosts in the cloud. In fact, host-based security is paramount, as cloud security is workload-centric vs. environment-centric. The biggest change for configuration governance in a cloud context is a singular focus on removing manual review/config steps. The goal is to develop “gold images” and “gold configurations,” and drive toward immutable infrastructure.
To start, you will need to pay attention to your initial configuration for your host images, and minimize the drift of that configuration over time, just like you should on-premises. In on-premises environments, you would modify an initial build through both patch and change management processes, depending on the update needed. But in the cloud, with new host deployment measured in minutes, it is more efficient to throw away a host and replace it with an updated version of the “gold image.”
Configuration management in the cloud is essential, because the more you automate and “shift left” before execution, the better. In order to extract the most value out of the cloud, you will want to streamline configuration management workflows and embed automation as early as you can. In the cloud, humans do not touch environments, and you do not change configurations on the fly. If that initial system image is wrong, you replace it with a new image that has all the capabilities you need. While it is important to think through the changes this approach will require, you can use your current processes initially. Getting to this nirvana takes time, but making sure you understand out of the gate the changes necessary for configuration and patch management is essential.
Everyone claims to have a strategy for managing their IT assets, but most companies do not execute their strategies well. In the cloud, having a structured program with a simple identification scheme able to handle short-lived resources is critical. If you do not have those elements, that needs to be addressed as you move to the cloud.
On-premises assets can live long lives. You buy a server and it can take weeks to months to acquire and deploy it. That device may live in the data center for one to five years or longer. Assets in the cloud get recycled much faster: you can spin a server up, use it and spin it back down in 10 minutes. This short lifespan may not get captured by legacy asset management processes. You cannot manage what you cannot see, and in the cloud, you cannot validate your asset management accuracy by walking around and taking a manual inventory. So, it is important to understand early in the process what you have, and to identify what is essential to track in your environment.
If you do not have a robust asset management process in place, you can start a new one in the cloud. This should begin with the “big rocks“ first: focus on a top-down inventory of the enterprise, starting with the first mover application and its environments. As you transition more and more workloads to the cloud, it will be essential to know where they are.
Regardless of your asset management maturity, you will need a simplified, focused tagging strategy. It is critical to keep this simple to start. Initially, track only key elements, such as Resource Name, Application, Environment, Business Unit, Owner, Security Classification, Retention.
Design your tagging strategy for ease of integration with your existing management/governance process, including any existing asset management process. The tagging strategy will evolve as your cloud adoption matures, but keeping it simple to start helps avoid unnecessary complications from legacy asset tracking decisions.
Here, the overall processes stay the same in terms of request tracking and escalation processes. But you will need to anticipate how to integrate new alert data from the cloud estate, as well as handle changes to how some tasks are executed.
With event, incident and problem management, the first thing you have to do is integrate your cloud environment, alerting flow into your service management toolset, either through ServiceNow or another platform. The human processes initially stay the same: an event gets raised and goes into the service desk; if there is an incident, it gets resolved. While the processes stay the same, how you are going to resolve the issue changes. This will be an incremental change at first, but the increasing adoption of automation will drive the breadth of issues getting resolved without manual intervention. There may be whole classes of incidents/requests, such as the resource scaling or removal of temporary resources, which can be handled through automatic rules addressed in the environment. Historically, if you had an outage, human intervention would resolve the issue. Now, you can have issues raised and closed automatically through the adoption of architecture and operational principles of auto scaling, self-healing and immutable infrastructure.
While you can reuse a great deal of your existing service desk capabilities, you will need to plan for the alert integration from your cloud estate, and begin to think through the classes of issues that can be streamlined and addressed through automation.
Logging and Monitoring
While existing logging and monitoring strategies can easily be modified to support data inflow from the cloud, there are a few key differences to consider. First, while you can often reuse existing log aggregation or security incident and event monitoring platforms, you will need to integrate them with cloud provider data flows.
Next, you will need to make sure your team understands the key differences between on-premises and cloud architectures. Training IT and security ops personnel on cloud architecture helps ensure they are responding to real signals and not noise based on legacy architectural assumptions.
Finally, you should begin to consider the storage needs for your logging data. Key questions to consider are where the logging “source of truth” is (the cloud estate or on-premises), and how that location may change as your adoption of cloud evolves. You will have access to near infinite storage, but should not store what you do not need: your budgets are not infinite! Cloud architectures can allow you to easily create additional enclaves of logging data for various purposes (troubleshooting, forensics, R&D, archiving, etc). While this flexibility is valuable, defining your access and usage pattern will help you manage potential egress charges. Furthermore, developing a data retention strategy based on your regulatory or business requirements for cloud estate logs is important from the outset. Keep it simple to begin with, and leverage cloud provider tools.
On one level, governance is governance: it is about keeping track of what you have and the processes you follow. But governance in the cloud cannot be just a carbon copy of governance on-premises. It has to be constructed with cloud needs in mind, because it is going to drive the future of your cloud operations.