It’s 3 a.m. and your e-commerce site is down. You don’t know about it until you awake at 6 a.m.—OK, 7 a.m. It turns out to be an issue in your cloud service, and you spend a few hours to fix the remote servers. As with power-cycling the Wi-Fi router at home, things return to normal afterward.
All in all, you were down for six hours. No biggie—except that you lost a half million dollars in revenue, and the cost to your reputation is estimated at $5 million.
When people talk about such situations, what I hear is that they lack management. Systems fail, both cloud-based and on-premise—that’s a given. What also should be a given, but often is not, is having the ability to correct issues or prevent them entirely. For cloud systems, that takes a good understanding of cloud-management best practices and tools.
Cloud management is not only about provisioning and deprovisioning servers. It’s about monitoring key data points such as the server, database, application, network, middleware, and usually custom aspects for which you must create your own monitoring capabilities.
Moreover, cloud management is about fixing things automatically. In that e-commerce example, a cloud-management system could have kicked off a bunch of corrective actions, such as bouncing the server, database, and application. The total outage could have been 10 minutes, not six hours. Also, in many cases, a cloud-management system could have automatically switched to a redundant system while the primary servers were being corrected, leading to zero downtime.
The specific cloud-management tools you need vary greatly from organization to organization. But there are also general-purpose tools that are no-brainers for everyone, such as AWS Cloud Watch, Librato, and Data Dog (the latter two are cloud-agnostic). And the list is getting longer. Start with the general-purpose tools, then figure out what specialty tools you need for your environment.
It takes time and money to set up these cloud-management systems. But as you get more dependent on the cloud, the ability to keep things running and making money will be paramount. Make the investment in cloud management.