Companies that are making a platform choice for Big Data implementations must ask themselves: What are the value propositions public cloud providers offer?
Unfortunately, many companies do not seem to have a clear picture of the growth, innovation and service offerings that the main three public cloud providers; Amazon Web Services, Google Cloud Platform and Microsoft Azure have at the ready. After leading several cloud-based Big Data implementations, I’ve come up with my top 6 reasons why companies are increasingly selecting the public cloud for Big Data projects.
1. Infrastructure Independence
Companies looking for scalable Big Data solutions, without any infrastructure constraints, often find themselves trying to accomplish this dream by building large-scale and expensive IT infrastructure (hardware, software, storage, etc.) in-house. However, with continuously evolving business needs and the massive explosion of data, it is becoming increasingly difficult for companies to keep up. Expanding infrastructure to fulfill business needs in a traditional data center often takes weeks or months. Public cloud, on the other hand, offers the ability to procure unlimited infrastructure capacity and scale up or down on-demand. All this with low costs, high availability, and without upfront capital expenditure.
2. High Availability, Scalability and Performance
Companies want and need high speed, highly scalable, as well as performance and cost effective computational Big Data capabilities. Many Big Data applications on-premise suffer performance degradation when running complex queries against vast amounts of data (e.g. Petabytes). As a workaround, companies are over-provisioning infrastructure and hiring external resources to tune the environments on a regular basis. In some extreme situations, they are constraining the environment to the final end users. What they may not realize however, is that the public cloud vendors own optimized Big Data services running in proprietary networks connecting all of their global data centers. This results in high availability, high throughput, and low latency response times.
3. Stronger Security
Whether staying in-house or utilizing a public cloud provider, companies cannot afford to compromise existing security controls, data privacy standards and compliance policies associated with Big Data applications. Not surprisingly, public cloud providers are continuously making investments in security-focused services and employing more and more cloud security experts. For many applications, public cloud offers, better and stronger security than traditional data centers. Companies using in-house security solutions cannot possibly keep pace with the evolving technology and standards.
4. Reduced Licensing Costs
Companies that want to deliver cost-effective Big Data solutions, without incurring expensive license fees, are moving from traditional proprietary technology stacks to open source software (OSS). If you are looking to redesign existing Big Data applications or build new cloud native Big Data applications, public cloud, in combination with modern open source technologies offers you the best balance of multi-tenant infrastructure, affordable license fees, innovation, interoperability, support, and risk management.
5. Agility Enablement
Companies are looking for platforms where they can spend more time on the analysis of the data and less time setting up and maintaining complex infrastructures. Public cloud offers fully-managed PaaS (Platform as a service) that allow you to develop and run cloud native applications. These options leverage industry best practices around data architecture, data governance and roadmap development for incremental delivery of Big Data projects. For example, Azure SQL Data Warehouse, Google BigQuery and AWS Redshift are true PaaS platforms that can handle and analyze large amounts of data at high speed with minimum or zero administration for performance and scale.
6. Future Growth Preparation
Companies are looking to implement solutions that can be flexible to accommodate current and future business needs. Public cloud is continuously innovating and adding new features to support use cases across the entire data life cycle process: (Data connectors -> Data Ingestion- > Data-Tier -> Processing/Visualization -> etc.).
Some of the typical use cases are:
- Data Migration from Legacy Data Warehouse – You are looking to migrate an existing data warehouse to the cloud but want to expedite the migration process.
- Data Profiling of Source Systems – You want to create an up-to-date profile of your data in source systems, for the purpose of understanding trends and detecting anomalies in data content and data quality.
- Data Consolidation from Multiple Sources into One Single Source – You have disparate raw data and need to consolidate it into a single analytical solution.
- Master Data Management across Traditional and Cloud Applications – You want to integrate master data management with cloud application deployments.
- Data Streaming to Personalize Content Based on Consumer Needs – You want to build a state-of-the-art data pipeline to interactively analyze streaming data and generate personalized experiences to the users across all channels.
It is clear that infrastructure independence, high availability, scalability, performance, security, costs, agility and future growth are key factors to be considered when evaluating platform options for Big Data implementations. There are many choices in the market, some of them focused more on-premise and others focused more on cloud or a hybrid approach.
In the near term, I expect to see even more players in the market. However, as a leading Cloud Technologist who has seen these six (6) business drivers play out repeatedly, I strongly recommend that you evaluate public cloud if your company is looking for a Big Data solution that can best balance current and future business requirements.