What is AI?
This is a simple question with a not-so-simple answer. Like most buzzwords, AI has seen many varied attempts at definition, and we think they all add nuanced color to help us better understand the topic. We’ll start our definition by adding another buzzword into the mix, Big Data, and do a bit of compare-and-contrast to set the background.
We like to think of Big Data as the overarching concept that encompasses Artificial Intelligence. First, Big Data implies working with data that is, well… big! Most of us have seen the stats around the growth of data across the planet:
- In the past year, we’ve generated more data than in all the years of human history
- The rate of growth in data generated is exponential
- As more sensors come online, machines are generating a greater and greater portion of the overall data
The bottom line is that we’re faced with an ever increasing amount of data, and our ability to effectively store and process it is failing to keep up. This is the world of Big Data and it covers the gamut of collecting, storing, cleaning, organizing, enriching, analyzing and visualizing all this information.
AI is intelligence exhibited by machines, and is therefore sometimes referred to as Machine Intelligence (MI). You can contrast this with the natural intelligence that is exhibited by humans or other organisms. Intelligence can be looked at from a factual standpoint, and facts can be represented by simple “if:then” statements. Collect enough of those and you can make a machine look fairly intelligent.
But another aspect of intelligence is learning: the ability to acquire new or modify existing factual knowledge, based on new information. This is where it starts to get interesting. We break down the techniques that enable machines to learn into two groups — Machine Learning (ML) and Deep Learning (DL).
We look at ML as computational statistics, or using data to create mathematical models that are useful for making predictions. Part of a data set is used to see which models seem to fit best, while the remaining data is used to test the predictive capability of the model. Once the fit is deemed adequate, new data can be analyzed with the model, and the results can be reasonably acted upon. Great applications for this approach include predictive maintenance and detection of security anomalies.
DL, or hierarchical learning, is a subset of ML that mimics the function of the human neocortex. Our cortex is essentially a pattern matching engine that takes our sensory input and, as parts are recognized, groups them into higher level concepts until recognition occurs. We then act accordingly.
Think of looking at a coffee cup. First we recognize the circle of the opening, then the straight lines of the sides, the curve of the bottom and the semicircle of the handle. As these parts pass through the hierarchies of our cortex, they eventually come together as what we recognize as a coffee cup and we act accordingly. Depending on the circumstance, we may then pick up the cup and take a sip of nice, hot coffee!
We now have computational techniques to mimic this process, and, based on the data input, they can learn to identify inputs and take appropriate action, just as we humans do. The main difference is that humans have limited capabilities — our five basic senses, limited storage and processing power. Machines, on the other had, can take in many types of input and use virtually unlimited compute and storage capacity to analyze the data.
The current state of the art in DL is AlphaGo Zero from Google’s DeepMind. The combination of DeepMind’s algorithms and Google’s vast amounts of raw data have been making great progress toward solving difficult problems, such as image and speech recognition, not forgetting the fact that AlphaGo Zero has now handily beaten the Go world champion!
Elements of AI
Now that we’ve provided a breakdown of the larger topic, let’s take a look at some of the parts that need to be considered in tackling AI.
The large volumes of data we referred to earlier first need to be captured before we can do anything with them. This is where data ingestion comes into play. Think of data sources like social media streams, corporate transaction systems and sensor data (aka the Internet of Things). This data, whether in the form of files, transactions or streams, is often pulled in and stored, and the public cloud provides an attractive repository, with virtually unlimited storage capacity and relatively low cost.
We use the term data munging to encompass a few concepts that generally comprise 80-90% of the overall effort involved in AI. These include:
- ETL (extract, transform, load), to get the data into a common format
- Cleansing or removing incomplete or corrupt data
- Deduplication, to remove duplicate data that might be pulled in from different sources
- Enrichment, to add in third-party data that may provide a more complete data set to analyze
Much of this process can be automated, but there is no magic way to avoid the still laborious job of getting all your data ready for the data scientists to start analyzing.
Once your data has been ingested and munged into a usable state, you can begin to apply the computational techniques of Machine Learning and Deep Learning. This is not an exact science and generally involves quite a bit of trial and error. It is also important to factor in a healthy dose of applicable domain knowledge. For example, if you’re looking at marketing data, you’d better be working with someone who understands the type of marketing you’re doing. Or if you’re looking to improve predictive maintenance for industrial machinery, you’d better include someone who knows the ins and outs of how those machines tick.
When you combine these experimental analytics with the knowledge of someone who understands the underlying processes generating the data, you have a good chance of achieving results that can reasonably be relied upon. The general goal is to make business decisions that are equal to or better than those that can be made by humans.
Data visualization is the last and often overlooked part of the process and involves examining outputs from the prior stages. The best results often involve equal parts of computer science, art and neuroscience, to understand the intricacies of how we perceive visual input. The old adage “a picture is worth a thousand words” certainly comes into play here.
We need to point out two things. First, we mentioned outputs from prior states. These techniques and technologies are not only valuable for the final output of your analytics. They can also be incredibly useful in going through the difficult process of data munging. Remember, we are talking about Big Data here, and that generally entails more columns and rows of data than we can get our minds wrapped around. Techniques to visualize data sets help us to clearly see errors, patterns and anomalies that our eyes would otherwise be unable to discern from the raw data itself.
Second, data visualization is sometimes thought of only in terms of outputs for humans. More and more, we see our applications comprised of chains of processes connected via APIs. The output of one stage needs to be presented in a manner that can be understood as input to the next stage. Thinking of data visualization from this perspective allows us to go beyond what our human eyes can perceive, and address the superhuman capacity of what machines can see.
When we look at all the moving parts, it becomes easier to understand why large enterprises often struggle to apply AI to their business processes. The three areas below outline some of the biggest challenges.
Organizational structure tends to be the single biggest impediment to progress for enterprises. As organizations grow into the thousands, tens of thousands or even hundreds of thousands of employees, they naturally form organizational structures and processes to maintain some semblance of order. These structures and processes often create inertia, with silos and bureaucracy becoming impediments to innovation. This inertia can easily quash efforts to apply AI to business processes, simply because of the number of data inputs, the overall amount of data and the additional data manipulations AI requires.
Incidentally, concepts such as the Cloud Business Office (CBO), which CTP advocates, can help bridge silos by providing representation to every business function. We have seen this facilitate agility, innovation and appropriate organizational change.
Public Cloud Capabilities
As we mentioned earlier, AI can be significantly facilitated by services in the public cloud that provide speed, scale and cost advantages. However, since the shift to the public cloud is relatively new to many large enterprises, we find that they often do not have the skills and experience to move their workloads while still managing critical considerations, such as security, compliance and costs.
Without the guidance of people who have repeatedly assisted large enterprises to set themselves up properly in the cloud, we have seen countless corporate efforts delayed or outright failed. This definitely lies in the realm of “you don’t know what you don’t know,” so do not shy away from getting guidance from those who have done this before — a lot!
Getting the Right People
This third challenge may well be the most difficult to deal with, as there are a few factors at play.
First, AI involves numerous skill sets. As we looked at all the moving parts, we saw that there are many different skills needed, including cloud services, data storage, data transformation, third-party data sources, mathematics, statistics, programming, graphics, etc. Enterprises often seek the mythical “Data Scientist Unicorn” who has all these skills, but such folks are few and far between. It is often more productive to define the specific skills needed for your particular effort, and then seek to build a multi-disciplined team of people who individually have expertise in one or two of the skills.
Second, these disciplines are relatively new, so it is difficult to find experienced personnel to work in AI. Given the speed at which we have begun to generate and accumulate data, our educational system has only just started producing formally educated data scientists that have exposure to the many skills required. Those people are inherently inexperienced in large enterprise culture, so we often have to turn to people that are self taught. This group is also inherently inexperienced, as significant AI efforts have only recently been undertaken.
Third, because of the complexity and lack of AI experience within the enterprise, it is often difficult to determine whether candidates actually have the skills needed. It seems there are lots of resumes these days with the term data scientist on them. But does your hiring manager know enough about a given discipline to tell whether a candidate actually has the required skill, or is faking it?
We’ve repeatedly seen examples of enterprises struggling to get the right candidates hired. More often than not, we see this as the biggest impediment to moving AI initiatives forward with the desired velocity.
We’ve now seen how loaded the topic of AI is, and how it can take a bit of dissection to be able to compile a useful working definition. We’ve also broken down many of the moving parts to provide a high level overview of what’s needed. Lastly, we’ve looked at some of the major challenges large enterprises face in getting their AI initiatives off the ground. As the topic continues to garner lots of press attention, are you ready to get going?