**Summary**

In today's data-rich corporate environment, it's easy to underestimate the danger of simply using your intuition and boardroom statistical tools (i.e. spreadsheets) to make strategic decisions. While there are many tools available to data scientists, Bayesian logic is a particularly helpful machine learning algorithm for estimating probability in business decision-making processes.

**Data-driven Decision Making**

Large companies collect a massive amount of data from their websites, mobile apps, customer surveys, supply-chain operations, products and marketing efforts. Today, "Big Data" is everywhere.

The power of Big Data is that it enables decision makers to approach business decisions scientifically as opposed to relying solely on intuition. While human intuition will always play a vital role in the decision-making process, intuition should be informed, and based on facts. Striking this balance can only be described as an art form. It must take into account:

- the way our brain works
- the way spreadsheets and common analytics fail

**How the Brain Works**

Our brains are great at pattern recognition but weak at the probabilistic determination. It is well documented that humans are prone to use (often incorrect) stereotypes instead of realistic assessments.

Take the mammography scan information as an example:

- The chance of developing breast cancer is about 0.8%
- Mammograms are 90% good at detecting the breast cancer if the patient already has it
- The chance of False Positive is 7%

The big question: What is the chance of having cancer given a positive test result?

Did you guess somewhere around 90%, maybe closer to 80%? Not really. But, if that was your guess, don’t feel bad - studies show 9 out of 10 of doctors got the answer wrong. This is a little scary, if you ask me! Why is answering this seemingly simple question so difficult? The math behind it is a little more involved than one would think. In the toolset of Machine Learning we have Bayes' probabilistic formula:

Where,

- P(C|R)P(C|R) the Probability of having the (cancer) Condition given positive Result
- P(R|C)P(R|C) the Probability of having the (positive) Result if you have the (cancer) Condition
- P(C)P(C) the Probability of having (Cancer) Condition in general
- P(R)P(R) is Probability of receiving the (positive) Results in general

In addition, we will use notation:

- nC is not having given Condition
- nR is receiving false Result

To apply Bayes formula to our mammogram scenario:

Using computer code (Python language) we can easily solve the math.

First, let's define a method that will display results formatted for percentages with one decimal point.

Let's calculate the "numerator" of the formula (top part) of the formula:

Next, let's calculate the "denominator" (bottom part) of the formula:

Finally, we can get the answer to our question:

This means the chance of having cancer if you receive a positive test result is ONLY 9.4%.

Unless you are a trained data scientist, you probably would have guessed it to be about 80-90%.

Our minds are great at recognizing patterns. At best, we are usually able to quickly avoid danger when it involves a significant margin, but we are usually not good at being intuitively precise.

Why is this important in the boardroom? Well, most companies operate on razor-thin profit margins, and a few percentage points of difference can mean all the difference in the world. While high-level decisions may be easily made based on intuition and experience, precise decision-making skills can make all the difference when it comes to making less obvious choices. For example, data can inform a decision about whether to invest in a new product feature, predicting how a small change can make a big financial impact over time.

**How Spreadsheets Work**

In addition to our brains not being designed for precise probabilistic computations, there are also problems with the way data is commonly presented in spreadsheets. There are several well documented statistical fallacies, also known as Simpson’s Paradox (Wall Street Journal, Dec. 2, 2009). These examples show that way statistical data is interpreted datasets may have a catastrophic impact on business decision making.

One of the examples, presented by @Bayesia, has to do with major car manufacturer's Super Bowl advertising campaign.

In the example, the car manufacturer did a survey of dealership customers buying their cars and tabulating the data. The results showed that customers who had seen the Super Bowl advertisement spent less money at the dealership than the customers who did not. The spreadsheet data clearly showed that the multi-million dollar Super Bowl campaign had failed!

Bayesian AI was applied to the same case study, and it discovered a different story. Without going into diagramming the data interpretation, the bottom line is that after using Bayesian Network tools and adjustment for gender and shopping habits (customers' research vs impulse buying) it was proven that the advertisement was actually a success.

Such mistakes matter when making big-ticket decisions. Of course, a trained data scientist or statistician could investigate appropriately to find the flaws in the way the data is interpreted, but unfortunately, boardroom power points often miss the fine details mathematicians may otherwise consider. The problem compounds when considering hundreds, or tens of thousands of factors, as is often the case.

**About "Deep Learning"**

Thanks to enormous computer power the new GPU processors afford us, we can enjoy the thrill of playing 3D video games, and we are equally empowered to quickly run massive amounts of computations.

Some of the newest GPU machines have as many as 30,000 parallel processing cores and are able to perform several 1012(Tera) math operations per second. This bonanza of computing power sparked a renaissance of Machine Learning studies, from which "Deep Learning" was born. Behind the scenes, as is usually the case, things are not as glorious as the "Deep Learning" name implies - massive matrices of data are being adjusted over and over until the outcome is the number with the least amount of error.

As with many disruptive innovations, Big Data is only as good as it’s practical application - how we use the data features we want to analyze is of utmost importance.

It takes usually 80% of total effort for a data scientist to understand and prepare the data, the remaining 20% is processing it using one of the Machine Learning techniques and formatting the results.

Machine Learning techniques that are available today are the closest thing we have to superpowers, especially when we process data that has dozens, hundreds or even thousands of criteria to be considered in making the decision. It is not, however, a plug-and-play solution as article titles about "unsupervised learning" would suggest.

**Combined Solution: Bayesian Networks**

Bayesian Networks combine the power of various Machine Learning algorithms with the graph of the features connected by directional causation vectors. The causation relationships are determined by human intuition (expert knowledge). The weights for each feature are still re-calculated using Machine Learning algorithms, but some features are recognized as not being under our control. Take, for example, the day of the week — you cannot have more Fridays and weekends when you sell camping gear and you cannot cancel the winter in the ice cream business.

**Wellness Measurement Study**

Let's take a look at designing the Wellness Measurement Study.

Example Assumptions:

- Young people tend to have higher metabolic rate (and eat more)
- The amount of food people consume is dependent on gender (men tend to eat more than women)
- The type of food people consume will influence metabolic rate, amount of food needed and weight
- Weight of the person is not a good indicator of health as there are healthy, lean, dense-muscle athletes
- Weight change alone is no indication of health without consideration of gender, age, fat percentage
- The ambient temperature, climate, geographic location has influence on metabolic rate
- Stress level can greatly influence weight change and wellness in general

A simplistic representation of some of the causal relationships in the Bayesian Network, in real implementation there could be thousands of data points.

We can identify types of features:

- blue: variables which we can adjust
- red: confounding variables which introduce bias
- gray: areas that we could change, but for the study we consider as just confounding
- green: results, or dependent variables

We can intuitively understand the decisions associated with how measured features interact with each other and may have a critical influence on the results.

Bayesian Network, when correctly designed, would predict how much we should change variables (blue) to positively affect the results (green). This example is simple and intuitive, but when it comes to corporate decision making, there are thousands of variables and even more confounding criteria to consider. Moreover, the growth of the relationships between them is exponential. The choice is between either making intuitive decisions (and risking inaccuracy) or turn to state of the art Machine Learning techniques.

**Conclusion**

I hope you have gained a new appreciation for combining human knowledge (expert systems) together with Machine Learning techniques to design systems that are much more accurate than results that human intuition, or "data processing", can deliver alone.

To find out more about AI and how it can improve your decision-making processes, connect with our team today at 312-561-9000 or Services@ProductiveEdge.com