Predictive Analytics 101: A Beginner’s Guide

Picture this: You are about to present the most recent data analysis project to company executives. Your dataset has the potential to influence new marketing campaigns, develop RFP material, and spur new sales. It is sitting on the cloud for easy access and interpretation. You even have a dashboard with visualizations that perfectly illustrate the dataset’s immense power. You are going to make waves.

Five minutes into the slides, an exec interrupts you, “How will this data change in the future?” Before you can answer, another exec says, “How do we know this dashboard is really telling us everything?”

Taken aback, you stop to think. The data you are showing these execs is accurate — your QA team was knee-deep in testing for months. But can you really say if and how this data will change? The dataset and dashboard are just snapshots in time. No one can predict the future.

But what if you could get close? Modern brands need more than just point-in-time reporting. They need to mitigate future risk, increase sales and customer satisfaction, and streamline their processes. To do so, companies in countless industries are turning to predictive analytics. Harnessing the power of predictive analytics means understanding its current applications, its intersection with the cloud, and the science behind it.

What is predictive analytics?

Predictive analytics is the practice of aggregating and analyzing historical data to anticipate future outcomes. Aggregating multiple datasets connects the dots between different departments, business processes, and types of data (structured vs. unstructured).

However, simply aggregating various data points does not necessarily indicate future behavior. Predictive analytics leverages statistical techniques like data modeling, machine learning, and even artificial intelligence to uncover patterns in big data.

While these patterns cannot predict exactly what will happen in the future, predictive analytics can identify trends, herald disruptive industry changes, and allow for more data-driven decision making.

Watch Fundamentals of Machine Learning now.
Watch Now

Practical applications of predictive analytics.

Any field that captures data is a candidate for predictive analysis. Everything from enhancing cybersecurity to developing more targeted marketing to strengthening actuarial performance is fair game.

Predictive analytics in healthcare.

Healthcare is a primary use case for predictive analytics. A major issue in healthcare is the difficulty of predicting patient risk. Actuarial teams need to establish optimal insurance rates and governmental requests for reimbursement for members with various health issues.

Due to this need, health insurance agencies were some of the first companies to adopt big data practices. Actuaries use predictive analytics to determine things like a patient’s predisposition for developing a worsened condition, or a patient’s likelihood of participating in sponsored wellness activities.

Predictive analytics allows health insurance companies to examine patterns of risk among patients of similar age, with similar conditions, and from similar social determinants of health. Armed with this information, health insurance companies are able to make more informed financial and ethical decisions.

Predictive analytics in finance.

Lending, a key function of the financial services industry, has been revolutionized by predictive analytics. Before a bank gives out a loan, they want to make sure that a customer is trustworthy. Ultimately, they want their money back. So how do underwriters gauge that trust?

Until several years ago, underwriters would judge an applicant based on past performance and personal hunches. Underwriters would review the applicant’s history and debt-to-income to arrive at a convoluted interest rate. As new financial laws emerged, lenders had to develop a more statistically relevant method for underwriting.

The lending industry underwent a revolution when third-party predictive analytics models like VantageScore and FICO Score became available. These models allowed lenders to calculate accurate risk-based interest pricing, and limited subjective bias. Instead of basing interest rates on a few outdated metrics, the VantageScore and FICO Score models are based on the performance of millions of borrowers with similar spending tendencies.

Three examples of predictive analytics in the real world.

While hypothetical use cases are interesting, what about real world applications of predictive analytics?

1. Improving patient care.

CenterLight is a managed care organization in New York with 13 facilities offering services for the disabled, elderly, and chronically ill. For years, CenterLight used a homegrown system to govern their data.

As you can imagine, this homegrown system soon became obsolete. Ever-changing compliance guidelines and varied options for patient treatment made it challenging to track patient progress and manage patient care.

CenterLight leveraged predictive analytics on their data warehouse, which integrated data from their Customer Relationship Management system (Salesforce), eCHAMP (their homegrown system), and other claims and provider databases. As a result, the Business Intelligence Team detected patterns in nurse behavior that encouraged member retention and better prepared patients for their assessments.

Predictive analytics helped CenterLight simultaneously save time and money, while effectively managing their members’ care.

Download The Definitive Guide to Data Integration now.
Download Now

2. Engaging users with the right recommendations.

Lenovo, a personal technology company that produces PCs and smartphones, serves customers in more than 160 countries. Faced with standing out in a highly competitive industry, Lenovo realized offering innovative products was not enough. They needed to create new categories of products to enhance the customer experience.

In order to be highly effective, Lenovo set a goal to understand customer needs by using data sets outlining their expectations, behaviors, and preferences. To do so, the company developed a channel-agnostic and real-time predictive analytics practice that involved acquiring data from a variety of touch points. This predictive analytics model helped Lenovo improve the customer experience and achieve an 11% increase in revenue per retail unit.

3. Establishing a 360 view of the customer.

Air France-KLM is a world leader in its three main business lines: passenger transportation, cargo transportation, and aeronautics maintenance. With 90 million annual customers and 2.5 million monthly unique web visitors, Air France-KLM places data managment as a key priority to maintaining customer satisfaction.

Leveraging their data, Air France-KLM developed a 360 customer approach based on predictive analytics. From providing call center agents with a complete customer history to sending targeted promotional offers and launching customer service bots, the company created an exceptional customer experience by anticipating needs. In fact, Air France-KLM went as fair to identify customers’ main stress factors and create a proactive plan of action to mitigate any potential issues.

How predictive analytics works.

Predictive analytics seems like magic, but it stems from statistical science. At its core, predictive modeling involves giving the presence of particular variables in a large dataset a certain weight or score. This score is then used to calculate the probability of a certain event occurring in the future.

There are two main statistical modeling approaches used in predictive analytics: classification models and regression models.

Classification models.

Classification models are typically binary. For example, you might be interested in CenterLight member enrollment. A classification model will tell you if a member is likely to either stay with CenterLight or disenroll from CenterLight in a given timeframe, based on certain criteria.

Regression models.

Regression models are less black and white. Instead of a 0 or 1, regression models will predict an actual number. Sticking with the CenterLight example, let’s say a member has a BMI of 29. A regression model might predict that the member’s BMI could drop 3 points in the next year with a consistent, healthy diet.

Three techniques for predictive analytics: decision trees, regression, and neural networks.

There are several techniques data scientists use to construct classification and regression models. Namely, decision trees, regression, and neural networks.

  1. Decision trees visually represent a path of choices. Each branch of the decision tree is a possible decision between two or more options, whereas each leaf is a classification (a yes or no). Decision trees are one of the more attractive techniques for modeling because they can handle missing values and are simple to comprehend.
  2. Regression is another popular modeling tool. As discussed earlier, regression is used with continuous data as opposed to binary data. Different data questions require different applications of regression. For instance, linear regression is used if only one independent variable can be ascribed to an outcome. If more than one independent variables have an effect on an outcome, multiple regression is most accurate. Logistic regression is an even more complex form of regression that does not follow the same convention as linear and multiple regression. Unlike the other two regression models, logistic regression is used when the dependent variable is binary. Returning back to the CenterLight example, a logistic regression could be used to answer: how do the odds of a member having a heart attack (binary variable) change with every additional BMI value (continuous variable)?
  3. Neural networks represent the final, most complicated technique. This method is becoming more and more in-demand because perfectly linear relationships are rare in nature. Neural networks allow for more sophisticated pattern recognition by employing artificial intelligence.

Although these statistical methods are not new, they are being more widely accepted and used. This can be attributed to the rise in popularity of the cloud.

Big data, the cloud, and the future of predictive analytics.

Before the cloud, predictive analytics seemed impossible. Computers did not have the capacity to house petabytes of data, let alone have enough processing power to run labyrinthian data models. The cloud offers companies a way to compile and combine multiple, huge data sets and easily scale their models.

There are many emerging, cloud-based predictive analytics products. In the future, the cloud will enable companies to build their own machine learning models. By teaching a computer to find patterns in the data, the cloud eliminates manual work and allows for greater interpretation and extrapolation.

The cloud also allows for more customization and flexibility. With the advent of Internet of Things on the cloud, predictive analytics tools could get even more granular in their assessment of people’s everyday habits.

Download O’Reilly Report: The Internet of Things Market now.
Download Now

Modern predictive analytics software and tools.

Now that companies are able to readily retrieve large datasets from the cloud, there is so much room for big data analysis, and there are many cloud-based predictive analytics software options on the market. While it is vital to have a team of experts to interpret data models, software is necessary to lessen the time to collect, clean, and analyze data. Predictive analytics software can digest both stored and real-time data, and assist in appropriate formatting.

In addition, most cloud-based predictive analytics software integrates well with ERP systems, digital analytics software, and business intelligence platforms that most companies already have. Business intelligence teams can also use predictive analytics software to demonstrate the value of predictive analytics in visual form via dashboards.

Talend is an example of big data software that is universally applicable. Since Talend is an open-source integration platform, it is versatile enough to assist in data preparation, data management, and cloud integration. As companies mature and develop their predictive analytics practice, the first task will be to migrate their data to the cloud.

Ready to get started? Try Talend Data Fabric today to start transforming your company’s data.

| Last Updated: August 8th, 2019