Using AI and ML to Predict Top Customers and Outperform the Competition
In today’s data-rich environments, turning raw information into actionable insight remains a constant challenge. One recent project focused on building a machine learning pipeline to predict customer purchasing outcomes. This post outlines the approach, lessons learned, and outcomes.
Defining the Right Predictive Objective
The initial goal was to predict the percentile group into which each customer would fall, bottom 20%, second 20%, and so on based on sales. This multiclass classification, while detailed, resulted in prediction accuracy levels around 62-63%, which is better than a coin-toss, but was not high enough to have confidence in the outcome. The objective was redefined to improve outcomes, instead of predicting 5 groups, the model would predict whether a customer would fall into the top 20% of purchasers or not, with only two groups. Simplifying the prediction target yielded more precise results and increased accuracy to around 75%.
Extensive Data Preparation
As with most machine learning projects, the majority of effort was allocated to data preparation. Sales history was consolidated and enriched with fields such as customer classification, supplier, and product groupings showing their distinct purchasing behaviours.
A modified version of the RFM (Recency, Frequency, and Monetary) and other purchasing habits were used to build meaningful customer profiles. With these categorizations, different clustering techniques were used, such as K-Means and Percentile Binning. Using K-means clustering presented challenges due to the distribution of the data. With the data being segmented into a limited number of groups with many datapoints, the distribution failed to work with the requirements of the machine learning technology. As a result, Percentile Binning using fixed 20% intervals was adopted. These percentiles were pegged to the training dataset to maintain consistency across different prediction runs.
Customer data was broken into two groups, the first group was the training dataset, with customers who had completed the buying journey and the outcome was known. The second group was customers who had not completed the journey, and we wanted to predict their outcome. The training dataset was used to determine the correct algorithm for the prediction and its probable accuracy. The second prediction dataset was used to create the predictions based on the training data, and the predictions were used in a business intelligence application to allow sales representatives to understand these customers' behaviours.
Process Automation for Scalability
Automations were created to streamline the generation of prediction datasets and make the process repeatable so that more predictions could be made in the future.
Predictions can be generated at various points throughout business processes, depending on organizational needs. For example, customer data may be processed in batches to support sales campaigns or recurring initiatives. In batch mode, predictions for many customers are generated simultaneously, enabling group-level comparisons and forward-looking performance analysis.
Alternatively, predictions can be triggered in real time, for instance, when a customer reaches a specific milestone or behavioral threshold. In these cases, predictions are generated instantly and delivered to the appropriate stakeholders to support timely decision-making.
This deployment flexibility enables seamless integration with Business Intelligence tools and ERP platforms, through API-driven workflows. Whether operating in batch or real-time mode, organizations can embed predictive insights directly into daily operations.
Tools Used
The project leveraged a combination of tools to support each stage of the process. Qlik AutoML was used for machine learning, while data preparation was handled through a mix of Qlik Sense ETL processes and custom Python scripts. The resulting customer predictions were visualized in Qlik Sense, enabling end users to easily explore and understand the outcomes.
Key Takeaways
This project highlights how organizations can unlock real business value by applying practical machine learning tools to everyday sales data. With Machine Learning, it’s possible to identify high-value customers, prioritize sales outreach, and make smarter, faster decisions, without needing a team of data scientists.
The key to success wasn’t deep technical complexity, it was asking the right business question, preparing meaningful data, and automating the heavy lifting. By focusing on predicting which customers are most likely to buy, teams gain a competitive edge in targeting efforts where they matter most.
Predictive analytics doesn’t have to be complicated. With the right tools and approach, it becomes a powerful driver of revenue growth and customer engagement.
If you are interested in learning more about how BizXcel can help apply AI and ML to your data, Contact us at https://www.bizxcel.com/contact