To do any sort of business planning, you need forecasting. Our reason for existing is to help our customers drive success from their data, so, around two years ago we began to research adding automated forecasting to our cloud analytics platform.
This did however present a huge challenge. Take the typical lifecycle of a forecasting project. It’s an iterative process of refining a model, generating a forecast and evaluating the results to understand how to further refine the model. This cycle is repeated until you are happy with the results.
Evaluating the results and refining the model are usually manual tasks. They need an analyst with a good understanding of the model and the business or process that generates the data. That’s a pretty specific skillset.
That’s fine if you are creating one forecast for one organisation. We, however, want to scale our business rapidly, taking on lots of different customers. It’s not practical for us to recruit large numbers of highly skilled analysts in a short space of time. So, we need to automate as much of the process as we can.
We looked at a number of solutions. An obvious first choice, as we already use AWS, was the Amazon AI platform SageMaker, which includes DeepAR. DeepAR is a ML forecasting model based on a recurrent neural network with long-short term memory cells allowing it to model trends and seasonality. Importantly for us, the platform offers automatic model tuning. So, you run your data through the model numerous times, and it automatically selects hyperparameters to improve the forecast.
We also looked at Facebook Prophet. Unlike DeepAR, it’s a statistical model, similar to the generalised additive model (GAM). It copes particularly well with trends, seasonality and holidays, each being the additive components for the model. It’s also open source, and available in Python or R, so was easy for us to integrate into our platform.
Finally, we also considered Amazon Forecast. As a fully managed service on AWS, you upload your data and it uses automated machine learning (AutoML) to choose and refine an appropriate model. It includes a growing number of algorithms, including ARIMA, DeepAR+ (an enhancement of DeepAR), Prophet, ETS and NPTS.
We began the project using real world data from one of our existing customers, a jewellery retailer. We chose them primarily because they were enthusiastic to take part, but also their data was particularly challenging to forecast. The nature of their sales means the data is noisy, outliers are often high value sales that occur randomly, yet account for a large proportion of the value so cannot be discounted. The sales are also highly seasonal, with the 5 weeks leading up to Christmas accounting for as much as 50% of the sales.
We trained the models on three years data, from 2015-2017, then tested the resulting forecasts against the real-world results for 2018. For comparison we had the manual forecasts the company had used at the time, but unfortunately this was only detailed enough for the fairly rudimentary measure of the total difference over 12 months.
On total error over 12 months all the methods we tested significantly outperformed the manual forecast. The best results were generated by Prophet, with a total error of under 300k (out of annual sales over 100 million). This compared to over 15 million error with the manual forecast. However, using more meaningful measures (RMSE and MAE) DeepAR gave us the most accurate results.
In the 9 months since we conducted this research, we’ve integrated automated forecasting into our platform. We initially chose to use Prophet, even though DeepAR was arguably more accurate. Our reasoning was that Prophet was significantly easier to configure, and less resource hungry to train. Given the relatively small increase in accuracy DeepAR gave us, the ease at which we could scale the solution with Prophet won the day. In more recent months, with the added complication of forecasting during the Covid-19 crisis, we've begun using a combination of Prophet and LightGMB, with some success.
It has been fascinating to see how quickly this field has developed in the last year. The proliferation of available models, if anything, has gathered pace. We’ve continued to test different algorithms and techniques, and the level of automation and accuracy of our forecasting increases almost daily. We are already much closer to the goal of completely automating forecasting for industry than we thought possible 12 months ago. Who knows where we’ll be in another 12 months!