Sequential Prediction

Suppose that we observe data sequentially, and at each point in time, we must make a prediction concerning the next data point, based on all the previous data. This problems occurs everywhere – for example, electricity companies routinely use predictions of the electricity supply of wind-, solar- and conventional sources to decide whether or not to generate extra electricity; and Google wants to predict whether or not you’ll click on an ad based on your previous click behaviour.

Suppose that we observe data sequentially, and at each point in time, we must make a prediction concerning the next data point, based on all the previous data. This problems occurs everywhere – for example, electricity companies routinely use predictions of the electricity supply of wind-, solar- and conventional sources to decide whether or not to generate extra electricity; and Google wants to predict whether or not you’ll click on an ad based on your previous click behaviour. Many more examples can be found in many other areas.

Classical approaches to such problems are based on a mathematical model of the domain. But in practice, there are often several such models available, none of which is even close to perfect (some may be better in summer, others in winter, and so on). Sometimes the predictions of human experts are also available. Our Machine Learning group focuses on new ways of combining the predictions of such models and experts. It turns out to be possible to do this in a way guaranteed to predict at least as well as the best candidate model or expert, and in many cases even substantially better. The combined prediction algorithm is very robust, and automatically adapts to constantly changing sources.

The technologies we use are exceptionally useful for cases such as electricity-demand prediction – for example, Electricité de France uses a method that is very similar to the SafeBayesian method we developed at CWI. More generally, whenever there is a set of methods available, some of which are reasonable but none of which is very good, the SafeBayesian method may boost performance. The methodology is also useful for a variety of other problems such as predicting ad-clicking behaviour and even learning to play games.

We have developed new ways of combining predictions made by arbitrary given predictors, with strong guarantees regarding the performance of the combined predictions. The algorithms are fast, and are especially good at switching ‘right-on-time’ from one predictor to another in case the nature of the data changes over time.

Contact person: Peter Grünwald
Research group: Machine Learning (ML)