Chasing StatsForecast AutoARIMA Residuals in Two Lines of Code

Microprediction
3 min readOct 11, 2022

This post shows you how to apply the excellent StatsForecast auto-ARIMA methodology to a univariate time series with one line of code, then hunt down model residuals with another one-liner. It also shows you how you can do something similar for any pair of primary and secondary models, provided they satisfy the “skater” convention.

David Taffet photography

As this description suggests, it will be a short post!

Straight to the Code

pip install timemachines
pip install sktime
pip install statsforecast
from timemachines.skaters.sk.sfautoarimahypocratic import sf_autoarima_hypocratic as f

We already have a “skater” f that can sequentially process one data point at a time. There’s more explanation of skaters in the README. But let’s just try it out.

y = np.cumsum(np.random.randn(100))
s = {}
x = list()
for yi in y:
xi, x_std, s = f(y=yi, s=s, k=3, e=1000)
x.append(xi)

That’s all folks. You can read the example here.

Residual Chasing

Perhaps, though, you are wondering what sfautoarimahypocratic actually does. To briefly unpack, this is merely a use of quickly_moving_hypocratic_residual_factory applied to sf_autoarima, the one-line online version of StatsForecast AutoARIMA so graciously incorporated into the sktime package.

The function will do the following:

  1. Send StatsModels AutoARIMA a new data point
  2. Update the record of residuals and “chase them” using the more general residual_chaser_factory, making a choice of secondary model as a rapidly moving average.
  3. Return a forecast that is based on the Auto ARIMA model, but is adjusted up or down sometimes if the secondary model is very confident that the primary model’s forecast is going to be wrong in one direction or the other.

Pretty simple eh? As to whether it works, we’ll see soon enough because this model will be automatically assessed using live real-world data soon enough. Of course for your own application, your mileage may vary.

Modification is easy. You can easily swap out the primary or secondary models for any of a hundred time-series models in the same package. File an issue if you have trouble. If you poke into the code you’ll find little residuals helper, and you can further generalize, should you be so inclined.

Want More StatsForecast Models, or Combinations, to Receive Elo Ratings?

The statsforecast package is a terrific contribution to the open-source community from Nixtla. It provides a number of optimized algorithms that are considerably faster than their predecessors, so the inclusion in timemachines and the Elo ratings thus created is overdue.

If you have other algorithms from StatsForecast that you think should be included you could consider a pull request similar to sfautoarima.py, and maybe peruse the guide for contributing batch-style models. It would probably make sense to mention this in the microprediction slack and ask questions as needed.

--

--