Content - 3d01b45e2d6cf4a9a9ee115060a9d5da2c726380 - f679c53/evaluations/README.md

README.md
# gluon-ts evaluations

This folder aims at collecting evaluations of forecasting models. The goal is to make reproducibility and comparison easier by versioning the code producing dataset as well as the model and evaluation code. 

Note that the evaluations are not "optimal" in the sense that the models are trained with default parameters for all the datasets, and with no additional features associated to the datasets.

## mean_wQuantileLoss

estimator | electricity | exchange_rate | m4_daily | m4_hourly | m4_monthly | m4_quarterly | m4_weekly | m4_yearly | solar-energy | traffic
---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ----
DeepAREstimator | 0.050 | 0.023 | 0.025 | 0.033 | 0.115 | 0.087 | 0.048 | 0.128 | 0.398 | 0.126
MQCNNEstimator | 0.083 | 0.016 | 0.027 | 0.065 | 0.124 | 0.089 | 0.059 | 0.122 | 0.551 | 0.272
MQRNNEstimator | 0.197 | 0.004 | 0.222 | 0.298 | 0.209 | 0.326 | 0.104 | 0.328 | 0.164 | 0.087
NPTSPredictor | 0.062 | 0.021 | 0.145 | 0.048 | 0.233 | 0.255 | 0.296 | 0.355 | 0.826 | 0.180
RForecastPredictor_arima |  | 0.008 | 0.024 | 0.040 |  | 0.080 | 0.050 | 0.124 | 1.153 |
RForecastPredictor_ets | 0.121 | 0.008 | 0.023 | 0.043 | 0.099 | 0.079 | 0.051 | 0.126 | 1.778 | 0.373
SeasonalNaivePredictor | 0.070 | 0.011 | 0.028 | 0.048 | 0.146 | 0.119 | 0.063 | 0.161 | 1.000 | 0.251
SimpleFeedForwardEstimator | 0.062 | 0.009 | 0.023 | 0.044 | 0.116 | 0.088 | 0.051 | 0.132 | 0.435 | 0.212
TransformerEstimator | 0.066 | 0.009 | 0.027 | 0.035 | 0.136 | 0.105 | 0.083 | 0.160 | 0.432 | 0.132

## ND

estimator | electricity | exchange_rate | m4_daily | m4_hourly | m4_monthly | m4_quarterly | m4_weekly | m4_yearly | solar-energy | traffic
---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ----
DeepAREstimator | 0.061 | 0.029 | 0.030 | 0.042 | 0.125 | 0.102 | 0.060 | 0.152 | 0.490 | 0.150
MQCNNEstimator | 0.102 | 0.019 | 0.032 | 0.086 | 0.132 | 0.103 | 0.065 | 0.146 | 0.666 | 0.310
MQRNNEstimator | 0.639 | 0.015 | 0.662 | 0.906 | 0.660 | 0.981 | 0.345 | 0.987 | 0.702 | 0.334
NPTSPredictor | 0.080 | 0.025 | 0.191 | 0.063 | 0.293 | 0.334 | 0.387 | 0.442 | 1.031 | 0.225
RForecastPredictor_arima |  | 0.009 | 0.029 | 0.053 |  | 0.097 | 0.060 | 0.148 | 1.150 |
RForecastPredictor_ets | 0.150 | 0.010 | 0.027 | 0.054 | 0.120 | 0.095 | 0.061 | 0.149 | 1.364 | 0.385
SeasonalNaivePredictor | 0.070 | 0.011 | 0.028 | 0.048 | 0.146 | 0.119 | 0.063 | 0.161 | 1.000 | 0.251
SimpleFeedForwardEstimator | 0.075 | 0.012 | 0.028 | 0.055 | 0.126 | 0.104 | 0.060 | 0.158 | 0.520 | 0.251
TransformerEstimator | 0.082 | 0.011 | 0.032 | 0.043 | 0.150 | 0.128 | 0.098 | 0.193 | 0.534 | 0.159

## RMSE

estimator | electricity | exchange_rate | m4_daily | m4_hourly | m4_monthly | m4_quarterly | m4_weekly | m4_yearly | solar-energy | traffic
---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ----
DeepAREstimator | 1177.808 | 0.033 | 635.905 | 1344.878 | 1405.169 | 1405.009 | 613.102 | 1865.383 | 31.510 | 0.025
MQCNNEstimator |  |  |  |  |  |  |  |  |  |
MQRNNEstimator |  |  |  |  |  |  |  |  |  |
NPTSPredictor | 1679.833 | 0.033 | 2207.532 | 2871.974 | 2613.715 | 3251.401 | 3621.983 | 4211.343 | 53.450 | 0.031
RForecastPredictor_arima |  | 0.011 | 641.476 | 2285.035 |  | 1436.552 | 644.820 | 2065.602 | 58.934 |
RForecastPredictor_ets | 3195.747 | 0.012 | 602.283 | 2158.406 | 1413.275 | 1374.529 | 659.644 | 2066.347 | 65.986 | 0.039
SeasonalNaivePredictor | 1139.925 | 0.013 | 705.425 | 1901.146 | 1628.794 | 1577.303 | 673.443 | 2016.458 | 62.518 | 0.037
SimpleFeedForwardEstimator | 1285.875 | 0.014 | 677.479 | 2323.024 | 1420.506 | 1453.103 | 672.740 | 1982.234 | 37.251 | 0.034
TransformerEstimator | 2059.355 | 0.014 | 664.324 | 1575.836 | 1512.068 | 1494.884 | 848.616 | 2010.936 | 35.152 | 0.026

# FAQ

## Can I add evaluations of a model? Are there conditions to add evaluations?
The accuracy numbers should be obtained with code checked in gluon-ts to allow other researcher to reproduce reported results. 
This can include models written in python but also wrappers (for instance see `RForecastPredictor` that allows to use Hyndman R forecast package) or not completely polished code (unpolished code will be put under a `contribution` folder). 

Also models are run with default parameters, they should only require as parameters the time frequency of the data and the number of prediction steps needed. All others should be fixed or adapted automatically to the data.

If there is sufficient demand, we could also collect results that are run outside of gluon-ts but one will then not be able to reproduce results.


## How can I add evaluations of my model?
Run `generate_evaluations.py` which will save evaluations results in `evaluations`. The results can then be visualised with `show_results.py` (which generates the table above). 
You can then issue a pull-request with your model, adding or updating evaluations files.


## How do you enforce that each number is valid?
We do not enforce that the results are actually produced by the code (one could for instance put arbitrary 
low numbers). 
However, every result is versioned through git together that the code that produced it and can be checked by anyone 
by rerunning a given evaluation. We might also at some point generate this table automatically.


## Can I add another dataset?
We are happy to include other datasets.
To add another dataset, you have to include the downloading and processing code in 
`gluonts.dataset.repository.datasets.py`.