prophet/notebooks/diagnostics.ipynb

482 lines
262 KiB
Text
Raw Normal View History

2017-09-02 17:53:38 +00:00
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"block_hidden": true
2017-09-02 17:53:38 +00:00
},
"outputs": [],
"source": [
"%load_ext rpy2.ipython\n",
"%matplotlib inline\n",
"from fbprophet import Prophet\n",
"import pandas as pd\n",
"from matplotlib import pyplot as plt\n",
"import logging\n",
"logging.getLogger('fbprophet').setLevel(logging.ERROR)\n",
"import warnings\n",
"warnings.filterwarnings(\"ignore\")\n",
"df = pd.read_csv('../examples/example_wp_log_peyton_manning.csv')\n",
2017-09-02 17:53:38 +00:00
"m = Prophet()\n",
"m.fit(df)\n",
"future = m.make_future_dataframe(periods=366)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"block_hidden": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Loading required package: Rcpp\n",
"\n",
"WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Loading required package: rlang\n",
"\n",
"WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.\n",
"\n"
]
2017-09-02 17:53:38 +00:00
}
],
"source": [
"%%R\n",
"library(prophet)\n",
"df <- read.csv('../examples/example_wp_log_peyton_manning.csv')\n",
2017-09-02 17:53:38 +00:00
"m <- prophet(df)\n",
"future <- make_future_dataframe(m, periods=366)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Prophet includes functionality for time series cross validation to measure forecast error using historical data. This is done by selecting cutoff points in the history, and for each of them fitting the model using data only up to that cutoff point. We can then compare the forecasted values to the actual values. This figure illustrates a simulated historical forecast on the Peyton Manning dataset, where the model was fit to a initial history of 5 years, and a forecast was made on a one year horizon."
]
},
{
"cell_type": "code",
"execution_count": 3,
2017-09-02 20:07:49 +00:00
"metadata": {
"input_hidden": true
},
2017-09-02 17:53:38 +00:00
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAmEAAAF3CAYAAADtkpxQAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzsvXl8FFW6//+pru6ETVAjCEiAQR1AiCYQlnYUW1FQFMVvZlxmrlFB44YjozOOjD/vhNEhXrxKZpBRgsCldUZHDQqoiLK0gmkJYQdxnxgUwhJE1vRSdX5/dE6lqrp6S3qp7n7e9+Ud0kvVqa5T53zO8zzneQTGGANBEARBEASRVCypbgBBEARBEEQ2QiKMIAiCIAgiBZAIIwiCIAiCSAEkwgiCIAiCIFIAiTCCIAiCIIgUQCKMIAiCIAgiBZAIIwiCIAiCSAEkwgiCIAiCIFIAiTCCIAiCIIgUQCKMIAiCIAgiBVhT3YBoOOuss9C/f/9UN4MgCCKjOHbsmObv0047LUUtST7ZfO1E4qmvr8ehQ4cifi4tRFj//v1RV1eX6mYQBEFkFC6XS/O3w+FISTtSQTZfO5F4iouLo/ocuSMJgiAIgiBSAIkwgiAIgiCIFEAijCAIgiAIIgUkTIRNnjwZPXr0wNChQ5XXnnjiCVx44YUoLCzEuHHjsHfv3kSdniAIgiAIwtQkTITdcccdeP/99zWv/eEPf8D27duxdetWXHfddfjLX/6SqNMTBEEQBEGYmoSJsDFjxuDMM8/UvNa1a1fl3ydOnIAgCIk6PUEQBEEQhKlJeoqKxx9/HE6nE926dcPatWtDfq6qqgpVVVUAgIMHDyareQRBEEQWQCkpCDOQ9MD8v/71r9izZw9+85vf4Pnnnw/5ubKyMtTV1aGurg7du3dPYgsJgiAIgiAST8p2R/7mN79BdXV1qk5PEARBEASRUpIqwr766ivl30uXLsWgQYOSeXqCIAiCIAjTkLCYsFtvvRUulwuHDh1Cnz59MGPGDLz33nv44osvYLFY0K9fP7z44ouJOj1BEARBRIXb7YbL5YLD4YDdbk91c4gsImEi7NVXXw16bcqUKYk6HUEQBEHEjNvtxtixY+H1epGTk4PVq1eTECOSBmXMJwiCILKOL774Al988QXWrFmDfv36QZIkeL3eoMLeBJFISIQRBEEQWce+ffuwb98+9O3bF3369IEoisjJyaHUFURSSXqeMIIgCIIwC/n5+SgtLcUVV1xBMWFE0iERRhAEQWQ1+fn5uO2221LdDCILIXckQRAEQRBECiARRhAEocLtdqOiogJutzvVTSEIIsMhdyRBEEQLlK6AIIhkQpYwgiCIFlwuF7xeL6UrIAgiKZAIIwiCaMHhcCAnJ4fSFRAEkRTIHUkQBNGC3W7H6tWrqYQNQRBJgUQYQRCECrvdTuKLIIikQO5IgiAIgiCIFEAijCAIgiAIIgWQCCMIgiAIgkgBJMIIgiAIgiBSAAXmEwRBEFnH8OHDU90EgiARRhAEQWQfp512WqqbQBDkjiQIgiAIgkgFJMIIgiAIgiBSAIkwgiAIgiCIFEAijCAIgiAIIgVQYD5BEASRdezdu1fzd+/evVPUEiKbIRFGEDqqqqpQVVUFACgrK0NZWVlM39+7dy+uv/56AMB1112H8vJyAMCXX34Jl8sFAHA4HPj5z3+u+d7EiROxb98+9OrVC8uXL4+53eXl5XjnnXcAAMuWLaNJJQNoaGjAa6+9htraWuzfvx+CIKB79+4oKirCDTfcgIKCgpiOt3fvXqWPxCNFw7/+9S+8+eabaGxshNfrRZcuXZQ+Hu49M/Dll19q/qbnhUgFJMIIIkl88cUXirjr1atXkAgjCDXLli3D008/Da/Xq3n9u+++w3fffYcff/wRzz77bEzH3Ldvn2aB0Z4+WFNTg+eeey7m9wiCaIVEGEHEmd69e6Ouri7m77XF+kVkJhs3bsRTTz0FWZYhCAImT56MkpISnHHGGdi3bx9Wr16NhoaGlLbx888/V/5dXl6Oa6+9FoIgRHyPIIhWSIQRRBSoXX0LFy7EG2+8gfXr10MQBBQXF+OPf/wj8vLyABi7I8vKyrB582bleDNmzMCMGTMAAH/+858xceJEQ3fkl19+ifnz5+Orr77C4cOH4fF40K1bN1x00UW48847ccEFFyTzZyCSxPPPPw9ZlgEAt9xyC+677z7lvb59++LOO++EJEkAgOLiYgDAsGHDFCuX0evqPgwE3O7Hjh0DEOinEydOBAB8/PHHeO2117B7926cOnUKeXl5GDVqFO666y7FZcf7Kqe8vBzl5eUYNmwY9u3bF/I9dfsIgiARRhAx89BDDymTFwCsWbMGx48fxz/+8Y+4n6u+vh5r167VvHb48GGsXbsWbrcbL7/8Mn72s5/F/bxE6jh8+DB27dql/H3bbbcZfk4Uxbife9GiRZg7d67mtf3792PZsmVwuVx46aWXMGDAgLiflyDigdvthsvlgsPhgN1uT3VzooJEGEHESO/evTFr1ixIkoS77roLhw8fRm1tLQ4dOoSzzjrL8DtVVVVYvnx5kPUrEoMGDcLzzz+P888/H127doXP58OKFStQUVGB5uZmLFmyBI888khcr49ILWorUufOndGjR4+4HLe8vBwTJ07EPffcAyA4JqypqQkvvvgigEBJn2effRYDBw6E0+nEggULcPToUTz77LOYO3culi9frtnAMm/ePE2gf7j3CCIRuN1ujB07Fl6vFzk5OVi9enVaCDHKExYDbrcbFRUVcLvdqW4KkULuvfdenHPOOejbty8KCwuV19WTZ7zIy8tDbW0t7r33XjgcDowZMwYVFRXK+999913cz0lkJ7t27VJcnNdeey2GDRuGzp0745577sHpp58OAKirqwvaKEAQZsDlcsHr9UKSJHi9XlPtxA0HWcKiJF1VNhF/+vXrp/y7Y8eOyr8TMTk99thjYUV/c3Nz3M9JpJZevXop/z5x4gQOHjyI7t27x3QMLqZi4fjx48q/e/bsqfzbYrGgR48eOHLkCCRJwk8//RRzewgi0TgcDuTk5ChztMPhSHWToiJhlrDJkyejR48eGDp0qPLaH/7wBwwaNAgXXnghbrzxRhw5ciRRp4876aqyifhjtbauXWLZ8RXr7rCjR48qAuzMM8/E66+/jtraWrz22msxHYdIL84880wMGTJE+fvll182/BwXWjabDYB2EfDDDz8YfidcHzzttNOUfzc2Nir/lmUZBw4cABCIQ+vWrVukSyCIpGO327F69Wo8+eSTaWUkSZgIu+OOO/D+++9rXrvqqquwc+dObN++HT//+c81bhWzw1W2KIpppbIJ86CevL755puI1gqr1apMmlarFV26dMGRI0fwwgsvJLSdROp54IEHYLEEhufXXnsNVVVVOHjwIPx+PxoaGrBw4UI89dRTAFotZ19//TX27dsHv98fso+o++B//vMf+Hw+5e8hQ4Yowf7vvfcetm7dihMnTmD+/PnKgnnEiBHIycmJ/wUTRByw2+2YPn162ggwIIHuyDFjxqC+vl7z2rhx45R/jx49Gm+++WaiTh93uMpOt50XhHkYOHAgbDYbfD4fXnnlFbzyyisAQme379SpE0aMGIHa2locOHAAEyZMABBIUUBkNiNHjsSf/vQnPP300/D7/ZpAd85ll10GALj66qtRVVWF5uZmTJo0SSPe9eTn5+P000/HkSNH8OGHH2LJkiUAgIcffhgDBw7Evffei7lz5+Lo0aO46667NN/t2rUrHn744QRcberYs2cP6uvr0b9//1Q3hchSUhaYv3DhQlxzzTWpOn2bSEeVTZiHHj16YMaMGRgwYEDU1oSnnnoK48aNQ9euXdGlSxdMmDAhrSzIRNuZNGkSXnvtNfzqV79C3759kZubi44dO6Jfv3644YYbcMcddwAIeB1+/etfo3v37rDZbCgqKsLChQsNj5mTk4OKigoMHjwYHTp0CHr/zjvvxHPPPYcRI0agS5cuEEURPXr0wPXXX49XXnklo9JT7NmzB06nE2vXroXT6aQ
2017-09-02 17:53:38 +00:00
"text/plain": [
"<Figure size 720x432 with 1 Axes>"
2017-09-02 17:53:38 +00:00
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from fbprophet.diagnostics import cross_validation\n",
"df_cv = cross_validation(\n",
" m, '365 days', initial='1825 days', period='365 days')\n",
"cutoff = df_cv['cutoff'].unique()[0]\n",
"df_cv = df_cv[df_cv['cutoff'].values == cutoff]\n",
2017-09-02 17:53:38 +00:00
"\n",
"fig = plt.figure(facecolor='w', figsize=(10, 6))\n",
"ax = fig.add_subplot(111)\n",
"ax.plot(m.history['ds'].values, m.history['y'], 'k.')\n",
"ax.plot(df_cv['ds'].values, df_cv['yhat'], ls='-', c='#0072B2')\n",
"ax.fill_between(df_cv['ds'].values, df_cv['yhat_lower'],\n",
" df_cv['yhat_upper'], color='#0072B2',\n",
" alpha=0.2)\n",
"ax.axvline(x=pd.to_datetime(cutoff), c='gray', lw=4, alpha=0.5)\n",
2017-09-02 17:53:38 +00:00
"ax.set_ylabel('y')\n",
"ax.set_xlabel('ds')\n",
"ax.text(x=pd.to_datetime('2010-01-01'),y=12, s='Initial', color='black',\n",
" fontsize=16, fontweight='bold', alpha=0.8)\n",
"ax.text(x=pd.to_datetime('2012-08-01'),y=12, s='Cutoff', color='black',\n",
" fontsize=16, fontweight='bold', alpha=0.8)\n",
"ax.axvline(x=pd.to_datetime(cutoff) + pd.Timedelta('365 days'), c='gray', lw=4,\n",
2017-09-02 17:53:38 +00:00
" alpha=0.5, ls='--')\n",
"ax.text(x=pd.to_datetime('2013-01-01'),y=6, s='Horizon', color='black',\n",
" fontsize=16, fontweight='bold', alpha=0.8);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[The Prophet paper](https://peerj.com/preprints/3190.pdf) gives further description of simulated historical forecasts.\n",
"\n",
2017-09-02 20:07:49 +00:00
"This cross validation procedure can be done automatically for a range of historical cutoffs using the `cross_validation` function. We specify the forecast horizon (`horizon`), and then optionally the size of the initial training period (`initial`) and the spacing between cutoff dates (`period`). By default, the initial training period is set to three times the horizon, and cutoffs are made every half a horizon.\n",
2017-09-02 17:53:38 +00:00
"\n",
"The output of `cross_validation` is a dataframe with the true values `y` and the out-of-sample forecast values `yhat`, at each simulated forecast date and for each cutoff date. In particular, a forecast is made for every observed point between `cutoff` and `cutoff + horizon`. This dataframe can then be used to compute error measures of `yhat` vs. `y`.\n",
"\n",
"Here we do cross-validation to assess prediction performance on a horizon of 365 days, starting with 730 days of training data in the first cutoff and then making predictions every 180 days. On this 8 year time series, this corresponds to 11 total forecasts."
2017-09-02 17:53:38 +00:00
]
},
{
"cell_type": "code",
"execution_count": 4,
2017-09-02 20:07:49 +00:00
"metadata": {
"output_hidden": true
},
2017-09-02 17:53:38 +00:00
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Making 11 forecasts with cutoffs between 2010-02-15 and 2015-01-20\n",
"\n"
]
},
2017-09-02 17:53:38 +00:00
{
"data": {
"text/plain": [
" ds y yhat yhat_lower yhat_upper cutoff\n",
"1 2010-02-16 8.242493 8.954992 8.423614 9.496403 2010-02-15\n",
"2 2010-02-17 8.008033 8.721365 8.226481 9.219106 2010-02-15\n",
"3 2010-02-18 8.045268 8.605072 8.103985 9.104483 2010-02-15\n",
"4 2010-02-19 7.928766 8.526855 8.023088 9.042035 2010-02-15\n",
"5 2010-02-20 7.745003 8.268741 7.757920 8.779416 2010-02-15\n",
"6 2010-02-21 7.866339 8.599935 8.084956 9.060284 2010-02-15\n"
2017-09-02 17:53:38 +00:00
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%R\n",
"df.cv <- cross_validation(m, initial = 730, period = 180, horizon = 365, units = 'days')\n",
2017-09-02 17:53:38 +00:00
"head(df.cv)"
]
},
{
"cell_type": "code",
"execution_count": 5,
2017-09-02 17:53:38 +00:00
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
2017-09-02 17:53:38 +00:00
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>ds</th>\n",
" <th>yhat</th>\n",
" <th>yhat_lower</th>\n",
" <th>yhat_upper</th>\n",
" <th>y</th>\n",
" <th>cutoff</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2010-02-16</td>\n",
" <td>8.956572</td>\n",
" <td>8.460049</td>\n",
" <td>9.460400</td>\n",
" <td>8.242493</td>\n",
" <td>2010-02-15</td>\n",
2017-09-02 17:53:38 +00:00
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2010-02-17</td>\n",
" <td>8.723004</td>\n",
" <td>8.200557</td>\n",
" <td>9.236561</td>\n",
" <td>8.008033</td>\n",
" <td>2010-02-15</td>\n",
2017-09-02 17:53:38 +00:00
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2010-02-18</td>\n",
" <td>8.606823</td>\n",
" <td>8.070835</td>\n",
" <td>9.123754</td>\n",
" <td>8.045268</td>\n",
" <td>2010-02-15</td>\n",
2017-09-02 17:53:38 +00:00
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>2010-02-19</td>\n",
" <td>8.528688</td>\n",
" <td>8.034782</td>\n",
" <td>9.042712</td>\n",
" <td>7.928766</td>\n",
" <td>2010-02-15</td>\n",
2017-09-02 17:53:38 +00:00
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>2010-02-20</td>\n",
" <td>8.270706</td>\n",
" <td>7.754891</td>\n",
" <td>8.739012</td>\n",
" <td>7.745003</td>\n",
" <td>2010-02-15</td>\n",
2017-09-02 17:53:38 +00:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" ds yhat yhat_lower yhat_upper y cutoff\n",
"0 2010-02-16 8.956572 8.460049 9.460400 8.242493 2010-02-15\n",
"1 2010-02-17 8.723004 8.200557 9.236561 8.008033 2010-02-15\n",
"2 2010-02-18 8.606823 8.070835 9.123754 8.045268 2010-02-15\n",
"3 2010-02-19 8.528688 8.034782 9.042712 7.928766 2010-02-15\n",
"4 2010-02-20 8.270706 7.754891 8.739012 7.745003 2010-02-15"
2017-09-02 17:53:38 +00:00
]
},
"execution_count": 5,
2017-09-02 17:53:38 +00:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from fbprophet.diagnostics import cross_validation\n",
"df_cv = cross_validation(m, initial='730 days', period='180 days', horizon = '365 days')\n",
2017-09-02 17:53:38 +00:00
"df_cv.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In R, the argument `units` must be a type accepted by `as.difftime`, which is weeks or shorter. In Python, the string for `initial`, `period`, and `horizon` should be in the format used by Pandas Timedelta, which accepts units of days or shorter.\n",
"\n",
"The `performance_metrics` utility can be used to compute some useful statistics of the prediction performance (`yhat`, `yhat_lower`, and `yhat_upper` compared to `y`), as a function of the distance from the cutoff (how far into the future the prediction was). The statistics computed are mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), mean absolute percent error (MAPE), and coverage of the `yhat_lower` and `yhat_upper` estimates. These are computed on a rolling window of the predictions in `df_cv` after sorting by horizon (`ds` minus `cutoff`). By default 10% of the predictions will be included in each window, but this can be changed with the `rolling_window` argument."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"output_hidden": true
},
"outputs": [
{
"data": {
"text/plain": [
" horizon mse rmse mae mape coverage\n",
"1 37 days 0.4971086 0.7050593 0.5075009 0.05882459 0.6765646\n",
"2 38 days 0.5029463 0.7091870 0.5125229 0.05940706 0.6765646\n",
"3 39 days 0.5252677 0.7247535 0.5186555 0.06001158 0.6751942\n",
"4 40 days 0.5326181 0.7298069 0.5215775 0.06032500 0.6788488\n",
"5 41 days 0.5401377 0.7349406 0.5226521 0.06041353 0.6838739\n",
"6 42 days 0.5438937 0.7374915 0.5230473 0.06043453 0.6891275\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%R\n",
"df.p <- performance_metrics(df.cv)\n",
"head(df.p)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>horizon</th>\n",
" <th>mse</th>\n",
" <th>rmse</th>\n",
" <th>mae</th>\n",
" <th>mape</th>\n",
" <th>coverage</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>37 days</td>\n",
" <td>0.495378</td>\n",
" <td>0.703831</td>\n",
" <td>0.505713</td>\n",
" <td>0.058593</td>\n",
" <td>0.680448</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>38 days</td>\n",
" <td>0.501134</td>\n",
" <td>0.707908</td>\n",
" <td>0.510680</td>\n",
" <td>0.059169</td>\n",
" <td>0.679077</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>39 days</td>\n",
" <td>0.523334</td>\n",
" <td>0.723418</td>\n",
" <td>0.516755</td>\n",
" <td>0.059766</td>\n",
" <td>0.677707</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>40 days</td>\n",
" <td>0.530625</td>\n",
" <td>0.728440</td>\n",
" <td>0.519645</td>\n",
" <td>0.060075</td>\n",
" <td>0.678849</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>41 days</td>\n",
" <td>0.538117</td>\n",
" <td>0.733565</td>\n",
" <td>0.520663</td>\n",
" <td>0.060156</td>\n",
" <td>0.686386</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" horizon mse rmse mae mape coverage\n",
"0 37 days 0.495378 0.703831 0.505713 0.058593 0.680448\n",
"1 38 days 0.501134 0.707908 0.510680 0.059169 0.679077\n",
"2 39 days 0.523334 0.723418 0.516755 0.059766 0.677707\n",
"3 40 days 0.530625 0.728440 0.519645 0.060075 0.678849\n",
"4 41 days 0.538117 0.733565 0.520663 0.060156 0.686386"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from fbprophet.diagnostics import performance_metrics\n",
"df_p = performance_metrics(df_cv)\n",
"df_p.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cross validation performance metrics can be visualized with `plot_cross_validation_metric`, here shown for MAPE. Dots show the absolute percent error for each prediction in `df_cv`. The blue line shows the MAPE, where the mean is taken over a rolling window of the dots. We see for this forecast that errors around 5% are typical for predictions one month into the future, and that errors increase up to around 11% for predictions that are a year out."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"output_hidden": true
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAtAAAAGwCAIAAAAPKcUMAAAACXBIWXMAAAsSAAALEgHS3X78AAAgAElEQVR4nOydWYwr2V3/Ty3ey/vS7W73cu+d7rvOhEACo0kUJARZxAt5gweQ5iEPiIkIQkIkEPIAUiTmYQRBJCJCQigCCQUhEBL5Jw8wQSiDFCFImGTmzs3ce3u323uV91r+Dz/N4aTs9nV3u2x3+/t5cru9nCpXnfM9v1VyHIcBAAAAAHiJPO8BAAAAAOD6A8EBAAAAAM+B4AAAAACA50BwAAAAAMBz1HkPwE2v1zNNc7qf6ff7+/3+dD9zwZEkSZIk27bnPZCZoqqqbdvLdtSKoliWNe9RzBRFUSRJmvpEseAs4Q8tSZLP58PsvYBEIpELvGvhBEe/3+/1elP8QFmWI5FIo9GY4mcuPoqi+Hy+brc774HMlHg83uv1lm16ikQinU5n3qOYKaFQSFXVZTvqcDjc7XaXKq9QluVQKLRss7eqqoqiTHcdnDoXExxwqQAAAADAcyA4AAAAAOA5EBwAAAAA8BwIDgAAAAB4DgQHAAAAADwHggMAAAAAngPBAQAAAADPgeAAAAAAgOdAcAAAAADAcyA4AAAAAOA5EBwAAAAA8BwIDgAAAAB4DgQHAAAAADwHggMAAAAAnrNw7ekBWECOj4/L5TJjbHt7OxqNzns4AABw9YCFA4Bn0Gg0SG0wxp48eeI4znzHAwAAVxEIDgCewWAwEP+0bXteIwEAgKsLBAcAz0DTNP44FospijLHwQAAwBUFMRwAPINgMHjr1q1Go6EoSjqdnvdwAADgSgLBAcCzCYfD4XB43qMAAIArDFwqAAAAAPAcCA4AAAAAeA4EBwAAAAA8B4IDAAAAAJ4DwQEAAAAAz4HgAAAAAIDnQHAAAAAAwHMgOAAAAADgORAcAAAAAPAcCA4AAAAAeA4EBwAAAAA8B4IDAAAAAJ4DwQEAAAAAz4HgAAAAAIDnQHAAAAAAwHMgOAAAAADgORAcAAAAAPAcCA4AAAAAeA4EBwAAAAA8B4IDAAAAAJ4DwQEAAAAAz4HgAAAAAIDnQHAAAAAAwHMgOAAAAADgOaqnn25Z1muvvdZqtTY3N19++WV60nGcv/zLvyyVStFo9JVXXpEkydMxAAAAAGDueGvheOONNwqFwhe+8IWTk5ODgwP+ZCgU+tznPveTP/mTxWLR0wEAAAAAYBHw1sLx6NGju3fvMsZu3rz56NGjQqHAGHvrrbdkWf7TP/3Te/fura6u8lf+x3/8B2Psp3/6pzc3N6c4BrKghEKhKX7m4iPLsqIoy2Y9kmXZ7/crijLvgcwUVVWX7fL2+XyyLC/hUTPGHMeZ90BmhyRJkiQt2w8tv8e8BzJ9vBUchmFkMhnGWDqdNgyDnmy1Wq1W6+WXX/7KV76SzWbf9773Mcba7fbh4SFjrNPpTHfBoEV32RYhulGX8KiX7ZDZUh710l7e13IRGsNyzt6yLF/Xy9tbwRGJRCqVys2bNyuVSi6XoyfD4fBLL72Uy+U+8pGPvPPOOyQ4XnjhhRdeeIExpus6lyZTQZblQCAw3c9cfBRF8fl83W533gOZKfF4vNPp9Pv9eQ9kpkQikVarNe9RzJRQKKSq6rLd1OFwuNPpLJWFg2yWy/ZDq6qqKEqv15v3QMYRDAYv8C5v9fLOzs7jx48ZY0+fPt3Z2eFPvvPOO4yxd999d2VlxdMBAAAAAGAR8FZwvPjii0dHR6+++urq6mqhUHj48OGf/dmfvfjii48fP/7c5z5XrVZfeuklTwcAAAAAgEVAWjQDna7r0zUlybKcSqXK5fIUP3PxgUtlijiOY9v2wrpUl9alouv6vAcyU5bTpZJMJiuVyrwHMlOuhEuFojPPi7cxHABcdWq1GmV0JxKJQqGwbIk/AAAwLZYr5hmAc+E4Dq8fU6/Xm83mfMcDAABXFwgOAM7EsizxT9M05zUSAAC46kBwAHAmqqrG43H+ZywWm+NgAADgSoMYDgDGsbGxEYvFTNOMx+NU6hEAAMAFgOAAYBySJCUSiXmPAgAArjxwqQAAAADAcyA4AAAAAOA5EBwAAAAA8BwIDgAAAAB4DgQHAAAAADwHggMAAAAAngPBAQAAAADPgeAAAAAAgOdAcAAAAADAcyA4AAAAAOA5EBwAAAAA8BwIDgAAAAB4DgQHAAAAADwHggMAAAAAngPBAQAAAADPgeAAAAAAgOdAcAAAAADAcyA4AAAAAOA5EBwAAAAA8BwIDgAAAAB4DgQHAAAAADwHggMAAAAAngPBAQAAAADPgeAAAAAAgOdAcAAAAADAcyA4AAAAAOA5EBwAAAAA8BwIDgAAAAB4DgQHAAAAADwHggMAAAAAngPBAQAAAADPgeAAAAAAgOdAcAAAAADAcyA4AAAAAOA5EBwAAAAA8BwIDgAAAAB4DgQHAAAAADwHggMAAAAAngPBAQAAAADPgeAAAAAAgOdAcAAAAADAcyA4AAAAAOA5EBwAAAAA8Bx13gNwo6qqqk5zVJIkMcYikcgUP3PxkSRJlmVFUeY9kJmiKEowGPT5fPMeyEzx+XzLdnmrqirL8rIdtc/no9lseZAkSZKkZfuhZVmWJGm66+CCsHCHZJpmr9eb4gfKshwMBlut1hQ/c/FRFMXn83W73XkPZKaoqtrtdvv9/rwHMlMikciyXd6hUEhV1WU76nA43Ol0HMeZ90BmhyzLgUBg2X5oVVUVRZnuOjh1QqHQBd4FlwoAAAAAPAeCAwAAAACeA8EBAAAAAM+B4AAAAACA50BwAAAAAMBzIDgAAAAA4DkLlxYLwOJQq9Xa7XYoFEomk8tWAgEAAKYLBAcAoymXy8fHx/TYNM1cLjff8QAAwJUGLhUARiOWG2q323McCQAAXAMgOAAYjd/v54+vZZlhAACYJRAcAIwmm83G43HGWDweX1lZmfdwAADgaoN9GwCjUVV1c3Nz3qMAAIBrAiwcAAAAAPAcCA4AAAAAeA4EBwAAAAA8B4IDAAAAAJ4DwQEAAAAAz4HgAAAAAIDnQHAAAAAAwHMgOAAAAADgORAcAAAAAPAcCA4AAAAAeA4EBwAAAAA8B4IDAAAAAJ4DwQEAAAAAz4HgAAAAAIDnQHAAAAAAwHMgOAAAAADgORAcAAAAAPAcCA4AAAAAeA4EBwAAAAA8B4IDAAAAAJ4DwQEAAAAAz4HgAAAAAIDnQHAAAAAAwHMgOAAAAADgORAcAAAAAPAcCA4AAAAAeA4EBwAAAAA8B4IDAAAAAJ4DwQEAAAAAz4HgAAAAAIDnQHAAAAAAwHMgOAAAAADgORAcAAAAAPAcCA4AAAAAeA4EBwAAAAA8B4IDAAAAAJ4DwQEAAAAAz1E9/XTLsl577bVWq7W5ufnyyy+7/qvrejQa9XQAAAAAAFgEvLVwvPHGG4VC4Qtf+MLJycnBwYH4r3/913/96le/6um3AwAAAGBB8FZwPHr06ObNm4yxmzdvPnr0iD9fLBZff/11T78aAAAAAIuDty4VwzAymQxjLJ1OG4ZBT9q2/Vd/9Ve/+qu/+o//+I/8ld/+9rfJ4PGpT33qpZdemvpIEonE1D9zkZEkSZKkYDA474HMFEVRIpFIOBye90BmiizLPp9v3qOYKbIsS5K0bDe1LMt+v3/eo5g1S/hDS5LEGAuFQvMeyPTxVnBEIpFKpXLz5s1KpZLL5ejJr3/96x/96Edd0Rt379799Kc/zRjL5/OtVmuKY5AkKRaLTfczFx9FURRF6ff78x7ITIlEIv1+fzAYzHsgMyUYDHa73XmPYqYEAgFFUdrt9rwHMlOCwWCv13McZ94DmR2yLGuatoSztyzLCz6PxePxC7zLW8Gxs7Pz+PHjD37wg0+fPv3Qhz5ET7bb7X/+53/u9XqHh4f/8i//8olPfIIxls1ms9ksY0zX9V6vN8UxyLLMGFvwH2/q2LbNlu+oHccxTXPZjtrv9y/bIauqKknSsh21z+cbDAbLJjjYUs5jiqJcy6P2VnC8+OKLX/rSl1599dXV1dVCofDw4cNvfvObr7zyCmOsVCp97WtfI7UBAAAAgOuNtGh
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%R -w 10 -h 6 -u in\n",
"plot_cross_validation_metric(df.cv, metric = 'mape')"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAmQAAAF3CAYAAAALu1cUAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzsnX1sHMd5/7/3QvLuSN4dJeqNpExZpiiLciOblmLFLiy5gSBERtW6RVM3QREkcNUXAQka9CV9MxohRQIEcJFAQVs1bdLWTdyiQWIDSWQotiUFhi3VpmxXoSWStkiLR1Hkibw78o5H3tvvD/1mPLe3e7d3vOUuqe8HCGJSy93ZmdmZ7zzPM8+4CoVCAYQQQgghxDbcdheAEEIIIeROh4KMEEIIIcRmKMgIIYQQQmyGgowQQgghxGYoyAghhBBCbIaCjBBCCCHEZijICCGEEEJshoKMEEIIIcRmKMgIIYQQQmyGgowQQgghxGa8dhegWtrb27Ft2zbL7p9MJtHc3GzZ/VcrrJdSWCf6sF70Yb2UwjrRh/Wiz2qtl9HRUUSj0YrXrTpBtm3bNrzxxhuW3f/s2bM4ePCgZfdfrbBeSmGd6MN60Yf1UgrrRB/Wiz6rtV727t1r6jq6LAkhhBBCbIaCjBBCCCHEZijICCGEEEJshoKMEEIIIcRmKMgIIYQQQmyGgowQQgghxGYoyAghhBBCbIaCjBBCCCHEZijICCGEEEJshoKMEEIIIcRmKMgIcTjxeBxjY2OIx+N2F4UQQohFrLqzLAm5k4jH4zh37hzy+TzcbjcOHDiAUChkd7EIIYTUGVrICHEwsVgM+Xwe4XAY+XwesVjM7iIRQgixAAoyQhxMOByG2+1GLBaD2+1GOBy2u0iEEEIsgC5LQhxMKBTCgQMHEIvFEA6H6a4khJA1CgUZIQ4nFApRiBFCyBqHLktCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZSwXZ6dOnsXPnTvT09OBrX/ua7jX//d//jb6+PuzevRuf+tSnrCwOIYQQQogj8Vp141wuh+PHj+PMmTPo6urCvn37cPToUfT19clrhoeH8dWvfhWvvvoq2traMDU1ZVVxCCGEEEIci2UWsosXL6Knpwfbt29HY2MjnnzySTz//PNF1/zzP/8zjh8/jra2NgDAxo0brSoOIYQQQohjsUyQRSIRbN26Vf7c1dWFSCRSdM3Q0BCGhobwyCOPYP/+/Th9+rRVxSGEEEIIcSyWuSzNkM1mMTw8jLNnz2J8fByPPvoo/u///g/hcLjoulOnTuHUqVMAgPHxcZw9e9ayMs3Pz1t6/9UK66UU1ok+rBd9WC+lsE70Yb3os9brxTJB1tnZievXr8ufx8fH0dnZWXRNV1cXHnroITQ0NODuu+9Gb28vhoeHsW/fvqLrjh07hmPHjgEA9u7di4MHD1pVbJw9e9bS+69WWC+lsE70Yb3ow3ophXWiD+tFn7VeL5a5LPft24fh4WFcu3YNS0tLeO6553D06NGia379139dqt1oNIqhoSFs377dqiIRQgghhDgSywSZ1+vFyZMncfjwYezatQuf/OQnsXv3bjz99NN44YUXAACHDx/G+vXr0dfXh8ceewxf//rXsX79equKRAghhBDiSCyNITty5AiOHDlS9LsTJ07I/3a5XHjmmWfwzDPPWFkMQlYl8XgcsVgM4XAYoVDI7uIQQgixEFuD+gkh+sTjcZw7dw75fB5utxsHDhygKCOEkDUMj04ixIHEYjHk83mEw2Hk83nEYjG7i0QIIcRCKMgIcSDhcBhutxuxWAxut7skFQwhhJC1BV2WhDiQUCiEAwcOMIaMEELuECjICHEooVCIQowQQu4Q6LIkhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuxVJCdPn0aO3fuRE9PD772ta+V/Pt3v/tdbNiwAffffz/uv/9+fPvb37ayOIQQQgghjsRr1Y1zuRyOHz+OM2fOoKurC/v27cPRo0fR19dXdN1v//Zv4+TJk1YVgxBCCCHE8VhmIbt48SJ6enqwfft2NDY24sknn8Tzzz9v1eMIIYQQQlYtllnIIpEItm7dKn/u6urChQsXSq77wQ9+gPPnz6O3txd///d/X/Q3glOnTuHUqVMAgPHxcZw9e9aqYmN+ft7S+69WWC+lsE70Yb3ow3ophXWiD+tFn7VeL5YJMjP86q/+Kn7nd34HTU1N+Kd/+id85jOfwcsvv1xy3bFjx3Ds2DEAwN69e3Hw4EHLynT27FlL779aYb2UwjrRh/WiD+ulFNaJPqwXfdZ6vVjmsuzs7MT169flz+Pj4+js7Cy6Zv369WhqagIAPPXUU3jzzTetKg4hhBBCiGOxTJDt27cPw8PDuHbtGpaWlvDcc8/h6NGjRdfcuHFD/vcLL7yAXbt2WVUcQgghhBDHYpnL0uv14uTJkzh8+DByuRw+97nPYffu3Xj66aexd+9eHD16FN/85jfxwgsvwOv1Yt26dfjud79rVXEIIYQQQhyLpTFkR44cwZEjR4p+d+LECfnfX/3qV/HVr37VyiIQQgghhDgeZuonhBBCCLEZCjJCHEg8HsfY2Bji8bjdRSGEELIC2Jr2ghBSSjwex7lz55DP5+F2u3HgwAGEQiG7i0UIIcRCaCEjxGHEYjHk83mEw2Hk83nEYjG7i0QIIcRiKMgIcRjhcBhutxuxWAxutxvhcNjuIhFCCLEYuiwJcRihUAgHDhxALBZDOBymu5IQQu4AKMgIcSChUIhCjBBC7iDosiSEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZCjJCCCGEEJuhICOEEEIIsRkKMkIIIYQQm6EgI4QQQgixGQoyQgghhBCboSAjhBBCCLEZSwXZ6dOnsXPnTvT09OBrX/ua4XU/+MEP4HK58MYbb1hZHEIIIYQQR2KZIMvlcjh+/Dh++tOfYnBwEN///vcxODhYct3c3By+8Y1v4KGHHrKqKIQQQgghjsYyQXbx4kX09PRg+/btaGxsxJNPPonnn3++5Lq/+Zu/wZ//+Z/D5/NZVRRCCCGEEEdjmSCLRCLYunWr/LmrqwuRSKTomoGBAVy/fh2PP/64VcUghBBCCHE8XrsenM/n8cUvfhHf/e53K1576tQpnDp1CgAwPj6Os2fPWlau+fl5S++/WmG9lMI60Yf1og/rpRTWiT6sF33Wer1YJsg6Oztx/fp1+fP4+Dg6Ozvlz3Nzc7h8+TIOHjwIAJicnMTRo0fxwgsvYO/evUX3OnbsGI4dOwYA2Lt3r/wbKzh79qyl91+tsF5KYZ3ow3rRh/VSCutEH9aLPmu9XixzWe7
"text/plain": [
"<Figure size 720x432 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from fbprophet.plot import plot_cross_validation_metric\n",
"fig = plot_cross_validation_metric(df_cv, metric='mape')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The size of the rolling window in the figure can be changed with the optional argument `rolling_window`, which specifies the proportion of forecasts to use in each rolling window. The default is 0.1, corresponding to 10% of rows from `df_cv` included in each window; increasing this will lead to a smoother average curve in the figure.\n",
"\n",
"The `initial` period should be long enough to capture all of the components of the model, in particular seasonalities and extra regressors: at least a year for yearly seasonality, at least a week for weekly seasonality, etc."
]
2017-09-02 17:53:38 +00:00
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
2017-09-02 17:53:38 +00:00
"language": "python",
"name": "python3"
2017-09-02 17:53:38 +00:00
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
2017-09-02 17:53:38 +00:00
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
2017-09-02 17:53:38 +00:00
}
},
"nbformat": 4,
"nbformat_minor": 1
}