Add example of auto parameter selection to notebook diagnostics (#1464)

* Add example of dask distributed parameter tuning to notebook diagnostics

* Rearrange description blocks and correct typos

* Add more informative explanations and tips in docs for tuning examples

* increase output df column width to see results

* add instructions for multiprocess and threaded

* reset index of hyperparameter tuning results df

* fix typos

* remove blank code cell

* grammar correction

* Add subsection

* Split different parameter optimsiation examples
This commit is contained in:
Ryan Nazareth 2020-05-07 18:04:22 +01:00 committed by GitHub
parent 9168dcf11d
commit caaaee8d34
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -6,7 +6,18 @@
"metadata": {
"block_hidden": true
},
"outputs": [],
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/ryannazareth/anaconda3/envs/fbprophet/lib/python3.7/site-packages/rpy2/robjects/pandas2ri.py:17: FutureWarning: pandas.core.index is deprecated and will be removed in a future version. The public classes are available in the top-level namespace.\n",
" from pandas.core.index import Index as PandasIndex\n",
"/Users/ryannazareth/Documents/Python_sprints/prophet/python/fbprophet/diagnostics.py:10: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)\n",
" from tqdm.autonotebook import tqdm\n"
]
}
],
"source": [
"%load_ext rpy2.ipython\n",
"%matplotlib inline\n",
@ -468,17 +479,343 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The size of the rolling window in the figure can be changed with the optional argument `rolling_window`, which specifies the proportion of forecasts to use in each rolling window. The default is 0.1, corresponding to 10% of rows from `df_cv` included in each window; increasing this will lead to a smoother average curve in the figure.\n",
"The size of the rolling window in the figure can be changed with the optional argument `rolling_window`, which specifies the proportion of forecasts to use in each rolling window. The default is 0.1, corresponding to 10% of rows from `df_cv` included in each window; increasing this will lead to a smoother average curve in the figure. The `initial` period should be long enough to capture all of the components of the model, in particular seasonalities and extra regressors: at least a year for yearly seasonality, at least a week for weekly seasonality, etc.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hyperparameter Optimisation\n",
"\n",
"The `initial` period should be long enough to capture all of the components of the model, in particular seasonalities and extra regressors: at least a year for yearly seasonality, at least a week for weekly seasonality, etc."
"Auto parameter tuning can also be carried out by evaluating the parameter combinations in serial and using the in-built parallelization over cutoffs. An example implementation with multi-processing in Python is shown below, with a grid of six combinations of `changepoint_prior_scale` and `changepoint_range` parameters. The function `create_param_combinaitons` creates a dataframe of parameter combinations, which can be evaluated serially to call `single_cv_run` with the `parallel` keyword to parallelze over cutoffs. The best parameter combination is selected based on the best `rmse` score but can be switched to another performance metric depending on the use case.\n",
"\n",
"As an alternative for creating parameter combinations, one could also use the `ParameterGrid` class in `sklearn.model_selection`. This would need to be installed and imported separately if required, as it is not included with Prophet."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
" The best param combination is {'changepoint_prior_scale': 0.05, 'changepoint_range': 0.8}\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>horizon</th>\n",
" <th>rmse</th>\n",
" <th>mape</th>\n",
" <th>params</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>200 days</td>\n",
" <td>0.450030</td>\n",
" <td>0.034958</td>\n",
" <td>{'changepoint_prior_scale': 0.05, 'changepoint_range': 0.8}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>200 days</td>\n",
" <td>0.453755</td>\n",
" <td>0.035471</td>\n",
" <td>{'changepoint_prior_scale': 0.05, 'changepoint_range': 0.9}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>200 days</td>\n",
" <td>0.456887</td>\n",
" <td>0.035469</td>\n",
" <td>{'changepoint_prior_scale': 0.5, 'changepoint_range': 0.8}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>200 days</td>\n",
" <td>0.490453</td>\n",
" <td>0.039134</td>\n",
" <td>{'changepoint_prior_scale': 0.5, 'changepoint_range': 0.9}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>200 days</td>\n",
" <td>0.463969</td>\n",
" <td>0.036428</td>\n",
" <td>{'changepoint_prior_scale': 5.0, 'changepoint_range': 0.8}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>200 days</td>\n",
" <td>0.512077</td>\n",
" <td>0.040488</td>\n",
" <td>{'changepoint_prior_scale': 5.0, 'changepoint_range': 0.9}</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" horizon rmse mape \\\n",
"0 200 days 0.450030 0.034958 \n",
"1 200 days 0.453755 0.035471 \n",
"2 200 days 0.456887 0.035469 \n",
"3 200 days 0.490453 0.039134 \n",
"4 200 days 0.463969 0.036428 \n",
"5 200 days 0.512077 0.040488 \n",
"\n",
" params \n",
"0 {'changepoint_prior_scale': 0.05, 'changepoint_range': 0.8} \n",
"1 {'changepoint_prior_scale': 0.05, 'changepoint_range': 0.9} \n",
"2 {'changepoint_prior_scale': 0.5, 'changepoint_range': 0.8} \n",
"3 {'changepoint_prior_scale': 0.5, 'changepoint_range': 0.9} \n",
"4 {'changepoint_prior_scale': 5.0, 'changepoint_range': 0.8} \n",
"5 {'changepoint_prior_scale': 5.0, 'changepoint_range': 0.9} "
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from fbprophet.diagnostics import cross_validation, performance_metrics\n",
"import itertools\n",
"\n",
"def create_param_combinations(**param_dict):\n",
" param_iter = itertools.product(*param_dict.values())\n",
" params =[]\n",
" for param in param_iter:\n",
" params.append(param) \n",
" params_df = pd.DataFrame(params, columns=list(param_dict.keys()))\n",
" return params_df\n",
"\n",
"def single_cv_run(history_df, metrics, param_dict, parallel):\n",
" m = Prophet(**param_dict)\n",
" m.fit(history_df)\n",
" df_cv = cross_validation(m, initial='2600 days', period='100 days', horizon = '200 days', parallel=parallel)\n",
" df_p = performance_metrics(df_cv, rolling_window=1)\n",
" df_p['params'] = str(param_dict)\n",
" df_p = df_p.loc[:, metrics]\n",
" return df_p\n",
"\n",
"\n",
"pd.set_option('display.max_colwidth', None)\n",
"param_grid = { \n",
" 'changepoint_prior_scale': [0.05, 0.5, 5],\n",
" 'changepoint_range': [0.8, 0.9],\n",
" }\n",
"metrics = ['horizon', 'rmse', 'mape', 'params'] \n",
"results = []\n",
"\n",
"\n",
"params_df = create_param_combinations(**param_grid)\n",
"for param in params_df.values:\n",
" param_dict = dict(zip(params_df.keys(), param))\n",
" cv_df = single_cv_run(df, metrics, param_dict, parallel=\"processes\")\n",
" results.append(cv_df)\n",
"results_df = pd.concat(results).reset_index(drop=True)\n",
"best_param = results_df.loc[results_df['rmse'] == min(results_df['rmse']), ['params']]\n",
"print(f'\\n The best param combination is {best_param.values[0][0]}')\n",
"results_df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alternatively, in some cases one could benefit from parallelizing over parameter values instead, when the number of parameter combinations are large and the user has access to a large number of cores or a cluster. In the example below, parameter combinations are evaluated in parallel using `dask.distributed.Client`. The helper function `parallelize_param_combinations` parallelizes the calls to `single_cv_run` for each parameter combination. The cutoffs in `cross_validation` are then evaluated serially. To switch to other parallel modes in this example, import the `concurrent.futures` module and set `pool=concurrent.futures.ThreadPoolExecutor()`for execution in threads or `pool=concurrent.futures.ProcessPoolExecutor()` for multi-processing."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
" The best param combination is {'changepoint_prior_scale': 0.05, 'changepoint_range': 0.8}\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>horizon</th>\n",
" <th>rmse</th>\n",
" <th>mape</th>\n",
" <th>params</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>200 days</td>\n",
" <td>0.450030</td>\n",
" <td>0.034958</td>\n",
" <td>{'changepoint_prior_scale': 0.05, 'changepoint_range': 0.8}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>200 days</td>\n",
" <td>0.453755</td>\n",
" <td>0.035471</td>\n",
" <td>{'changepoint_prior_scale': 0.05, 'changepoint_range': 0.9}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>200 days</td>\n",
" <td>0.456887</td>\n",
" <td>0.035469</td>\n",
" <td>{'changepoint_prior_scale': 0.5, 'changepoint_range': 0.8}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>200 days</td>\n",
" <td>0.490453</td>\n",
" <td>0.039134</td>\n",
" <td>{'changepoint_prior_scale': 0.5, 'changepoint_range': 0.9}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>200 days</td>\n",
" <td>0.463969</td>\n",
" <td>0.036428</td>\n",
" <td>{'changepoint_prior_scale': 5.0, 'changepoint_range': 0.8}</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>200 days</td>\n",
" <td>0.512077</td>\n",
" <td>0.040488</td>\n",
" <td>{'changepoint_prior_scale': 5.0, 'changepoint_range': 0.9}</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" horizon rmse mape \\\n",
"0 200 days 0.450030 0.034958 \n",
"1 200 days 0.453755 0.035471 \n",
"2 200 days 0.456887 0.035469 \n",
"3 200 days 0.490453 0.039134 \n",
"4 200 days 0.463969 0.036428 \n",
"5 200 days 0.512077 0.040488 \n",
"\n",
" params \n",
"0 {'changepoint_prior_scale': 0.05, 'changepoint_range': 0.8} \n",
"1 {'changepoint_prior_scale': 0.05, 'changepoint_range': 0.9} \n",
"2 {'changepoint_prior_scale': 0.5, 'changepoint_range': 0.8} \n",
"3 {'changepoint_prior_scale': 0.5, 'changepoint_range': 0.9} \n",
"4 {'changepoint_prior_scale': 5.0, 'changepoint_range': 0.8} \n",
"5 {'changepoint_prior_scale': 5.0, 'changepoint_range': 0.9} "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from dask.distributed import Client\n",
"import functools\n",
"from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor\n",
"\n",
"\n",
"def parallelize_param_combinations(history_df, params_df, single_cv_callable, pool):\n",
" results = []\n",
" for param in params_df.values:\n",
" param_dict = dict(zip(params_df.keys(), param))\n",
" if isinstance(pool,(ThreadPoolExecutor, ProcessPoolExecutor)):\n",
" future = pool.submit(single_cv_callable, history_df, param_dict=param_dict)\n",
" results.append(future.result())\n",
" elif isinstance(pool, Client):\n",
" remote_df = pool.scatter(history_df)\n",
" future = pool.submit(single_cv_callable, remote_df, param_dict=param_dict)\n",
" results.append(future)\n",
" if isinstance(pool, Client):\n",
" results = pool.gather(results)\n",
" results_df = pd.concat(results).reset_index(drop=True)\n",
" \n",
" return results_df\n",
"\n",
"\n",
"single_cv_callable = functools.partial(single_cv_run, metrics=metrics, parallel=None)\n",
"\n",
"pool = Client()\n",
"results_df = parallelize_param_combinations(df, params_df, single_cv_callable, pool=pool)\n",
"best_param = results_df.loc[results_df['rmse'] == min(results_df['rmse']), ['params']]\n",
"print(f'\\n The best param combination is {best_param.values[0][0]}')\n",
"results_df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Recommended Hyperparameter Ranges"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the examples above, we have used recommended initial settings for `changepoint_prior_scale:[0.05, 0.5, 5]` and \n",
"`changepoint_range: [0.8, 0.9]`. We could alternatively also use a random search to carry out a sweep between a range of values e.g. `np.random.uniform(0.05, 5, 3)`. Other parameters like the `seasonality_prior_scale`, `holidays_prior_scale` and `seasonality_mode` could also be optimised for. For the seasonality and holiday prior scales, recommended values to start with are `[0.1,1,10]`, and it is better to set these values on a log scale in the grid e.g. `np.random.uniform(-1, 1, 5)`."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python (fbprophet)",
"language": "python",
"name": "python3"
"name": "fbprophet"
},
"language_info": {
"codemirror_mode": {
@ -490,7 +827,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.7.0"
}
},
"nbformat": 4,