Update site with documentation updates

This commit is contained in:
Ben Letham 2018-05-30 22:43:52 -07:00
parent f5b4249180
commit af2824fbd0
87 changed files with 603 additions and 250 deletions

View file

@ -4,7 +4,8 @@
- id: quick_start
- id: saturating_forecasts
- id: trend_changepoints
- id: seasonality_and_holiday_effects
- id: seasonality,_holiday_effects,_and_regressors
- id: multiplicative_seasonality
- id: uncertainty_intervals
- id: outliers
- id: non-daily_data

View file

@ -5,7 +5,7 @@ title: "Getting Help and Contributing"
permalink: /docs/contributing.html
---
Prophet has an non-fixed release cycle but we will be making bugfixes in response to user feedback and adding features. Its current state is Beta (v0.3), we expect no obvious bugs. Please let us know if you encounter a bug by [filing an issue](https://github.com/facebook/prophet/issues). Github issues is also the right place to ask questions about using Prophet.
Prophet has a non-fixed release cycle but we will be making bugfixes in response to user feedback and adding features. Its current state is Beta (v0.3), we expect no obvious bugs. Please let us know if you encounter a bug by [filing an issue](https://github.com/facebook/prophet/issues). Github issues is also the right place to ask questions about using Prophet.
We appreciate all contributions. If you are planning to contribute back bug-fixes, please do so without any further discussion.

View file

@ -14,23 +14,38 @@ Prophet includes functionality for time series cross validation to measure forec
This cross validation procedure can be done automatically for a range of historical cutoffs using the `cross_validation` function. We specify the forecast horizon (`horizon`), and then optionally the size of the initial training period (`initial`) and the spacing between cutoff dates (`period`). By default, the initial training period is set to three times the horizon, and cutoffs are made every half a horizon.
The output of `cross_validation` is a dataframe with the true values `y` and the out-of-sample forecast values `yhat`, at each simulated forecast date and for each cutoff date. This dataframe can then be used to compute error measures of `yhat` vs. `y`.
The output of `cross_validation` is a dataframe with the true values `y` and the out-of-sample forecast values `yhat`, at each simulated forecast date and for each cutoff date. In particular, a forecast is made for every observed point between `cutoff` and `cutoff + horizon`. This dataframe can then be used to compute error measures of `yhat` vs. `y`.
Here we do cross-validation to assess prediction performance on a horizon of 365 days, starting with 730 days of training data in the first cutoff and then making predictions every 180 days. On this 8 year time series, this corresponds to 11 total forecasts.
```R
# R
df.cv <- cross_validation(m, horizon = 730, units = 'days')
df.cv <- cross_validation(m, initial = 730, period = 180, horizon = 365, units = 'days')
head(df.cv)
```
```python
# Python
from fbprophet.diagnostics import cross_validation
df_cv = cross_validation(m, horizon = '730 days')
df_cv = cross_validation(m, initial='730 days', period='180 days', horizon = '365 days')
df_cv.head()
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
@ -46,51 +61,164 @@ df_cv.head()
<tbody>
<tr>
<th>0</th>
<td>2014-01-21</td>
<td>9.439510</td>
<td>8.799215</td>
<td>10.080240</td>
<td>10.542574</td>
<td>2014-01-20</td>
<td>2010-02-16</td>
<td>8.957184</td>
<td>8.438130</td>
<td>9.431683</td>
<td>8.242493</td>
<td>2010-02-15</td>
</tr>
<tr>
<th>1</th>
<td>2014-01-22</td>
<td>9.267086</td>
<td>8.645900</td>
<td>9.882225</td>
<td>10.004283</td>
<td>2014-01-20</td>
<td>2010-02-17</td>
<td>8.723619</td>
<td>8.228941</td>
<td>9.225985</td>
<td>8.008033</td>
<td>2010-02-15</td>
</tr>
<tr>
<th>2</th>
<td>2014-01-23</td>
<td>9.263447</td>
<td>8.628803</td>
<td>9.852847</td>
<td>9.732818</td>
<td>2014-01-20</td>
<td>2010-02-18</td>
<td>8.607378</td>
<td>8.086717</td>
<td>9.125563</td>
<td>8.045268</td>
<td>2010-02-15</td>
</tr>
<tr>
<th>3</th>
<td>2014-01-24</td>
<td>9.277452</td>
<td>8.693226</td>
<td>9.897891</td>
<td>9.866460</td>
<td>2014-01-20</td>
<td>2010-02-19</td>
<td>8.529250</td>
<td>8.053584</td>
<td>9.056437</td>
<td>7.928766</td>
<td>2010-02-15</td>
</tr>
<tr>
<th>4</th>
<td>2014-01-25</td>
<td>9.087565</td>
<td>8.447306</td>
<td>9.728898</td>
<td>9.370927</td>
<td>2014-01-20</td>
<td>2010-02-20</td>
<td>8.271228</td>
<td>7.748368</td>
<td>8.756539</td>
<td>7.745003</td>
<td>2010-02-15</td>
</tr>
</tbody>
</table>
</div>
The `performance_metrics` utility can be used to compute some useful statistics of the prediction performance (`yhat`, `yhat_lower`, and `yhat_upper` compared to `y`), as a function of the distance from the cutoff (how far into the future the prediction was). The statistics computed are mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), mean absolute percent error (MAPE), and coverage of the `yhat_lower` and `yhat_upper` estimates. These are computed on a rolling window of the predictions in `df_cv` after sorting by horizon (`ds` minus `cutoff`). By default 10% of the predictions will be included in each window, but this can be changed with the `rolling_window` argument.
```R
# R
df.p <- performance_metrics(df.cv)
head(df.p)
```
```python
# Python
from fbprophet.diagnostics import performance_metrics
df_p = performance_metrics(df_cv)
df_p.head()
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>horizon</th>
<th>mse</th>
<th>rmse</th>
<th>mae</th>
<th>mape</th>
<th>coverage</th>
</tr>
</thead>
<tbody>
<tr>
<th>3297</th>
<td>37 days</td>
<td>0.481970</td>
<td>0.694241</td>
<td>0.502930</td>
<td>0.058371</td>
<td>0.673367</td>
</tr>
<tr>
<th>35</th>
<td>37 days</td>
<td>0.480991</td>
<td>0.693535</td>
<td>0.502007</td>
<td>0.058262</td>
<td>0.675879</td>
</tr>
<tr>
<th>2207</th>
<td>37 days</td>
<td>0.480936</td>
<td>0.693496</td>
<td>0.501928</td>
<td>0.058257</td>
<td>0.675879</td>
</tr>
<tr>
<th>2934</th>
<td>37 days</td>
<td>0.481455</td>
<td>0.693870</td>
<td>0.502999</td>
<td>0.058393</td>
<td>0.675879</td>
</tr>
<tr>
<th>393</th>
<td>37 days</td>
<td>0.483990</td>
<td>0.695694</td>
<td>0.503418</td>
<td>0.058494</td>
<td>0.675879</td>
</tr>
</tbody>
</table>
</div>
Cross validation performance metrics can be visualized with `plot_cross_validation_metric`, here shown for MAPE. Dots show the absolute percent error for each prediction in `df_cv`. The blue line shows the MAPE, where the mean is taken over a rolling window of the dots. We see for this forecast that errors around 5% are typical for predictions one month into the future, and that errors increase up to around 11% for predictions that are a year out.
```R
# R
plot_cross_validation_metric(df.cv, metric = 'mape')
```
```python
# Python
from fbprophet.plot import plot_cross_validation_metric
fig = plot_cross_validation_metric(df_cv, metric='mape')
```
![png](/prophet/static/diagnostics_files/diagnostics_12_0.png)
The size of the rolling window in the figure can be changed with the optional argument `rolling_window`, which specifies the proportion of forecasts to use in each rolling window. The default is 0.1, corresponding to 10% of rows from `df_cv` included in each window; increasing this will lead to a smoother average curve in the figure.
When using cross validation on a model with extra regressors, the cross validation will exit with an error if the extra regressor is constant in the simulated history. The `initial` period should be long enough for the extra regressor to take on multiple values. Similarly, the initial period should be long enough to capture any seasonalities that are included in the model: at least a year for yearly seasonality, at least a week for weekly seasonality, etc.

View file

@ -0,0 +1,81 @@
---
layout: docs
docid: "multiplicative_seasonality"
title: "Multiplicative Seasonality"
permalink: /docs/multiplicative_seasonality.html
---
By default Prophet fits additive seasonalities, meaning the effect of the seasonality is added to the trend to get the forecast. This time series of the number of air passengers is an example of when additive seasonality does not work:
```R
# R
df <- read.csv('../examples/example_air_passengers.csv')
m <- prophet(df)
future <- make_future_dataframe(m, 50, freq = 'm')
forecast <- predict(m, future)
plot(m, forecast)
```
```python
# Python
df = pd.read_csv('../examples/example_air_passengers.csv')
m = Prophet()
m.fit(df)
future = m.make_future_dataframe(50, freq='MS')
forecast = m.predict(future)
fig = m.plot(forecast)
```
![png](/prophet/static/multiplicative_seasonality_files/multiplicative_seasonality_4_0.png)
This time series has a clear yearly cycle, but the seasonality in the forecast is too large at the start of the time series and too small at the end. In this time series, the seasonality is not a constant additive factor as assumed by Prophet, rather it grows with the trend. This is multiplicative seasonality.
Prophet can model multiplicative seasonality by setting `seasonality_mode='multiplicative'` in the input arguments:
```R
# R
m <- prophet(df, seasonality.mode = 'multiplicative')
forecast <- predict(m, future)
plot(m, forecast)
```
```python
# Python
m = Prophet(seasonality_mode='multiplicative')
m.fit(df)
forecast = m.predict(future)
fig = m.plot(forecast)
```
![png](/prophet/static/multiplicative_seasonality_files/multiplicative_seasonality_7_0.png)
The components figure will now show the seasonality as a percent of the trend:
```R
# R
prophet_plot_components(m, forecast)
```
```python
# Python
fig = m.plot_components(forecast)
```
![png](/prophet/static/multiplicative_seasonality_files/multiplicative_seasonality_10_0.png)
With `seasonality_mode='multiplicative'`, holiday effects will also be modeled as multiplicative. Any added seasonalities or extra regressors will by default use whatever `seasonality_mode` is set to, but can be overriden by specifying `mode='additive'` or `mode='multiplicative'` as an argument when adding the seasonality or regressor.
For example, this block sets the built-in seasonalities to multiplicative, but includes an additive quarterly seasonality and an additive regressor:
```R
# R
m <- prophet(seasonality.mode = 'multiplicative')
m <- add_seasonality(m, 'quarterly', period = 91.25, fourier.order = 8, mode = 'additive')
m <- add_regressor(m, 'regressor', mode = 'additive')
```
```python
# Python
m = Prophet(seasonality_mode='multiplicative')
m.add_seasonality('quarterly', period=91.25, fourier_order=8, mode='additive')
m.add_regressor('regressor', mode='additive')
```
Additive and multiplicative extra regressors will show up in separate panels on the components plot.

View file

@ -6,7 +6,7 @@ permalink: /docs/non-daily_data.html
---
## Sub-daily data
Prophet can make forecasts for time series with sub-daily observations by passing in a dataframe with timestamps in the `ds` column. When sub-daily data are used, daily seasonality will automatically be fit. Here we fit Prophet to data with 5-minute resolution (daily temperatures at Yosemite):
Prophet can make forecasts for time series with sub-daily observations by passing in a dataframe with timestamps in the `ds` column. The format of the timestamps should be YYYY-MM-DD HH:MM:SS - see the example csv [here](https://github.com/facebook/prophet/blob/master/examples/example_yosemite_temps.csv). When sub-daily data are used, daily seasonality will automatically be fit. Here we fit Prophet to data with 5-minute resolution (daily temperatures at Yosemite):
```R
# R
@ -14,7 +14,7 @@ df <- read.csv('../examples/example_yosemite_temps.csv')
m <- prophet(df, changepoint.prior.scale=0.01)
future <- make_future_dataframe(m, periods = 300, freq = 60 * 60)
fcst <- predict(m, future)
plot(m, fcst);
plot(m, fcst)
```
```python
# Python
@ -22,7 +22,7 @@ df = pd.read_csv('../examples/example_yosemite_temps.csv')
m = Prophet(changepoint_prior_scale=0.01).fit(df)
future = m.make_future_dataframe(periods=300, freq='H')
fcst = m.predict(future)
m.plot(fcst);
fig = m.plot(fcst)
```
![png](/prophet/static/non-daily_data_files/non-daily_data_4_0.png)
@ -36,12 +36,62 @@ prophet_plot_components(m, fcst)
```
```python
# Python
m.plot_components(fcst);
fig = m.plot_components(fcst)
```
![png](/prophet/static/non-daily_data_files/non-daily_data_7_0.png)
## Data with regular gaps
Suppose the dataset above only had observations from 12a to 6a:
```R
# R
df2 <- df %>%
mutate(ds = as.POSIXct(ds, tz="GMT")) %>%
filter(as.numeric(format(ds, "%H")) < 6)
m <- prophet(df2)
future <- make_future_dataframe(m, periods = 300, freq = 60 * 60)
fcst <- predict(m, future)
plot(m, fcst)
```
```python
# Python
df2 = df.copy()
df2['ds'] = pd.to_datetime(df2['ds'])
df2 = df2[df2['ds'].dt.hour < 6]
m = Prophet().fit(df2)
future = m.make_future_dataframe(periods=300, freq='H')
fcst = m.predict(future)
fig = m.plot(fcst)
```
![png](/prophet/static/non-daily_data_files/non-daily_data_10_0.png)
The forecast seems quite poor, with much larger fluctuations in the future than were seen in the history. The issue here is that we have fit a daily cycle to a time series that only has data for part of the day (12a to 6a). The daily seasonality is thus unconstrained for the remainder of the day and is not estimated well. The solution is to only make predictions for the time windows for which there are historical data. Here, that means to limit the `future` dataframe to have times from 12a to 6a:
```R
# R
future2 <- future %>%
filter(as.numeric(format(ds, "%H")) < 6)
fcst <- predict(m, future2)
plot(m, fcst)
```
```python
# Python
future2 = future.copy()
future2 = future2[future2['ds'].dt.hour < 6]
fcst = m.predict(future2)
fig = m.plot(fcst)
```
![png](/prophet/static/non-daily_data_files/non-daily_data_13_0.png)
The same principle applies to other datasets with regular gaps in the data. For example, if the history contains only weekdays, then predictions should only be made for weekdays since the weekly seasonality will not be well estimated for the weekends.
## Monthly data
You can use Prophet to fit monthly data. However, the underlying model is continuous-time, which means that you can get strange results if you fit the model to monthly data and then ask for daily forecasts. Here we forecast US retail sales volume for the next 10 years:
@ -49,24 +99,42 @@ You can use Prophet to fit monthly data. However, the underlying model is contin
```R
# R
df <- read.csv('../examples/example_retail_sales.csv')
m <- prophet(df)
m <- prophet(df, seasonality.mode = 'multiplicative')
future <- make_future_dataframe(m, periods = 3652)
fcst <- predict(m, future)
plot(m, fcst);
plot(m, fcst)
```
```python
# Python
df = pd.read_csv('../examples/example_retail_sales.csv')
m = Prophet().fit(df)
m = Prophet(seasonality_mode='multiplicative').fit(df)
future = m.make_future_dataframe(periods=3652)
fcst = m.predict(future)
m.plot(fcst);
fig = m.plot(fcst)
```
![png](/prophet/static/non-daily_data_files/non-daily_data_10_0.png)
![png](/prophet/static/non-daily_data_files/non-daily_data_16_0.png)
The forecast here seems very noisy. What's happening is that this particular data set only provides monthly data. When we fit the yearly seasonality, it only has data for the first of each month and the seasonality components for the remaining days are unidentifiable and overfit. When you are fitting Prophet to monthly data, only make monthly forecasts, which can be done by passing the frequency into make_future_dataframe:
This is the same issue from above where the dataset has regular gaps. When we fit the yearly seasonality, it only has data for the first of each month and the seasonality components for the remaining days are unidentifiable and overfit. This can be clearly seen by doing MCMC to see uncertainty in the seasonality:
```R
# R
m <- prophet(df, seasonality.mode = 'multiplicative', mcmc.samples = 300)
fcst <- predict(m, future)
prophet_plot_components(m, fcst)
```
```python
# Python
m = Prophet(seasonality_mode='multiplicative', mcmc_samples=300).fit(df)
fcst = m.predict(future)
fig = m.plot_components(fcst)
```
![png](/prophet/static/non-daily_data_files/non-daily_data_19_0.png)
The seasonality has low uncertainty at the start of each month where there are data points, but has very high posterior variance in between. When fitting Prophet to monthly data, only make monthly forecasts, which can be done by passing the frequency into `make_future_dataframe`:
```R
# R
@ -78,8 +146,8 @@ plot(m, fcst)
# Python
future = m.make_future_dataframe(periods=120, freq='M')
fcst = m.predict(future)
m.plot(fcst);
fig = m.plot(fcst)
```
![png](/prophet/static/non-daily_data_files/non-daily_data_13_0.png)
![png](/prophet/static/non-daily_data_files/non-daily_data_22_0.png)

View file

@ -8,22 +8,20 @@ There are two main ways that outliers can affect Prophet forecasts. Here we make
```R
# R
df <- read.csv('../examples/example_wp_R_outliers1.csv')
df$y <- log(df$y)
df <- read.csv('../examples/example_wp_log_R_outliers1.csv')
m <- prophet(df)
future <- make_future_dataframe(m, periods = 1096)
forecast <- predict(m, future)
plot(m, forecast);
plot(m, forecast)
```
```python
# Python
df = pd.read_csv('../examples/example_wp_R_outliers1.csv')
df['y'] = np.log(df['y'])
df = pd.read_csv('../examples/example_wp_log_R_outliers1.csv')
m = Prophet()
m.fit(df)
future = m.make_future_dataframe(periods=1096)
forecast = m.predict(future)
m.plot(forecast);
fig = m.plot(forecast)
```
![png](/prophet/static/outliers_files/outliers_4_0.png)
@ -40,13 +38,13 @@ outliers <- (as.Date(df$ds) > as.Date('2010-01-01')
df$y[outliers] = NA
m <- prophet(df)
forecast <- predict(m, future)
plot(m, forecast);
plot(m, forecast)
```
```python
# Python
df.loc[(df['ds'] > '2010-01-01') & (df['ds'] < '2011-01-01'), 'y'] = None
model = Prophet().fit(df)
model.plot(model.predict(future));
fig = model.plot(model.predict(future))
```
![png](/prophet/static/outliers_files/outliers_7_0.png)
@ -56,22 +54,20 @@ In the above example the outliers messed up the uncertainty estimation but did n
```R
# R
df <- read.csv('../examples/example_wp_R_outliers2.csv')
df$y = log(df$y)
df <- read.csv('../examples/example_wp_log_R_outliers2.csv')
m <- prophet(df)
future <- make_future_dataframe(m, periods = 1096)
forecast <- predict(m, future)
plot(m, forecast);
plot(m, forecast)
```
```python
# Python
df = pd.read_csv('../examples/example_wp_R_outliers2.csv')
df['y'] = np.log(df['y'])
df = pd.read_csv('../examples/example_wp_log_R_outliers2.csv')
m = Prophet()
m.fit(df)
future = m.make_future_dataframe(periods=1096)
forecast = m.predict(future)
m.plot(forecast);
fig = m.plot(forecast)
```
![png](/prophet/static/outliers_files/outliers_10_0.png)
@ -86,13 +82,13 @@ outliers <- (as.Date(df$ds) > as.Date('2015-06-01')
df$y[outliers] = NA
m <- prophet(df)
forecast <- predict(m, future)
plot(m, forecast);
plot(m, forecast)
```
```python
# Python
df.loc[(df['ds'] > '2015-06-01') & (df['ds'] < '2015-06-30'), 'y'] = None
m = Prophet().fit(df)
m.plot(m.predict(future));
fig = m.plot(m.predict(future))
```
![png](/prophet/static/outliers_files/outliers_13_0.png)

View file

@ -8,22 +8,20 @@ permalink: /docs/quick_start.html
Prophet follows the `sklearn` model API. We create an instance of the `Prophet` class and then call its `fit` and `predict` methods.
The input to Prophet is always a dataframe with two columns: `ds` and `y`. The `ds` (datestamp) column must contain a date or datetime (either is fine). The `y` column must be numeric, and represents the measurement we wish to forecast.
The input to Prophet is always a dataframe with two columns: `ds` and `y`. The `ds` (datestamp) column should be of a format expected by Pandas, ideally YYYY-MM-DD for a date or YYYY-MM-DD HH:MM:SS for a timestamp. The `y` column must be numeric, and represents the measurement we wish to forecast.
As an example, let's look at a time series of daily page views for the Wikipedia page for [Peyton Manning](https://en.wikipedia.org/wiki/Peyton_Manning). We scraped this data using the [Wikipediatrend](https://cran.r-project.org/web/packages/wikipediatrend/vignettes/using-wikipediatrend.html) package in R. Peyton Manning provides a nice example because it illustrates some of Prophet's features, like multiple seasonality, changing growth rates, and the ability to model special days (such as Manning's playoff and superbowl appearances). The CSV is available [here](https://github.com/facebook/prophet/blob/master/examples/example_wp_peyton_manning.csv).
As an example, let's look at a time series of the log daily page views for the Wikipedia page for [Peyton Manning](https://en.wikipedia.org/wiki/Peyton_Manning). We scraped this data using the [Wikipediatrend](https://cran.r-project.org/web/packages/wikipediatrend/vignettes/using-wikipediatrend.html) package in R. Peyton Manning provides a nice example because it illustrates some of Prophet's features, like multiple seasonality, changing growth rates, and the ability to model special days (such as Manning's playoff and superbowl appearances). The CSV is available [here](https://github.com/facebook/prophet/blob/master/examples/example_wp_log_peyton_manning.csv).
First we'll import the data and log-transform the y variable.
First we'll import the data:
```python
# Python
import pandas as pd
import numpy as np
from fbprophet import Prophet
```
```python
# Python
df = pd.read_csv('../examples/example_wp_peyton_manning.csv')
df['y'] = np.log(df['y'])
df = pd.read_csv('../examples/example_wp_log_peyton_manning.csv')
df.head()
```
@ -88,7 +86,7 @@ We fit the model by instantiating a new `Prophet` object. Any settings to the f
```python
# Python
m = Prophet()
m.fit(df);
m.fit(df)
```
Predictions are then made on a dataframe with a column `ds` containing the dates for which a prediction is to be made. You can get a suitable dataframe that extends into the future a specified number of days using the helper method `Prophet.make_future_dataframe`. By default it will also include the dates from the history, so we will see the model fit as well.
@ -186,37 +184,37 @@ forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail()
<tr>
<th>3265</th>
<td>2017-01-15</td>
<td>8.206753</td>
<td>7.432279</td>
<td>8.937916</td>
<td>8.199274</td>
<td>7.489884</td>
<td>8.969065</td>
</tr>
<tr>
<th>3266</th>
<td>2017-01-16</td>
<td>8.531766</td>
<td>7.791623</td>
<td>9.279194</td>
<td>8.524244</td>
<td>7.790682</td>
<td>9.266504</td>
</tr>
<tr>
<th>3267</th>
<td>2017-01-17</td>
<td>8.319156</td>
<td>7.601640</td>
<td>9.077195</td>
<td>8.311615</td>
<td>7.553025</td>
<td>9.049803</td>
</tr>
<tr>
<th>3268</th>
<td>2017-01-18</td>
<td>8.151772</td>
<td>7.436613</td>
<td>8.895926</td>
<td>8.144232</td>
<td>7.428174</td>
<td>8.864747</td>
</tr>
<tr>
<th>3269</th>
<td>2017-01-19</td>
<td>8.163690</td>
<td>7.472291</td>
<td>8.926861</td>
<td>8.156091</td>
<td>7.395160</td>
<td>8.883232</td>
</tr>
</tbody>
</table>
@ -244,7 +242,7 @@ fig2 = m.plot_components(forecast)
![png](/prophet/static/quick_start_files/quick_start_14_0.png)
More details about the options available for each method are available in the docstrings, for example, via `help(Prophet)` or `help(Prophet.fit)`.
More details about the options available for each method are available in the docstrings, for example, via `help(Prophet)` or `help(Prophet.fit)`. The [R reference manual](https://cran.r-project.org/web/packages/prophet/prophet.pdf) on CRAN provides a concise list of all of the available functions, each of which has a Python equivalent.
## R API
@ -253,14 +251,12 @@ In R, we use the normal model fitting API. We provide a `prophet` function that
```R
# R
library(prophet)
library(dplyr)
```
First we read in the data and create the outcome variable. As in the Python API, this is a dataframe with columns `ds` and `y`, containing the date and numeric value respectively. As above, we use here the log number of views to Petyon Manning's Wikipedia page, available [here](https://github.com/facebook/prophet/blob/master/examples/example_wp_peyton_manning.csv).
First we read in the data and create the outcome variable. As in the Python API, this is a dataframe with columns `ds` and `y`, containing the date and numeric value respectively. The ds column should be YYYY-MM-DD for a date, or YYYY-MM-DD HH:MM:SS for a timestamp. As above, we use here the log number of views to Petyon Manning's Wikipedia page, available [here](https://github.com/facebook/prophet/blob/master/examples/example_wp_log_peyton_manning.csv).
```R
# R
df <- read.csv('../examples/example_wp_peyton_manning.csv') %>%
mutate(y = log(y))
df <- read.csv('../examples/example_wp_log_peyton_manning.csv')
```
We call the `prophet` function to fit the model. The first argument is the historical dataframe. Additional arguments control how Prophet fits the data and are described in later pages of this documentation.
@ -295,12 +291,12 @@ tail(forecast[c('ds', 'yhat', 'yhat_lower', 'yhat_upper')])
```
ds yhat yhat_lower yhat_upper
3265 2017-01-14 7.825609 7.183818 8.488012
3266 2017-01-15 8.207400 7.478778 8.951113
3267 2017-01-16 8.532394 7.826360 9.240482
3268 2017-01-17 8.319785 7.596815 9.042505
3269 2017-01-18 8.152424 7.440858 8.874581
3270 2017-01-19 8.164327 7.419148 8.882906
3265 2017-01-14 7.824163 7.127881 8.609668
3266 2017-01-15 8.205942 7.452071 8.904387
3267 2017-01-16 8.530942 7.742400 9.300974
3268 2017-01-17 8.318327 7.606534 9.071184
3269 2017-01-18 8.150948 7.440224 8.902922
3270 2017-01-19 8.162839 7.385953 8.890669
@ -324,4 +320,6 @@ prophet_plot_components(m, forecast)
![png](/prophet/static/quick_start_files/quick_start_29_0.png)
An interactive plot of the forecast using Dygraphs can be made with the command `dyplot.prophet(m, forecast)`.
More details about the options available for each method are available in the docstrings, for example, via `?prophet` or `?fit.prophet`. This documentation is also available in the [reference manual](https://cran.r-project.org/web/packages/prophet/prophet.pdf) on CRAN.

View file

@ -10,40 +10,37 @@ By default, Prophet uses a linear model for its forecast. When forecasting growt
Prophet allows you to make forecasts using a [logistic growth](https://en.wikipedia.org/wiki/Logistic_function) trend model, with a specified carrying capacity. We illustrate this with the log number of page visits to the [R (programming language)](https://en.wikipedia.org/wiki/R_%28programming_language%29) page on Wikipedia:
```python
# Python
df = pd.read_csv('../examples/example_wp_R.csv')
import numpy as np
df['y'] = np.log(df['y'])
```
```R
# R
df <- read.csv('../examples/example_wp_R.csv')
df$y <- log(df$y)
df <- read.csv('../examples/example_wp_log_R.csv')
```
```python
# Python
df = pd.read_csv('../examples/example_wp_log_R.csv')
```
We must specify the carrying capacity in a column `cap`. Here we will assume a particular value, but this would usually be set using data or expertise about the market size.
```python
# Python
df['cap'] = 8.5
```
```R
# R
df$cap <- 8.5
```
```python
# Python
df['cap'] = 8.5
```
The important things to note are that `cap` must be specified for every row in the dataframe, and that it does not have to be constant. If the market size is growing, then `cap` can be an increasing sequence.
We then fit the model as before, except pass in an additional argument to specify logistic growth:
```R
# R
m <- prophet(df, growth = 'logistic')
```
```python
# Python
m = Prophet(growth='logistic')
m.fit(df)
```
```R
# R
m <- prophet(df, growth = 'logistic')
```
We make a dataframe for future predictions as before, except we must also specify the capacity in the future. Here we keep capacity constant at the same value as in the history, and forecast 3 years into the future:
```R
@ -51,14 +48,14 @@ We make a dataframe for future predictions as before, except we must also specif
future <- make_future_dataframe(m, periods = 1826)
future$cap <- 8.5
fcst <- predict(m, future)
plot(m, fcst);
plot(m, fcst)
```
```python
# Python
future = m.make_future_dataframe(periods=1826)
future['cap'] = 8.5
fcst = m.predict(future)
m.plot(fcst);
fig = m.plot(fcst)
```
![png](/prophet/static/saturating_forecasts_files/saturating_forecasts_13_0.png)
@ -91,7 +88,7 @@ future['floor'] = 1.5
m = Prophet(growth='logistic')
m.fit(df)
fcst = m.predict(future)
m.plot(fcst);
fig = m.plot(fcst)
```
![png](/prophet/static/saturating_forecasts_files/saturating_forecasts_16_0.png)

View file

@ -1,36 +1,9 @@
---
layout: docs
docid: "seasonality_and_holiday_effects"
title: "Seasonality And Holiday Effects"
permalink: /docs/seasonality_and_holiday_effects.html
docid: "seasonality,_holiday_effects,_and_regressors"
title: "Seasonality, Holiday Effects, And Regressors"
permalink: /docs/seasonality,_holiday_effects,_and_regressors.html
---
### Specifying Seasonalities
Prophet will by default fit weekly and yearly seasonalities, if the time series is more than two cycles long. It will also fit daily seasonality for a sub-daily time series. You can add other seasonalities (monthly, quarterly, hourly) using the `add_seasonality` method (Python) or function (R).
The inputs to this function are a name, the period of the seasonality in days, and the number of Fourier terms for the seasonality. Increasing the number of Fourier terms allows the seasonality to fit faster changing cycles, but can also lead to overfitting: $N$ Fourier terms corresponds to $2N$ variables used for modeling the cycle. For reference, by default Prophet uses 3 terms for weekly seasonality and 10 for yearly seasonality. An optional input to `add_seasonality` is the prior scale for that seasonal component - this is discussed below.
As an example, here we fit the Peyton Manning data from the Quickstart, but replace the weekly seasonality with monthly seasonality. The monthly seasonality then will appear in the components plot:
```R
# R
m <- prophet(weekly.seasonality=FALSE)
m <- add_seasonality(m, name='monthly', period=30.5, fourier.order=5)
m <- fit.prophet(m, df)
forecast <- predict(m, future)
prophet_plot_components(m, forecast)
```
```python
# Python
m = Prophet(weekly_seasonality=False)
m.add_seasonality(name='monthly', period=30.5, fourier_order=5)
forecast = m.fit(df).predict(future)
m.plot_components(forecast);
```
![png](/prophet/static/seasonality_and_holiday_effects_files/seasonality_and_holiday_effects_4_0.png)
### Modeling Holidays and Special Events
If you have holidays or other recurring events that you'd like to model, you must create a dataframe for them. It has two columns (`holiday` and `ds`) and a row for each occurrence of the holiday. It must include all occurrences of the holiday, both in the past (back as far as the historical data go) and in the future (out as far as the forecast is being made). If they won't repeat in the future, Prophet will model them and then not include them in the forecast.
@ -38,26 +11,6 @@ You can also include columns `lower_window` and `upper_window` which extend the
Here we create a dataframe that includes the dates of all of Peyton Manning's playoff appearances:
```python
# Python
playoffs = pd.DataFrame({
'holiday': 'playoff',
'ds': pd.to_datetime(['2008-01-13', '2009-01-03', '2010-01-16',
'2010-01-24', '2010-02-07', '2011-01-08',
'2013-01-12', '2014-01-12', '2014-01-19',
'2014-02-02', '2015-01-11', '2016-01-17',
'2016-01-24', '2016-02-07']),
'lower_window': 0,
'upper_window': 1,
})
superbowls = pd.DataFrame({
'holiday': 'superbowl',
'ds': pd.to_datetime(['2010-02-07', '2014-02-02', '2016-02-07']),
'lower_window': 0,
'upper_window': 1,
})
holidays = pd.concat((playoffs, superbowls))
```
```R
# R
library(dplyr)
@ -79,20 +32,40 @@ superbowls <- data_frame(
)
holidays <- bind_rows(playoffs, superbowls)
```
```python
# Python
playoffs = pd.DataFrame({
'holiday': 'playoff',
'ds': pd.to_datetime(['2008-01-13', '2009-01-03', '2010-01-16',
'2010-01-24', '2010-02-07', '2011-01-08',
'2013-01-12', '2014-01-12', '2014-01-19',
'2014-02-02', '2015-01-11', '2016-01-17',
'2016-01-24', '2016-02-07']),
'lower_window': 0,
'upper_window': 1,
})
superbowls = pd.DataFrame({
'holiday': 'superbowl',
'ds': pd.to_datetime(['2010-02-07', '2014-02-02', '2016-02-07']),
'lower_window': 0,
'upper_window': 1,
})
holidays = pd.concat((playoffs, superbowls))
```
Above we have include the superbowl days as both playoff games and superbowl games. This means that the superbowl effect will be an additional additive bonus on top of the playoff effect.
Once the table is created, holiday effects are included in the forecast by passing them in with the `holidays` argument. Here we do it with the Peyton Manning data from the Quickstart:
```python
# Python
m = Prophet(holidays=holidays)
forecast = m.fit(df).predict(future)
```
```R
# R
m <- prophet(df, holidays = holidays)
forecast <- predict(m, future)
```
```python
# Python
m = Prophet(holidays=holidays)
forecast = m.fit(df).predict(future)
```
The holiday effect can be seen in the `forecast` dataframe:
```R
@ -111,6 +84,19 @@ forecast[(forecast['playoff'] + forecast['superbowl']).abs() > 0][
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
@ -124,62 +110,62 @@ forecast[(forecast['playoff'] + forecast['superbowl']).abs() > 0][
<tr>
<th>2190</th>
<td>2014-02-02</td>
<td>1.226679</td>
<td>1.192500</td>
<td>1.229999</td>
<td>1.176410</td>
</tr>
<tr>
<th>2191</th>
<td>2014-02-03</td>
<td>1.911294</td>
<td>1.373781</td>
<td>1.900543</td>
<td>1.486962</td>
</tr>
<tr>
<th>2532</th>
<td>2015-01-11</td>
<td>1.226679</td>
<td>1.229999</td>
<td>0.000000</td>
</tr>
<tr>
<th>2533</th>
<td>2015-01-12</td>
<td>1.911294</td>
<td>1.900543</td>
<td>0.000000</td>
</tr>
<tr>
<th>2901</th>
<td>2016-01-17</td>
<td>1.226679</td>
<td>1.229999</td>
<td>0.000000</td>
</tr>
<tr>
<th>2902</th>
<td>2016-01-18</td>
<td>1.911294</td>
<td>1.900543</td>
<td>0.000000</td>
</tr>
<tr>
<th>2908</th>
<td>2016-01-24</td>
<td>1.226679</td>
<td>1.229999</td>
<td>0.000000</td>
</tr>
<tr>
<th>2909</th>
<td>2016-01-25</td>
<td>1.911294</td>
<td>1.900543</td>
<td>0.000000</td>
</tr>
<tr>
<th>2922</th>
<td>2016-02-07</td>
<td>1.226679</td>
<td>1.192500</td>
<td>1.229999</td>
<td>1.176410</td>
</tr>
<tr>
<th>2923</th>
<td>2016-02-08</td>
<td>1.911294</td>
<td>1.373781</td>
<td>1.900543</td>
<td>1.486962</td>
</tr>
</tbody>
</table>
@ -189,19 +175,84 @@ forecast[(forecast['playoff'] + forecast['superbowl']).abs() > 0][
The holiday effects will also show up in the components plot, where we see that there is a spike on the days around playoff appearances, with an especially large spike for the superbowl:
```python
# Python
m.plot_components(forecast);
```
```R
# R
prophet_plot_components(m, forecast);
prophet_plot_components(m, forecast)
```
```python
# Python
fig = m.plot_components(forecast)
```
![png](/prophet/static/seasonality_and_holiday_effects_files/seasonality_and_holiday_effects_16_0.png)
![png](/prophet/static/seasonality,_holiday_effects,_and_regressors_files/seasonality,_holiday_effects,_and_regressors_13_0.png)
Individual holidays can be plotted using the `plot_forecast_component` method (Python) or function (R). For example, `m.plot_forecast_component(forecast, 'superbowl')` in Python and `plot_forecast_component(forecast, 'superbowl')` in R to plot just the superbowl holiday component.
Individual holidays can be plotted using the `plot_forecast_component` function (imported from `fbprophet.plot` in Python) like `plot_forecast_component(forecast, 'superbowl')` to plot just the superbowl holiday component.
### Fourier Order for Seasonalities
Seasonalities are estimated using a partial Fourier sum. See [the paper](https://peerj.com/preprints/3190/) for complete details, and [this figure on Wikipedia](https://en.wikipedia.org/wiki/Fourier_series#/media/File:Fourier_Series.svg) for an illustration of how a partial Fourier sum can approximate an aribtrary periodic signal. The number of terms in the partial sum (the order) is a parameter that determines how quickly the seasonality can change. To illustrate this, consider the Peyton Manning data from the Quickstart. The default Fourier order for yearly seasonality is 10, which produces this fit:
```R
# R
m <- prophet(df)
prophet:::plot_yearly(m)
```
```python
# Python
from fbprophet.plot import plot_yearly
m = Prophet().fit(df)
a = plot_yearly(m)
```
![png](/prophet/static/seasonality,_holiday_effects,_and_regressors_files/seasonality,_holiday_effects,_and_regressors_17_0.png)
The default values are often appropriate, but they can be increased when the seasonality needs to fit higher-frequency changes, and generally be less smooth. The Fourier order can be specified for each built-in seasonality when instantiating the model, here it is increased to 20:
```R
# R
m <- prophet(df, yearly.seasonality = 20)
prophet:::plot_yearly(m)
```
```python
# Python
from fbprophet.plot import plot_yearly
m = Prophet(yearly_seasonality=20).fit(df)
a = plot_yearly(m)
```
![png](/prophet/static/seasonality,_holiday_effects,_and_regressors_files/seasonality,_holiday_effects,_and_regressors_20_0.png)
Increasing the number of Fourier terms allows the seasonality to fit faster changing cycles, but can also lead to overfitting: N Fourier terms corresponds to 2N variables used for modeling the cycle
### Specifying Custom Seasonalities
Prophet will by default fit weekly and yearly seasonalities, if the time series is more than two cycles long. It will also fit daily seasonality for a sub-daily time series. You can add other seasonalities (monthly, quarterly, hourly) using the `add_seasonality` method (Python) or function (R).
The inputs to this function are a name, the period of the seasonality in days, and the Fourier order for the seasonality. For reference, by default Prophet uses a Fourier order of 3 for weekly seasonality and 10 for yearly seasonality. An optional input to `add_seasonality` is the prior scale for that seasonal component - this is discussed below.
As an example, here we fit the Peyton Manning data from the Quickstart, but replace the weekly seasonality with monthly seasonality. The monthly seasonality then will appear in the components plot:
```R
# R
m <- prophet(weekly.seasonality=FALSE)
m <- add_seasonality(m, name='monthly', period=30.5, fourier.order=5)
m <- fit.prophet(m, df)
forecast <- predict(m, future)
prophet_plot_components(m, forecast)
```
```python
# Python
m = Prophet(weekly_seasonality=False)
m.add_seasonality(name='monthly', period=30.5, fourier_order=5)
forecast = m.fit(df).predict(future)
fig = m.plot_components(forecast)
```
![png](/prophet/static/seasonality,_holiday_effects,_and_regressors_files/seasonality,_holiday_effects,_and_regressors_23_0.png)
### Prior scale for holidays and seasonality
If you find that the holidays are overfitting, you can adjust their prior scale to smooth them using the parameter `holidays_prior_scale`. By default this parameter is 10, which provides very little regularization. Reducing this parameter dampens holiday effects:
@ -226,6 +277,19 @@ forecast[(forecast['playoff'] + forecast['superbowl']).abs() > 0][
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
@ -239,62 +303,62 @@ forecast[(forecast['playoff'] + forecast['superbowl']).abs() > 0][
<tr>
<th>2190</th>
<td>2014-02-02</td>
<td>1.200631</td>
<td>0.957093</td>
<td>1.205344</td>
<td>0.963327</td>
</tr>
<tr>
<th>2191</th>
<td>2014-02-03</td>
<td>1.841906</td>
<td>0.979777</td>
<td>1.851992</td>
<td>0.991010</td>
</tr>
<tr>
<th>2532</th>
<td>2015-01-11</td>
<td>1.200631</td>
<td>1.205344</td>
<td>0.000000</td>
</tr>
<tr>
<th>2533</th>
<td>2015-01-12</td>
<td>1.841906</td>
<td>1.851992</td>
<td>0.000000</td>
</tr>
<tr>
<th>2901</th>
<td>2016-01-17</td>
<td>1.200631</td>
<td>1.205344</td>
<td>0.000000</td>
</tr>
<tr>
<th>2902</th>
<td>2016-01-18</td>
<td>1.841906</td>
<td>1.851992</td>
<td>0.000000</td>
</tr>
<tr>
<th>2908</th>
<td>2016-01-24</td>
<td>1.200631</td>
<td>1.205344</td>
<td>0.000000</td>
</tr>
<tr>
<th>2909</th>
<td>2016-01-25</td>
<td>1.841906</td>
<td>1.851992</td>
<td>0.000000</td>
</tr>
<tr>
<th>2922</th>
<td>2016-02-07</td>
<td>1.200631</td>
<td>0.957093</td>
<td>1.205344</td>
<td>0.963327</td>
</tr>
<tr>
<th>2923</th>
<td>2016-02-08</td>
<td>1.841906</td>
<td>0.979777</td>
<td>1.851992</td>
<td>0.991010</td>
</tr>
</tbody>
</table>
@ -306,18 +370,19 @@ The magnitude of the holiday effect has been reduced compared to before, especia
Prior scales can be set separately for individual holidays by including a column `prior_scale` in the holidays dataframe. Prior scales for individual seasonalities can be passed as an argument to `add_seasonality`. For instance, the prior scale for just weekly seasonality can be set using:
```python
# Python
m = Prophet()
m.add_seasonality(
name='weekly', period=7, fourier_order=3, prior_scale=0.1);
```
```R
# R
m <- prophet()
m <- add_seasonality(
m, name='weekly', period=7, fourier.order=3, prior.scale=0.1)
```
```python
# Python
m = Prophet()
m.add_seasonality(
name='weekly', period=7, fourier_order=3, prior_scale=0.1)
```
### Additional regressors
Additional regressors can be added to the linear part of the model using the `add_regressor` method or function. A column with the regressor value will need to be present in both the fitting and prediction dataframes. For example, we can add an additional effect on Sundays during the NFL season. On the components plot, this effect will show up in the 'extra_regressors' plot:
@ -356,12 +421,16 @@ m.fit(df)
future['nfl_sunday'] = future['ds'].apply(nfl_sunday)
forecast = m.predict(future)
m.plot_components(forecast);
fig = m.plot_components(forecast)
```
![png](/prophet/static/seasonality_and_holiday_effects_files/seasonality_and_holiday_effects_26_0.png)
![png](/prophet/static/seasonality,_holiday_effects,_and_regressors_files/seasonality,_holiday_effects,_and_regressors_32_0.png)
NFL Sundays could also have been handled using the "holidays" interface described above, by creating a list of past and future NFL Sundays. The `add_regressor` function provides a more general interface for defining extra linear regressors, and in particular does not require that the regressor be a binary indicator. Another time series could be used as a regressor, although its future values would have to be known. The regressor cannot be constant in the training data; fitting will exit with an error if it is.
The `add_regressor` function has optional arguments for specifying the prior scale (holiday prior scale is used by default) and whether or not the regressor is standardized - see the docstring with `help(Prophet.add_regressor)` in Python and `?add_regressor` in R.
The `add_regressor` function has optional arguments for specifying the prior scale (holiday prior scale is used by default) and whether or not the regressor is standardized - see the docstring with `help(Prophet.add_regressor)` in Python and `?add_regressor` in R. Note that regressors must be added prior to model fitting.
The extra regressor must be known for both the history and for future dates. It thus must either be something that has known future values (such as `nfl_sunday`), or something that has separately been forecasted elsewhere. Prophet will also raise an error if the regressor is constant throughout the history, since there is nothing to fit from it.
Extra regressors are put in the linear component of the model, so the underlying model is that the time series depends on the extra regressor as either an additive or multiplicative factor (see the next section for multiplicativity).

View file

@ -19,7 +19,23 @@ Even though we have a lot of places where the rate can possibly change, because
![png](/prophet/static/trend_changepoints_files/trend_changepoints_6_0.png)
The number of potential changepoints can be set using the argument `n_changepoints`, but this is better tuned by adjusting the regularization.
The number of potential changepoints can be set using the argument `n_changepoints`, but this is better tuned by adjusting the regularization. The locations of the signification changepoints can be visualized with:
```R
# R
plot(m, forecast) + add_changepoints_to_plot(m)
```
```python
# Python
from fbprophet.plot import add_changepoints_to_plot
fig = m.plot(forecast)
a = add_changepoints_to_plot(fig.gca(), m, forecast)
```
![png](/prophet/static/trend_changepoints_files/trend_changepoints_9_0.png)
By default changepoints are only inferred for the first 80% of the time series in order to have plenty of runway for projecting the trend forward and to avoid overfitting fluctuations at the end of the time series. This default works in many situations but not all, and can be change using the `changepoint_range` argument. For example, `m = Prophet(changepoint_range=0.9)` in Python or `m <- prophet(changepoint.range = 0.9)` in R will place potential changepoints in the first 90% of the time series.
### Adjusting trend flexibility
If the trend changes are being overfit (too much flexibility) or underfit (not enough flexibility), you can adjust the strength of the sparse prior using the input argument `changepoint_prior_scale`. By default, this parameter is set to 0.05. Increasing it will make the trend *more* flexible:
@ -28,16 +44,16 @@ If the trend changes are being overfit (too much flexibility) or underfit (not e
# R
m <- prophet(df, changepoint.prior.scale = 0.5)
forecast <- predict(m, future)
plot(m, forecast);
plot(m, forecast)
```
```python
# Python
m = Prophet(changepoint_prior_scale=0.5)
forecast = m.fit(df).predict(future)
m.plot(forecast);
fig = m.plot(forecast)
```
![png](/prophet/static/trend_changepoints_files/trend_changepoints_10_0.png)
![png](/prophet/static/trend_changepoints_files/trend_changepoints_13_0.png)
Decreasing it will make the trend *less* flexible:
@ -46,34 +62,34 @@ Decreasing it will make the trend *less* flexible:
# R
m <- prophet(df, changepoint.prior.scale = 0.001)
forecast <- predict(m, future)
plot(m, forecast);
plot(m, forecast)
```
```python
# Python
m = Prophet(changepoint_prior_scale=0.001)
forecast = m.fit(df).predict(future)
m.plot(forecast);
fig = m.plot(forecast)
```
![png](/prophet/static/trend_changepoints_files/trend_changepoints_13_0.png)
![png](/prophet/static/trend_changepoints_files/trend_changepoints_16_0.png)
### Specifying the locations of the changepoints
If you wish, rather than using automatic changepoint detection you can manually specify the locations of potential changepoints with the `changepoints` argument.
If you wish, rather than using automatic changepoint detection you can manually specify the locations of potential changepoints with the `changepoints` argument. Slope changes will then be allowed only at these points, with the same sparse regularization as before. One could, for instance, create a grid of points as is done automatically, but then augment that grid with some specific dates that are known to be likely to have changes. As another example, the changepoints could be entirely limited to a small set of dates, as is done here:
```R
# R
m <- prophet(df, changepoints = c('2014-01-01'))
forecast <- predict(m, future)
plot(m, forecast);
plot(m, forecast)
```
```python
# Python
m = Prophet(changepoints=['2014-01-01'])
forecast = m.fit(df).predict(future)
m.plot(forecast);
fig = m.plot(forecast)
```
![png](/prophet/static/trend_changepoints_files/trend_changepoints_17_0.png)
![png](/prophet/static/trend_changepoints_files/trend_changepoints_20_0.png)

View file

@ -15,39 +15,39 @@ One property of this way of measuring uncertainty is that allowing higher flexib
The width of the uncertainty intervals (by default 80%) can be set using the parameter `interval_width`:
```python
# Python
forecast = Prophet(interval_width=0.95).fit(df).predict(future)
```
```R
# R
m <- prophet(df, interval.width = 0.95)
forecast <- predict(m, future)
```
```python
# Python
forecast = Prophet(interval_width=0.95).fit(df).predict(future)
```
Again, these intervals assume that the future will see the same frequency and magnitude of rate changes as the past. This assumption is probably not true, so you should not expect to get accurate coverage on these uncertainty intervals.
### Uncertainty in seasonality
By default Prophet will only return uncertainty in the trend and observation noise. To get uncertainty in seasonality, you must do full Bayesian sampling. This is done using the parameter `mcmc.samples` (which defaults to 0). We do this here for the Peyton Manning data from the Quickstart:
By default Prophet will only return uncertainty in the trend and observation noise. To get uncertainty in seasonality, you must do full Bayesian sampling. This is done using the parameter `mcmc.samples` (which defaults to 0). We do this here for the first six months of the Peyton Manning data from the Quickstart:
```python
# Python
m = Prophet(mcmc_samples=300)
forecast = m.fit(df).predict(future)
```
```R
# R
m <- prophet(df, mcmc.samples = 300)
forecast <- predict(m, future)
```
This replaces the typical MAP estimation with MCMC sampling, and takes much longer - think 10 minutes instead of 10 seconds. If you do full sampling, then you will see the uncertainty in seasonal components when you plot them:
```python
# Python
m.plot_components(forecast);
m = Prophet(mcmc_samples=300)
forecast = m.fit(df).predict(future)
```
This replaces the typical MAP estimation with MCMC sampling, and can take much longer depending on how many observations there are - expect several minutes instead of several seconds. If you do full sampling, then you will see the uncertainty in seasonal components when you plot them:
```R
# R
prophet_plot_components(m, forecast);
prophet_plot_components(m, forecast)
```
```python
# Python
fig = m.plot_components(forecast)
```
![png](/prophet/static/uncertainty_intervals_files/uncertainty_intervals_10_0.png)

View file

@ -4,7 +4,7 @@ title: Prophet
id: home
---
Prophet is a procedure for forecasting time series data. It is based on an additive model where non-linear trends are fit with yearly and weekly seasonality, plus holidays. It works best with daily periodicity data with at least one year of historical data. Prophet is robust to missing data, shifts in the trend, and large outliers.
Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.
Prophet is [open source software](https://code.facebook.com/projects/) released by Facebook's [Core Data Science team](https://research.fb.com/category/data-science/). It is available for download on [CRAN](https://cran.r-project.org/package=prophet) and [PyPI](https://pypi.python.org/pypi/fbprophet/).

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 65 KiB

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 45 KiB

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 62 KiB

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 156 KiB

After

Width:  |  Height:  |  Size: 159 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 193 KiB

After

Width:  |  Height:  |  Size: 194 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 37 KiB

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 48 KiB

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 94 KiB

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 49 KiB

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 70 KiB

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 26 KiB

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 38 KiB

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 73 KiB

After

Width:  |  Height:  |  Size: 73 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 116 KiB

After

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 86 KiB

After

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 41 KiB

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 71 KiB

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 71 KiB

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 86 KiB

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 106 KiB

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 107 KiB

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 10 KiB

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 104 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 112 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 83 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 45 KiB

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 72 KiB

After

Width:  |  Height:  |  Size: 23 KiB

View file

@ -506,7 +506,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The seasonality has low uncertainty at the start of each month where there are data points, but has very high posterior variance in between. When fitting Prophet to monthly data, only make monthly forecasts, which can be done by passing the frequency into make_future_dataframe:"
"The seasonality has low uncertainty at the start of each month where there are data points, but has very high posterior variance in between. When fitting Prophet to monthly data, only make monthly forecasts, which can be done by passing the frequency into `make_future_dataframe`:"
]
},
{
@ -570,7 +570,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.14+"
"version": "2.7.13"
}
},
"nbformat": 4,

View file

@ -394,7 +394,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"More details about the options available for each method are available in the docstrings, for example, via `help(Prophet)` or `help(Prophet.fit)`."
"More details about the options available for each method are available in the docstrings, for example, via `help(Prophet)` or `help(Prophet.fit)`. The [R reference manual](https://cran.r-project.org/web/packages/prophet/prophet.pdf) on CRAN provides a concise list of all of the available functions, each of which has a Python equivalent."
]
},
{
@ -612,7 +612,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.14+"
"version": "2.7.13"
}
},
"nbformat": 4,

View file

@ -357,9 +357,7 @@
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"output_hidden": true
},
"metadata": {},
"outputs": [
{
"data": {
@ -380,7 +378,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Individual holidays can be plotted using the `plot_forecast_component` method (Python) or function (R). For example, `m.plot_forecast_component(forecast, 'superbowl')` in Python and `plot_forecast_component(forecast, 'superbowl')` in R to plot just the superbowl holiday component."
"Individual holidays can be plotted using the `plot_forecast_component` function (imported from `fbprophet.plot` in Python) like `plot_forecast_component(forecast, 'superbowl')` to plot just the superbowl holiday component."
]
},
{
@ -511,7 +509,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Increasing the number of Fourier terms allows the seasonality to fit faster changing cycles, but can also lead to overfitting: $N$ Fourier terms corresponds to $2N$ variables used for modeling the cycle\n",
"Increasing the number of Fourier terms allows the seasonality to fit faster changing cycles, but can also lead to overfitting: N Fourier terms corresponds to 2N variables used for modeling the cycle\n",
"\n",
"### Specifying Custom Seasonalities\n",
"\n",
@ -801,6 +799,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Additional regressors\n",
"Additional regressors can be added to the linear part of the model using the `add_regressor` method or function. A column with the regressor value will need to be present in both the fitting and prediction dataframes. For example, we can add an additional effect on Sundays during the NFL season. On the components plot, this effect will show up in the 'extra_regressors' plot:"
]
@ -915,7 +914,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.14+"
"version": "2.7.13"
}
},
"nbformat": 4,