ARIMA & Prophet
Classical and modern approaches to time series forecasting. Predict future values from historical trends, seasonality, and patterns.
Time Series Basics
A time series has three main components:
Trend
Long-term increase or decrease. Example: global temperatures rising over decades.
Seasonality
Repeating patterns at fixed intervals. Example: ice cream sales peaking every summer.
Noise (Residual)
Random fluctuations that can't be explained by trend or seasonality.
ARIMA: AutoRegressive Integrated Moving Average
ARIMA combines three components into one model, defined by three parameters (p, d, q):
ARIMA(p, d, q):
AR (AutoRegressive, p): Use past values to predict future
y_t = c + φ₁y_{t-1} + φ₂y_{t-2} + ... + φ_py_{t-p}
I (Integrated, d): Differencing to make data stationary
y'_t = y_t - y_{t-1} (d=1: first difference)
MA (Moving Average, q): Use past forecast errors
y_t = c + θ₁ε_{t-1} + θ₂ε_{t-2} + ... + θ_qε_{t-q}
p = number of lag observations (AR terms)
d = number of times differenced (to remove trend)
q = size of moving average window (MA terms)
Stationarity
ARIMA requires stationary data (constant mean and variance over time). Use the ADF (Augmented Dickey-Fuller) test to check:
from statsmodels.tsa.stattools import adfuller
import pandas as pd
import numpy as np
# Generate sample data with trend
np.random.seed(42)
dates = pd.date_range('2020-01-01', periods=200, freq='D')
data = np.cumsum(np.random.randn(200)) + np.arange(200) * 0.1 # trend + noise
ts = pd.Series(data, index=dates)
# ADF test
result = adfuller(ts)
print(f"ADF Statistic: {result[0]:.4f}")
print(f"p-value: {result[1]:.4f}")
# p-value > 0.05 → NOT stationary, need differencing (d > 0)
# After differencing:
ts_diff = ts.diff().dropna()
result = adfuller(ts_diff)
print(f"After diff - p-value: {result[1]:.4f}")
# p-value < 0.05 → stationary, use d=1
ARIMA Code
from statsmodels.tsa.arima.model import ARIMA
import matplotlib.pyplot as plt
# Fit ARIMA(2,1,2)
model = ARIMA(ts, order=(2, 1, 2))
fitted = model.fit()
print(fitted.summary())
# Forecast next 30 days
forecast = fitted.forecast(steps=30)
# Plot
plt.figure(figsize=(12, 5))
plt.plot(ts, label='Actual')
plt.plot(forecast.index, forecast.values, 'r--', label='Forecast')
plt.legend()
plt.title('ARIMA Forecast')
plt.show()
# Auto ARIMA (finds best p,d,q automatically)
# pip install pmdarima
from pmdarima import auto_arima
auto_model = auto_arima(ts, seasonal=False, trace=True)
print(f"Best order: {auto_model.order}")
SARIMA: Seasonal ARIMA
Extends ARIMA with seasonal components. SARIMA(p,d,q)(P,D,Q,s) where s = seasonal period (12 for monthly, 7 for daily):
# SARIMA with monthly seasonality
from statsmodels.tsa.statespace.sarimax import SARIMAX
model = SARIMAX(ts, order=(1,1,1), seasonal_order=(1,1,1,12))
fitted = model.fit()
forecast = fitted.forecast(steps=30)
Facebook Prophet
Developed by Facebook (Meta) for business forecasting. Designed to be easy to use, handle missing data, and detect holidays/events automatically.
Prophet is additive: y(t) = trend(t) + seasonality(t) + holidays(t) + error(t). Each component is modeled separately, making it interpretable.
Prophet Code
# pip install prophet
from prophet import Prophet
import pandas as pd
# Prophet requires columns named 'ds' (date) and 'y' (value)
df = pd.DataFrame({'ds': dates, 'y': data})
# Fit
model = Prophet(
yearly_seasonality=True,
weekly_seasonality=True,
daily_seasonality=False,
changepoint_prior_scale=0.05 # Flexibility of trend changes
)
model.fit(df)
# Forecast
future = model.make_future_dataframe(periods=30) # 30 days ahead
forecast = model.predict(future)
# Plot
model.plot(forecast)
model.plot_components(forecast) # Shows trend + seasonality separately
# Add holidays
holidays = pd.DataFrame({
'holiday': 'black_friday',
'ds': pd.to_datetime(['2020-11-27', '2021-11-26', '2022-11-25']),
'lower_window': -1, # 1 day before
'upper_window': 1 # 1 day after
})
model = Prophet(holidays=holidays)
model.fit(df)
ARIMA vs Prophet
| Feature | ARIMA | Prophet |
| Ease of use | Needs parameter tuning (p,d,q) | Works out of the box |
| Missing data | Can't handle gaps | Handles gracefully |
| Multiple seasonality | One seasonal period | Multiple (daily, weekly, yearly) |
| Holidays/events | No built-in support | Built-in holiday effects |
| Interpretability | Mathematical (coefficients) | Visual (component plots) |
| Best for | Short-term, stationary data | Business forecasting with trends |
Neither ARIMA nor Prophet can predict sudden events (pandemics, crashes). They work best when the future resembles the past.