π Machine Learning for Time Series Forecasting
β 1. Why Use ML for Time Series?
Unlike statistical models (e.g., ARIMA), machine learning:
- Doesnβt assume linearity or stationarity
- Captures non-linear patterns and interactions
- Scales better for multivariate problems
β 2. Train/Test Split in Time Series
β Random split is invalid for time series!
- Time series data is sequential
- You must preserve temporal order
β Solution: Time-based split or Walk-Forward Validation
π Walk-Forward Validation Process:
- Train on
T1
, predictT2
- Update training set with
T2
, predictT3
- Repeat...
β Code Example:
from sklearn.model_selection import TimeSeriesSplit
tscv = TimeSeriesSplit(n_splits=5)
for train_idx, test_idx in tscv.split(X):
X_train, X_test = X.iloc[train_idx], X.iloc[test_idx]
y_train, y_test = y.iloc[train_idx], y.iloc[test_idx]
β 3. Feature Engineering for Time Series
π Key features:
- Lag features: past values (e.g.,
y(t-1)
,y(t-2)
) - Rolling stats: rolling mean, std
- Date/time: month, day, hour, weekday, etc.
β Lag Features Example:
for lag in range(1, 4):
df[f'lag_{lag}'] = df['value'].shift(lag)
β Rolling Stats Example:
df['rolling_mean'] = df['value'].rolling(3).mean()
df['rolling_std'] = df['value'].rolling(3).std()
β Time Features:
df['month'] = df.index.month
df['dayofweek'] = df.index.dayofweek
β 4. ML Models for Forecasting
Letβs build a regression model to predict the next time point.
βοΈ Target:
\[
\hat{y}_t = f(y_{t-1}, y_{t-2}, \ldots, \text{date features})
\]
π’ a. Linear Regression
β Code:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
rmse = mean_squared_error(y_test, y_pred, squared=False)
π Pros:
- Fast and interpretable
- Performs well on linear trends
π² b. Random Forest
A tree-based ensemble model that captures non-linear patterns.
β Code:
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor(n_estimators=100)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
β Tips:
- Handles non-linearities and interactions
- Feature importance is interpretable
β‘ c. XGBoost
Gradient Boosted Trees: powerful & state-of-the-art
β Code:
import xgboost as xgb
model = xgb.XGBRegressor()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
β Advantages:
- Handles missing values
- Fast and accurate
- Scalable
π 5. Multi-Step Forecasting
π§© Two Approaches:
-
Recursive Forecasting
-
Predict next step
-
Feed it back into model to predict next
-
Direct Forecasting
-
Train one model per step (e.g., t+1, t+2)
β Recursive Example:
def recursive_forecast(model, history, steps=5):
preds = []
for _ in range(steps):
input_data = history[-n_lags:].reshape(1, -1)
pred = model.predict(input_data)[0]
preds.append(pred)
history = np.append(history, pred)
return preds
β 6. Evaluation Metrics
Use these to assess accuracy:
- RMSE (Root Mean Squared Error):
\[
RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}
\]
- MAE (Mean Absolute Error):
\[
MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|
\]
- MAPE (Mean Absolute Percentage Error):
\[
MAPE = \frac{100}{n} \sum_{i=1}^{n} \left|\frac{y_i - \hat{y}_i}{y_i}\right|
\]
π§ Additional Tips
- Scale your features (e.g., with
MinMaxScaler
orStandardScaler
) - Always validate on future (not random) data
- Start with linear, move to trees, test neural nets later
π Tools
scikit-learn
: regression models, pipelinesxgboost
: fast boosted treeslightgbm
: efficient for large datasetsoptuna
: hyperparameter optimization
β Final Thoughts
ML Aspect | Classical Time Series |
---|---|
Lag-based features | β |
Handles multivariate | β |
Assumes stationarity | β |
Can model non-linearity | β |