TimeSeriesSplit — no peeking at the future

All-Pro

Prompt

Re-evaluate the pipeline from Challenge 4 with TimeSeriesSplit instead of default KFold. The data should be sorted by (season, week) before splitting. Report mean R² ± std across 5 sequential folds.

Compare to your Challenge 4 result. Is the score better, worse, or about the same? Why does that make sense?

Expected output

KFold        R² = 0.0XX ± 0.0XX  (from Challenge 4)
TimeSeries   R² = 0.0XX ± 0.0XX

Plus a 2-sentence interpretation.

Hint

from sklearn.model_selection import TimeSeriesSplit

df = df.sort_values(['season', 'week']).reset_index(drop=True)
X, y = df[features], df['receiving_yards']

cv = TimeSeriesSplit(n_splits=5)
scores = cross_val_score(pipe, X, y, cv=cv, scoring='r2')

Solution

# (same imports / data prep as Challenge 4)
from sklearn.model_selection import TimeSeriesSplit

df = df.sort_values(['season', 'week']).reset_index(drop=True)
X = df[['prev_yards', 'prev_targets', 'position_group']]
y = df['receiving_yards']

preprocess = ColumnTransformer([...])      # same as Challenge 4
pipe = Pipeline([('prep', preprocess), ('model', LinearRegression())])

cv = TimeSeriesSplit(n_splits=5)
scores = cross_val_score(pipe, X, y, cv=cv, scoring='r2')
print(f"TimeSeries R² = {scores.mean():.3f} ± {scores.std():.3f}")

The TimeSeries score will usually be a touch lower than the KFold score. Two reasons: the model never sees later seasons during training (less data on average), and football changes year to year (rule tweaks, scheme shifts). KFold artificially inflated the score by allowing the model to peek at the future. The TimeSeries number is closer to what you’d actually get deploying the model on next week’s games.