Economic Time-Series Forecasting on AWS SageMaker

Overview

This project forecasts U.S. Consumer Price Index (CPI) values using a multi-model approach, ranging from classical statistical methods to gradient boosting. The primary objective was to evaluate model accuracy under a rigorous time-series validation framework and demonstrate production-level deployment on AWS SageMaker.

A key decision mid-project was pivoting away from LSTM after testing showed deep learning struggled with the small monthly sample size and non-stationary trend of CPI data — a deliberate architectural choice in favor of a more robust tree-based approach.

XGBoost CPI Forecast

Results

Performance evaluated on a 24-month holdout set (Dec 2023 – Nov 2025):

Model	MAE	RMSE
XGBoost (Winner)	1.83	2.0
Prophet	7.09	7.15
ARIMA (1,1,1)	8.04	9.29

Data

Source: Federal Reserve Economic Data (FRED)
Series: CPIAUCSL (Monthly)
Evaluation window: 24-month holdout

Technical Approach

Applied first-order differencing to remove trend bias — model predicts monthly delta rather than absolute CPI values
Engineered a 12-month sliding window of lag features to capture seasonal momentum and annual cycles
Benchmarked ARIMA and Prophet as statistical baselines before training XGBoost
Serialized the final model into model.tar.gz with a custom inference handler for SageMaker deployment
Deployment script intentionally separated from the notebook to avoid unnecessary cloud costs

Tech Stack

Python XGBoost ARIMA Prophet AWS SageMaker Boto3 Pandas Scikit-learn FRED API MLOps

View Full Repo on GitHub · Back to Portfolio