🟢 Time Series Analysis with Exponential Smoothing – Forecasting Monthly Sunspot Numbers

stemaway · November 5, 2024, 9:38pm

Time Series Analysis with Exponential Smoothing – Forecasting Monthly Sunspot Numbers

Objective

Create a forecasting model to predict monthly sunspot numbers using basic time series analysis techniques.

Learning Outcomes

By completing this project, you will:

Learn to visualize time series data
Understand seasonal decomposition
Implement simple exponential smoothing
Evaluate forecasting models

Prerequisites

Basic Python skills
Basic statistics understanding
Familiarity with pandas and matplotlib

Dataset Details

Monthly Sunspot Numbers Dataset: Link
Size: ~30KB
Monthly counts of sunspots from 1749 to 1983
Features: Month, Sunspot Count

Tools Required

# Core libraries
pip install pandas numpy matplotlib

# Statistical modeling
pip install statsmodels

Project Structure

sunspot_forecasting/
│
├── data/
│   └── monthly-sunspots.csv
│
├── src/
│   ├── data_processing.py
│   ├── model.py
│   └── evaluation.py
│
└── notebooks/
    ├── data_exploration.ipynb
    ├── time_series_decomposition.ipynb
    ├── model_implementation.ipynb

Steps and Tasks

1. Data Acquisition and Exploration

Load and visualize the sunspot data.

import pandas as pd
import matplotlib.pyplot as plt

# Load data
df = pd.read_csv('monthly-sunspots.csv', parse_dates=['Month'], index_col='Month')

# Plot time series
plt.figure(figsize=(12, 6))
plt.plot(df['Sunspots'])
plt.title('Monthly Sunspot Numbers')
plt.xlabel('Year')
plt.ylabel('Sunspot Count')
plt.show()

2. Time Series Decomposition

Analyze the components of the time series.

from statsmodels.tsa.seasonal import seasonal_decompose

# Decompose the time series
decomposition = seasonal_decompose(df['Sunspots'], model='additive', period=12)
decomposition.plot()
plt.show()

3. Forecasting with Simple Exponential Smoothing

Implement and evaluate a simple forecasting model.

from statsmodels.tsa.holtwinters import SimpleExpSmoothing

# Split data
train = df['Sunspots'][:-24]
test = df['Sunspots'][-24:]

# Fit the model
model = SimpleExpSmoothing(train).fit()

# Forecast
forecast = model.forecast(len(test))

# Plot forecast vs actual
plt.figure(figsize=(12, 6))
plt.plot(test.index, test, label='Actual')
plt.plot(test.index, forecast, label='Forecast', color='red')
plt.legend()
plt.show()

4. Model Evaluation

Assess the forecasting accuracy.

from sklearn.metrics import mean_squared_error

mse = mean_squared_error(test, forecast)
print(f'Mean Squared Error: {mse:.2f}')