Time Series Analysis with Exponential Smoothing – Forecasting Monthly Sunspot Numbers
Objective
Create a forecasting model to predict monthly sunspot numbers using basic time series analysis techniques.
Learning Outcomes
By completing this project, you will:
- Learn to visualize time series data
- Understand seasonal decomposition
- Implement simple exponential smoothing
- Evaluate forecasting models
Prerequisites
- Basic Python skills
- Basic statistics understanding
- Familiarity with pandas and matplotlib
Dataset Details
- Monthly Sunspot Numbers Dataset: Link
- Size: ~30KB
- Monthly counts of sunspots from 1749 to 1983
- Features: Month, Sunspot Count
Tools Required
# Core libraries
pip install pandas numpy matplotlib
# Statistical modeling
pip install statsmodels
Project Structure
sunspot_forecasting/
│
├── data/
│ └── monthly-sunspots.csv
│
├── src/
│ ├── data_processing.py
│ ├── model.py
│ └── evaluation.py
│
└── notebooks/
├── data_exploration.ipynb
├── time_series_decomposition.ipynb
├── model_implementation.ipynb
Steps and Tasks
1. Data Acquisition and Exploration
Load and visualize the sunspot data.
import pandas as pd
import matplotlib.pyplot as plt
# Load data
df = pd.read_csv('monthly-sunspots.csv', parse_dates=['Month'], index_col='Month')
# Plot time series
plt.figure(figsize=(12, 6))
plt.plot(df['Sunspots'])
plt.title('Monthly Sunspot Numbers')
plt.xlabel('Year')
plt.ylabel('Sunspot Count')
plt.show()
2. Time Series Decomposition
Analyze the components of the time series.
from statsmodels.tsa.seasonal import seasonal_decompose
# Decompose the time series
decomposition = seasonal_decompose(df['Sunspots'], model='additive', period=12)
decomposition.plot()
plt.show()
3. Forecasting with Simple Exponential Smoothing
Implement and evaluate a simple forecasting model.
from statsmodels.tsa.holtwinters import SimpleExpSmoothing
# Split data
train = df['Sunspots'][:-24]
test = df['Sunspots'][-24:]
# Fit the model
model = SimpleExpSmoothing(train).fit()
# Forecast
forecast = model.forecast(len(test))
# Plot forecast vs actual
plt.figure(figsize=(12, 6))
plt.plot(test.index, test, label='Actual')
plt.plot(test.index, forecast, label='Forecast', color='red')
plt.legend()
plt.show()
4. Model Evaluation
Assess the forecasting accuracy.
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(test, forecast)
print(f'Mean Squared Error: {mse:.2f}')