Time Series Forecasting with ARIMA – Predicting Daily Minimum Temperatures
Objective
Build a simple time series forecasting model to predict daily minimum temperatures. Learn fundamental time series concepts and basic statistical forecasting techniques.
Learning Outcomes
By completing this project, you will:
- Understand time series analysis basics
- Implement simple statistical forecasting models
- Use essential time series libraries and tools
- Visualize and interpret time series data
- Evaluate model performance with basic metrics
Prerequisites
- Basic Python programming skills
- Basic statistics knowledge
- Familiarity with pandas and NumPy
- Basic understanding of plotting with matplotlib
Dataset Details
- Daily Minimum Temperatures Dataset: Link
- Size: ~20KB
- 10 years of daily temperatures from Melbourne (1981–1990)
- Features: Date, Temperature (°C)
Tools Required
# Core libraries
pip install pandas numpy matplotlib
# Statistical modeling
pip install statsmodels
Project Structure
temperature_forecasting/
│
├── data/
│ └── daily-min-temperatures.csv
│
├── src/
│ ├── data_processing.py
│ ├── model.py
│ └── evaluation.py
│
└── notebooks/
├── data_exploration.ipynb
├── model_implementation.ipynb
└── model_evaluation.ipynb
Steps and Tasks
1. Data Acquisition and Exploration
Load and explore the dataset.
import pandas as pd
import matplotlib.pyplot as plt
# Load data
df = pd.read_csv('daily-min-temperatures.csv', parse_dates=['Date'], index_col='Date')
# Plot time series
plt.figure(figsize=(12, 6))
plt.plot(df['Temp'])
plt.title('Daily Minimum Temperatures in Melbourne')
plt.xlabel('Date')
plt.ylabel('Temperature (°C)')
plt.show()
2. Basic Statistical Model Implementation
Implement a simple forecasting model using Moving Average.
# Calculate moving average
df['Moving_Average'] = df['Temp'].rolling(window=365).mean()
# Plot moving average
plt.figure(figsize=(12, 6))
plt.plot(df['Temp'], label='Original')
plt.plot(df['Moving_Average'], label='365-Day Moving Average', color='red')
plt.legend()
plt.show()
3. Forecasting with ARIMA
Use ARIMA for forecasting.
from statsmodels.tsa.arima.model import ARIMA
# Split data
train = df['Temp'][:'1989']
test = df['Temp']['1990':]
# Fit ARIMA model
model = ARIMA(train, order=(5,1,0))
model_fit = model.fit()
# Forecast
forecast = model_fit.forecast(steps=len(test))
# Plot forecast vs actual
plt.figure(figsize=(12, 6))
plt.plot(test.index, test, label='Actual')
plt.plot(test.index, forecast, label='Forecast', color='red')
plt.legend()
plt.show()
4. Model Evaluation
Evaluate the model’s performance.
from sklearn.metrics import mean_absolute_error
mae = mean_absolute_error(test, forecast)
print(f'Mean Absolute Error: {mae:.2f}°C')