🟢 Image Classification with CNNs: A Foundational Project

stemaway · October 27, 2024, 1:33pm

Image Classification with CNNs: A Foundational Project

Objective

The main goal of this project is to build a basic image classification model using Convolutional Neural Networks (CNNs). You’ll work with a standard dataset to understand the fundamentals of image processing, CNN architecture, and model evaluation. This project serves as a stepping stone to more advanced computer vision projects.

Learning Outcomes

By completing this project, you will:

Understand the basics of image classification and its applications
Learn how to preprocess and augment image data
Build and train a simple CNN from scratch using Keras and TensorFlow
Implement data visualization techniques for image data
Evaluate model performance using accuracy and loss metrics
Gain foundational skills to progress to intermediate-level computer vision projects

Prerequisites and Theoretical Foundations

1. Python Programming Basics

Variables, data types, and control structures
Functions and modules
Basic object-oriented programming

Click to view Python prerequisites code examples

# Variables and data types
number = 10
text = "Hello, World!"

# Control structures
for i in range(5):
    print(f"Iteration {i}")

# Functions
def add(a, b):
    return a + b

# Classes
class Greeting:
    def __init__(self, name):
        self.name = name
    
    def say_hello(self):
        return f"Hello, {self.name}!"

2. Basic Mathematics

Algebra (equations, functions)
Basic understanding of matrices and vectors
Elementary statistics (mean, standard deviation)

Click to view mathematical concepts with code

import numpy as np

# Mean and standard deviation
data = [1, 2, 3, 4, 5]
mean = np.mean(data)
std_dev = np.std(data)

# Matrix operations
matrix_a = np.array([[1, 2], [3, 4]])
matrix_b = np.array([[5, 6], [7, 8]])
matrix_sum = matrix_a + matrix_b
matrix_product = np.dot(matrix_a, matrix_b)

3. Introduction to Machine Learning

Understanding supervised learning
Concepts of training and testing datasets
Basic idea of model fitting and prediction

Click to view ML concepts with code

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

# Sample data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([5, 7, 9, 11, 13])

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Model training
model = LinearRegression()
model.fit(X_train, y_train)

# Prediction
predictions = model.predict(X_test)

4. Basic Concepts in Deep Learning

Understanding neural networks
Activation functions (ReLU, sigmoid)
Concept of epochs, batches, and iterations
Loss functions and optimization

Click to view deep learning concepts

Neural Networks Basics
- Neuron: Basic unit computing weighted sum and activation.
- Layers: Input, hidden, and output layers.
- Forward Pass: Calculating outputs from inputs through the network.
Activation Functions
- ReLU (Rectified Linear Unit): ( f(x) = \max(0, x) )
- Sigmoid: ( f(x) = \frac{1}{1 + e^{-x}} )
Training Process
- Loss Function: Measures the difference between predicted and actual values.
  - Example: Mean Squared Error (MSE)
- Optimizer: Algorithm to adjust weights to minimize loss.
  - Example: Stochastic Gradient Descent (SGD)
Key Terms
- Epoch: One complete pass through the training dataset.
- Batch Size: Number of samples processed before updating the model.
- Iteration: One update of the model’s parameters.

Skills Gained

Building and training basic convolutional neural networks
Preprocessing and augmenting image data
Using TensorFlow and Keras for deep learning tasks
Evaluating model performance using standard metrics
Visualizing images and training results using Matplotlib
Preparing for more advanced computer vision projects

Tools Required

Python 3.x
Jupyter Notebook or an IDE like Visual Studio Code

Libraries:

pip install tensorflow==2.x
pip install matplotlib
pip install numpy
pip install pandas

Steps and Tasks

1. Data Acquisition and Exploration

We’ll use the MNIST dataset, a collection of 70,000 grayscale images of handwritten digits (0-9).

Load the Dataset:

from tensorflow.keras.datasets import mnist

# Load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Check the shape
print(f"Training data shape: {X_train.shape}, Training labels shape: {y_train.shape}")
print(f"Test data shape: {X_test.shape}, Test labels shape: {y_test.shape}")

Data Exploration:

Click to view data exploration code

import matplotlib.pyplot as plt

# Visualize some examples from the dataset
def plot_sample_images(X, y, num_samples=5):
    plt.figure(figsize=(10, 2))
    for i in range(num_samples):
        plt.subplot(1, num_samples, i+1)
        plt.imshow(X[i], cmap='gray')
        plt.title(f"Label: {y[i]}")
        plt.axis('off')
    plt.show()

plot_sample_images(X_train, y_train)

2. Data Preprocessing

Prepare the data for training.

Tasks:

Normalize the pixel values
Reshape the data to include the channel dimension
Convert labels to one-hot encoding

Implementation:

import numpy as np
from tensorflow.keras.utils import to_categorical

# Normalize pixel values
X_train = X_train / 255.0
X_test = X_test / 255.0

# Reshape data
X_train = X_train.reshape(-1, 28, 28, 1)  # -1 means unspecified number of samples
X_test = X_test.reshape(-1, 28, 28, 1)

# One-hot encode labels
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

print(f"Updated training data shape: {X_train.shape}")
print(f"Updated training labels shape: {y_train.shape}")

3. Building the CNN Model

Create a simple CNN model using Keras.

Tasks:

Define the model architecture
Compile the model with appropriate loss function and optimizer

Implementation:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Define the model
model = Sequential()

# Convolutional layer
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))

# Pooling layer
model.add(MaxPooling2D(pool_size=(2, 2)))

# Flatten layer
model.add(Flatten())

# Fully connected layer
model.add(Dense(128, activation='relu'))

# Output layer
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Model summary
model.summary()

4. Training the Model

Train the model on the training data.

Tasks:

Fit the model
Monitor training and validation accuracy and loss

Implementation:

# Train the model
history = model.fit(
    X_train, y_train,
    epochs=10,
    batch_size=128,
    validation_split=0.1
)

5. Evaluating the Model

Assess the model’s performance on the test data.

Implementation:

# Evaluate the model
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f"Test accuracy: {test_accuracy:.4f}")

6. Visualizing Training Results

Plot the training and validation accuracy and loss over epochs.

Click to view visualization code

# Plot accuracy
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Val Accuracy')
plt.title('Accuracy over epochs')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Val Loss')
plt.title('Loss over epochs')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.show()

7. Making Predictions

Use the trained model to make predictions on new data.

Implementation:

# Predict on test data
predictions = model.predict(X_test)

# Convert predictions to class labels
predicted_classes = np.argmax(predictions, axis=1)
true_classes = np.argmax(y_test, axis=1)

# Display some predictions
def display_predictions(X, y_true, y_pred, num_samples=5):
    plt.figure(figsize=(10, 2))
    for i in range(num_samples):
        plt.subplot(1, num_samples, i+1)
        plt.imshow(X[i].reshape(28, 28), cmap='gray')
        plt.title(f"True: {y_true[i]}\nPred: {y_pred[i]}")
        plt.axis('off')
    plt.show()

display_predictions(X_test, true_classes, predicted_classes)

8. Saving and Loading the Model

Learn how to save your trained model for future use.

Implementation:

# Save the model
model.save('mnist_cnn_model.h5')

# Load the model
from tensorflow.keras.models import load_model
loaded_model = load_model('mnist_cnn_model.h5')

# Verify loaded model performance
loss, accuracy = loaded_model.evaluate(X_test, y_test)
print(f"Loaded model accuracy: {accuracy:.4f}")

9. Experimenting with Model Improvements

Explore ways to improve the model’s performance.

Suggestions:

Add more convolutional and pooling layers
Experiment with different activation functions
Adjust the number of epochs and batch size
Implement data augmentation using ImageDataGenerator

Click to view data augmentation example

from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Define data generator with augmentation
datagen = ImageDataGenerator(
    rotation_range=10,
    zoom_range=0.1,
    width_shift_range=0.1,
    height_shift_range=0.1
)

# Fit the generator to the training data
datagen.fit(X_train)

# Retrain the model using the data generator
history_aug = model.fit(
    datagen.flow(X_train, y_train, batch_size=128),
    epochs=10,
    validation_data=(X_test, y_test)
)

10. Conclusion

In this project, you built a simple image classification model using a convolutional neural network. You learned how to preprocess image data, define a CNN architecture, train and evaluate the model, and visualize the results. This foundational project provides the essential skills needed to progress to intermediate-level computer vision projects, where you’ll work with more complex datasets and advanced techniques.

Next Steps and Improvements

To further enhance your skills and prepare for intermediate projects, consider the following:

Work with Different Datasets: Try other datasets like CIFAR-10 or Fashion-MNIST to deal with colored images and more classes.
Explore Advanced Architectures: Learn about deeper networks like VGG, ResNet, or MobileNet.
Implement Transfer Learning: Use pre-trained models and fine-tune them for your specific tasks.
Hyperparameter Tuning: Experiment with different optimizers, learning rates, and regularization techniques.
Model Deployment: Learn how to deploy your model using Flask or TensorFlow Serving.