Creating AI-Generated Art with Style Transfer
Objective
Develop an application that generates artistic images by applying the style of one image to the content of another using Neural Style Transfer. This project focuses on utilizing existing models to create AI-generated art, exploring the creative applications of generative AI without building models from scratch.
Learning Outcomes
By completing this project, you will:
- Understand the concept of Neural Style Transfer.
- Learn to use pre-trained models for image processing tasks.
- Develop skills in image manipulation and visualization.
- Build an application or tool that allows users to generate art.
- Explore the intersection of AI and creativity.
Prerequisites and Theoretical Foundations
1. Basic Python Programming
- Libraries: Familiarity with PyTorch or TensorFlow.
- Image Processing: Understanding of image arrays and transformations.
2. Understanding of Convolutional Neural Networks (CNNs)
- Pre-trained Models: Knowledge of models like VGG19.
- Feature Extraction: Using CNNs to extract content and style representations.
3. Experience with Jupyter Notebooks or IDEs
- Visualization: Ability to display images and results within notebooks.
Tools Required
- Programming Language: Python 3.7+
- Libraries:
- PyTorch: (
pip install torch torchvision
) - Matplotlib: For visualization (
pip install matplotlib
) - Pillow: For image handling (
pip install pillow
)
- PyTorch: (
- Environment:
- Jupyter Notebook or any Python IDE.
Project Structure
neural_style_transfer/
│
├── style_transfer.py
├── content_image.jpg
├── style_image.jpg
└── output/
└── generated_image.jpg
Steps and Tasks
1. Setting Up the Environment
Tasks:
- Install Required Libraries.
- Prepare Content and Style Images.
Implementation:
# Imports
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.models as models
from PIL import Image
import matplotlib.pyplot as plt
2. Loading and Preprocessing Images
Tasks:
- Define Image Transformations.
- Load Images and Apply Transforms.
- Ensure Images are the Same Size.
Implementation:
# Image loading function
def load_image(img_path, max_size=400):
image = Image.open(img_path).convert('RGB')
# Resize image
size = max(image.size)
if size > max_size:
size = max_size
transform = transforms.Compose([
transforms.Resize(size),
transforms.ToTensor(),
])
image = transform(image).unsqueeze(0)
return image
content_img = load_image('content_image.jpg')
style_img = load_image('style_image.jpg')
3. Defining the Model and Layers
Tasks:
- Use a Pre-trained CNN (e.g., VGG19).
- Identify Layers for Content and Style Extraction.
- Build the Model for Style Transfer.
Implementation:
# Load pre-trained VGG19 model
vgg = models.vgg19(pretrained=True).features
# Freeze model parameters
for param in vgg.parameters():
param.requires_grad_(False)
# Move model to device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
vgg.to(device)
# Define content and style layers
content_layers = ['conv4_2']
style_layers = ['conv1_1', 'conv2_1', 'conv3_1', 'conv4_1', 'conv5_1']
4. Extracting Features
Tasks:
- Define Functions to Get Content and Style Features.
- Compute Gram Matrix for Style Representation.
Implementation:
def get_features(image, model, layers):
features = {}
x = image.to(device)
for name, layer in model._modules.items():
x = layer(x)
layer_name = f'conv{name}'
if layer_name in layers:
features[layer_name] = x
return features
def gram_matrix(tensor):
_, d, h, w = tensor.size()
tensor = tensor.view(d, h * w)
gram = torch.mm(tensor, tensor.t())
return gram
5. Defining Loss Functions and Optimizer
Tasks:
- Compute Content Loss.
- Compute Style Loss.
- Set Up Optimizer for the Generated Image.
Implementation:
# Get features
content_features = get_features(content_img, vgg, content_layers + style_layers)
style_features = get_features(style_img, vgg, content_layers + style_layers)
# Calculate gram matrices for style features
style_grams = {layer: gram_matrix(style_features[layer]) for layer in style_layers}
# Create a target image
target = content_img.clone().requires_grad_(True).to(device)
# Define optimizer
optimizer = optim.Adam([target], lr=0.003)
6. Running the Style Transfer
Tasks:
- Iteratively Update the Target Image.
- Compute Total Loss (content + style).
- Display and Save the Output Image.
Implementation:
style_weights = {'conv1_1': 1.0, 'conv2_1': 0.75,
'conv3_1': 0.5, 'conv4_1': 0.25,
'conv5_1': 0.1}
content_weight = 1 # alpha
style_weight = 1e6 # beta
for epoch in range(1, 1001):
target_features = get_features(target, vgg, content_layers + style_layers)
content_loss = torch.mean((target_features['conv4_2'] - content_features['conv4_2']) ** 2)
style_loss = 0
for layer in style_layers:
target_gram = gram_matrix(target_features[layer])
style_gram = style_grams[layer]
layer_style_loss = style_weights[layer] * torch.mean((target_gram - style_gram) ** 2)
style_loss += layer_style_loss
total_loss = content_weight * content_loss + style_weight * style_loss
optimizer.zero_grad()
total_loss.backward()
optimizer.step()
if epoch % 100 == 0:
print(f"Epoch {epoch}, Total Loss: {total_loss.item()}")
7. Displaying and Saving the Result
Tasks:
- Convert Tensor to Image.
- Display Using Matplotlib.
- Save the Generated Image.
Implementation:
# Display image
def im_convert(tensor):
image = tensor.to('cpu').clone().detach()
image = image.squeeze(0)
image = transforms.ToPILImage()(image)
return image
final_image = im_convert(target)
final_image.save('output/generated_image.jpg')
plt.imshow(final_image)
plt.show()
Further Enhancements
- Build a User Interface:
- Create a web app using Streamlit or Gradio for users to upload images.
- Experiment with Different Models:
- Use alternative architectures or pre-trained models for different styles.
- Optimize Performance:
- Implement techniques to reduce computation time.
- Explore Real-Time Style Transfer:
- Investigate methods for faster inference suitable for video processing.
Conclusion
In this project, you have:
- Applied neural style transfer to generate artistic images.
- Used pre-trained models to leverage existing technology.
- Developed an application that bridges AI and creativity.
- Explored the practical usage of generative AI without extensive model training.
This project showcases how generative AI technologies can be used in creative applications, focusing on usage and user experience.