Welcome to our deep dive into binary image classification using PyTorch! In this tutorial, we’ll explore how to build and train a convolutional neural network (CNN) to distinguish between images of dogs and cats. We’ll adopt a transfer learning approach, leveraging a pre-trained network to boost our model’s performance. Our dataset for this adventure is the freely available ‘Dogs vs. Cats’ dataset from Kaggle.
Setting Up the Environment
Before we begin, ensure you have PyTorch installed. If not, you can install it via pip:
pip install torch torchvision
Also, download the ‘Dogs vs. Cats’ dataset from Kaggle and extract it into a folder named data
.
Data Preparation
First, let’s prepare our dataset. We’ll split it into training and validation sets and apply the necessary transformations.
import os
import torch
from torchvision import datasets, transforms
# Define transformations
transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# Load datasets
train_dataset = datasets.ImageFolder(root='data/train', transform=transform)
val_dataset = datasets.ImageFolder(root='data/val', transform=transform)
# Data loaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=32, shuffle=False)
Transfer Learning with a Pre-trained Model
We’ll use a pre-trained model from torchvision’s models. Let’s choose ResNet-18 for its balance between performance and computational efficiency.
from torchvision import models
# Load pre-trained ResNet-18
model = models.resnet18(pretrained=True)
# Freeze parameters
for param in model.parameters():
param.requires_grad = False
# Modify the final layer for binary classification
num_ftrs = model.fc.in_features
model.fc = torch.nn.Linear(num_ftrs, 2)
Training the Model
Now, let’s train our model. We’ll use the cross-entropy loss and an optimizer.
import torch.optim as optim
from torch.optim import lr_scheduler
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.Adam(model.fc.parameters(), lr=0.001)
scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)
def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
for epoch in range(num_epochs):
# Training phase
model.train()
running_loss = 0.0
running_corrects = 0
for inputs, labels in train_loader:
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item() * inputs.size(0)
_, preds = torch.max(outputs, 1)
running_corrects += torch.sum(preds == labels.data)
scheduler.step()
epoch_loss = running_loss / len(train_dataset)
epoch_acc = running_corrects.double() / len(train_dataset)
print(f'Epoch {epoch}/{num_epochs - 1}, Loss: {epoch_loss:.4f}, Acc: {epoch_acc:.4f}')
# Train the model
train_model(model, criterion, optimizer, scheduler)
Evaluating the Model
After training, let’s evaluate the model on the validation set.
def evaluate_model(model, val_loader):
model.eval()
running_corrects = 0
with torch.no_grad():
for inputs, labels in val_loader:
outputs = model(inputs)
_, preds = torch.max(outputs, 1)
running_corrects += torch.sum(preds == labels.data)
acc = running_corrects.double() / len(val_dataset)
print(f'Validation Acc: {acc:.4f}')
# Evaluate the model
evaluate_model(model, val_loader)
Inference on New Images
Finally, let’s use our trained model to predict new images.
from PIL import Image
def predict_image(image_path, model, transform):
image = Image.open(image_path)
image = transform(image).unsqueeze(0)
model.eval()
with torch.no_grad():
outputs = model(image)
_, preds = torch.max(outputs, 1)
return 'Cat' if preds[0] == 0 else 'Dog'
# Example usage
image_path = 'path_to_your_image.jpg'
print(predict_image(image_path, model, transform))
Congratulations! You’ve successfully trained a convolutional neural network for binary image classification using PyTorch and transfer learning. This tutorial is just the beginning of your journey in deep learning. Explore further by tweaking the model, experimenting with different pre-trained networks, and adjusting hyperparameters.
The above content provides a comprehensive guide on using PyTorch to develop a binary image classification model. The tutorial focuses on distinguishing between images of dogs and cats using the ‘Dogs vs. Cats’ dataset from Kaggle. Key steps include setting up the environment, preparing the data, employing a pre-trained model (ResNet-18) for transfer learning, training the model, and evaluating its performance. It also covers how to use the trained model for inference on new images. The guide is designed to be accessible and practical, offering step-by-step instructions and code snippets, making it suitable for learners and practitioners in machine learning and computer vision.