ML Playground / CNN Image Classification View Notebook

CNN Image Classification

Using Convolutional Neural Networks to classify images from the CIFAR-10 dataset (10 classes, 32x32 RGB).

What is a CNN?

A Convolutional Neural Network exploits the spatial structure of images by using convolutional layers to extract patterns like edges, textures, and shapes. Unlike regular neural networks that flatten images immediately, CNNs preserve and learn from the 2D spatial relationships between pixels.

CNNs are the standard for image tasks: object recognition, medical imaging, face recognition, and satellite imagery.

CNN Architecture Components

Layer-by-Layer Breakdown
  1. Conv2D(32, 3x3, relu) -- 32 filters scan 3x3 patches, detect basic features (edges, curves)
  2. MaxPooling2D(2x2) -- Reduces spatial size by half, keeps strongest features
  3. Conv2D(64, 3x3, relu) -- 64 filters detect more complex patterns (combinations of edges)
  4. MaxPooling2D(2x2) -- Further reduces dimensions
  5. Conv2D(64, 3x3, relu) -- Learns even higher-level features
  6. Flatten() -- Converts 3D feature maps to 1D vector for Dense layers
  7. Dense(64, relu) -- Combines features for classification
  8. Dense(10, softmax) -- Output probabilities for 10 classes

Key Concepts

Code: Load CIFAR-10 Data

import tensorflow as tf from tensorflow.keras import datasets, layers, models import matplotlib.pyplot as plt import numpy as np # Load the dataset (x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data() # Normalize pixel values to [0,1] x_train = x_train / 255.0 x_test = x_test / 255.0 # Class names for CIFAR-10 class_names = ['airplane','automobile','bird','cat','deer', 'dog','frog','horse','ship','truck'] print("Training images shape:", x_train.shape) # (50000, 32, 32, 3) print("Test images shape:", x_test.shape) # (10000, 32, 32, 3) # Visualize first 5 training images plt.figure(figsize=(5,2)) for i in range(5): plt.subplot(1,5,i+1) plt.imshow(x_train[i]) plt.title(class_names[int(y_train[i])]) plt.axis('off') plt.show()

Code: Build CNN Model

# Build CNN Model model = models.Sequential() # 1st Convolution + Pooling model.add(layers.Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3))) model.add(layers.MaxPooling2D((2,2))) # 2nd Convolution + Pooling model.add(layers.Conv2D(64, (3,3), activation='relu')) model.add(layers.MaxPooling2D((2,2))) # 3rd Convolution model.add(layers.Conv2D(64, (3,3), activation='relu')) # Flatten + Dense layers model.add(layers.Flatten()) model.add(layers.Dense(64, activation='relu')) model.add(layers.Dense(10, activation='softmax'))

Code: Train and Evaluate

# Compile and train model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) history = model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test)) # Evaluate test_loss, test_acc = model.evaluate(x_test, y_test) print(f"Test accuracy: {test_acc:.4f}") # Visualize Predictions predictions = model.predict(x_test[:5]) plt.figure(figsize=(6,1)) for i in range(5): plt.subplot(1,5,i+1) plt.imshow(x_test[i]) plt.title(f"{class_names[np.argmax(predictions[i])]}") plt.axis('off') plt.show()

When to Use CNNs

Good ForNot Ideal For
Image classificationTabular / structured data
Object detectionSequential / time-series data
Medical imagingSmall datasets without augmentation
Face recognitionTasks where spatial structure is irrelevant

CIFAR-10 images are only 32x32 pixels. The ~72% accuracy is typical for a basic CNN on this dataset. Adding data augmentation, dropout, batch normalization, and deeper architectures (ResNet) can push accuracy above 90%.

CNNCIFAR-10Conv2DMaxPoolingKerasImage Classification