Generative Adversarial Networks

Two neural networks competing against each other to generate realistic synthetic data.

What are GANs?

Introduced by Ian Goodfellow in 2014, GANs consist of two neural networks trained simultaneously in a game-theoretic framework. One network (Generator) creates fake data, and the other (Discriminator) tries to distinguish real data from fake. Through this competition, the Generator learns to produce increasingly realistic outputs.

Think of it as a counterfeiter (Generator) trying to make fake money, and a detective (Discriminator) trying to catch the fakes. Both get better over time.

Architecture

Generator (G)

Takes random noise (latent vector z) as input and transforms it into synthetic data (images, text, etc). Goal: fool the Discriminator.

Discriminator (D)

Takes both real and generated data as input and outputs a probability of the data being real. Goal: correctly identify fakes.

How Training Works

Training Loop

Sample random noise z from a distribution (usually Gaussian)
Generator creates fake data: G(z)
Discriminator evaluates both real data and G(z)
Update Discriminator — reward it for correctly classifying real vs fake
Update Generator — reward it when the Discriminator is fooled
Repeat until equilibrium (Discriminator can't tell the difference)

Loss Functions

Discriminator Loss: L_D = -[log(D(x_real)) + log(1 - D(G(z)))] → Maximize: correctly classify real as real, fake as fake Generator Loss: L_G = -log(D(G(z))) → Minimize: make Discriminator think fake is real Combined (Minimax Game): min_G max_D E[log(D(x))] + E[log(1 - D(G(z)))]

Code Implementation (Simple GAN with Keras)

import numpy as np import tensorflow as tf from tensorflow.keras import layers, Model # --- Generator --- def build_generator(latent_dim=100): model = tf.keras.Sequential([ layers.Dense(256, input_dim=latent_dim), layers.LeakyReLU(0.2), layers.BatchNormalization(), layers.Dense(512), layers.LeakyReLU(0.2), layers.BatchNormalization(), layers.Dense(784, activation='tanh'), # 28x28 = 784 for MNIST layers.Reshape((28, 28, 1)) ]) return model # --- Discriminator --- def build_discriminator(): model = tf.keras.Sequential([ layers.Flatten(input_shape=(28, 28, 1)), layers.Dense(512), layers.LeakyReLU(0.2), layers.Dropout(0.3), layers.Dense(256), layers.LeakyReLU(0.2), layers.Dropout(0.3), layers.Dense(1, activation='sigmoid') # Real or Fake ]) return model # --- Training Step --- latent_dim = 100 generator = build_generator(latent_dim) discriminator = build_discriminator() cross_entropy = tf.keras.losses.BinaryCrossentropy() def train_step(real_images, batch_size=64): # Generate fake images noise = tf.random.normal([batch_size, latent_dim]) fake_images = generator(noise, training=True) # Train Discriminator with tf.GradientTape() as disc_tape: real_output = discriminator(real_images, training=True) fake_output = discriminator(fake_images, training=True) d_loss = cross_entropy(tf.ones_like(real_output), real_output) + \ cross_entropy(tf.zeros_like(fake_output), fake_output) # Train Generator with tf.GradientTape() as gen_tape: noise = tf.random.normal([batch_size, latent_dim]) generated = generator(noise, training=True) fake_output = discriminator(generated, training=True) g_loss = cross_entropy(tf.ones_like(fake_output), fake_output) return d_loss, g_loss

Common GAN Variants

Variant	Key Idea	Use Case
DCGAN	Uses convolutional layers instead of fully connected	Image generation
WGAN	Wasserstein distance for more stable training	Fixes mode collapse
Conditional GAN	Generator takes class labels as additional input	Generate specific classes
CycleGAN	Unpaired image-to-image translation	Horse→Zebra, Photo→Monet
StyleGAN	Style-based generator with progressive growing	Photorealistic faces
Pix2Pix	Paired image-to-image translation	Sketch→Photo, Map→Satellite

Training Challenges

Mode Collapse

Generator learns to produce only a few types of output instead of diverse samples. Fix: use WGAN, minibatch discrimination, or unrolled GANs.

Training Instability

Discriminator gets too strong or too weak. The two networks must stay balanced. Fix: learning rate scheduling, spectral normalization.

Vanishing Gradients

When Discriminator is perfect, Generator gets no useful gradient signal. Fix: use Wasserstein loss or feature matching.

Evaluation

Hard to measure quality objectively. Common metrics: FID (Frechet Inception Distance) and IS (Inception Score).

Real-World Applications

Image synthesis — Generate photorealistic faces, art, product images
Data augmentation — Create synthetic training data for rare classes
Super-resolution — Upscale low-resolution images (SRGAN)
Drug discovery — Generate novel molecular structures
Video generation — Predict future frames, create deepfakes
Text-to-image — Generate images from text descriptions

GANs are computationally expensive and can be tricky to train. Start with DCGAN on MNIST before attempting complex architectures.