Generative Adversarial Networks
Two neural networks competing against each other to generate realistic synthetic data.
What are GANs?
Introduced by Ian Goodfellow in 2014, GANs consist of two neural networks trained simultaneously in a game-theoretic framework. One network (Generator) creates fake data, and the other (Discriminator) tries to distinguish real data from fake. Through this competition, the Generator learns to produce increasingly realistic outputs.
Think of it as a counterfeiter (Generator) trying to make fake money, and a detective (Discriminator) trying to catch the fakes. Both get better over time.
Architecture
Generator (G)
Takes random noise (latent vector z) as input and transforms it into synthetic data (images, text, etc). Goal: fool the Discriminator.
Discriminator (D)
Takes both real and generated data as input and outputs a probability of the data being real. Goal: correctly identify fakes.
How Training Works
Training Loop
- Sample random noise z from a distribution (usually Gaussian)
- Generator creates fake data: G(z)
- Discriminator evaluates both real data and G(z)
- Update Discriminator — reward it for correctly classifying real vs fake
- Update Generator — reward it when the Discriminator is fooled
- Repeat until equilibrium (Discriminator can't tell the difference)
Loss Functions
Discriminator Loss:
L_D = -[log(D(x_real)) + log(1 - D(G(z)))]
→ Maximize: correctly classify real as real, fake as fake
Generator Loss:
L_G = -log(D(G(z)))
→ Minimize: make Discriminator think fake is real
Combined (Minimax Game):
min_G max_D E[log(D(x))] + E[log(1 - D(G(z)))]
Code Implementation (Simple GAN with Keras)
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, Model
# --- Generator ---
def build_generator(latent_dim=100):
model = tf.keras.Sequential([
layers.Dense(256, input_dim=latent_dim),
layers.LeakyReLU(0.2),
layers.BatchNormalization(),
layers.Dense(512),
layers.LeakyReLU(0.2),
layers.BatchNormalization(),
layers.Dense(784, activation='tanh'), # 28x28 = 784 for MNIST
layers.Reshape((28, 28, 1))
])
return model
# --- Discriminator ---
def build_discriminator():
model = tf.keras.Sequential([
layers.Flatten(input_shape=(28, 28, 1)),
layers.Dense(512),
layers.LeakyReLU(0.2),
layers.Dropout(0.3),
layers.Dense(256),
layers.LeakyReLU(0.2),
layers.Dropout(0.3),
layers.Dense(1, activation='sigmoid') # Real or Fake
])
return model
# --- Training Step ---
latent_dim = 100
generator = build_generator(latent_dim)
discriminator = build_discriminator()
cross_entropy = tf.keras.losses.BinaryCrossentropy()
def train_step(real_images, batch_size=64):
# Generate fake images
noise = tf.random.normal([batch_size, latent_dim])
fake_images = generator(noise, training=True)
# Train Discriminator
with tf.GradientTape() as disc_tape:
real_output = discriminator(real_images, training=True)
fake_output = discriminator(fake_images, training=True)
d_loss = cross_entropy(tf.ones_like(real_output), real_output) + \
cross_entropy(tf.zeros_like(fake_output), fake_output)
# Train Generator
with tf.GradientTape() as gen_tape:
noise = tf.random.normal([batch_size, latent_dim])
generated = generator(noise, training=True)
fake_output = discriminator(generated, training=True)
g_loss = cross_entropy(tf.ones_like(fake_output), fake_output)
return d_loss, g_loss
Common GAN Variants
| Variant | Key Idea | Use Case |
| DCGAN | Uses convolutional layers instead of fully connected | Image generation |
| WGAN | Wasserstein distance for more stable training | Fixes mode collapse |
| Conditional GAN | Generator takes class labels as additional input | Generate specific classes |
| CycleGAN | Unpaired image-to-image translation | Horse→Zebra, Photo→Monet |
| StyleGAN | Style-based generator with progressive growing | Photorealistic faces |
| Pix2Pix | Paired image-to-image translation | Sketch→Photo, Map→Satellite |
Training Challenges
Mode Collapse
Generator learns to produce only a few types of output instead of diverse samples. Fix: use WGAN, minibatch discrimination, or unrolled GANs.
Training Instability
Discriminator gets too strong or too weak. The two networks must stay balanced. Fix: learning rate scheduling, spectral normalization.
Vanishing Gradients
When Discriminator is perfect, Generator gets no useful gradient signal. Fix: use Wasserstein loss or feature matching.
Evaluation
Hard to measure quality objectively. Common metrics: FID (Frechet Inception Distance) and IS (Inception Score).
Real-World Applications
- Image synthesis — Generate photorealistic faces, art, product images
- Data augmentation — Create synthetic training data for rare classes
- Super-resolution — Upscale low-resolution images (SRGAN)
- Drug discovery — Generate novel molecular structures
- Video generation — Predict future frames, create deepfakes
- Text-to-image — Generate images from text descriptions
GANs are computationally expensive and can be tricky to train. Start with DCGAN on MNIST before attempting complex architectures.