ML Playground / Neural Network Basics View Notebook

Neural Network Basics

Building and training an Artificial Neural Network (ANN) for handwritten digit classification using MNIST.

What is an ANN?

An Artificial Neural Network is a stack of layers: an input layer, one or more hidden layers, and an output layer. Each layer contains neurons that apply weights, biases, and an activation function to transform inputs into outputs. This is the foundation of all deep learning.

This notebook builds a simple feedforward ANN that recognizes handwritten digits (0-9) from the MNIST dataset with ~98% accuracy.

How It Works

Architecture: MNIST Digit Classifier
  1. Flatten(28, 28) -- Convert 28x28 pixel image into a 1D vector of 784 values
  2. Dense(128, relu) -- First hidden layer: 128 neurons learn features
  3. Dense(64, relu) -- Second hidden layer: 64 neurons learn higher-level features
  4. Dense(10, softmax) -- Output layer: 10 neurons (one per digit), probabilities sum to 1

Key Components

Code: Load and Prepare Data

import tensorflow as tf from tensorflow import keras from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Flatten from tensorflow.keras.optimizers import Adam import matplotlib.pyplot as plt import numpy as np # Load the MNIST dataset (x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data() # Normalize pixel values (0-255 -> 0-1) x_train, x_test = x_train / 255.0, x_test / 255.0 # Display a sample image plt.imshow(x_train[2], cmap='gray') plt.title(f"Label: {y_train[2]}") plt.show()

Code: Build the Model

# Define the ANN model model = Sequential([ Flatten(input_shape=(28, 28)), # Convert 2D image to 1D vector Dense(128, activation='relu'), # First hidden layer with 128 neurons Dense(64, activation='relu'), # Second hidden layer with 64 neurons Dense(10, activation='softmax') # Output layer for 10 classes (0-9) ])

Code: Compile and Train

model.compile(optimizer=Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy']) history = model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))

Code: Evaluate and Predict

test_loss, test_acc = model.evaluate(x_test, y_test) print(f"Test Accuracy: {test_acc:.2f}") predictions = model.predict(x_test) predicted_label = np.argmax(predictions[0]) # Get the most probable digit # Show the predicted and actual label plt.imshow(x_test[0], cmap='gray') plt.title(f"Predicted: {predicted_label}, Actual: {y_test[0]}") plt.show()

Code: Plot Training History

plt.plot(history.history['accuracy'], label='train_acc') plt.plot(history.history['val_accuracy'], label='val_acc') plt.title('Accuracy over epochs') plt.xlabel('Epoch') plt.ylabel('Accuracy') plt.legend() plt.show()

When to Use ANNs

Good ForNot Ideal For
Tabular/structured dataImages (use CNNs instead)
Simple classification/regressionSequential data (use RNN/LSTM)
Quick baseline deep learning modelVery small datasets (use classical ML)

ANNs with Dense layers treat each pixel independently -- they ignore spatial structure. For image tasks, CNNs are significantly better. ANNs work well on MNIST because the digits are centered and simple.

ANNMNISTKerasSoftmaxReLUAdam