Multi-Layer Perceptron (MLP)
A fully connected neural network for regression -- predicting house prices from tabular features using scikit-learn's MLPRegressor.
What is MLP?
A Multilayer Perceptron is a type of artificial neural network with multiple layers of fully connected neurons. It consists of an input layer, one or more hidden layers, and an output layer. MLP is one of the most fundamental deep learning models for supervised learning.
MLP works well when the relationship between features and target is non-linear and cannot be captured by Linear Regression. For classification: use softmax/sigmoid output. For regression: use linear (no activation) output.
How It Works
Architecture for This Example
- Input Layer -- 3 features: Area_sqft, Bedrooms, Location_Index
- Hidden Layer 1 -- 64 neurons with ReLU activation
- Hidden Layer 2 -- 32 neurons with ReLU activation
- Output Layer -- 1 neuron, linear activation (regression output)
Code: Prepare Data
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_absolute_error, r2_score
data = {
"Area_sqft": [1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 5500],
"Bedrooms": [2, 3, 3, 4, 4, 5, 5, 6, 6, 7],
"Location_Index": [1, 2, 1, 2, 3, 1, 3, 2, 3, 1], # Encoded location values
"Price_Lakhs": [50, 70, 80, 110, 150, 130, 180, 200, 250, 270] # Target
}
df = pd.DataFrame(data)
X = df.drop(columns=["Price_Lakhs"]) # Features
y = df["Price_Lakhs"] # Target
# Train-Test Split (80% Train, 20% Test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Code: Scale and Train
# Standardization (Neural Networks need scaled data)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
mlp_regressor = MLPRegressor(
hidden_layer_sizes=(64, 32), # Two hidden layers (64 & 32 neurons)
activation='relu', # ReLU activation function
solver='adam', # Adam optimizer
learning_rate_init=0.1,
max_iter=10000, # Number of iterations
random_state=42
)
mlp_regressor.fit(X_train_scaled, y_train)
Code: Predict and Evaluate
y_pred = mlp_regressor.predict(X_test_scaled)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"R-squared Score: {r2:.2f}")
Always scale your features before training an MLP. Neural networks are sensitive to feature scales. StandardScaler (zero mean, unit variance) is the most common choice. Fit the scaler on training data only, then transform both train and test.
Key Parameters
- hidden_layer_sizes: Tuple defining the number of neurons in each hidden layer, e.g., (64, 32)
- activation: 'relu' (default), 'tanh', 'logistic' (sigmoid)
- solver: 'adam' (default, best for most cases), 'sgd', 'lbfgs' (good for small datasets)
- learning_rate_init: Initial learning rate (default 0.001)
- max_iter: Maximum training iterations
When to Use MLP
| Good For | Not Ideal For |
| Non-linear regression/classification | Image data (use CNNs) |
| Tabular/structured data | Sequential data (use RNN/LSTM) |
| Medium-sized datasets | Very large datasets (use deep learning frameworks) |
| Quick prototyping with scikit-learn | When interpretability is required |
MLPRegressionscikit-learnStandardScalerAdam