ML Playground / SVM View Notebook

Support Vector Machine (SVM)

A powerful algorithm that finds the optimal hyperplane to separate data into classes with maximum margin.

What is SVM?

SVM is a supervised learning algorithm that finds the hyperplane (decision boundary) that best separates data points into different classes. It maximizes the margin between the closest points of each class (called support vectors) and the boundary.

SVM works by finding the widest possible "street" between two classes. The edges of the street are defined by the support vectors -- the data points closest to the boundary.

Key Concepts

Kernel Types

Linear

For linearly separable data. Fastest. Use when data can be separated by a straight line/plane.

RBF (Radial Basis Function)

Default kernel. Maps data to infinite-dimensional space. Works well for most non-linear problems.

Polynomial

Maps data using polynomial features. Good for data with polynomial relationships.

Sigmoid

Similar to a neural network activation. Rarely used in practice.

SVR (Support Vector Regression)

For regression tasks, SVR fits a hyperplane within an epsilon-insensitive tube around the data:

Code: SVR for Rent Prediction

import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.svm import SVR from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score # Dataset data = { "Size_sqft": [500, 700, 900, 1100, 1500, 1800, 2100], "Bedrooms": [1, 1, 2, 2, 3, 3, 4], "Rent": [12, 15, 20, 25, 30, 35, 50] # Rent in thousands } df = pd.DataFrame(data) # Features and target X = df[["Size_sqft", "Bedrooms"]] y = df["Rent"] # Train-test split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Feature scaling (critical for SVM) scaler_X = StandardScaler() scaler_y = StandardScaler() X_train_scaled = scaler_X.fit_transform(X_train) X_test_scaled = scaler_X.transform(X_test) y_train_scaled = scaler_y.fit_transform(y_train.values.reshape(-1, 1)).flatten() # Train SVR with RBF kernel svr = SVR(kernel="rbf", C=100, epsilon=0.01) svr.fit(X_train_scaled, y_train_scaled) # Predictions (inverse transform to original scale) y_pred_scaled = svr.predict(X_test_scaled) y_pred = scaler_y.inverse_transform(y_pred_scaled.reshape(-1, 1)).flatten() # Evaluate mae = mean_absolute_error(y_test, y_pred) mse = mean_squared_error(y_test, y_pred) r2 = r2_score(y_test, y_pred) print(f"MAE: {mae:.2f}") print(f"MSE: {mse:.2f}") print(f"R2 Score: {r2:.2f}")

SVM is very sensitive to feature scales. Always use StandardScaler before training. Also scale the target variable for SVR.

Key Parameters

ParameterDescription
CRegularization. Higher C = less tolerance for misclassification (risk of overfitting)
kernel"linear", "rbf", "poly", "sigmoid". Default is "rbf"
epsilon (SVR)Width of the insensitive tube. Larger = more tolerance for errors
gammaControls influence of a single training example. "scale" or "auto" are common defaults

When to Use SVM

Good ForNot Ideal For
High-dimensional data (text, genomics)Very large datasets (slow training)
Clear margin of separationNoisy data with overlapping classes
Binary and multi-class classificationWhen you need probability estimates (use SVC with probability=True)
Small to medium datasetsWhen interpretability is important

Classification Regression Supervised Kernel Margin