Hyperparameter Tuning

Finding the best settings for your ML model using systematic search methods.

What are Hyperparameters?

Hyperparameters are settings you choose before training a model. They control how the model learns, but are not learned from data. Getting them right can dramatically improve model performance.

Parameters (Learned)

Learned from data during training. Examples: coefficients in Linear Regression, weights in Neural Networks.

Hyperparameters (Set by You)

Set before training. Examples: K in KNN, max_depth in Decision Trees, C in SVM, learning_rate in boosting.

Default hyperparameters rarely give the best performance. Tuning them properly can be the difference between a mediocre model and a great one.

Tuning Methods

1. Manual Tuning

Try different values by hand and compare results. Quick for simple models but tedious for many parameters.

from sklearn.tree import DecisionTreeClassifier from sklearn.metrics import accuracy_score # Try different max_depth values manually for depth in [1, 2, 3, 5, 10, None]: model = DecisionTreeClassifier(max_depth=depth, random_state=42) model.fit(X_train, y_train) y_pred = model.predict(X_test) print(f"max_depth={depth}, Accuracy={accuracy_score(y_test, y_pred):.2f}")

2. Grid Search (GridSearchCV)

Tries every possible combination of hyperparameters in a defined grid. Exhaustive but slow.

from sklearn.model_selection import GridSearchCV from sklearn.tree import DecisionTreeClassifier param_grid = { "max_depth": [1, 2, 3, None], "min_samples_split": [2, 3, 4], "random_state": [42, 100] } # GridSearch with 3-fold cross-validation grid = GridSearchCV(DecisionTreeClassifier(), param_grid, cv=3) grid.fit(X_train, y_train) print("Best Params:", grid.best_params_) print("Best Score:", grid.best_score_)

Grid Search tests all combinations. With max_depth=[1,2,3,None], min_samples_split=[2,3,4], and random_state=[42,100], that is 4 x 3 x 2 = 24 models trained, each with 3-fold CV = 72 total fits.

3. Random Search (RandomizedSearchCV)

Tries random subsets of combinations instead of all. Much faster for large search spaces with similar quality results.

from sklearn.model_selection import RandomizedSearchCV param_dist = { "max_depth": [1, 2, 3, None], "min_samples_split": [2, 3, 4, 5, 6] } random_search = RandomizedSearchCV( DecisionTreeClassifier(random_state=42), param_distributions=param_dist, n_iter=5, # only try 5 random combinations cv=3, random_state=42 ) random_search.fit(X_train, y_train) print("Best Params:", random_search.best_params_) print("Best Score:", random_search.best_score_)

4. Bayesian Optimization

Uses probability to intelligently guess which hyperparameter regions will perform best, then focuses the search there. The most efficient method for expensive models.

import optuna from sklearn.tree import DecisionTreeClassifier from sklearn.metrics import accuracy_score def objective(trial): max_depth = trial.suggest_categorical("max_depth", [1, 2, 3, None]) min_samples_split = trial.suggest_int("min_samples_split", 2, 6) model = DecisionTreeClassifier( max_depth=max_depth, min_samples_split=min_samples_split, random_state=42 ) model.fit(X_train, y_train) y_pred = model.predict(X_test) return accuracy_score(y_test, y_pred) # Run optimization study = optuna.create_study(direction="maximize") study.optimize(objective, n_trials=10) print("Best Params:", study.best_params) print("Best Score:", study.best_value)

Comparison of Methods

Method	Speed	Coverage	Best For
Manual	Fast (few combos)	Low	Quick experiments, simple models
Grid Search	Slow	Complete	Small search space, need exact best
Random Search	Fast	Partial	Large search space, good-enough results
Bayesian (Optuna)	Efficient	Smart	Expensive models, deep learning

Setup Code for Examples Above

All examples above use this dataset:

import pandas as pd from sklearn.model_selection import train_test_split data = { "Age": [22, 25, 30, 35, 40, 50, 55], "Income_LPA": [8, 10, 15, 12, 25, 20, 35], "Buys_House": [0, 0, 0, 1, 1, 0, 1] } df = pd.DataFrame(data) X = df[["Age", "Income_LPA"]] y = df["Buys_House"] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Hyperparameters Grid Search Random Search Bayesian Optuna