# Introduction to Hyperparameter Tuning in Machine Learning.

## Strive for continuous improvement, instead of perfection. - Kim Collins.

Data Scientists and Machine Learning Engineers often contemplate the model and parameters to produce the best accuracy after data preprocessing. Hyperparameter tuning is the process or method used to find out the best parameters for a model. In this article, you will learn the various methods of hyperparameter tuning and their implementation. A working understanding of python and DS libraries like Numpy will be useful in following this article through.

### Hyperparameter Tuning using Cross Validation Score Object

Cross-Validation Score is an object in `sklearn.model_selection`

that tunes your model manually by taking in the algorithm, its parameters, the independent features (X), and, the dependent feature (y) of your data. It then performs cross-validation with the given input and returns n number of accuracy scores, where n is the number of splits in cross-validation. See the example below.

#### Code Example

Import the Pandas and Numpy library to begin with.

```
import pandas
```

The BMI index dataset from Kaggle will be used in this article. You can download it here. The goal of this dataset is to predict a person's Body Mass Index(BMI) based on gender, height, and age. In the real world, these features may be insufficient, but for this tutorial, they will suffice.

First, read the dataset and print out the first five rows to see how the data is structured.

```
bmi_data = pd.read_csv('bmi.csv')
bmi_data.head()
```

#### Output

Next, replace the Male and Female data in the gender column with 0 and 1, respectively, as most machine learning algorithms only work on numerical data. Also, display the first five rows using `bmi_data.head`

.

```
bmi_data = bmi_data.replace(to_replace = ['Male', 'Female'], value = [0,1])
bmi_data.head()
```

#### Output

Next, separate the independent and dependent features when training a model. Dependent(y) features are those you want to predict and are *dependent * on other variables for their values, while your independent(X) features are those used to predict dependent feature(s) values.

`Index`

is the dependent variable in the dataset, while Gender, Height, and, Weight are the independent variables. You can perform this separation using Pandas `.iloc[]`

indexing attribute.

```
X = bmi_data.iloc[:,0:3] # Independent variable
y = bmi_data.iloc[:,3] #Dependent Variable
```

Now you are ready to tune your model.

Import the `cross_val_score`

object from `sklearn.model_selection`

.

```
from sklearn.model_selection import cross_val_score
```

Next, import the algorithm the Decision Tree Classifier algorithm for training, testing, and building the model.

```
from sklearn.tree import DecisionTreeClassifier
```

Next, input the algorithm, its parameters, the independent and dependent features, and the number of splits in cross-validation(cv).

The `min_samples_split`

and `max_depth`

parameters of the Decision Tree Classifier will be varied to find which value(s) give better accuracy.

```
cross_val = cross_val_score(DecisionTreeClassifier
(min_samples_split = 3,
max_depth = 3),
X, y, cv = 5)
```

Use Numpy `.mean()`

attribute to display the average of the result.

```
print(cross_val, np.mean(cross_val))
```

#### Output

The mean accuracy was 0.654 for a `min_samples_split`

of 3 and `max_depth`

of 3.

Run this same code but with a `min_samples_split`

of 5 and a `max_depth`

of 7, to check if there will be any improvement.

```
cross_val_ = cross_val_score(DecisionTreeClassifier(min_samples_split = 5, max_depth = 7), X, y, cv = 5)
print(cross_val_, np.mean(cross_val_))
```

#### Output

The accuracy went up from an average of 0.654 to an average score of 0.8. Quite impressive.

You could keep on choosing different values for different parameters to see which produces the right fit, but that would take a lot of time. Using a for loop will make things more complex because you would have to write multiple `for`

statements for every additional parameter you choose to add. Luckily, sklearn.model_selection comes with a package called GridSearchCV, which allows you to select as many parameters and values as you want, helping you tune your model with few lines of code.

### Hyperparameter Tuning using GridSearchCV

To get started, import `GridSearchCV`

for `sklearn.model_selection`

```
from sklearn.model_selection import GridSearchCV
```

GridSearchCV takes in the model you want to use on your data, followed by a dictionary containing your parameters and their values, then the number of cross-validation splits you want to perform.

Next, create an instance of the GridSearchCV object and input the values stated above to it.

```
clf_decision_tree = GridSearchCV(DecisionTreeClassifier(), {
'max_depth' : [1,3,5,7,9],
'min_samples_split' : [2,3,4],
}, cv=5, return_train_score=False)
```

The `max_depth`

parameter takes in 5 values, while the `min_samples_split`

parameter takes in 3. The option provided by GridSearchCV to compute multiple values at a go, makes it more efficient than `cross_val_score`

.

Next, fit the data using the `.fit()`

method and display the result using the `.cv_results_`

attribute.

```
clf_decision_tree.fit(X,y)
clf_decision_tree.cv_results_
```

#### Output

Pass the result into a dataframe for better visualization using Pandas `.DataFrame()`

method

```
grid_results = pd.DataFrame(clf_decision_tree.cv_results_)
grid_results
```

#### Output

The `.cv_results_`

attribute return values such as the mean test score, the parameters, individual accuracy of each split performed, and so on. You can view all the columns it returns using the `.columns`

attribute. Not all these values are necessary, just the parameters and their mean test scores. To get this, index the `params`

and `mean_test_score`

features.

```
grid_results[['params', 'mean_test_score']]
```

#### Output

The result above shows the best accuracy is achieved when the `max_depth`

was set to 9 and `min_samples_split`

was set to 2.

GridSearchCV is a great tool to use when there is the confidence that the best accuracy will be gotten using a few sets of values. The limitation of GridSearchCV is that it can't be used for a wide range of values. For example, setting the `min_samples_split`

to take numbers from 1 to 100 will be computationally expensive. To solve this problem, whenever hyperparameter tuning is performed for a wide range of values, RandomizedSearchCV is employed.

### Hyperparameter Tuning using RandomizedSearchCV

RandomizedSearchCV works exactly like GridSearchCV except that it tunes the model based on n number of random values where n is the number of iterations you choose.

Get started by importing `RandomizedSearchCV`

from `sklearn.model_selection`

```
from sklearn.model_selection import RandomizedSearchCV
```

Next, create an instance of RandomizedSearchCV and pass in the parameters similar to that of GridSearchCV. The number of iterations parameter `n_iter`

will be set to two. You can use any value of your choice.

```
random_cv = RandomizedSearchCV(DecisionTreeClassifier(), {
'max_depth' : [1,3,5],
'min_samples_split' : [2,3,4],
}, cv=5,
n_iter = 2
)
```

Next, fit your data to `random_cv`

and display the results in a DataFrame for easy visualization.

```
random_cv.fit( X,y)
random_cv_df = pd.DataFrame(random_cv.cv_results_)
random_cv_df
```

#### Output

Index the `params`

and `mean_test_score`

values to have a clearer picture of the data.

```
random_cv_df[['params', 'mean_test_score']]
```

#### Output

Using `max_depth values`

of 2 and 9 and `min_samples_split' values of 3 and , give accuracies of 0.58 and 0.838.

Rerun the code.

#### Output

Repeating again with `max_depth`

values of 3 and 9 and `min_samples_split' values of 4 and 2 to give accuracies of 0.58 and 0.834. The results do not differ so much, but you get the idea.

### Choosing the best model with the right Parameters

Finding the appropriate model is as important as finding the best parameters. GridSearchCV will be used to find this best model with the best parameter in this article, though both GridSearchCV and RandomizedSearchCV can be used.

DecisionTreeClassifier, RandomForestClassifier and, AdaBoostClassifier are the three algorithms to be tested on this dataset, along with two of their parameters.

Import the algorithm libraries to get started. You can exclude importing DecisionTreeClassifier as it has been done above.

```
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
```

Next, create a list containing instances of the algorithms and the parameters to be tested embedded in a dictionary.

```
best_model = [
{
'model' : DecisionTreeClassifier(),
'parameters': {
'min_samples_split' : [1.0, 3, 5],
'max_depth' : [2, 4,6]
}
},
{
'model' : AdaBoostClassifier(),
'parameters': {
'algorithm' : ['SAMME', 'SAMME.R'],
'learning_rate' : [1,2,3]
}
},
{
'model' : RandomForestClassifier(),
'parameters': {
'min_samples_split' : [1.0, 3, 5],
'class_weight' : ['balanced', 'balanced_subsample']
}
}
]
```

To know more about the parameters of each algorithm.

Next, go through the list and input the model and its parameters to GridSearchCV for tuning of the model. Also, append the best parameters and scores to a list.

```
best_model_params = []
best_model_scores = []
for model in best_model:
clf_best = GridSearchCV(model['model'], model['parameters'], cv=5, return_train_score=False)
clf_best.fit(X,y)
best_model_params.append(clf_best.best_params_)
best_model_scores.append(clf_best.best_score_)
```

Next, print the `best_model_params`

and the `best_model_scores`

list

```
print(best_model_params)
print('')
print(best_model_scores)
```

#### Output

RandomForestClassifier returned the best accuracy with `class_weight`

set to balanced and `min_samples_split`

to 5.

### Conclusion

Hyperparameter Tuning is needed when building a model because it gives the optimal parameters that produce the best accuracy. You get an unfair advantage when participating in competitions on platforms like Kaggle and Zindi using Hyperparameter Tuning, placing you higher on the leaderboard.

Moving forward, you can read on other preprocessing methods such as

I hope you learned the basics of Hyperparameter Tuning and how to implement it on data. Tuning your parameters will enable you to get the best out of your model.

If you have any questions, please don't hesitate to reach me on Twitter: @ee_ephraim.