Python

Grid Search CV in Sklearn

“The variables of a machine learning model that determine how the parameters will be tuned are known as hyperparameters. The problem of selecting a group of ideal hyperparameters for a learning algorithm is known as hyperparameter tuning. The hyperparameters can be configured manually or automatically. There are numerous algorithms available for this tweaking. One such method, GridSearchCV, will be covered in this blog.”

What are Hyperparameters?

We can fit model parameters by using existing data to train a model. Another class of parameters cannot be immediately learned through routine training. These parameters represent the model’s “higher-level” characteristics, such as complexity and learning rate. Hyperparameters is the term for them. Typically, hyperparameters are adjusted before the training starts.

What is Grid Search CV?

Grid Search assesses the performance for each possible combination of the hyperparameters and their values, chooses the combination with the best performance, and takes that combination as its starting point. With so many hyperparameters involved, processing becomes time-consuming and expensive. Cross-validation is done in GridSearchCV in addition to Grid Search. Cross-validation is employed while the model is trained to validate the outcomes against a dataset.

Grid Search across two hyperparameters (source: Wikipedia)

How Does Grid Search Work?

In its most basic form, grid search is a method that uses brute force to estimate hyperparameters. Let’s say you have k hyperparameters, and there are ci possible values for each of them. Taking a Cartesian product of these potential values is essentially what grid search is. Although grid search may appear highly inefficient, it can be sped up using parallel processing.

Implementing Grid Search in Sklearn

# importing the libraries and classes

from sklearn import svm, datasets

from sklearn.model_selection import GridSearchCV

 

# loading the iris dataset

data = datasets.load_iris()

 

# specifying the parameter space and algorithm

parameters = {'kernel':('linear', 'rbf'), 'C':[5, 20]}

 

# creating the model

svc = svm.SVC()

 

# creating GridSearch instance

clf = GridSearchCV(svc, parameters)

 

# fitting the model

print(clf.fit(data.data, data.target))

 

# printing the results

print(clf.cv_results_.keys())

Output

GridSearchCV(estimator=SVC(),

             param_grid={'C': [5, 20], 'kernel': ('linear', 'rbf')})

dict_keys(['mean_fit_time', 'std_fit_time', 'mean_score_time', 'std_score_time', 'param_C', 'param_kernel', 'params', 'split0_test_score', 'split1_test_score', 'split2_test_score', 'split3_test_score', 'split4_test_score', 'mean_test_score', 'std_test_score', 'rank_test_score'])

Conclusion

This article talked about how to tune hyperparameters using GridSearchCV and its implementation. In machine learning, hyperparameters are parameters that the user directly defines to regulate the learning process. These hyperparameters are employed to enhance the model’s learning process. Hyperparameter tuning involves determining the ideal values for various parameters. Sklearn offers the “mode selection” class, which enables us to generate Grid Search instances and use them for our purposes.

About the author

Simran Kaur

Simran works as a technical writer. The graduate in MS Computer Science from the well known CS hub, aka Silicon Valley, is also an editor of the website. She enjoys writing about any tech topic, including programming, algorithms, cloud, data science, and AI. Travelling, sketching, and gardening are the hobbies that interest her.