“The variables of a machine learning model that determine how the parameters will be tuned are known as hyperparameters. The problem of selecting a group of ideal hyperparameters for a learning algorithm is known as hyperparameter tuning. The hyperparameters can be configured manually or automatically. There are numerous algorithms available for this tweaking. One such method, GridSearchCV, will be covered in this blog.”
What are Hyperparameters?
We can fit model parameters by using existing data to train a model. Another class of parameters cannot be immediately learned through routine training. These parameters represent the model’s “higher-level” characteristics, such as complexity and learning rate. Hyperparameters is the term for them. Typically, hyperparameters are adjusted before the training starts.
What is Grid Search CV?
Grid Search assesses the performance for each possible combination of the hyperparameters and their values, chooses the combination with the best performance, and takes that combination as its starting point. With so many hyperparameters involved, processing becomes time-consuming and expensive. Cross-validation is done in GridSearchCV in addition to Grid Search. Cross-validation is employed while the model is trained to validate the outcomes against a dataset.
Grid Search across two hyperparameters (source: Wikipedia)
How Does Grid Search Work?
In its most basic form, grid search is a method that uses brute force to estimate hyperparameters. Let’s say you have k hyperparameters, and there are ci possible values for each of them. Taking a Cartesian product of these potential values is essentially what grid search is. Although grid search may appear highly inefficient, it can be sped up using parallel processing.
Implementing Grid Search in Sklearn
from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
# loading the iris dataset
data = datasets.load_iris()
# specifying the parameter space and algorithm
parameters = {'kernel':('linear', 'rbf'), 'C':[5, 20]}
# creating the model
svc = svm.SVC()
# creating GridSearch instance
clf = GridSearchCV(svc, parameters)
# fitting the model
print(clf.fit(data.data, data.target))
# printing the results
print(clf.cv_results_.keys())
Output
param_grid={'C': [5, 20], 'kernel': ('linear', 'rbf')})
dict_keys(['mean_fit_time', 'std_fit_time', 'mean_score_time', 'std_score_time', 'param_C', 'param_kernel', 'params', 'split0_test_score', 'split1_test_score', 'split2_test_score', 'split3_test_score', 'split4_test_score', 'mean_test_score', 'std_test_score', 'rank_test_score'])
Conclusion
This article talked about how to tune hyperparameters using GridSearchCV and its implementation. In machine learning, hyperparameters are parameters that the user directly defines to regulate the learning process. These hyperparameters are employed to enhance the model’s learning process. Hyperparameter tuning involves determining the ideal values for various parameters. Sklearn offers the “mode selection” class, which enables us to generate Grid Search instances and use them for our purposes.