What is Ensemble Learning?
By combining many models, ensemble learning enhances machine learning outcomes. In comparison to using a single model, this strategy enables the generation of greater prediction performance. In order to reduce variance (bagging), reduce bias (boosting), and improve predictions, ensemble approaches blend multiple machine learning techniques into one predictive model (stacking).
What is Adaboost?
Adaboost is an example of “Ensemble Learning,” which involves using many learners to create a more effective learning algorithm. Adaboost operates by selecting a basic algorithm (such as decision trees) and incrementally enhancing it by considering the improperly categorized samples in the training set. We select a basic method and give each training example the same weight. The training set is subjected to the base algorithm at each iteration, and the weights of the cases that were mistakenly categorized are increased. We apply the base learner to the training set with updated weights each time we repeat “n” times. The weighted average of the “n” learners makes up the final model.
Why Do We Use Adaboost?
Since the input parameters in the Adaboost algorithm are not simultaneously optimized, it is less affected by overfitting. By applying Adaboost, weak classifiers’ accuracy can be increased. Instead of binary classification problems, Adaboost is also utilized to solve text and image classification problems. Adaboost is also frequently employed in challenging Machine Learning issues.
Implementing Adaboost in sklearn
from sklearn.ensemble import AdaBoostClassifier
from sklearn.datasets import make_classification
Creating the dataset
X, y = make_classification(n_samples = 500, n_features = 4, n_informative=2, n_redundant=0, random_state=0, shuffle=False)
print("Feature data is", X)
print("Label data is", y)
Output
[ 1.34699113 1.48361993 -0.04932407 0.2390336 ]
[-0.54639809 -1.1629494 -1.00033035 1.67398571]
...
[ 0.8903941 1.08980087 -1.53292105 -1.71197016]
[ 0.73135482 1.25041511 0.04613506 -0.95837448]
[ 0.26852399 1.70213738 -0.08081161 -0.70385904]]
Label data is [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
Creating the model and making predictions
clf.fit(X, y)
print("Output Label is", clf.predict([[1.5, 1, 0.5, -0.5]]))
print("Classification score is", clf.score(X, y))
Output
Classification score is 0.94
Conclusion
We discussed the Adaboost algorithm in Machine Learning, including ensemble learning, its advantages and implementation in sklearn. This is a helpful algorithm as it uses a set of models to decide the output instead of one and also converts the weak learners to strong learners. Sklearn provides Adaboost implementation in the “ensemble” class, where we provide custom parameters for the model.