Data Structures & Algorithms

Fully Customizable Machine Learning Algorithms with Shogun on Linux

The world of machine learning has seen a variety of different toolkits and software modules pop up which help the users to apply the complex machine learning and data analytics models to their data. This leads to them being able to get keen insights from the data that would otherwise not mean much without these models extracting information from it. The advantage that these modules and frameworks offer is that the user gets to implement the complicated inference generation algorithms without much effort or an in-depth understanding of these models beforehand. What this does is significantly accelerate the pace of the model implementation and inference generation, providing quick results to an otherwise time-consuming task at hand.

One such module is Shogun. Used with Python, C++, Octave, Java, R, and more, Shogun provides some very unique use cases and control capabilities when it comes to implementing specific algorithms in Python. You see, most modules try to implement some version of the most common algorithms as a one-stop solution for users which entails that the users only use that specific module for all of their machine-learning needs. Shogun, on the other hand, does not only offer all of the commonly used algorithms, it in fact offers comprehensive large-scale kernel methods and fully customizable Support Vector Machines (SVMs), too. By providing these customizations and the most common algorithms, they are able to achieve complex in-algorithm customizations that the users can take advantage of and get the best possible results when dealing with unique tasks that require specific customizations that the other modules simply cannot offer.

Today, Shogun is used across the board by scientists, researchers, students, and hobbyists alike. By providing an easy access to the Shogun toolkit, the developers made the process of model implementation, customization, and inference generation really simple. With this ease being provided, Shogun is quickly becoming a very well-adopted toolkit that is able to offer the users of all programming expertise with their required implementation of any algorithm that they require.


Follow the step-by-step installation guide to install the Shogun toolkit on your Linux machine.

1. We start the installation process by first adding the Shogun repository to the Linux system by running the following command in the terminal:

$ sudo add-apt-repository ppa:shogun-toolbox/stable


2. We now update the repository information by running the following command in the terminal:

$ sudo apt-get update


3. We can now proceed to install Shogun using the terminal command:

$ sudo apt-get install libshogun18


Note: To install the Python 2 bindings, run the following command in the terminal:

$ sudo apt-get install python-shogun


4. Shogun can also be directly installed using the Pip package manager that is offered by Python. Run the following command:

$ pip install shogun


User Guide

What differentiates Shogun from the other packages is its ability to provide some really specific solutions to complex use cases. For example, some of the commonly used frameworks use the decision trees and random forest classifiers which in turn use the Gini index methodology to create further splits in the data to create samples and trees. In comparison to that methodology, what Shogun does is that it uses the Chi Squared Automatic Interaction Detector (Chaid) to create these splits. This is an alternative to the Gini impurity method and it produces results that are sometimes better depending on the use case that it is implemented on.

For example, building a classifier for the prediction of whether an outdoor game of sport will occur or not on a specific day, depending on a range of different features, we can build a CHAIDTree and provide it with the type of data and the number of features that we need it to look at and examine while making those splits.

ourClassifier = CHAIDTree(type_off_data, features, output_classes)

The trained algorithm uses the CHAID methodology of splits to create trees that are better able to generate an inference and reach a convergence based on this training.


Shogun provides its users with a range of different algorithms that are commonly used in the world of machine learning. These can be used to gain keen insights from the data that would otherwise be difficult to interpret the patterns from. Where it differs from the other modules is its ability to provide specific implementations and customization abilities when it comes to kernel methods. With its implementation of similarity and dissimilarity indexing using targeted methods, it is able to achieve the results that sometimes outshine the competition. It all depends on the nature of the task at hand and what works better with the task at hand.

Being used by people from all walks of the stem life, Shogun is becoming a staple in the world of machine learning by providing researchers, students, and scientists with unique solutions to the problems that would otherwise require more effort to solve.

About the author

Zeeman Memon

Hi there! I'm a Software Engineer who loves to write about tech. You can reach out to me on LinkedIn.