Python

Matplotlib violin plot

Matplotlib is a plotting library utilized in python. For embedding graphs into programs, it includes object-oriented Interfaces. It is a framework for making 2D graphs by using array data. A Violin plot is generally related to a boxing plot, but this plot also depicts the data’s probability distribution function at various parameters.

As with standard box graphs, these figures provide a marker for the data’s mean value and a box denoting the quartiles. A statistical evaluation is applied to this box graphic. Violin graphs, like box graphs, are being used to show how a variable dispersion compares between many “classes.” In this article, let’s discuss how to create violin plots in matplotlib.

Visualize Violin Plots by the use of the Matplotlib library:

The matplotlib.pyplot.violinplot() module generates a violin graph for every set of data sections or variables in a series dataset. Using additional lines at the mean, average, lower value, max value, and user-defined data series, every covered area stretches to indicate the whole sample. The five erratically dispersed data points are constructed with NumPy in the instance below.

Each collection has 1000 entries; however, the value of standard deviation and mean values are varied. The use of data points creates a violin graph.

import matplotlib.pyplot as plt
import numpy as np
np.random.seed(3)
data_1 = np.random.normal(20, 50, 2000)
data_2 = np.random.normal(60, 10, 2000)
data_3 = np.random.normal(40, 70, 2000)
data_4 = np.random.normal(30, 80, 2000)
data_5 = np.random.normal(0, 10, 2000)
data = [data_1, data_2, data_3, data_4, data_5]
fig, ax = plt.subplots()
ax.violinplot(data, showmedians=True)
plt.show()

We have to include the required libraries matplotlib.pyplot as plt and NumPy as np to start the code. Matplotlib is utilized to draw graphs. Now we call the random() function by using the NumPy library. We define data points for creating violin plots. Here we make five different variables, which are used to store the data sets.

The function np.random.normal is applied for every dataset. In addition to this, we create a new variable that holds those five datasets. We declare a new object of the figure. And we also employ the plt.subplots(). To draw violin graphs we use the ax.violinplot() function. Here we pass ‘true’ to the argument ‘showmedians’ to this function. In the end, we display the graph by using the plt.show() function.

Add gird lines to the violin plots:

Using Matplotlib’s ax.violinplot() technique to make a violin graph. Showmeans and showmedians are two additional arguments that could be used. The subsequent program creates a violin graph with four ” violins with randomly created data sets.”

import matplotlib.pyplot as plt
import numpy as np
data_1 = np.random.normal(10, 12, 300)
data_2 = np.random.normal(10, 15, 300)
data_3 = np.random.normal(10, 22, 300)
data_4 = np.random.normal(10, 20, 300)
data = list([data_1, data_2, data_3, data_4])
fig, ax = plt.subplots()
ax.violinplot(data, showmeans=True, showmedians=False)
ax.set_title('violin graph')
ax.set_xlabel('X')
ax.set_ylabel('Y')
xticklabels = ['first plot', 'second plot', 'third plot', 'fourth plot']
ax.set_xticks([0.9,1.9,2.9,3.9])
ax.set_xticklabels(xticklabels)
ax.yaxis.grid(True)
plt.show()

We import matplotlib.pyplot, and NumPy libraries. In the next step, we start to create four different data points. And these data points are stored in different variables. Now we declare an array containing these four data points. We utilize plt.subplots() method.

Furthermore the ax.violinplot() method is defined. We set the value of showmeans and showmedians and passed it to the function. Now we insert the title of the graph by applying the set_title() function. Similarly, we utilize the set_xlabel() function and set_ylabel to modify the labels of both axes. The tick labels() are used to create a list.

We set the position of the labels of those four plots. And we place labels of these plots on the X-axis. Before utilizing the plt.show() to represent the graph, we insert horizontal grid lines using an ax.yaxis.gird() method. And we set the ‘true’ value to this function here.

Visualize vertical violin plots:

Here we are going to take three random data sets to create violin plots.

import matplotlib.pyplot as plt
import numpy as np
np.random.seed(50)
data_1 = np.random.normal(300, 20, 300)
data_2 = np.random.normal(40, 70, 300)
data_3 = np.random.normal(10, 30, 300)
data_list = [data_1, data_2, data_3,]
fig = plt.figure()
ax = fig.add_axes([5,5,2,2])
bp = ax.violinplot(data_list)
ax.xaxis.grid(True)
plt.show()

At the beginning of the code, we acquire libraries matplotlib.pyplot as plt and NumPy as np. We randomly generate three data sets by the use of the NumPy module. Now we have to combine these three data sets into an array. So here we declare an array.

Furthermore, we call the plt.figure() function to create a graph. Now we adjust the axes of the graph, so we employ the function fig.add_axes(). We also generate a violin graph, so we apply the ax.violinplot() method. To create gridlines on the x-axis, we set the ‘true’ value to the ax.xaxis.gri () function. We terminate the code by calling the plt.show() function.

Visualize Horizontal violin plot:

By the use of the ‘vert’ argument, we can create a horizontal violin graph as presented below instance.

import matplotlib.pyplot as plt
import numpy as np
np.random.seed(5)
data_1 = np.random.normal(30, 30, 3000)
data_2 = np.random.normal(80, 20, 3000)
data_3 = np.random.normal(10, 40, 3000)
data_4 = np.random.normal(20, 60, 300)
data_5 = np.random.normal(70, 50, 3000)
data_6 =np.random.normal(50, 10, 3000)
d = [data_1, data_2, data_3, data_4, data_5, data_6]
fig, ax = plt.subplots()
ax.violinplot(d, vert=False, showmedians=True)
plt.show()

First, we introduce the libraries in the code that can be utilized for creating violin graphs. Now we apply random.seed() using the NumPy library. We now take random datasets for the violin graphs. These datasets are stored in different variables. Then we create the list which contains all those datasets. In addition to this, we employ plt.subplots(), and also we declare a new object. To create violin plots in the figure, we have to utilize the violinplot() method by providing the datasets as a parameter. We also pass the ‘vert’ argument to this function. Here the value of this parameter is ‘false’, which shows that we have to make horizontal violin plots. After all this, we display the graph by using the plt.show() function.

Conclusion:

In this tutorial, we have communicated about the matplotlib violin plots. Using the ‘vert’ argument, we can create these plots in both vertical and horizontal directions. We also add gird lines to the violin plot. These plots can be modified to demonstrate the median and mean values. A violin graph is much more useful than a simple box graph. Although a box graph simply displays statistical results and quartiles values, a violin plot displays the entire data dispersion.

About the author

Kalsoom Bibi

Hello, I am a freelance writer and usually write for Linux and other technology related content