Python

Matplotlib 2d histogram

The two-dimensional histogram is being used to observe the relation between two attributes with a large number of elements. A two-dimensional histogram is quite identical to a one-dimensional histogram. The data point’s category parameters are represented on both the x and y axes. In contrast to a 1-dimensional histogram, this is created by measuring the number of permutations of elements in x and y regions and indicating the intensities.

It is effective when a discrete distribution has a vast quantity of data and enables this by illustrating the positions where the frequency components are dense. This can help to evade an over-plotted graph. Let’s discuss the matplotlib two-dimensional histogram in detail:

Customize 2D Histogram:

We will utilize the Matplotlib library’s built-in functionality matplotlib.pyplot.hist2d() to modify and draw the 2D histograms. We are going to examine how to customize a two-dimensional histogram by adjusting the bin scale in this step:

import matplotlib.pyplot as plt
import numpy as np
import random
a = np.random.normal(size = 700000)
b = a * 5 + 6 * np.random.normal(size = 700000)
a_min = np.min(a)
a_max = np.max(a)  
b_min = np.min(b)
b_max = np.max(b)
a_bins = np.linspace(a_min, a_max, 60)
b_bins = np.linspace(b_min, b_max, 30)
fig, ax = plt.subplots(figsize =(9, 4))
plt.hist2d(a, b, bins =[a_bins, b_bins])
ax.set_xlabel('X')
ax.set_ylabel('Y')
plt.tight_layout()
plt.show()

At the start of the code, we introduce the libraries: matplotlib.pyplot is a plotting library for making visualizations in python. We can use it on the web as well as desktop applications and various graphical user interfaces. The second library provides us with a large number of numeric data types that we can utilize to make arrays.

Last but not least is random, an inbuilt python module utilized to create random numbers. Now we initialize the ‘a’ variable for the x-axis and assign it a random numeric value with the size of 70000. Then we assign the ‘b’ variable to the y-axis, but before that, we multiply our ‘a’ with 5, and a random numeric value with 6 is added. That is how we have our x-axis and y-axis.

Further, we utilize two new variables, ‘a_min’ and ‘a_max’. Here we initialize a function min() and max() that finds the element-wise minimum array elements and maximum array elements, respectively, and we pass the x-axis to this function. The same thing is performed with the y-axis also.

In addition to this, for plotting, the minimum and maximum value of the y-axis is 30, and similarly, the minimum and maximum value for the x-axis is 60. We set the size of the figure by providing the ‘figsize’ parameter to the subplots() function. The function ‘plt.hist2d’ is utilized to make a 2D histogram plot.

We pass x-axis and y-axis bins values as its arguments. Then we set labels of x and y axes with X and Y. The function plt.tight_layout() is called to adjust specific padding between subplots. In the end, we show the graph by using the plt.show() method.

Customizing the color scale and inserting the color bar:

Here, we’ll adjust the color combination and integrate a color bar into a two-dimensional histogram using the following method:

import matplotlib.pyplot as plt
import numpy as np
import random
a = np.random.normal(size = 600000)
b = a * 5 + 7 * np.random.normal(size = 600000)
a_min = np.min(a)
a_max = np.max(a)
b_min = np.min(b)
b_max = np.max(b)
a_bins = np.linspace(a_min, a_max, 70)
b_bins = np.linspace(b_min, b_max, 30)
fig, ax = plt.subplots(figsize =(9, 4))
plt.hist2d(a, b, bins =[a_bins, b_bins], cmap = plt.cm.nipy_spectral)
plt.title("Figure")
plt.colorbar()
ax.set_xlabel('X')
ax.set_ylabel('Y')
plt.tight_layout()
plt.show()

Here, we integrate our modules matplotlib.pyplot, NumPy, and random for plotting the graphs, for numeric values, and for using random numbers. In the next step, we again initialize two variables for the x-axis and y-axis by giving them some random numeric size.

In the y-axis, we do some calculations to give it an accurate place. With the help of min() and max() functions, we get the minimum and maximum array elements for both the x-axis and y-axis. There is more than one plot, so we make a specific grid with the help of the plt.subplots() function.

We pass the size of the plot to this function. Then we draw the 2D histogram by calling the plt.hist2d() function, and we label the plot with the help of the plt.tittle() function. Further, we set the label to x and y-axes. We call plt.show() function to represent the plot.

Update the datasets:

The relevance of the bins parameter is demonstrated in the subsequent instance. We may specify how many of those bins we can get on the X and Y axes manually. The effect of filtering the datasets will be seen here:

import matplotlib.pyplot as plt
import numpy as np
import random
a = np.random.normal(size = 600000)
b = a * 6 + 7 * np.random.normal(size = 600000)
a_min = np.min(a)
a_max = np.max(a)
b_min = np.min(b)
b_max = np.max(b)
a_bins = np.linspace(a_min, a_max, 40)
b_bins = np.linspace(b_min, b_max, 30)
data1 = np.c_[a, b]
for i in range(20000):
    x_idx = random.randint(0, 600000)
    data1[x_idx, 1] = -9999
data2 = data1[data1[:, 1]!=-9999]
fig, ax = plt.subplots(figsize =(9, 6))
plt.hist2d(data2[:, 0], data2[:, 1], bins =[a_bins, b_bins])
plt.title("Figure")
ax.set_xlabel('X')
ax.set_ylabel('Y')
plt.tight_layout()
plt.show()

In this case, we import the libraries matplotlib.pyplot, NumPy, and random. Then we initialize the x and y-axis with ‘a’ and ‘b’ variables, respectively. We specified some numeric values with the help of a random function.

After that, we add line spacing on both the x and y-axis. In addition to this, we make an array by merging ‘a’ and ‘b’. We utilize for loop with a limit of 20000. In this loop, we call a method that integrates values between high to low. We declare a random object and store it in a variable. After ending the loop, we have to adjust the size of the graph.

So we provide the ‘figsize’ parameter to the function plt.subplots(). We draw a 2D histogram, calling the function plt.hist2d(). Further, we set the label of the figure and axes by using functions. In the termination of the code, we display the graph using the plt.show() function.

Use matplotlib hexbin method:

To build two-dimensional histograms, we also utilize the hexbin() method. So we’ll go over how to use the matplotlib hexbin technique in this illustration:

import matplotlib.pyplot as plt
import numpy as np
import random
a = np.random.normal(size = 700000)
b = a * 6 + 8 * np.random.normal(size = 700000)  
fig, ax = plt.subplots(figsize =(8, 6))
plt.title("Figure")
plt.hexbin(a, b, bins = 60)
ax.set_xlabel('X')
ax.set_ylabel('Y')
plt.tight_layout()
plt.show()

The last example includes our necessary libraries for drawing graphs, numeric values, and other functionalities. Next, we initialized the x and y-axis and gave them some random values with the help of built-in methods.

In addition to this, we draw the figure by applying the plt.hexbin() method. We label the plot with the help of the plt.title() method. Further, we add labels to both axes. In the end, we show the graph after adjusting the layout.

Conclusion:

In this article, we have covered the method of using matplotlib.pyplot.hist2d() function to create 2D histograms. We insert the color into a 2D histogram and customize the color scale. We notice the effect on the histogram after filtering the datasets. Hexbin() method is also used to draw a two-dimensional histogram.

About the author

Kalsoom Bibi

Hello, I am a freelance writer and usually write for Linux and other technology related content