Python

Python NumPy histogram() tutorial

A histogram is a mapping of intervals to frequencies. It is used to approximate the probability density function of the particular variable. It is known as the bar graph also. Many options are available in python for building and plotting histograms. NumPy library of python is useful for scientific and mathematical operations. One of this library’s important features is to implement histogram by using the histogram() function. This function is used to create the histogram that represents the frequency distribution of data graphically. In the histogram, the class intervals are represented by bins that look like horizontal rectangles, and the variable height represents the frequencies. The knowledge of creating NumPy array is necessary to understand the examples shown in this tutorial.

Syntax:

numpy.histogram(input_array, bins=10, range=None, normed=None, weights=None, density=None)

This function can take six arguments to return the computed histogram of a set of data. The purposes of these arguments are explained below.

  • input_array: It is a mandatory argument that is used to calculate the histogram data set.
  • bins: It is an optional argument that can take integer or a set of integer or string values. It is used to define the number of equal-width bins. An array of bin edges can be defined that increases monotonically. It can include the rightmost edge also which can use non-uniform bin widths. In the new NumPy version, the string value can be used for this argument.
  • range: It is an optional argument that is used to define the lower-upper ranges of the bins. The default range value is set by using max() and min() functions. The first element of the range must be less than or equal to the second element.
  • normed: It is an optional argument that is used to retrieve the number of samples in each bin. It may return false output for unequal bin widths.
  • weights: It is an optional argument that is used to define the array that contains weight values.
  • density: It is an optional argument that can take any Boolean value. If this argument’s value is True, then the number of samples in each bin will be returned; otherwise, the probability density function’s values will be returned.

This function can return two arrays. One is the hist array that contains the set of histogram data. Another is the edge array that contains the values of the bin.

Example 1: Print the histogram array

The following example shows the use of the histogram() function with a one-dimensional array and the bins argument with the sequential values. An array of 5 integer numbers has been used as an input array, and an array of 5 sequential values has been used as bins value. The content of the histogram array and bin array will print together as output.

# Import NumPy library
import numpy as np
# Call histogram() function that returns histogram data
np_array = np.histogram([10, 3, 8, 9, 7], bins=[2, 4, 6, 8, 10])
# Print the histogram output
print("The output of histogram is : \n", np_array)

Output:

The following output will appear after executing the above script.

Example 2: Print the histogram and bin arrays

The following example shows how the histogram array and the bin array can be created by using the histogram() function. A NumPy array has been created by using arrange() function in the script. Next, the histogram() function has called to return the histogram array and bin array values separately.

# Import NumPy library
import numpy as np

# Create NumPy array using arange()
np_array = np.arange(90)
# Create histogram data
hist_array, bin_array = np.histogram(np_array, bins=[0, 10,  25, 45, 70, 100])

# Print histogram array
print("The data of the histogram array is: ", hist_array)
# Print bin array
print("The data of the bin array is: ",  bin_array)

Output:

The following output will appear after executing the above script.

Example 3: Print the histogram and bin arrays based on density argument

The following example shows the use of the density argument of the histogram() function to create the histogram array. A NumPy array of 20 numbers is created by using arange() function. The first histogram() function is called by setting the density value to False. The second histogram() function is called by setting the density value to True.

# import NumPy array
import numpy as np
# Create a NumPy array of 20 sequential numbers
np_array = np.arange(20)
# Calculate the histogram data with false density
hist_array, bin_array = np.histogram(np_array, density=False)
print("The histogram output by setting density to False: \n", hist_array)
print("The output of bin array : \n", bin_array)
# Calculate the histogram data with true density
hist_array, bin_array = np.histogram(np_array, density=True)
print("\nThe histogram output by setting density to True: \n", hist_array)
print("The output of bin array : \n", bin_array)

Output:

The following output will appear after executing the above script.

Example 4: Draw a bar chart using histogram data

You have to install the matplotlib library of python to draw the bar chart before executing this example’s script. hist_array and bin_array have been created by using the histogram() function. These arrays have been used in the bar() function of the matplotlib library to create the bar chart.

# import necessary libraries
import matplotlib.pyplot as plt
import numpy as np

# Create histogram dataset
hist_array, bin_array = np.histogram([4, 10, 3, 13, 8, 9, 7], bins=[2, 4, 6, 8, 10, 12, 14])

# Set some configurations for the chart
plt.figure(figsize=[10, 5])
plt.xlim(min(bin_array), max(bin_array))
plt.grid(axis='y', alpha=0.75)
plt.xlabel('Edge Values', fontsize=20)
plt.ylabel('Histogram Values', fontsize=20)
plt.title('Histogram Chart', fontsize=25)

# Create the chart
plt.bar(bin_array[:-1], hist_array, width=0.5, color='blue')
# Display the chart
plt.show()

Output:

The following output will appear after executing the above script.

Conclusion:

The histogram() function has been explained in this tutorial by using various simple examples that will help the readers know the purpose of using this function and apply it properly in the script.

About the author

Fahmida Yesmin

I am a trainer of web programming courses. I like to write article or tutorial on various IT topics. I have a YouTube channel where many types of tutorials based on Ubuntu, Windows, Word, Excel, WordPress, Magento, Laravel etc. are published: Tutorials4u Help.