Python

NumPy percentile

The percentile function in NumPy is used to calculate the nth percentile of a specified array along a defined axis.

A percentile refers to the value below which a specified percentage of data falls.

Let us understand how we can use the percentile function in NumPy.

Function Syntax

The function takes on syntax as shown below:

numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, method='linear', keepdims=False, *, interpolation=None)

Let us discuss function syntax.

Function Parameters

  1. a – refers to the input array whose percentile we need to calculate.
  2. q – specifies which percentile we are calculating. Must be a value of 0 to 100 (inclusive).
  3. axis – defines along which axis we are computing for the percentile.
  4. out – specifies an output array. This array must have the same shape as the resulting output.
  5. overwrite_input – allows you to modify the input array.
  6. method – determines the method for estimating the percentile. Again, check the docs for the accepted values.
  7. keepdims – reduces the axes with the dimensions of one.

Function Return Value

The percentile() function returns an array with the percentile values of the values along the specified axis.

Example #1

Take the example provided in the code below:

# import numpy
import numpy as np
# create 1D array
arr_1d = np.array([10,14,7,4,3,2,8,1])
# calculate 25th percentile
print(f"25th percentile: {np.percentile(a=arr_1d, q=25)}")
# calculate 50th percentile
print(f"50th percentile: {np.percentile(a=arr_1d, q=50)}")
# 100th percentile
print(f"100th percentile: {np.percentile(a=arr_1d, q=100)}")

The code above uses the percentile function to calculate the 25th, 50th, and 100th percentiles of one-dimensional array.

Since the input is 1D, the function will return a scalar value as shown in the output below:

25th percentile: 2.75
50th percentile: 5.5
100th percentile: 14.0

Example #2

Consider the code below that calculates the percentiles of a 2d array along the 0 axis.

# 2d array
arr_2d = np.array([[10,14,7,4], [3,2,8,1]])
# calculate 25th percentile
print(f"25th percentile: {np.percentile(a=arr_2d, q=25, axis=0)}")
# calculate 50th percentile
print(f"50th percentile: {np.percentile(a=arr_2d, q=50, axis=0)}")
# 100th percentile
print(f"100th percentile: {np.percentile(a=arr_2d, q=100, axis=0)}")

In the example above, we calculate the 25th, 50th and 100th percentile of a 2D array along the zero axis. The resulting output is as shown:

25th percentile: [4.75 5.   7.25 1.75]
50th percentile: [6.5 8.  7.5 2.5]
100th percentile: [10. 14.  8.  4.]

Example #3

The code below demonstrates the percentile function with a 2d array along various axes.

# 2d array
arr_2d = np.array([[10,14,7,4], [3,2,8,1]])
# calculate 25th percentile
print(f"25th percentile (axis=1): {np.percentile(a=arr_2d, q=25, axis=1)}")
# calculate 50th percentile
print(f"50th percentile (axis=-1): {np.percentile(a=arr_2d, q=50, axis=-1)}")
# 100th percentile
print(f"100th percentile (axis=None): {np.percentile(a=arr_2d, q=100, axis=None)}")

The code above shows how to calculate the percentiles of a 2d array along various axes.

NOTE: Setting the axis to None flattens the array and calculates the percentile.

25th percentile (axis=1): [6.25 1.75]
50th percentile (axis=-1): [8.5 2.5]
100th percentile (axis=None): 14.0

Conclusion

This article covers how to use the percentile function in NumPy. Hence, tweak the function’s parameters to understand better how the function behaves under various conditions.

Thanks for reading!!

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list