Python

Python Statistics Standard Deviation

The dispersion of the data is determined by using the standard deviation, a crucial measurement. This implies that the data is more centralized when the standard deviation is smaller as well as more scattered out when the standard deviation is significantly greater. The variance’s square root is used to compute it. Even though variance and standard deviation are measurements of dispersion, the standard deviation would be more frequently employed due to the standard deviation’s utilization of almost the same measures as the data.

For example, statistical testing and data visualization would be the capabilities of using standard deviation. This article includes demonstrations for each method users may use to determine the standard deviation of a set of data in Python.

Example no 1:
To compute the standard deviation, we would either create a method or utilize pre-built pandas or Numpy techniques. Let’s create a native Python version of determining standard deviation without importing any external frameworks.

def get_std_dev(list):
a = len(list)
m = sum(list) / a
v = sum((x - m)**4 for x in list) / a
s_dev = v ** 1.5
return s_dev

list = [17, 22, 44, 13, 29, 72, 60, 27]
get_s_dev(list)

At the start of the program, we will define the function of standard deviation. We provide the parameter of the list within this function. In the next step, we will find the length of the required list by using the len() method. The length of the list will be stored in a variable ‘a’. Now, we will find the mean of the required list. To determine the mean of the list, first we have to obtain the sum of the list and then the calculated sum will be divided by the length of the list. The mean of the list will now be saved in the variable ‘m’. Let’s find the variance of the defined list.

Here, we apply formulae of the variance. We have employed the ‘for’ loop within the formulae of the variance. Furthermore, we will utilize the formulae for finding the standard deviation. The variance will be multiplied by 1.5. By doing so, this returns the standard deviation of the specified list. We declare a variable ‘list’.

Here,m we set different random values. To depict the standard deviation of the list, we have applied the get_s_dev() method. This function contains the list as its parameters.

In this illustration, we constructed a method that returns standard deviations of a set of integers. You’ll see that we computed the sum for mean and variance using the Python language’s inbuilt sum() method. This method is being used to compute the summation of the defined sequence.

There are several ways apart from the one mentioned above to determine the standard deviation of a set of data. We may utilize the basic one-line solutions for computing standard deviations by using the modules and saving the data as a Numpy array or a pandas framework.

Example no 2:
The standard deviation could be determined instantly by storing the set of elements as a NumPy array and using the NumPy ndarray std() method. Let’s have a look at an instance.

import numpy as np
list = [17, 23, 14, 33, 19, 10, 40, 62]
x = np.array(list)
print(x.std())

First, we will incorporate the library ‘NumPy’ as np. Further, we define elements of the data set. These elements are stored in a variable ‘list’. Next, we call the array() method of the NumPy module. This method contains the list of elements as the argument. In the last step of the code, the print() method will be invoked to display the standard deviation of the list.

Example no 3:
The set of data may alternatively be saved as a pandas package from which we can subsequently calculate the standard deviation by utilizing method std(). This method is probably comparable to the NumPy array approach. A lot of pandas’ functions are containers for NumPy functions. At this moment, let’s use the pandas module to calculate the set of elements’ standard deviation.

import pandas as pd
l = [34, 22, 74, 23, 19, 16, 40, 62]
c = pd.Series(l)
print(c.std())

Here, we have to import the required header file ‘pandas’ as pd. We specify the data set’s components in a form of a list. These values are saved in the variable “l.” The pandas’ module’s method series() would then be invoked. The parameter for this method is that set of elements. List values will be constructed in a pandas package. The print() function could be used in the final line of code to show the list’s standard deviation. We call the std() method to determine the list’s standard deviation.

Example no 4:
In this example, we will determine the standard deviation of data sets with different data types.

from statistics import stdev

from fractions import Fraction as fr
set_1 = (11, 22, 15, 41, 78, 59, 90)
set_2 = (-21, -14, -33, -51, -35, -26)

set_3 = (-59, -71, -20, 12, 15, 33, 74, 69)

set_4 = (5.13, 4.40, 3.31, 8.5, 7.2)

print("The calculated Standard Deviation of Set 1: % s"
              %(stdev(set_1)))
             
print("The calculated Standard Deviation of Set 2: % s"
              %(stdev(set_2)))
             
print("The calculated Standard Deviation of Set 3: % s"
              %(stdev(set_3)))
             
print("The calculated Standard Deviation of Set 4: % s"
              %(stdev(set_4)))

We are going to integrate the stdev library from the statistics module and fraction as ‘fr’ from the fractions framework. Now, we will create four different data sets of different data types. The elements of the first data set will be stored in a variable ‘set_1’. This set contains all positive numbers. The second data set will be stored in a variable ‘set_2’. This set consists of all negative values. Next, we declare a variable ‘set_3’.

Here, we define the elements of the third data set. This list has a mixture of positive and negative values. To store the values of the last data set, we will initialize a variable ‘set_4’. This set holds all the floating-point values. Let’s print the standard deviation of these data sets. To accomplish this, we have to utilize the function print() for all the data sets respectively. The standard deviation of all the sets would be calculated by using the method stdev().

Conclusion

In this article, we looked at various methods for computing the standard deviation. In the first illustration, we have calculated the data set’s standard deviation by calculating its sum and variance. Then, to obtain the standard deviation of the predefined list of elements, we use the pandas and NumPy modules in the next two instances. In the last demonstration, we obtain the standard deviation of sets with various data types.

About the author

Kalsoom Bibi

Hello, I am a freelance writer and usually write for Linux and other technology related content