Matlab

How to Work with Normal Distribution in MATLAB Using fitdist

Normal Distribution is a statistical technique widely used in artificial intelligence, data science,  machine learning, and, many other fields. It is a probability distribution that is symmetric at the mean and is also referred to as the Gaussian distribution due to the shape it makes on a graph. It shows that the data values close to the mean occur more frequently than data values far from the mean. On a graph, the normal distribution forms a bell curve.

Finding a normal distribution of a data set is not an easy task; however, we can perform it in MATLAB using the fitdist() function. Read this guide to learn in detail about working with the normal distribution in MATLAB using the fitdist() function. 

What is Normal Distribution

A normal distribution also termed a Gaussian distribution is defined using two parameters; mean and standard deviation of the data points. The mean measures the average of data values, while the standard deviation measures how data values are spread out around the mean. With the combination of both Mean and Standard deviation, we can calculate normal distribution from the following formula:

Where:

  • x represents dataset values.
  • f(x) represents the probability function.
  • μ denotes the
  • σ denotes the standard deviation.

How to Perform Normal Distribution in MATLAB Using the fitdist() Function

MATLAB lets us calculate the normal distribution of random variables using the built-in fitdist() function. This function produces a normal probability distribution object by fitting the given distribution to the input data. The normal distribution accepts two parameters as input: the standard deviation as well as the mean. A standard normal distribution has zero mean value as well as a unit standard deviation that is 1. This means that the normal distribution is centered at zero and the values of the distributions are spread out equally on both sides of the mean.

Syntax

The fitdist() in MATLAB can be used in different ways:

pd = fitdist(x,distname)
pd = fitdist(x,distname,Name,Value)
pdca,gn,gl] = fitdist(x,distname,'By',groupvar)

Here:

  • The function pd = fitdist(x,distname) is responsible for fitting the distribution provided by distname to the data contained in column vector x to produce a probability distribution object.
  • The function pd = fitdist(x,distname,Name,Value) is responsible for building the probability distribution object with one or more name-value pair arguments that specify extra parameters.
  • The function [pdca,gn,gl] = fitdist(x,distname,’By’,groupvar) is responsible for fitting the probability distribution defined by distname to the data in column vector x based on the grouping variable groupvar to generate probability distribution objects. It gives back a cell array of fitted probability distribution objects, denoted as pdca, a cell array of group labels, denoted as gn, and a cell array of grouping variable levels, denoted as gl.

Example 1: How to Find Normal Distribution Using fitdist(x,distname) Function

This example fits a normal distribution to the sample data z using the fitdist() function.

load patients
z = Weight;
pd = fitdist(z,'Normal')

Example 2: How to Find Normal Distribution Using fitdist(x,distname,Name,Value) Function

In this example, we are going to fit a Kernel distribution to the sample data using the fitdist() function in MATLAB.

load patients
z = Weight;
pd = fitdist(z,'Kernel','Kernel','epanechnikov')

Example 3: How to Find Normal distribution Using fitdist(x,distname,’By’,groupvar) Function

The given below MATLAB code fits normal distributions to grouped data, computes and plots the pdf of both groups of data.

load patients
z = Weight;
[pdca,gn,gl] = fitdist(z,'Normal','By',Gender)
female = pdca{1}
male = pdca{2}
z_values = 80:1:220;
femalepdf = pdf(female,z_values);
malepdf = pdf(male,z_values);
figure
plot(z_values,femalepdf,'LineWidth',2)
hold on
plot(z_values,malepdf,'Color','r','LineStyle',':','LineWidth',2)
legend(gn,'Location','NorthEast')
hold off

Conclusion

Finding the normal distribution of a dataset is a statistical technique that is widely used in machine learning, artificial intelligence, data science, and many other fields. It can be defined using two parameters; mean as well as standard deviation of the data points. We can fit the dataset in the normal distribution object using the fitdist() function. This guide has provided the basics of the normal distribution function and how to work with it in MATLAB using the fitdist() function.

About the author

Komal Batool Batool

I am passionate to research technologies and new ideas and that has brought me here to write for the LinuxHint. My major focus is to write on programming languages and computer science related topics.