pandas

How to Plot Data in Pandas Python

Data visualization plays an important role in data analysis. Pandas is a strong data analysis library in python for data science. It provides various options for data visualization with .plot() method. Even if you are a beginner, you can easily plot your data using the Pandas library. You need to import the pandas and matplotlib.pyplot package for data visualization.

In this article, we will explore various data plotting methods by using the Pandas python. We have executed all examples on the pycharm source code editor by using the matplotlib.pyplot package.

Plotting in Pandas Python

In Pandas, the .plot() has several parameters that you can use based on your needs. Mostly, using the ‘kind’ parameter, you can define which type of plot you will create.

The Syntax for Plotting Data using Pandas Python

The following syntax is used to plot a DataFrame in Pandas Python:

# import pandas and matplotlib.pyplot Packages
import pandas as pd
import matplotlib.pyplot as plt
# Prepare Data to create DataFrame
data_frame = {
    'Column1': ['field1', 'field2', 'field3', 'field4',...],
     ‘Column2': ['field1', 'field2', 'field3', 'field4',...]
    }
var_df= pd.DataFrame(data_frame, columns=['
Column1', 'Column2])
print(Variable)
# plotting bar graph
var_df.plot.bar(x='Column1', y='Column2')
plt.show()

You can also define the plot kind by using the kind parameter as follows:

var_df.plot(x='Column1', y='Column2', kind=’bar’)

Pandas DataFrames objects have the following plot methods for plotting:

  • Scatter Plotting: plot.scatter()
  • Bar Plotting:  plot.bar() , plot.barh() where h represents horizontal bars plot.
  • Line Plotting: plot.line()
  • Pie Plotting: plot.pie()

If a user only uses the plot() method without using any parameter then, it creates the default line graph.

We will now elaborate on some major types of plotting in detail with the help of some examples.

Scatter Plotting in Pandas

In this type of plotting, we have represented the relationship between two variables. Let’s take an example.

Example

For example, we have data of correlation between two variables GDP_growth and Oil_price. To plot the relation between two variables, we have executed the following piece of code on our source code editor:

import matplotlib.pyplot as plt
import pandas as pd
gdp_cal= pd.DataFrame({
    'GDP_growth': [6.1, 5.8, 5.7, 5.7, 5.8, 5.6, 5.5, 5.3, 5.2, 5.2],
    'Oil_Price': [1500, 1520, 1525, 1523, 1515, 1540, 1545, 1560, 1555, 1565]
})
df = pd.DataFrame(gdp_cal, columns=['Oil_Price', 'GDP_growth'])
print(df)
df.plot(x='Oil_Price', y='GDP_growth', kind = 'scatter', color= 'red')
plt.show()

Line Charts Plotting in Pandas  

The line chart plot is a basic type of plotting in which given information displays in a  data points series that are further connected by segments of straight lines. Using the Line charts, you can also show the trends of information overtime.

Example

In the below-mentioned example, we have taken the data about the past year’s inflation rate. First, prepare the data and then create DataFrame. The following source code plots the line graph of the available data:

import pandas as pd
import matplotlib.pyplot as plt

infl_cal = {'Year': [2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011],
        'Infl_Rate': [5.8, 10, 7, 6.7, 6.8, 6, 5.5, 8.2, 8.5, 9, 10]
        }
data_frame = pd.DataFrame(infl_cal, columns=['Year', 'Infl_Rate'])
data_frame.plot(x='Year', y='Infl_Rate', kind='line')
plt.show()

In the above example, you need to set the kind= ‘line’ for line chart plotting.

Method 2# Using plot.line() method

The above example, you can also implement using the following method:

import pandas as pd
import matplotlib.pyplot as plt

inf_cal = {'Year': [2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011],
        'Inflation_Rate': [5.8, 10, 7, 6.7, 6.8, 6, 5.5, 8.2, 8.5, 9, 10]
        }
data_frame = pd.DataFrame(inf_cal, columns=['Inflation_Rate'], index=[2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011])
data_frame.plot.line()

plt.title('Inflation Rate Summary of Past 11 Years')
plt.ylabel('Inflation_Rate')
plt.xlabel('Year')
plt.show()

The following line graph will display after running the above code:

Bar Chart Plotting in Pandas

The bar chart plotting is used to represent the categorical data. In this type of plot, the rectangular bars with different heights are plotted based on the given information. The bar chart can be plotted in two different horizontal or vertical directions.

Example

We have taken the literacy rate of several countries in the following example. DataFrames are created in which ‘Country_Names’ and ‘literacy_Rate’ are the two columns of a DataFrame. Using Pandas, you can plot the information in the bar graph shape as follows:

import pandas as pd
import matplotlib.pyplot as plt

lit_cal = {
    'Country_Names': ['Pakistan', 'USA', 'China', 'India', 'UK', 'Austria', 'Egypt', 'Ukraine', 'Saudia', 'Australia',
                      'Malaysia'],
    'litr_Rate': [5.8, 10, 7, 6.7, 6.8, 6, 5.5, 8.2, 8.5, 9, 10]
    }
data_frame = pd.DataFrame(lit_cal, columns=['Country_Names', 'litr_Rate'])
print(data_frame)
data_frame.plot.bar(x='Country_Names', y='litr_Rate')
plt.show()

You can also implement the above example using the following method. Set the kind=’bar’ for bar chart plotting in this line:

data_frame.plot(x='Country_Names', y='litr_Rate', kind='bar')
plt.show()

Horizontal bar chart plotting

You can also plot the data on horizontal bars by executing the following code:

import matplotlib.pyplot as plt
import pandas as pd

data_chart = {'litr_Rate': [5.8, 10, 7, 6.7, 6.8, 6, 5.5, 8.2, 8.5, 9, 10]}
df = pd.DataFrame(data_chart, columns=['litr_Rate'], index=['Pakistan', 'USA', 'China', 'India', 'UK', 'Austria', 'Egypt', 'Ukraine', 'Saudia', 'Australia',
                      'Malaysia'])

df.plot.barh()

plt.title('Literacy Rate in Various Countries')
plt.ylabel('Country_Names')
plt.xlabel('litr_Rate')
plt.show()

In df.plot.barh(), the barh is used for horizontal plotting. After running the above code, the following bar chart displays on the window:

Pie Chart Plotting in Pandas

A pie chart represents the data in a circular graphic shape in which data displays into slices based on the given quantity.

Example

In the following example, we have displayed the information about ‘Earth_material’ in different slices on the Pie chart. First, create the DataFrame, then, by using the pandas, display all details on the graph.

import pandas as pd
import matplotlib.pyplot as plt

material_per = {'Earth_Part': [71,18,7,4]}
dataframe = pd.DataFrame(material_per,columns=['Earth_Part'],index = ['Water','Mineral','Sand','Metals'])

dataframe.plot.pie(y='Earth_Part',figsize=(7, 7),autopct='%1.1f%%', startangle=90)
plt.show()

The above source code plots the pie graph of the available data:

Conclusion

In this article, you have seen how to plot DataFrames in Pandas python. Different kinds of plotting are performed in the above article. To plot more kinds such as box, hexbin, hist, kde, density, area, etc., you can use the same source code just by changing the plot kind.

About the author

Avatar

Samreena Aslam

Samreena Aslam holds a master’s degree in Software Engineering. Currently, she's working as a Freelancer & Technical writer. She's a Linux enthusiast and has written various articles on Computer programming, different Linux flavors including Ubuntu, Debian, CentOS, and Mint.