In Python, a data structure called a dictionary is used to store information as key-value pairs. Dictionary objects are optimized to extract data/values when the key or keys are known. To efficiently find values using the related index, we can convert a pandas series or dataframe with a relevant index into a dictionary object with “index: value” key-value pairs. To achieve this task, the “to_dict()” method can be used. This function is a built-in function found in the pandas module’s Series class.
A DataFrame is converted into a python list-like data dictionary of series using the pandas.to_dict() method depending on the specified value of the orient parameter.”
We will use the to_dict() method in Pandas. We can orient the returned dictionary’s key-value pairs in a variety of ways using the to_dict() function. The function’s syntax is as follows:
Syntax:
Parameters:
-
- orient: Which datatype to convert columns (series into) is specified by the string value (“dict”, “list”, “records”, “index”, “series”, “split”). For instance, the keyword “list” would give a python dictionary of list objects with the keys “Column name” and “List” (converted series) as output.
- into: class can be passed as an instance or actual class. For instance, a class instance can be passed in the case of a default dict. The parameter’s default value is dict.
Return Type:
Dictionary converted from a dataframe or series.
Data:
In all the examples, we will use the following DataFrame named “remarks” that hold 2 rows and 4 columns. Here the column labels are – [‘id’,’name’,’status’,’fee’].
# Create the dataframe using lists
remarks = pandas.DataFrame([[23,'sravan','pass',1000],
[21,'sravan','fail',400],
],columns=['id','name','status','fee'])
# Display the DataFrame - remarks
print(remarks)
Output:
0 23 sravan pass 1000
1 21 sravan fail 400
Example 1: to_dict() with No Parameters
We will convert the remarks DataFrame to a dictionary without passing any parameters to the to_dict() method.
# Create the dataframe using lists
remarks = pandas.DataFrame([[23,'sravan','pass',1000],
[21,'sravan','fail',400],
],columns=['id','name','status','fee'])
# Convert to Dictionary
print(remarks.to_dict())
Output:
Explanation
The DataFrame is converted to a Dictionary.
Here, the columns in the original DataFrame were converted as Keys in a dictionary and each column will store two values again in a dictionary format. The keys for these values start from 0.
Example 2: to_dict() with ‘series’
We will convert the remarks DataFrame to a dictionary in Series format by passing the ‘series’ parameter to the to_dict() method.
Format:
# Create the dataframe using lists
remarks = pandas.DataFrame([[23,'sravan','pass',1000],
[21,'sravan','fail',400],
],columns=['id','name','status','fee'])
# Convert to Dictionary with series of values
print(remarks.to_dict('series'))
Output:
1 21
Name: id, dtype: int64, 'name': 0 sravan
1 sravan
Name: name, dtype: object, 'status': 0 pass
1 fail
Name: status, dtype: object, 'fee': 0 1000
1 400
Name: fee, dtype: int64}
Explanation
The DataFrame is converted to a Dictionary with ‘series’ format.
Here, the columns in the original DataFrame were converted as Keys in a dictionary and each column will store rows along with the data type of the column. The data type of ‘id’ column is int64 and other two columns are ‘object’.
Example 3: to_dict() with ‘split’
If you want to separate row labels, column labels and values in the converted Dictionary, then you can use the ‘split’ parameter. Here, ‘index’ key will store a list of index labels. ‘Columns’ key will hold a list of column names and data is a nested list that stores each row values in a list separated by a comma.
Format:
# Create the dataframe using lists
remarks = pandas.DataFrame([[23,'sravan','pass',1000],
[21,'sravan','fail',400],
],columns=['id','name','status','fee'])
# Convert to Dictionary without index and header
print(remarks.to_dict('split'))
Output:
Explanation
We can see that two indices were stored in a list as a value to the key – ‘index’. Similarly, column names are also stored in a list as a value to the key – ‘columns’ and each row is stored as a list in a nested list to the ‘data’.
Example 4: to_dict() with ‘record’
If you convert your DataFrame to a Dictionary with each row as a Dictionary in a list, you can use the record parameter in the to_dict() method. Here, each row is placed in a dictionary such that the key will be the column name and value is the actual value in the pandas DataFrame. All rows were stored in a list.
Format:
# Create the dataframe using lists
remarks = pandas.DataFrame([[23,'sravan','pass',1000],
[21,'sravan','fail',400],
],columns=['id','name','status','fee'])
# Convert to Dictionary by record
print(remarks.to_dict('record'))
Output:
Example 5: to_dict() with ‘index’
Here, each row is placed in a dictionary as a value to the key starts from 0. All rows were stored again in a dictionary.
Format:
# Create the dataframe using lists
remarks = pandas.DataFrame([[23,'sravan','pass',1000],
[21,'sravan','fail',400],
],columns=['id','name','status','fee'])
# Convert to Dictionary with index
print(remarks.to_dict('index'))
Output:
Example 6: OrderedDict()
Let us utilize the ‘into’ parameter that will take OrderedDict, which converts the pandas DataFrame into an Ordered dictionary.
from collections import *
# Create the dataframe using lists
remarks = pandas.DataFrame([[23,'sravan','pass',1000],
[21,'sravan','fail',400],
],columns=['id','name','status','fee'])
# Convert to OrderedDict
print(remarks.to_dict(into=OrderedDict))
Output:
Conclusion
We have discussed how we can convert the dataframe or pandas objects into a python dictionary. We have seen the syntax of the to_dict() function to understand the parameters of this function and how you can modify the function’s output by specifying the function with different parameters. In the examples of this tutorial, we have used the to_dict() method, an inbuilt pandas function, to change the pandas objects to the python dictionary.