Let’s discuss how to convert JSON into Pandas DataFrame with examples. In this guide, we will use the pandas.read_json(), pandas.DataFrame.from_dict(), json.loads(), and pandas.json_normalize() functions to convert JSON to the Pandas DataFrame.
Using Pandas.Read_Json
The pandas.read_json() converts a JSON string or JSON file into a Pandas object. Here, the DataFrame is considered as the Pandas object. A more indepth tutorial is available on this function like exploring its syntax and examples. We will see the different examples of loading JSON into the Pandas DataFrame with different orientations.
The parameters are discussed under each example. Based on the JSON input format, we need to specify the orient. The following are the orients and the JSON string formats.
Example 1: Records Orient
Consider the “campaign_json” JSON string and read this string into the DataFrame using the read_json() function by passing the orient parameter as “records”. Display the type of the result and the result after converting to the DataFrame.
# Create JSON related to the Campaigns
campaign_json = '''
[
{ "Campaign_Name": "Technical sessions on python", "Type": "Webinar","Status":"Planned"},
{ "Campaign_Name": "Marketing", "Type": "Conference","Status":"Completed"}
] '''
# orient ='records'
DataFrame_from_json = pandas.read_json(campaign_json, orient ='records')
print(type(DataFrame_from_json),"\n")
print(DataFrame_from_json)
Output:
Example 2: Values Orient
Read the JSON string which is in the form of values into the Pandas DataFrame. Set the orient parameter to “values”. After converting to the Dataframe, the columns will be 0, 1, and 2.
campaign_json = '''
[
["Technical sessions on python","Webinar","Planned"],
["Marketing", "Conference","Completed"]
] '''
# orient = 'values'
DataFrame_from_json = pandas.read_json(campaign_json, orient ='values')
print(type(DataFrame_from_json),"\n")
print(DataFrame_from_json)
Output:
Example 3: Columns Orient
Read the JSON string which is in the form of a list of dictionaries into the Pandas DataFrame. Set the orient parameter to “columns”. After converting to the Dataframe, the columns will be the keys of the dictionary and the values will be the dictionary values.
# Create JSON related to the Campaigns
campaign_json = '''
[
{ "Campaign_Name": "Sales", "Type": "Public relations","Status":"Planned"},
{ "Campaign_Name": "Training Sessions", "Type": "Email","Status":"Aborted"}
] '''
# orient ='columns'
DataFrame_from_json = pandas.read_json(campaign_json, orient ='columns')
print(type(DataFrame_from_json),"\n")
print(DataFrame_from_json)
Output:
Example 4: Index Orient
Read the JSON string which is in the form of dictionary of dictionaries into the Pandas DataFrame. Set the orient parameter to “index”. After converting to the Dataframe, the columns will be the keys of the dictionary and the values will be the dictionary values. Index is also added to the DataFrame which is the key of the main dictionary.
# Create JSON related to the Campaigns
campaign_json = '''
{
"camp-1": { "Campaign_Name": "Marketing", "Type": "Conference","Status":"Completed"},
"camp-2": { "Campaign_Name": "Training Sessions", "Type": "Email","Status":"Aborted"}
} '''
# orient ='index'
DataFrame_from_json = pandas.read_json(campaign_json, orient ='index')
print(type(DataFrame_from_json),"\n")
print(DataFrame_from_json)
Output:
Example 5: JSON from File
We convert JSON that is present in the file (camps.json) into Pandas DataFrame. The JSON that is present in the file is:
# Create JSON from file - 'camps.json'
DataFrame_from_json = pandas.read_json('camps.json')
print(DataFrame_from_json)
Output:
Using Pandas.DataFrame.From_Dict
The pandas.DataFrame.from_dict() is used to construct the DataFrame from the dictionary/array of arrays, dictionaries, etc. We can utilize this function to convert JSON to Pandas DataFrame.
We need to convert the JSON input to a dictionary before that.
The json.loads() function converts JSON to a dictionary. Now, we pass this dictionary to the pandas.DataFrame.from_dict() function.
Syntax:
The following is the syntax and parameter for the pandas.DataFrame.from_dict() function.
pandas.DataFrame.from_dict(data, orient, dtype, columns)
- The data can be an array, dictionary, or nested dictionaries.
- The orient parameter is for the data orientation. The supported values are “index”, “columns”, and “tight”.
- We can create the Pandas DataFrame with specific columns by passing the column labels through the list to the “columns” parameter. All columns are allowed by default. This parameter is used only if the orient parameter is set as “index”.
Example 1:
Consider the “campaign_json” JSON string in the form of a nested dictionary.
- Convert the campaign_json string into a dictionary using the json.loads() function.
- Pass the dictionary to the pandas.DataFrame.from_dict() function with the orient as “index”.
import json
# Create JSON related to the Campaigns
campaign_json = '''
{
"camp-1": { "Campaign_Name": "Marketing", "Type": "Conference","Status":"Completed"},
"camp-2": { "Campaign_Name": "Training Sessions", "Type": "Email","Status":"Aborted"}
}'''
# Load the campaign_json
loaded = json.loads(campaign_json)
# Use pandas.DataFrame.from_dict()
DataFrame_from_json = pandas.DataFrame.from_dict(loaded,orient='index')
print(type(DataFrame_from_json),"\n")
print(DataFrame_from_json)
Output:
The DataFrame is created from JSON (after converting JSON to a dictionary) with all the columns.
Example 2:
Use the same JSON and create the DataFrame with only one column – “Campaign_Name”.
import json
# Create JSON related to the Campaigns
campaign_json = '''
{
"camp-1": { "Campaign_Name": "Marketing", "Type": "Conference","Status":"Completed"},
"camp-2": { "Campaign_Name": "Training Sessions", "Type": "Email","Status":"Aborted"}
}'''
# Load the campaign_json
loaded = json.loads(campaign_json)
# Create pandas DataFrame by including only the Campaign_Name column
DataFrame_from_json = pandas.DataFrame.from_dict(loaded,orient='index',columns=["Campaign_Name"])
print(type(DataFrame_from_json),"\n")
print(DataFrame_from_json)
Output:
The DataFrame is created from JSON (after converting JSON to a dictionary) with only one column – “Campaign_Name”.
Using Pandas.Json_Normalize()
The pandas.json_normalize() is used to normalize the semi-structured JSON data into a flat table. This accpets a dictionary as a parameter. We can utilize this function to convert JSON to Pandas DataFrame. We need to convert the JSON input to a dictionary before that.
The json.loads() function converts JSON to a dictionary. Now, we pass this dictionary to the pandas.json_normalize() function.
Example:
Consider the “campaign_json” JSON string and read this string into the DataFrame using the read_json() function by passing the “orient” parameter as “records”. Display the type of the result and the result after converting to the DataFrame.
import json
# Create JSON related to the Campaigns
campaign_json = '''
[
{ "Campaign_Name": "Sales", "Type": "Public relations","Status":"Planned"},
{ "Campaign_Name": "Training Sessions", "Type": "Email","Status":"Aborted"}
]'''
# Using json_normalize()
DataFrame_from_json = pandas.json_normalize(json.loads(campaign_json))
print(type(DataFrame_from_json),"\n")
print(DataFrame_from_json)
Output:
Conclusion
There are several ways to convert JSON to Pandas DataFrame. In this guide, we learned how to convert a JSON string or JSON from a file first using the pandas.read_json() function. Then, we converted JSON to Pandas DataFrame using the pandas.DataFrame.from_dict() function. Your JSON needs to be converted into a dictionary before passing JSON to this function.