Python Pandas

Pandas Json Normalize

JSON or JavaScript Object Notation is a widely utilized format for storing and exchanging/sharing data. It is very difficult to work/deal with JSON data as it is often nested and complex. In order to make it a simple, flattened, or more manageable format, the “pandas.json_normalize()” method of the “Pandas” module is used in Python.

This blog presents a thorough tutorial on the “pandas.json_normalzie()” method using numerous examples and via the following content:

What is the “pandas.json_normalize()” Method in Python?

The “pandas.json_normalzie()” method is used to normalize semi-structured JSON (JavaScript object notation) data into a flat table. The syntax of “pandas.json_normalize()” method is shown below:

pandas.json_normalize(data, max_level=None, record_path=None, record_prefix=None, meta=None, errors='raise', meta_prefix=None, sep='.')

In the above syntax:

  • The “data” parameter specifies the data that can be a dictionary, a nested dictionary, or a list of dictionaries.
  • The “max_level” parameter is used to indicate the maximum level or depth of the dictionary to normalize. By default, it normalizes all the levels.
  • The “record_path” parameter is used to indicate the path to the record of the JavaScript Object Notation, which should be flattened.
  • The other parameters are optional and can perform certain operations when passed to the function.

Example 1: Normalizing JSON Data Using “pandas.json_normalize()” Method

The below code is used to normalize the nested dictionary JSON data:

import pandas

input_json = [{"name": "Joseph", "id_no": "1234"},{"name": "Lily", "id_no": "1453"},{"name": "Anna", "id_no": "1932"},

{"name": "Henry", "id_no": "1567"},{"name": "David", "id_no": "1354"}]

print(input_json, '\n')

print(pandas.json_normalize(input_json))

In the above code:

  • The “pandas” module is imported, the JSON data is created, and stored in the variable “input_json”.
  • The “pandas.json_normalize()” method takes the JSON data as a parameter and normalizes it.

Output

The JSON data has been normalized to the maximum level.

Example 2: Normalizing JSON Data Using “pandas.json_normalize()” Method With “max_level” Parameter

The following code is used to normalize the JSON data based on the level of the normalization:

import pandas

input_json = [

{

"id_no": 18012,

"name": "Joseph",

"grades": {"english": 22, "sports": 30},

},

{"name": "Henry", "grades": {"english": 38, "sports": 60}},

{

"id_no": 18043,

"name": "Anna",

"grades": {"english": 31, "sports": 190},

},

]

print(input_json, '\n')

print(pandas.json_normalize(input_json, max_level=0), '\n')

print(pandas.json_normalize(input_json, max_level=1))

In the above code:

  • The “pandas.json_normalize()” method takes the JSON data and the “max_level=0” parameter value as an argument to normalize the JSON data at level “0”.
  • The “pandas.json_normalize()” method is used again to normalize the JSON data at level “1” by assigning the parameter value “max_level=1”.

Output

The JSON data has been normalized according to the specified levels, such as “0” and “1”.

Conclusion

In Python, the “pandas.json_normalize()” method of the “pandas” module is utilized to normalize semi-structured JSON (JavaScript Object Notation) data into a flat table. This method is used with various parameters such as “max_level” to normalize the data based on the passed level value. This write-up has delivered a detailed guide on Panda’s JSON normalize method using numerous examples.

About the author

Haroon Javed

Hi, I'm Haroon. I am an electronics engineer and a technical content writer. I am a tech geek who loves to help people to the best of my knowledge.