Python

Pandas DataFrame from Dict

A Pandas DataFrame can be generated using a Python dict (dictionary) that is a key-value pair. In real-time, we usually produce a Pandas DataFrame by importing a CSV file or another resource, but it is possible to generate it using a dict (dictionary) object as well.

Python Pandas are frequently deployed in both data science/data processing and machine learning tasks. It is developed on the foundation of NumPy, another prominent Python library that supports the scientific computations. In Python, Pandas Dataframes are extremely useful for working with 2D (two-dimensional) data. A Pandas DataFrame may be constructed in a variety of methods, one of these is by extracting the data from a given dictionary.

Example 1: Utilizing the Default Constructor pd.DataFrame() to Generate a DataFrame from a Dictionary

This approach starts by generating a Python dictionary of lists, which we then pass to the pd.DataFrame() method. The last but not least, the pd.DataFrame() method outputs a Pandas DataFrame object containing the contents from the dictionary of lists.

Now, let us put it into practice using the Python scripting.

In the Python file, we first import the necessary library which is Pandas in this illustration as we will use the Pandas features here.

dic 1 in.jpg

We then create a dictionary “data” and initialize it with three lists with the titles ‘Name’, ‘Age’, and ‘Institute’, respectively. We assign each list with four values. We invoke the print() method to display the dictionary on the terminal.

The output we get by running the previous code is attached here:

dic 1.jpg

list.jpg
We now convert this Pandas dictionary to a Pandas DataFrame.

For the purpose of generating the DataFrame from a dictionary, we use the simplest method. Here, we employ the Pandas default constructor to generate a DataFrame. A DataFrame object with the name “output” is created and is assigned the output of invoking the pd.DataFrame() function. The dict “data” that we previously created is passed as a parameter to the pd.DataFrame() function. The Print() statement with the object “output” as an argument executes to display the DataFrame created from the specified dictionary.

This yields us the following resultant Pandas DataFrame:

list out.jpg

Example 2: Utilizing the User-Defined Indexes to Generate a DataFrame from a Dictionary

Using a dictionary featuring the user-defined indexes, we could construct a Pandas DataFrame object in Python. This technique commences by generating a Python dictionary, which is then passed simultaneously with the index list onto the pd.DataFrame() method. The pd.DataFrame() method ultimately returns a Pandas DataFrame object containing the dictionary’s contents together with the indexes from the provided index list.

We will explore here how the Python code is used to execute it.

index.jpg

To create a DataFrame from a dict that uses a user-defined indices, we first need to have a dictionary list. Since we generated a dictionary in the previous example, we will use the same dictionary in this instance as well.

Now, for constructing the DataFrame from dict, we employ the same Pandas DataFrame constructor but with one more innovation to it as we want the DataFrame to be displayed with indexes that we assign to it instead of its predefined indices. We utilize the “index” parameter inside the parenthesis with the name of the dict, separated by a comma. We assign the values to the “index” parameter using the “=” assignment operator and put the values inside the brackets. Lastly, we employ the print() method to exhibit the outcome of the program.

The following image shows the output DataFrame with user-defined indices “R, X, Y, Z” instead of the 0 indexing method.

index our.jpg

Example 3: Generate a DataFrame from a Dictionary with the Needed Columns

For the third instance, certain columns are ignored when we generate a DataFrame from a dictionary. The columns parameter makes accomplishing this task simple. As a parameter, this argument accepts a list, the items which are the specified columns. It returns the DataFrame with the selected columns only.

Let’s just check out the way to write a Python script to generate a DataFrame using the selected dictionary columns.

roll.jpg

In this example, we added one new record to the previously created dict “data” as “Roll” and assigned it with the same length of values as we used for the other three records. Now, collectively, we have a dict with 4 records. When we run the print() method, we get a dictionary with four records this time.

This is the output:

roll out.jpg

Now, we explore how we can construct a DataFrame out of a dictionary with some specified columns.

col.jpg

The Pandas DataFrame function provides us a parameter “column” to specify the names of the columns that you particularly want in your DataFrame. Between the braces of the pd.DataFrame() method, we passed the “columns” argument and assigned the names of the columns within the brackets. Here, we chose two columns, “Name” and “Institute”. The print() statement is presented on the terminal as the DataFrame that holds only the previously mentioned columns.

We got our DataFrame with two selected columns.

col out.jpg

Example 4: Generate a DataFrame from a Dictionary with a Changed Orientation by Utilizing the From_Dict() Function

Similar to the previous way, we first generate a Python dictionary of lists and then pass it to the DataFrame.from_dict() method. Eventually, a Pandas DataFrame object containing the information on the dictionary of lists is returned by the DataFrame.from_dict() method. There are various alternatives when specifically using the function from dict() to construct a DataFrame from a dictionary. The dict’s keys will, by default, be its columns as is the usual behavior. The dictionary keys are shown as rows when the orientation is “index.”

Here, we first see the default settings and then change the orientation to “index”.

ori def.jpg

We utilize the dict “data”. To generate a DataFrame from the Dictionary, we use the from_dict() method instead. The dictionary’s keys are used as column names by default when using the from_dict() method. The entries from the dictionary are utilized as DataFrame values. So, we run this method with the default setting and pass it the name of the dict “data”. We see the output through the print() function.

The following is the outcome generated from the previous Python script:

or def out.jpg

Now, to change the orientation of the DataFrame generated from the provided Dictionary list, we alter the default settings of the from_dict() function.

ori index.jpg

When you supply the orient=’index’ argument, a DataFrame is constructed using the values from the dict values if you prefer to utilize the dict keys as rows. We thus added the “orient” parameter to change the orientation and assign the “index” value. This means that the index’s values orientation are altered from columns to rows.

This is the output of this program:

ori index out.jpg

Conclusion

This guide explains the generation of a DataFrame from a dictionary list. We elaborated all the different ways and aspects of utilizing this approach. With the practical examples, we made you understand how to employ the Pandas DataFrame function as well as the Pandas from_dict function. The given step by step implementation will help you get the best learning experience in Python.

About the author

Aqsa Yasin

I am a self-motivated information technology professional with a passion for writing. I am a technical writer and love to write for all Linux flavors and Windows.