Python

Pandas Explode Multiple Columns

While working on either a small or a big dataset, you may need to clean, modify, or transform it. You may have to change the way the data is presented. Data modification or transformation can be achieved using various Python functions. Python is a high-level and user-friendly language that makes the analysis of data easier. It allows the developers to design and deploy simple functions for data analysis.In this article, we will talk about how you can alter your dataset with the help of Pandas packages in Python. We will discuss the explode() function and see the related example to understand how it transforms the data.

What Is Explode() in Python?

The explode() is a Python function used to transform or modify each member of an array or list into a row. The explode() function converts the list elements to a row while replacing the index values and returning the DataFrame exploded lists.

What Is the Syntax of Explode() Method?

The syntax of the explode function is given below:

dataframe.explode(col, ignore_ind)

The explode() function takes two parameters: “col” and “ignore_ind”. The first parameter represents the specific element of a column that is to explode. It is the required parameter that means you must provide it.

On the other hand, “ignore_ind” is an optional keyword argument parameter which means you may or may not provide this. It does, however, specify whether the initial index must be ignored or not. If it works accordingly, you have provided the value for “ignore_ind”.

The explode() function of Python will return the exploded DataFrame, which means each element of a DataFrame is converted to a row. In other words, it will return the exploded lists to rows of the subset columns while maintaining or changing the original index, and that depends on the input provided. However, the original DataFrame remains the same; it will not change, modify, or transform the original DataFrame.

What Are the Errors or Exceptions Raised for Explode() Method?

Here are several reasons for the explode() function to raise an error:

  • The explode() function will raise an error when a specific column does not contain any value; it is empty.
  • The explode() function will raise an error when the frames in a DataFrame are not unique.
  • The explode() function will also raise an error when the row-wise count of elements in a DataFrame does not match the column to explode.

The explode() function can explode the list-like data, which includes arrays, series, tuples, sets, and lists. The object will be returned for the dtype of the subset rows, the result for scalar data will be scalar, and the np.nan for a specific row will be returned for empty lists. Furthermore, the exploded datasets’ row order will be non-deterministic. Now, let us see some simple examples to learn how we can design an easy code in Python to use the explode() function.

Example 1

In this example, we will take you to each step one by one so that you can understand the complete process easily. First, you need to create a DataFrame, and you need to call the Pandas and NumPy library.

See the code below on how to call the Python libraries and create a DataFrame. First, we have imported the required modules (Pandas and NumPy) and then created the DataFrame, as you can see.

import pandas as pd

import numpy as np

df = pd.DataFrame({'X': [[15, 2, 89], 'function', [], [23, 8]],'Y': 9,'Z': [['ant', [], 'c'], np.nan, [], ['d', np.nan, 'e']]})

df.explode('Z')

Here is the following DataFrame, which is created above:

Graphical user interface, text, application, chat or text message Description automatically generated

Now, let us apply the explode() function to see how it transforms a column in the DataFrame. In this example, we will only explode one column at one time.

df.explode('Z')

Here is the following complete code:

import numpy as np
import pandas as pd
df.explode(list('XZ'))

df = pd.DataFrame({'X': [[15, 2, 89], 'function', [], [23, 8]],'Y': 9,'Z': [['ant', [], 'c'], np.nan, [], ['d', np.nan, 'e']]})

df.explode(list('Z'))

Below is the output of one exploded column. If you observe that each element of the column “Z” is transformed into a row, while all other columns remain the same. Moreover, the “index” column does not change as we have not provided any value against the “ignore_index” parameter.

Graphical user interface, text, application, chat or text message Description automatically generated

Example 2

In the previous example, we only exploded one column. Here, we will explode two columns simultaneously to see how the explode() function will react to exploding the two columns simultaneously. See the code below to explode two columns at the same time. The initial code is the same as the previous example, just the parameters to explode() function will change here.

As you can see, we have provided “XZ” as a parameter to the explode() function, which means it should transform both the columns “X” and “Z”. Here is the output after exploding two columns with explode() function:

Text Description automatically generated

The explode() is raising the “column must have matching element counts” error mentioned earlier. If you observe that the “X” column has only two elements in the last cell, [23, 8], while the “Z” column has three elements in the last cell, [‘d’, np.nan, ‘e’] which means the count of elements are not same and that becomes the reason of error.

Example 3

In this example, let us resolve the error we have encountered in the previous example. All we have to do is to make the count of both columns the same. We either have to add an element in the “X” column or remove an element from the “Z” column. We are removing the np.nan from the “Z” column. See the code below:

import pandas as pd

import numpy as np

df = pd.DataFrame({'X': [[15, 2, 89], 'function', [], [23, 8]], 'Y': 9, 'Z': [['ant', [], 'c'], np.nan, [], ['d', 'e']]})

df

df.explode(list('XZ'))

The error is resolved, and we have two columns exploding simultaneously.

Conclusion

We learned about Python’s explode() method in this post. We have explained the basic introduction necessary to explain the explode() method. We have used some examples to learn how to use explode() function of Python DataFrame and how its errors can be solved. Check out other Linux Hint articles for more tips and tutorials.

About the author

Kalsoom Bibi

Hello, I am a freelance writer and usually write for Linux and other technology related content