Python

Pandas bfill

“In Pandas dataframe, the dataset’s missing values are filled in backward using the dataframe.bfill() function. In the pandas dataframe, if there exist missing values, it will backward fill any NaN values of our dataframe. A one-dimensional array with axis labels is the Pandas series. The labels must be a hashable type; they do not have to be distinctive. The object has a wide range of methods for executing operations requiring the index and supports both label-based and integer indexing. The backward fill technique is used to populate the NaN values in the series object with the help of the series.bfill() function in Pandas.”

Syntax for Using bfill() Method?

The syntax for the bfill() function for the dataframe is the following.

Syntax:


The syntax for using the bfill() function in series is quite the same as with the dataframe. The difference is the series name will be used, followed by the bfill() method.

Syntax:

Parameters

Axis: {0 or index}

1 and columns are not supported. Axis = 1 in series

Inplace: boolean, False by default.

Make alterations to the same object.

Limit: int, none by default

The maximum number of successive NaN values to forward- or backward-fill if the method is provided. In other words, the gap will be filled partially if there are more than this many consecutive NaNs. This is the highest amount of entries along the complete axis where the null or missing entries will be filled if the method is not provided. It has 0 or more if none.

Returns: DataFrame or Series. NA entries filled with DataFrame or Series.

How to Use the pandas bfill() Method?

We will show how to utilize the bfill() function in pandas dataframes and series in the examples that follow.

Example # 1: Filling up the Dataframe’s Missing Values Using the bfill() function

As we already know, to fill the Na values in a dataframe, the dataframe.bfill() method is used. In the reverse direction, it fills the NaN values that exist in the dataframe.

First, we will create the dataframe after importing the pandas modules. To create the dataframe in pandas, we will use the pd.Dataframe() function. The following parameters will be passed in the pd.Dataframe() function to create the required dataframe.


As can be seen, we have created 3 columns X, Y, and Z. In each of our df dataframe’s columns missing values are present. To visualize the dataframe, we will pass the dataframe in the print () function as an argument.


Now we will apply the bfill() method to populate the Na cells in our dataframe. The value in the current NA cells is populated from the corresponding value in the following row when axis= “rows”. The following row won’t be filled in if the following row also has a Na value.

As you can see, in the 4th row, the 1st cell is still NaN. This is because the corresponding value in the next row 5 is also NaN. The 5th value is NaN because there is no corresponding value in the bottom row from which the bfill() function can populate the Na cell.

What if we use axis= “columns”? The bfill() function will fill the null cells with the corresponding values from the next column( the right column). Same as in the case of axis = “rows”, the following column won’t be filled in if the next column also has a Na value.


In the above dataframe, after using the bfill() function, all the values having a corresponding value in the next column have been changed.

Example # 2: Filling up the Boolean Dataframe’s Missing Values Using the bfill() function

In this example, we will create a dataframe with Boolean data and NaN values to check how the bfill() function will work on a Boolean dataframe.


We have created the dataframe with Boolean values. Now, the bfill() function will be used to fill the Na values.


The Na cells are filled where the corresponding values in the next row of the dataframe exist and are not null values. We can also specify axis = “columns” as we have done in example # 1 to fill the empty cell by the value in the corresponding column of the Na cell.

Example # 3: Filling up the Missing Values in Series Object Using the bfill() function

We have seen how the bfill() works in dataframes. Now, we will use the bfill() function in a series object with one or more null values. First, we will create a series of a person and specify the index name of each value in the series.


We will use the print() function to demonstrate our series “s”.


As can be seen, there is one Na cell at index D. To fill that Na cell; we will use the bfill() method on our series.


As you can see, the Na cell is filled by the next corresponding value in the series. As there is only a single axis in the series, it will always fill the missing values by the next value to the missing cell.

Example # 4: Filling up the Missing Values in Series Object With Numeric Values Using the bfill() function

After importing the pandas modules, now, we will use the bfill() function on a series object with numeric values and having one or more null values. First, we will create a numeric series and specify the index for each value in the series.


We have created the required series and specified the index of each value using the pd.date_range() function. To obtain a DatetimeIndex with a fixed frequency, the date_range() method is used. While freq = “M” indicates that the series must be created based on the month. Let’s visualize our series using the print() function.


As you can see, there are multiple Na values in our series. Now we will fill these null cells using the bfill() function.


The backward fill technique is used in the bfill() function to fill the null cells by the next adjacent value in the series.

Example # 5: Specify the Limit of the bfill() function in the Series Object

In this example, we will create a series with more than one consecutive value to show you how the limit parameter works in the bfill() function.


Let’s use the bfill() function on the series with the limit parameter.


The Series in the aforementioned example has two and three consecutive missing values, but because the limit is set to 2, the “Series.bfill()” method only filled two consecutive null cells; the third remained null. If we set the limit to 3, it will fill three consecutive null cells in the series.

Let’s additionally specify the “inplace=True” argument for the series.bfill() function. The series.bfill() method populates the missing values without generating a new object as long as this option is True and it returns None.


As you can see, it has filled all the Na cells from our series.

Conclusion

We learned how to use the Python pandas “Series.bfill()” function in this tutorial. We studied the syntax and parameters of the bfill() function before using it on a series and dataframes consisting of NaN values to understand how the DataFrame.bfill() and Series.bfill() function backfills the null values that exist in the pandas dataframe and series respectively.

About the author

Aqsa Yasin

I am a self-motivated information technology professional with a passion for writing. I am a technical writer and love to write for all Linux flavors and Windows.