Python

Pandas to check cell value is NaN

The main documentation of the pandas is saying null values are missing values. We can denote the missing or null values as NaN in the pandas as most developers do. The NaN and None keywords are both used by developers to show the missing values in the dataframe. The best thing in the pandas is that it treats both NaN and None similarly. To check the missing value of a cell, pandas.notnull will return False in both cases of NaN and None if the cell has NaN or None.

So, in this article, we will explore different methods to check whether a particular cell value is null or not (NaN or None).

The different methods which we are going to discuss are:

  1. isnull
  2. isnan
  3. isna
  4. notnull

Let’s discuss each method in detail.

Method 1: using the isnull function

In this method, we will use a very easy method called isnull () to find out whether the particular cell has a NaN value.

1
2
3
4
5
6
7
8
9
10
11
12
13
# python isnull.py

import pandas as pd
import numpy as np

data = {'x': [1,2,3,4,5,np.nan,6,7,np.nan,8,9,10,np.nan],
        'y': [11,12,np.nan,13,14,np.nan,15,16,np.nan,np.nan,17,np.nan,19]}
df = pd.DataFrame(data)

print (df)

nan_in_df = df.isnull(df.iloc[5,0])
print (nan_in_df

Output: python isnull.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
       x     y
0    1.0  11.0
1    2.0  12.0
2    3.0   NaN
3    4.0  13.0
4    5.0  14.0
5    NaN   NaN
6    6.0  15.0
7    7.0  16.0
8    NaN   NaN
9    8.0   NaN
10   9.0  17.0
11  10.0   NaN
12   NaN  19.0
True

Line 3 to 4: we import the library pandas and numpy.

Line 6: we create a dictionary with x and y keys and their values with some np.nan.

Line 8 to 10: we convert the dictionary to the dataframe and then print that dataframe which we can see in the output above.

Line 12 to 13: we call the dataframe method isnull and check particular cell [5, 0] dataframe value is null or not. In this case, we are not checking for the whole dataframe and for the single-cell dataframe value. So it gives the output True, which is shown in the above output. The first value 5 [5, 0] which represents the index position, and the other value, 0, represents the column index name.

Method 2: using isnan () method

In the above example, we checked the NaN value using the isnull method of the dataframe. Now we are going to use another method called isnan. This method belongs to the numpy and not the dataframe. The below program is for that which checks only for the particular cell.

1
2
3
4
5
6
7
8
9
# We can also check the cell NaN value in dataframe
data = {'x': [1,2,3,4,5,np.nan,6,7,np.nan,8,9,10,np.nan],
        'y': [11,12,np.nan,13,14,np.nan,15,16,np.nan,np.nan,17,np.nan,19]}
df = pd.DataFrame(data)
print(df)
value = df.at[5, 'x']  #nan
isNaN = np.isnan(value)
print("===============")
print("Is value at df[5, 'x'] NaN :", isNaN)

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
       x     y
0    1.0  11.0
1    2.0  12.0
2    3.0   NaN
3    4.0  13.0
4    5.0  14.0
5    NaN   NaN
6    6.0  15.0
7    7.0  16.0
8    NaN   NaN
9    8.0   NaN
10   9.0  17.0
11  10.0   NaN
12   NaN  19.0
===============
Is value at df[5, 'x'] NaN : True

Line 2: we create a dictionary with x and y keys and their values with some np.nan.

Line 4 and 5: we convert the dictionary to the dataframe and then print that dataframe which we can see in the output above.

Line 6: we filtered the particular cell value using the index and column name [5, ‘x’] and assigned that value to the variable value. The first value 5 which represents the index position, and ‘x’ represents the column name.

Line 7: we are checking whether the value is NaN or not.

Line 9: we finally print our output which shows that the value has NaN is True.

Method 3: cell NaN value in a series using isnan

We checked in the previous example the NaN value in a cell dataframe. We can also check inside of the pandas series if any cell value is NaN or not. So let’s see how we can implement that.

1
2
3
4
5
6
7
8
9
# We can also check the cell NaN value in the dataframe series

series_df = pd.Series([2,3,np.nan,7,25])

print(series_df)
value = series_df[2]  #nan
isNaN = np.isnan(value)
print("===============")
print("Is value at df[2] NaN :", isNaN)

Output:

1
2
3
4
5
6
7
8
0     2.0
1     3.0
2     NaN
3     7.0
4    25.0
dtype: float64
===============
Is value at df[2] NaN : True

Line 3: we created the pandas series.

Line 6: we assign the cell value which we want to check to another variable.

Line 7: we are checking either the value in that variable is NaN or not.

Method 4: using pandas.isna

Another method we can use is to check whether a particular dataframe cell value is null or not using the pandas.isna method.

1
2
3
4
5
6
7
8
data = {'x': [1,2,3,4,5,np.nan,6,7,np.nan,8,9,10,np.nan],
        'y': [11,12,np.nan,13,14,np.nan,15,16,np.nan,np.nan,17,np.nan,19]}
df = pd.DataFrame(data)

print (df)

print("checking NaN value in cell [5, 0]")
pd.isna(df.iloc[5,0])

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
       x     y
0    1.0  11.0
1    2.0  12.0
2    3.0   NaN
3    4.0  13.0
4    5.0  14.0
5    NaN   NaN
6    6.0  15.0
7    7.0  16.0
8    NaN   NaN
9    8.0   NaN
10   9.0  17.0
11  10.0   NaN
12   NaN  19.0
checking NaN value in cell [5, 0]
True

Line 1: we create a dictionary with x and y keys and their values with some np.nan.

Line 3 to 5: we convert the dictionary to the dataframe and then print that dataframe which we can see in the above output.

Line 8: We check whether the cell [5, 0] value is NaN or not. The first value 5, which represents the index position, and 0 represents the column name. We finally print our output which shows that the value has NaN is True.

Method 5: using pandas.notnull method

Another method through which we can check either particular cell value is NaN or not using the notnull method. In this method, if the cell value is NaN or missing, it will return a boolean False, as shown in the program below.

1
2
3
4
5
6
7
8
data = {'x': [1,2,3,4,5,np.nan,6,7,np.nan,8,9,10,np.nan],
        'y': [11,12,np.nan,13,14,np.nan,15,16,np.nan,np.nan,17,np.nan,19]}
df = pd.DataFrame(data)

print (df)

print("checking NaN value in cell [5, 0]")
pd.notnull(df.iloc[5,0])

Output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
       x     y
0    1.0  11.0
1    2.0  12.0
2    3.0   NaN
3    4.0  13.0
4    5.0  14.0
5    NaN   NaN
6    6.0  15.0
7    7.0  16.0
8    NaN   NaN
9    8.0   NaN
10   9.0  17.0
11  10.0   NaN
12   NaN  19.0
checking NaN value in cell [5, 0]
False

Line 1: we create a dictionary with x and y keys and their values with some np.nan.

Line 3 to 5: we convert the dictionary to the dataframe and then print that dataframe which we can see in the above output.

Line 8: we are checking whether the cell [5, 0] value is not NaN. The first value 5, which represents the index position, and 0 represents the column name. We finally print our output which shows that the value has NaN and returns as False because we are checking if the cell is notnull, but the cell actually is null.

Conclusion

In this blog, we have seen different methods to determine a particular cell value is NaN or None because sometimes we need to find out the cell value and not the whole dataframe. That’s why this blog is particular for the cell value focus. We have seen pandas and numpy, both methods to check missing values. We focus on the concept only to show simple tutorials and not use any iteration loop. All the above methods which we discussed are fast in execution even if you want to check the whole dataframe.

The code for this blog is available at the Github link.

About the author

Shekhar Pandey