Python

How to Iterate Over Rows in a DataFrame in Pandas

Iteration is a method that helps us to traverse all the values. In Pandas, when we create a DataFrame, we always need to access the values and where the iteration helps. So, in this article, we are going to review different methods for the DataFrame row-wise iteration.

pandas.DataFrame

A pandas DataFrame can be created using the following constructor:

pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

1. Method: Using Index Attribute of the Dataframe

We created a dictionary of data with four keys and then converted that data dictionary to DataFrame using the Pandas library as shown below:

In cell number [4], we just print that DataFrame to see how our DataFrame looks:

In cell number [5], we are displaying what actual index has information about the DataFrame. The output shows that the index stores the DataFrame total rows details in the form of Range, as shown above in the output.

In cell number [6], As we already know, the index stores the range function, which has values from 0 to 4 (the last value was not counted so that the loop will work from 0 to 3). So we iterate the loop as normal, and at each iteration, it will go to that particular column name which is mentioned like df[‘Name’] and then print the particular index (row number) value of that column.

2. Method: Using loc[ ] Function of the DataFrame

Let’s first understand the loc and iloc method. We created a series_df (Series) as shown below in the cell number [24]. Then, we print the series to see the index label along with the values. Now, at cell number [26], we are printing the series_df.loc[4], which gives the output c. We can see that the index label at 4 values is {c}. So, we got the correct result.

Now at the cell number [27], we are printing series_df.iloc[4], and we got the result {e} which is not the index label. But this is the index location that counts from 0 to the end of the row. So, if we start to count from the first row, then we get {e} at index location 4. So, now we understand how these two similar loc and iloc work.

Now, we are going to use the .loc method to iterate the rows of a DataFrame.

In cell number [7], we just print the DataFrame which we created before. We are going to use the same DataFrame for this concept too.

In cell number [8], as the index label starts from zero (0), we can iterate each row and get the values of each particular column’s index label as shown in the above image.

3.Method: Using iterrows( ) Method of the DataFrame

Let’s first understand the iterrows( ) and see how they print the values.

In cell number [32]: we created a DataFrame df_test.

In cell number [33 and 35]: we print our df_test so that we can see how it looks. Then, we loop it through the iterrows( ) and print the row, which prints all the values along with their column names left side.

In cell number [37], when we print the row using the above method, we get the column names on the left side. However, when we mention the column name already, then we get results like shown in the cell number [37]. Now we clearly understand it will iterate row-wise.

In cell number [9]: we just print the DataFrame which we created before. We are going to use the same DataFrame for this concept too.

In cell number [10]: we iterate each row using the iterrows( ) and print the result.

4. Method: Using itertuples( ) Method of the DataFrame

The above method is similar to the iterrows(). But the only difference is how we access the values. In cell number [11], we can see that to access the column value on each iteration. We are using the row. Name (dot operator).

5. Method: Using iloc [ ] Function of the DataFrame

We already explained before how the .iloc method works. So now, we are going to use that method directly to iterate the rows.

In cell number [18]: we just print the DataFrame, which we created before for this concept.

In cell number [19]: df.iloc[i , 0], in which i belongs to the location and next value 0, which tells the index of the column name.

6. Method: Iterate Over Rows and Print Along With Their Column Names

In cell number [20]: we just print the DataFrame (df), which we created before to understand the concept.

In cell number [21]: we iterate through the itertuples() method, which we explained already. But if we did not mention any other information, we get the output along with their column names.

Conclusion:

Today, we learn different methods to row iterate on the pandas DataFrame. We also learned about .loc and .iloc methods and the close difference between them. We also studied the iterrows( ) and itertuples( ) methods. We have also seen the index attribute method. All these above methods have their respective advantages and disadvantages. So, we can say that it depends upon the situation that which method when have to use.

About the author

Shekhar Pandey