The “Pandas” library makes it simple and efficient to work with Python data. Pandas “DataFrames” are like tables of data, with rows and columns. It is sometimes necessary to select only specific rows from a DataFrame according to the condition. For example, determine only the rows where the age column value is greater than 18. To accomplish this task, various Pandas methods can be used in Python.
This tutorial presents a detailed guide on selecting rows based on the condition using numerous examples.
How to Select/Determine Rows By Condition in Pandas DataFrame?
To select rows based on the particular condition of Pandas DataFrame, different methods are utilized in Python. Here are some of the methods:
- Using Relational Operators Method
- Using “df.isin()” Method
- Using “&” Operator
- Using “df.loc[]” Method
Method 1: Select DataFrame Rows By Condition Using Relational Operators Method
The relation operator method is used along with the square notation syntax to select rows of Pandas DataFrame. In the below code, the “==” operator is used to select DataFrame rows containing name column values equal to “Henry”.
students = {'Name': ['Lily', 'Joseph', 'Henry', 'David'],'Age': [15, 23, 17, 26], 'Grades': ['A', 'B', 'A+', 'D']}
df = pandas.DataFrame(students)
print(df, '\n')
output = df[df['Name'] == 'Henry']
print(output)
The above code execution returns the rows having a “Henry” value in the Name column:
We can also utilize other relation operators to select rows based on the specified condition. In this example, the “>” operator is used to select only those rows that have an age value greater than “18”.
students = {'Name': ['Lily', 'Joseph', 'Henry', 'David'],'Age': [15, 23, 17, 26], 'Grades': ['A', 'B', 'A+', 'D']}
df = pandas.DataFrame(students)
print(df, '\n')
output = df[df['Age'] > 18]
print(output)
When the above code is executed, the following output is shown to the console:
Method 2: Select DataFrame Rows By Condition Using “df.isin()” Method
The “df.isin()” method of the “pandas” module selects DataFame rows according to the specified condition. In the following example, the “df.isin()” method selects Pandas DataFrame rows that contain the “Grades” column value “A” or “A+”.
students = {'Name': ['Lily', 'Joseph', 'Henry', 'David'],'Age': [15, 23, 17, 26], 'Grades': ['A', 'B', 'A+', 'D']}
df = pandas.DataFrame(students)
print(df, '\n')
output = df[df['Grades'].isin(['A', 'A+'])]
print(output)
The code produces the below output to the console:
Method 3: Select DataFrame Rows By Condition Using “&” Operator
The “&” operator can also be utilized to select Pandas DataFrame rows according to the specified condition. For example, in the below code, the “&” operator is used between two conditions that are used to select DataFrame rows. Based on the conditions, only the rows having an “Age” greater than “18” and the “Grades” equal to “D” will be fetched:
students = {'Name': ['Lily', 'Joseph', 'Henry', 'David'],'Age': [15, 23, 17, 26], 'Grades': ['A', 'B', 'A+', 'D']}
df = pandas.DataFrame(students)
print(df, '\n')
output = df[(df['Age'] > 18) & df['Grades'].isin(['D'])]
print(output)
The above output generates the below output:
Method 4: Select Rows By Condition Using “df.loc[]” Method
The “df.loc[]” method takes the index label value as an argument and returns the data frame or rows. This method is utilized in the following code to select the DataFrame rows based on the condition. The condition in this case indicates that the “Age” must be greater or equal to “20”.
students = {'Name': ['Lily', 'Joseph', 'Henry', 'David'],'Age': [15, 23, 17, 26], 'Grades': ['A', 'B', 'A+', 'D']}
df = pandas.DataFrame(students)
print(df, '\n')
output = df.loc[df['Age'] >= 20]
print(output)
This code retrieves the following output to the console:
Conclusion
The relational operators, “df.isin()”, “&” operator, and “df.loc[]” methods, are used to select DataFrame rows based on the particular conditions. All of the specified methods can select DataFrame rows based on single or multiple conditions. This guide has presented a detailed tutorial on selecting rows according to the condition using numerous examples.