R

Filter Rows in the Dataframe

If we want to filter the rows from the dataframe, we can use the filter() function and specify the condition inside it. Based on the condition, we can filter the rows. For this, we have to use the filter() function.

In this R tutorial, we will filter the rows using the filter() function.

Let’s create a dataframe with four rows and five columns.

#create a dataframe-market that has 4 rows and 5 columns.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))

#display market
print(market)

Result:

Let’s filter the rows in this dataframe.

Syntax:

filter(dataframe_object,condition)

Parameters:
It takes two parameters:

  1. dataframe_object is the dataframe
  2. condition is used to filter the rows

We can specify the conditions using the relational and logical operators.

Example 1:
In this example, we will specify the condition on the market_id column.

We will filter the rows by selecting the values in this column greater than 3.

#create a dataframe-market that has 4 rows and 5 columns.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))

#return rows only when the values in market_id column are greater than 3
print(filter(market,market_id>3))

Result:

We can see that the rows are filtered and used the greater than (>) operator on the market_id column.

Example 2:
In this example, we will specify the condition on the market_id and market_place columns.

We will filter the rows by selecting the values in the market_id column greater than 2 and the market_place value “India”.

#create a dataframe-market that has 4 rows and 5 columns.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))

#return rows only when the values in market_id column are greater than 3 and place is India
print(filter(market,market_id>2 & market_place=='India'))

Result:

We can see that the rows are filtered and used the greater than (>) operator on the market_id column and the == operator on the market_place column combined with the and(&) operator.

Example 3:
In this example, we will specify the condition on the market_id and market_place columns.

We will filter the rows by selecting the values in the market_id column greater than 2 or the market_place “India”.

#create a dataframe-market that has 4 rows and 5 columns.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))

#return rows only when the values in market_id column are greater than 3 or place is India
print(filter(market,market_id>2 | market_place=='India'))

Result:

We can see that the rows are filtered and used the greater than (>) operator on the market_id column and the == operator on the market_place column combined with the or(|) oroperator.

Example 4:
In this example, we will specify the condition on the market_place column.

We will filter the rows by selecting the values in the market_place column such that the values are in “India” or “USA” using the %in% operator.

#create a dataframe-market that has 4 rows and 5 columns.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))
 
#return rows only when the values in market_place are India and USA only
print(filter(market, market_place %in% c('India','USA')))

Result:

We can see that the rows are filtered and used the %in% operator to check values which are “India” or “USA”.

Conclusion

In this article, we discussed the four different examples to filter the dataframe by specifying the different conditions using the relational operators, logical operators and %in% operator.

About the author

Sireesha Lavu

This is Sireesha Lavu from Gogulamudi, Andhra Pradesh, India 522015.
I am currently working as a teacher and interested in writing technical articles on computer science.