In this R tutorial, we will filter the rows using the filter() function.
Let’s create a dataframe with four rows and five columns.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))
#display market
print(market)
Result:
Let’s filter the rows in this dataframe.
Syntax:
filter(dataframe_object,condition)
Parameters:
It takes two parameters:
- dataframe_object is the dataframe
- condition is used to filter the rows
We can specify the conditions using the relational and logical operators.
Example 1:
In this example, we will specify the condition on the market_id column.
We will filter the rows by selecting the values in this column greater than 3.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))
#return rows only when the values in market_id column are greater than 3
print(filter(market,market_id>3))
Result:
We can see that the rows are filtered and used the greater than (>) operator on the market_id column.
Example 2:
In this example, we will specify the condition on the market_id and market_place columns.
We will filter the rows by selecting the values in the market_id column greater than 2 and the market_place value “India”.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))
#return rows only when the values in market_id column are greater than 3 and place is India
print(filter(market,market_id>2 & market_place=='India'))
Result:
We can see that the rows are filtered and used the greater than (>) operator on the market_id column and the == operator on the market_place column combined with the and(&) operator.
Example 3:
In this example, we will specify the condition on the market_id and market_place columns.
We will filter the rows by selecting the values in the market_id column greater than 2 or the market_place “India”.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))
#return rows only when the values in market_id column are greater than 3 or place is India
print(filter(market,market_id>2 | market_place=='India'))
Result:
We can see that the rows are filtered and used the greater than (>) operator on the market_id column and the == operator on the market_place column combined with the or(|) oroperator.
Example 4:
In this example, we will specify the condition on the market_place column.
We will filter the rows by selecting the values in the market_place column such that the values are in “India” or “USA” using the %in% operator.
market=data.frame(market_id=c(1,2,3,4), market_name=c('M1','M2','M3','M4'), market_place=c('India','USA','India','Australia'), market_type=c('grocery','bar','grocery','restaurent'), market_squarefeet=c(120,342,220,110))
#return rows only when the values in market_place are India and USA only
print(filter(market, market_place %in% c('India','USA')))
Result:
We can see that the rows are filtered and used the %in% operator to check values which are “India” or “USA”.
Conclusion
In this article, we discussed the four different examples to filter the dataframe by specifying the different conditions using the relational operators, logical operators and %in% operator.