R

# Sort the DataFrame in R

Sorting the DataFrames in R is a crucial operation in data analysis and manipulation. Significantly, R provides a lot of flexibility in sorting the data based on multiple columns and sorting in ascending or descending order. In R, sorting the DataFrames can be accomplished using a variety of methods and functions. In this article, we will go through various functions which help us to sort the DataFrame in any of the specified orders.

## Example 1: Sorting the DataFrame Using the Order() Method in R

The order() function in R is used to sort the DataFrames by one or multiple columns. The order function gets the indices of the sorted rows to rearrange the rows of the DataFrame.

emp = data.frame(names = c("Andy", "Mark", "Bonnie", "Caroline", "John"),

age = c(21, 23, 29, 25, 32),

salary = c(2000, 1000, 1500, 3000, 2500))

cat("\n\n Dataframe Sorted by Names in Ascending Order\n")

sorted_asc = emp[with(emp, order(names)), ]

print(sorted_asc)

Here, we define the “emp” DataFrame with three columns containing different values. The cat() function is deployed to print the statement to indicate that the “emp” DataFrame by the “names” column in ascending order is going to be sorted. For this, we use the order() function in R which returns the index positions of the values in a vector that is sorted in ascending order. In this case, the with() function specifies that the “names” column should be sorted. The sorted DataFrame is stored in the “sorted_asc” variable which is passed as an argument in the print() function to print the sorted results.

Hence, the sorted results of the DataFrame by “names” column in ascending order are displayed in the following. To get the sort operation in descending order, we can just specify the negative sign with the column name in the previous order() function: ## Example 2: Sorting the DataFrame Using the Order() Method Parameters in R

Moreover, the order() function takes the decreasing arguments to sort the DataFrame. In the following example, we specify the order() function with the argument to sort in increasing or decreasing order:

df = data.frame(

id = c(1, 3, 4, 5, 2),

course = c("Python", "Java", "C++", "MongoDB", "R"))

print("Sorted in Decreasing order by ID")

print(df[order(df\$id, decreasing = TRUE), ] )

Here, we first declare the “df” variable where the data.frame() function is defined with three different columns. Next, we use the print() function where we print a message to indicate that the DataFrame is going to be sorted in decreasing order based on the “id” column. After that, we deploy the print() function again to conduct the sorting operation and print those results. Inside the print() function, we call the “order” function to sort the “df” DataFrame based on the “course” column. The “decreasing” argument is set to TRUE to sort in decreasing order.

In the following illustration, the DataFrame’s “id” column is arranged in descending order: However, to get the sorting results in ascending order, we have to set the decreasing argument of the order() function with FALSE as shown in the following:

print("Sorted in Increasing order by ID")

print(df[order(df\$id, decreasing = FALSE), ] )

There, we get the output of the sort operation of the DataFrame by the “id” column in ascending order. ## Example 3: Sorting the DataFrame Using the Arrange() Method in R

Additionally, we can also use the arrange() method to sort a DataFrame by columns. We can also sort in ascending or descending order. The following given R code uses the arrange() function:

library("dplyr")

student = data.frame(

Id = c(3, 5, 2, 4, 1),

marks = c(70, 90, 75, 88, 92))

print("Increasing Order Sorting by Id ")

print(arrange(student, Id))

Here, we load the “dplyr” package of R to access the arrange() method for sorting. Then, we have the data.frame() function which contains two columns and set the DataFrame into the “student” variable. Next, we deploy the arrange() function from the “dplyr” package in the print() function to sort the given DataFrame. The arrange() function takes the “student” DataFrame as its first argument, followed by the “Id” of the columns to sort by. The print() function in the end prints the sorted DataFrame to the console.

We can see where the “Id” column is sorted in a sequence in the following output: ## Example 4: Sorting the DataFrame by Date in R

The DataFrame in R can also be sorted by the date values. For this, the sorted function must be specified with the as.date() function to format the dates.

event_date = data.frame(event=c('3/4/2023', '2/2/2023',

'10/1/2023', '29/3/2023'),

charges=c(3100, 2200, 1000, 2900))

event_date[order(as.Date(event_date\$event, format="%d/%m/%Y")), ]

Here, we have an “event_date” DataFrame which contains the “event” column with the date strings in the “month/day/year” format. We need to sort these date strings in ascending order. We use the order() function which sorts the DataFrame by the “event” column in ascending order. We accomplish this by converting the date strings in the “event” column to the actual dates using the “as.Date” function and specifying the format of the date strings using the “format” parameter.

Thus, we represent the data that is sorted by the “event” date column in ascending order. ## Example 5: Sorting the DataFrame Using the Setorder() Method in R

Similarly, the setorder() is also another method to sort the DataFrame. It sorts the DataFrame by taking the argument just like the arrange() method. The R code for the setorder() method is given as follows:

library("data.table")

d1=data.frame(orderId = c(1, 4, 2, 5, 3),

orderItem = c("apple", "orange","kiwi", "mango","banana"))

print(setorder(d1,orderItem))

Here, we set the data.table library first since the setorder() is the function of this package. Then, we employ the data.frame() function to create the DataFrame. The DataFrame is specified with only two columns which we use to sort. After this, we set the setorder() function within the print() function. The setorder() function takes the “d1” DataFrame as the first parameter and the “orderId” column as the second parameter by which the DataFrame is sorted. The “setorder” function rearranges the rows of the data table in ascending order based on the values in the “orderId” column.

The sorted DataFrame is the output in the following console of R: ## Example 6: Sorting the DataFrame Using the Row.Names() Method in R

The row.names() method is also a way to sort the DataFrame in R. The row.names() sort the DataFrames by the specified row.

df <- data.frame(team=c('X', 'X', 'Y', 'Y', 'Z'),

score=c(91, 80, 86, 83, 95))

row.names(df) <- c('A', 'D', 'C', 'E', 'B')

df[order(row.names(df)), ]

Here, the data.frame() function is established within the “df” variable where the columns are specified with the values. Then, the DataFrame’s row names are specified using the row.names() function. After that, we call the order() function to sort the DataFrame by row names. The order() function returns the indices of the sorted rows which are used to reorganize the rows of the DataFrame.

The output shows the sorted DataFrame by rows alphabetically: ## Conclusion

We have seen the different functions to sort the DataFrames in R. Each of the methods has an advantage and needs the sort operation. There can be more methods or ways to sort the DataFrame in R language but the order(), arrange(), and setorder() methods are the most important and easy to use for sorting. 