R

Combine Columns in R

“There is a diversity of scenarios where data sets are split into numerous tables and a variety of reasons why this may be the case. It’s sometimes simpler to gather information in smaller chunks, while in some situations, it’s better to minimize the size of the file. Irrespective of why sets of data are fragmented into separate tables, they must be structured in a way that there should be a minimum of one column similar between both the tables, allowing them to be combined if necessary.There are three major strategies we’ll look at to reduce your workload and ensure that each important column and parameter from your different datasets is integrated appropriately.After going through all the techniques and their corresponding practical example codes, you will be able to make a strong grip on the concept of combining columns in R.”

Combining Columns in R

R provides us with several ways to combine columns of a dataframe. In this article, we will make you familiar with 3 of them; by using the “paste()” function, by using the “unite()” function, and by using the “str_c()” function.

Combine Columns by “paste()” Function

One of the methods to combine columns of a dataframe in R is to use the “paste()” function. This function works by combining columns from two different dataframe as well as within the same dataframe.

Before we focus on its implementation, first, we need to understand the syntax for the “paste()” function.

# Paste (data$c1, data$c2, sep= “ ”)

The syntax for the “paste()” function has 3 parameters. “data” is the name of the dataframe which you would input, and “c1” is the column name of that dataframe. “c2” refers to another column name in the dataframe that you need to combine with the first one. Whereas “sep” means a separator that separates two columns and (“ ”) refers to anything that a separator needs to add between the 2 columns. Here it is used to add space between both the columns while joining them.

We have created a dataframe with 3 columns; “firstname,” “lastname,” and “age.” By utilizing the “c()” function, assign values to all the columns. The “firstname” and “lastname” have values of character data type, whereas the “age” column has stored values of numeric data type. We have created a dataframe named “Info” and stored the values of the “data.frame()” function in it. “print()” statement will display the dataframe we have just created.

colmn.png

In the output screen, you can see a table with 3 columns.

dtafrm out.png

Now we will utilize the “paste()” function. First, write the name of the dataframe where you stored values; as we named it “info,” then add a “$” operator next to it, which we use to choose a column or to allot a new value to a column. Proceeding further by making a new column with the name “fullname,” to which we will assign the combined values of the 2 different columns. Inside the “paste()” function, mention the name of the dataframe with a “$” operator and the column you want to select. After putting a comma, write the name of the dataframe, the “$” operator, and the second column you want to merge. “(sep= “ ”)” will add space between both the columns while combining them.

paste.png

The image below shows both the tables; the first one with the initial dataframe and the other with a combined column “fullname.”

paste out.png

You can remove the previous dataframe by eliminating the first “print()” statement.

paste only - Copy.png

The output can be found in the image below.

space out.png

Combine Columns by “unite()” Function

Another method to join the columns in R programming is by using the “unite()” function. We need to load the package that holds this function which is the “tidyr” package.

For using the “unite()” function, the syntax we will follow is:

# Unite (dataframe name, combined column name, c(column1, column2))

Where “dataframe name” is the dataframe you will input. “Combined column name” is the name of the column where you wish to store the merged data. And inside the “c()” function are “column1” and “column2,” the columns you need to combine.

In the example in hand, we have first installed the “tidyr” package from the R library. Created a dataframe the same way we created it in the above example. We want to combine 2 columns; “firstname” and “lastname.” Inside the “unite()” function, we will write the dataframe name, which is “Info.” Then name of the column where the combined column values will be stored as “fullname.” Using the “c()” function to call the 2 columns to be combined. Finally, the “print()” statement will display the output.

unite.png

A new column with the name “fullname” storing merging values of “firstname” and “lastname” columns.

unite out.png

Combining Columns by “str_c()” Function

Now we are joining multiple columns with the str_c() method. For this, initially, we need to install the package “stringr.”

“str_c()function works with this syntax:

Str_c(dataframe$column1, “ ”, dataframe$column2)

“dataframe” is the dataframe we have created. “column1” and “column2” are the 2 columns that we want to concatenate. If you have noticed, there is space (“ ”) between the column names. If we don’t add this additional space, this will merge both the columns without space.

Here’s an example of what we’ve done.

stringr.png

At the very beginning, we have loaded a package “stringr” which will allow us to use the “str_c()” function. We used the dataframe created in the above examples and, with the same steps as mentioned before, stored its values in a new dataframe named “Info.”

Inside the “str_c()” function, we took both columns’ names as “Info$firstname” and “info$lastname” and added space between them so that they won’t mingle when concatenated.

You can see the resultant combined column of the example demonstrated above.

space out.png

Conclusion

Combining columns in R programming is a super easy and useful concept. This can be done by several techniques using Rstudio in Ubuntu 20.04. We have made an effort to introduce you to the methods and techniques that can be utilized for the concatenation of columns in R. By demonstrating example codes and elaborating on every small detail about it, our intentional goal and purpose is to guide you most simply and reliably to learn and enjoy the combination of columns in R programming.

About the author

Saeed Raza

Hello geeks! I am here to guide you about your tech-related issues. My expertise revolves around Linux, Databases & Programming. Additionally, I am practicing law in Pakistan. Cheers to all of you.