R

Set Seed in R

“In R, you build pseudorandom numbers instead of “random numbers.” These numbers are created using an algorithm that starts with a seed. Because it is pseudorandom rather than pure random, the result can be predicted (and reproduced) if the seed and generator are known. In this tutorial, you will learn what setting a seed means and what set.seed function does in R, and how to set.seed function performs, how to set or unset a seed, and also how to produce repeatable outputs as a result.

The purpose of set.seed() function is to ensure that the randomization results are consistent. Because of randomization, when we randomly pick certain observations for any activity in R or any statistical software, we get different values every time. If we wish to maintain the values generated by the initial random pick, after randomization, we can either preserve the results in an object or change the randomization mechanism so that we always get the same results.”

What is the set.seed() Task in the R in Ubuntu 20.04?

You must first establish a seed before you can use R to initialize a pseudorandom number generator. The ability to generate pseudorandom integers that replicate the attributes of independent generations uniformly distributed in the interval (0, 1) is required by the majority of simulation tools in Statistics (0,1). A recursive algorithm called the Random Number Generator (RNG) is required to generate these pseudorandom number sequences:

xi​=f(xi−1,xi−2,,xi−k​)

Where (x0,x1,2,…, xk-1 ) is the seed or the generator’s initial state, and k is the generator’s order. The RNGkind function or the parameter type of the set.seed function in R, which employs the Mersenne-Twister generator by default, can be used to select from a variety of generators. The syntax we use for the set.seed function in R language is demonstrated as follows:

Syntax:

set.seed(n)

Where n is denoted as an integer number that serves as a seed, the seed value (n) you select will be utilized as the beginning point for generating a random number series. As a result, with the same seed number, you’ll get the same results.

How to do the set.seed Function Perform in the R in Ubuntu 20.04?

Let’s look at an example of how to use R’s set.seed() method to generate a consistent sample of random numbers. A data frame’s set.seed() method is also shown as an example.

Example # 1: Using the set.seed Function for the Random Values in R in Ubuntu 20.04

When you use the pseudorandom number set.seed function, you will get a different result each time you run them.

Firstly, we have shown the random numbers generated without the set.seed function. R has a built-in function called rnorm that creates a vector of properly distributed random numbers. Inside the rnorm function, we have passed the numerical value 3, which upon execution, shows the three random values. However, if you run the prior code again, the outcome is distinct. Because you don’t know the seed R used to construct that sequence, this means the code isn’t repeatable.

Now, we have specified a set.seed function and set the value inside it. The random number generator’s current state is saved in the variable x, where the Random.seed is utilized. It’s an integer vector whose length is determined by the generator. Then, we have called rnorm with the value inside it. We have generated the seed twice but with the Random.seed. So it generates the different random values both the time. Also, we have matched both x and y to have identical random values, which return FALSE as the random values are different.

We can pass any numerical value to the set.seed function. It generates the random values as in the above r prompt screen. We have passed higher degree values inside the function and get the random values.

Example # 2: Using the set.seed Function for a Random Sample Data Frame in R in Ubuntu 20.04

Let’s look at an example of a data frame’s set.seed() function extracting a random data frame sample.

We have invoked the set.seed function where the value 1234 is passed. Then, we have created a variable that is represented as an index where the sample function takes the mtcars data frame and value 10. It generated only the first ten entries in the output.

Thus, the random sample data set is generated using the set.seed function.

Example # 3: Using the set.seed Function for Computing the Median in R in Ubuntu 20.04

Setting a seed in R is advantageous with simulation studies, as we previously said. Assume you want to find the mean of a set of numbers drawn from a homogenous distribution, as demonstrated below.

Above, we have specified the set.seed function with an input integer. Then, we have created a variable n_rep and assigned it a value to be repeated. There is another variable, n, and set with the value for the number of points. The numeric function is applied to the n_rep variable, and then we have iteration over the repetition number for the median values.

If you run the preceding code, you will get the following output:

Example # 4: Using the set.seed Function to Unset it in R in Ubuntu 20.04

Finally, you might want to use R to reset or unset a seed. You have two strategies for getting this.

Since R employs the system clock to generate a seed when one is not supplied, you can revert to the default behavior by using the Sys.time approach. On the other hand, you can pass the NULL value inside the set.seed function to reset the seed.

Conclusion

We can use a random seed in R to ensure that the result of our R function is repeatable. By specifying a seed, the random operations in our program always begin at the same moment and, as a result, produce the same output. We have covered the set.seed function in our example for different cases. All the examples are executed in the Ubuntu terminal and have valid outputs.

About the author

Saeed Raza

Hello geeks! I am here to guide you about your tech-related issues. My expertise revolves around Linux, Databases & Programming. Additionally, I am practicing law in Pakistan. Cheers to all of you.