R

Histogram in R

Histograms are diagrams composed of rectangles to display any statistical data set. They are used to summarize the distribution of statistical information graphically. Histograms are very versatile and provide simplicity. They are used to represent the frequencies of a variable in continuous ranges.

We will try to cover the Histogram in this article. We will first go through the syntax of the hist() function before looking at some examples of how to create histograms in R using this method.

The Histogram in R:

In R programming, histograms are very helpful in visualizing the user-defined range. The histogram is one of the most often used plots for graphical data display and analysis. Histograms are commonly depicted as vertical rectangles aligned along a two-dimensional axis, displaying a comparison of data groups. The data counts are represented by the size of the columns or rectangular boxes on the y-axis, while the data groups’ values are shown on the x-axis. Histograms help in the evaluation of data. In R, a histogram may be generated for a specific variable. This is important for variable selection and feature learning applications in data science projects.

Constructing a Histogram in R:

The syntax for the construction of a histogram in R is:

# hist (v, main, xlab, ylab, xlim, ylim, breaks, col, border)

v” represents the data used to create the histogram. “main” is the title of the char, “col” is the color of the bars, “xlab” is a label for the horizontal axis, and “ylab” is a label for the vertical axis. “xlim” is used to define the horizontal axis limit. “ylim” is used to define the vertical axis limit. “break” is utilized to specify the width of bins. “border” specifies the color of the bar’s border.

We’ll perform different examples to understand and implement all the parameters mentioned in the above paragraph.

Example # 1: Creating a simple Histogram in R using a built-in dataset or vector:

The histogram needs some built-in dataset to import into R for evaluation. Numerous graphical modules and functions are included in R as well as its libraries. We used the built-in Air Passengers data set in this example. To construct a histogram for a particular data set, use the hist() function with a $ sign to choose a specific column of data from the dataset.

The following example creates a histogram of the values in the Air passenger’s dataset:

The resultant histogram looks like this:

In the above example, you learned to create a histogram by using the built-in dataset. Furthermore, you may quickly generate a histogram utilizing the” hist()” function which evaluates a histogram depending on the values you specify. You provide the name of your dataset between the brackets of this function. This function accepts a list of numbers to plot the histogram.

Using the “main” option, you can add a title to the histogram. By passing “main” as an input to the hist() method, you can change the name of the histogram. In this scenario, you create a histogram from the “s” data collection called “All students”. While “xlab” will be used to name the x-axis.

You can see the output Histogram of the hist() with a list of numbers:

Example # 2: Adding Color, Border, and Breaks to the Histogram:

The default representations rarely help you analyze your histograms. You must take one further move to gain a deeper knowledge of your histograms. R provides various quick and straightforward solutions to optimize diagram representation while still utilizing the hist() function.

The “col” argument will be used to add color to the histogram and then mention the color name. Apart from that, you can also add border color to the bins of the histogram by using the parameter “border’. The “breaks” option allows us to choose the number of bars in the histogram. This number, however, is only a recommendation. We can define the number of bars in the histogram with the “breaks” parameter. You may give the “breaks” parameter a vector of breakpoints if you want additional control over the breakpoints between bins. This is possible with the “c()” function.

Colored output histogram:

Example # 3: Setting Ranges of the X-Axis and Y-Axis:

To specify the range of values, use the “xlim” and “ylim” arguments. The range provided to these two arguments will determine the axes of our histogram graphic. Let’s have a glimpse at how it’s done below.

This histogram in the code snippet above has an x-axis limited to values 0 to 50 and a y-axis limited to values 0 to 4. When utilizing “xlim” and “ylim”, the “c()” method is applied to restrict the values on the axes. It takes two values: one for the starting and one for the end.

The above script creates a histogram of data values from “s” and calls it “All Students,” labels the x-axis as “No. of students”, adds a black border and a blue color to the bins, limits the x-axis from 0 to 50, y-axis from 0 to 4, and changes the bin-width to 6.

Example # 4: Histogram with Hatched Fill Pattern:

You can also construct a histogram with the hatched pattern. In this example, hatching fill histogram with 45° slanting lines is constructed. In a histogram, the frequency is determined by the area of the bar rather than its height, which refers to its “density”. The hatching lines will be blue while the borders will be black.

When we run the above code, we get the following result:

Conclusion:

In this topic, we attempted to explore the fundamentals of Histogram building in R using Rstudio. To use the arguments in the “hist()” method, we execute several sets of code. In four instances, we covered the basic construction of a histogram, how to add color and border to a histogram, how to restrict the ranges of the x and y axes, and how to add hatching to histograms. You would be able to improve the visual appeal of your histogram by following these simple and easy examples.

About the author

Saeed Raza

Hello geeks! I am here to guide you about your tech-related issues. My expertise revolves around Linux, Databases & Programming. Additionally, I am practicing law in Pakistan. Cheers to all of you.