plotly

Plotly.Express.Box

One of the most common statistical plots is a box plot. A box plot is used to show the distribution of numerical data using quartiles. The end of a box represents the lower and upper quartiles, and the box shows the second quartile by the line.

Although it may seem different, a box plot shares various features with a violin plot with certain exceptions.

This article will explore how to create box plots using the Plotly express module.

Function Syntax

The syntax for the box() function is shown below:

plotly.express.box(data_frame=None, x=None, y=None, color=None, facet_row=None, facet_col=None, facet_col_wrap=0, hover_name=None, hover_data=None, custom_data=None, animation_frame=None, animation_group=None, category_orders={}, labels={}, color_discrete_sequence=None, color_discrete_map={}, orientation=None, boxmode=None, log_x=False, log_y=False, range_x=None, range_y=None, points=None, notched=False, title=None, template=None, width=None, height=None)

Function parameter list:

  1. data_frame – specifies the data frame containing the column names used in the plot
  2. x – sets the values to position the marks along the x-axis in the cartesian system
  3. y – specifies the values used to position the marks along the y-axis in the cartesian coordinate system
  4. color – sets the values used to assign a unique color to the marks of the plot
  5. notched – defines if the boxes are drawn with notches or not
  6. title – represents the title for the plot
  7. width/height – defines the width and height of the figure in pixels

Example

The following code shows how to create a basic box plot:

import plotly.express as px
df = px.data.tips()
fig = px.box(df, y='total_bill')
fig.show()

Output:

To create multiple box plots, you can specify the x and y parameters;

import plotly.express as px
df = px.data.tips()
fig = px.box(df, x='sex', y='total_bill')
fig.show()

Resulting figure:

To display the underlying data points, you can set the points parameter to all as shown below:

import plotly.express as px
df = px.data.tips()
fig = px.box(df, x='sex', y='total_bill', points='all')
fig.show()

Output:

To create notched boxes:

import plotly.express as px
df = px.data.tips()
fig = px.box(df, x='sex', y='total_bill', points='all', notched=True)
fig.show()

The resulting figure:

To assign unique colors to the marks:

import plotly.express as px
df = px.data.tips()
fig = px.box(df, x='sex', y='total_bill', points='all', notched=True, color='sex')
fig.show()

Output figure:

You can also change the algorithm for calculating quartiles. The following example uses the inclusive algorithm:

import plotly.express as px
df = px.data.tips()
fig = px.box(df, x='sex', y='total_bill', points='all', notched=False, color='sex')
fig.update_traces(quartilemethod='inclusive')
fig.show()

The output figure is provided below:

You can check the following resource to learn about various quartile algorithms.

https://en.wikipedia.org/wiki/Quartile

Conclusion

This article covers the importance of the box plot since it shows the distribution of numerical data using quartiles. In addition, this guide discussed the various methods and techniques of creating box plots using the Plotly express module.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list