PostgreSQL

Autovacuum in PostgreSQL

Vacuum? Yes, the concept does exist in Postgres databases. When a record is deleted in Postgres, a dead tuple is formed and with the help of the vacuum command, these dead tuples can be removed. After removing the dead tuples, the ANALYZE action of Postgres is performed on the table to update the statistics. To perform the vacuum and analyze actions collectively and automatically, the AUTOVACUUM command of Postgres is utilized. It performs both the deletion and then updates the statistics . To get a more in-depth understanding of this Postgres utility, this guide is structured to get insight and good hands-on AUTOVACUUM in Postgres.

How Postgres Auto vacuum works

The Autovacuum is the background process in Postgres and is enabled by default. However, you can get the settings of Autovacuum by navigating to the postgresql.conf file.

As discussed earlier, the auto vacuum performs both vacuuming and analysis. Postgres analyzes and vacuums a table based on the threshold described below:

For the automatic vacuuming threshold of a table, the following equation is utilized:

Autovacuum VACUUM threshold = autovacuum_vacuum_scale_factor * no._of_tuples + autovacuum_vacuum_threshold

And to get the automatic analyzing threshold, the following equation does the job:

Autovacuum ANALYZE threshold = autovacuum_analyze_scale_factor * number of tuples + autovacuum_analyze_threshold

The terminologies used in the above sections are described in the upcoming section.

How to Configure Postgres Auto Vacuum

The Auto vacuum functionality of Postgres depends on the settings and options available inside the postgresql.conf file. With the help of the description provided below, you can change the parameters to tune the Auto vacuum functionality.

The following parameters have the key role in defining policy for Postgres autovacuum.

  • autovacuum: Shows the default status for auto vacuum call-up.
  • autovacuum_naptime: The parameter’s default value is the 60s (1min), which indicates the difference between each auto vacuum wakeup.
  • autovacuum_max_workers: The processes that are vacuumed after naptime.
  • autovacuum_vacuum_scale_factor: This factor directs the auto vacuum to start the process when the percentage of the value specified is changed. It is by default set to 20%, which means the auto vacuum will come into action when 20% of the tables/records are updated.
  • autovacuum_vacuum_threshold: This parameter ensures that auto vacuum must run after specific(by default, it is 50) numbers of tables are updated/changed.
  • autovacuum_analyze_scale_factor: This factor refers to the analysis process that an auto vacuum performs. When a specific percentage of records is updated/changed, the table performs the analysis.
  • autovacuum_analyze_threshold: When specific numbers of tables encounter updates, and the threshold is met, the auto vacuum starts analyzing the table, and the default value of this parameter is 50.

To edit the file, you can open it in a nano editor by using the command stated below (the location of postgresql.conf file may vary in your case):

$ sudo nano /etc/postgresql/12/main/postgresql.conf

How to configure auto vacuum parameters for a table

The postgresql.conf file deals with the global changes to the autovacuum parameters. However, you can tune parameters for a single table. For instance, we have used the following parameters for the linuxhint table of database mydb:

  • autovacuum_vacuum_scale_factor = 0.2
  • autovacuum_vacuum_threshold = 30
  • autovacuum_analyze_scale_factor= 0.2
  • autovacuum_analyze_threshold = 20

These parameters will be passed with ALTER TABLE statement of PowerShell and the ALTER TABLE will be applied on linuxhint table as shown below:

> ALTER TABLE linuxhint SET (autovacuum_vacuum_scale_factor = 0.2, autovacuum_vacuum_threshold = 30, autovacuum_analyze_scale_factor = 0.2, autovacuum_analyze_threshold = 20);

It can be calculated manually by inserting the values in the equations given in the above section and is as follows. Suppose the linuxhint table contains 10 tuples:

The VACUUM operation of Autovacuum is calculated as shown below. The output shows that the VACUUM operation of Autovacuum will be performed when the number of obsolete records reaches 31.

Autovacuum VACUUM threshold of linuxhint table = (0.2 * 5) + 30 = 31

Similarly, the ANALYZE operation of Autovacuum on linuxhint table will be carried out when the number of insertions/deletions/updating equals or exceeds 21.

Autovacuum ANALYZE of linuxhint table= (0.2*5) + 20 = 21

Why Auto vacuum

Looking at the working and the above discussion, the following benefits of Auto Vacuum can be extracted:

  • The space is managed effectively by deleting the dead cells
  • The analyzing and vacuuming time are saved as it is carried out automatically
  • The table bloating does not occur as deletion of dead tuples occurs automatically and thus the table does not overflow.
  • Autovacuum can be carried out in parallel manners. For example, Autovacuum does not lock the table as the manual vacuum does.

Conclusion

The auto vacuum functionality of Postgres allows you to perform the vacuum and analyze the tables of a database. Even though one can vacuum and analyze manually, the automated vacuum will do both. This article provides detailed information about the auto vacuum functionality of Postgres. Postgres calculates threshold values for each table and then decides whether to VACUUM or ANALYZE that table. Moreover, we have provided Postgres’s equations to calculate the VACCUM and ANALYZE threshold.

About the author

Adnan Shabbir