PostgreSQL

Postgres EXPLAIN ANALYZE Command

Regarding databases, performance is one feature that takes a high priority. This ensures that the applications can query and write the data with maximum speed and efficiency.

As database administrators, we need to be obsessed with the tools and methods of enhancing the database performance.

In PostgreSQL, we have access to the EXPLAIN ANALYZE command that allows us to analyze the execution plan and performance of a given database query. The command returns a detailed information on how the database engine processes the query. This includes the sequence of operations performed, estimated query costs, execution timing, and more.

We can then use this information to identify the database queries as well as identify and fix the potential performance bottlenecks.

This tutorial discusses how to use the EXPLAIN ANALYZE command in PostgreSQL to view and optimize the query performance.

PostgreSQL EXPLAIN ANALYZE

The command is pretty straightforward. First, we need to prepend the EXPLAIN ANALYZE command at the beginning of the query that we wish to analyze.

The command syntax is as follows:

EXPLAIN ANALYZE <target_query>

Once you execute the command, PostgreSQL returns a detailed output about the provided query.

Understanding the EXPLAIN ANALYZE Query Output

As mentioned, once we run the EXPLAIN ANALYZE command, PostgreSQL generates a detailed report of the query plan and the execution statistics.

The output is comprised of a set of columns that contain useful information. The resulting columns are as shown with their respective meaning:

QUERY PLAN – This column displays the execution plan of the specified query. The execution plan refers to a sequence of operations that the database engine performs to complete the query successfully.

PLAN – The second column is the PLAN column. This contains a textual representation of each operation or step in the execution plan. Again, each operation is indented to indicate the hierarchy of operations.

TOTAL COST – The total cost column represents the estimated total cost of the query. The cost refers to a relative measure that the database query planner uses to determine the optimal execution plan.

ACTUAL ROWS – This column shows the exact number of rows which are processed at each step in the query execution.

ACTUAL TIME – This column shows the actual time taken by each operation which includes both the execution time of the operation and the time spent on resources.

PLANNING TIME – This column shows the time that the query planner takes to generate an execution plan. This includes the total time of the query optimization and the plan generation.

EXECUTION TIME – This column shows the total time to execute the query. This also includes the time spent on planning and query execution time.

PostgreSQL EXPLAIN ANALYZE Example

Let us look at some basic examples of using the EXPLAIN ANALYZE statement.

Example 1: Select Statement
Let us use the EXPLAIN ANALYZE statement to show the execution of a simple select statement in PostgreSQL.

EXPLAIN ANALYZE select * from wp_users where id > 3;

Once we run the previous statement, we should get an output as follows:

                    QUERY PLAN
-------------------------------------------------------------------
 Seq Scan on wp_users  (cost=0.00..10.38 rows=10 width=2256) (actual time=0.009..0.010 rows=7 loops=1)
   Filter: (id > 3)
   Rows Removed by Filter: 3
 Planning Time: 0.995 ms
 Execution Time: 0.021 ms
(5 rows)

In this case, we can see that the Query plan section indicates that the query performs a sequential scan on the wp_users table. The filter line denotes the condition that is used to filter the resulting rows.

We then see the “Rows Removed by Filter” which shows the number of rows that are eliminated by the filter condition.

Finally, the execution time shows the total execution time of the query. In this case, the query takes 0.021ms.

Example 2: Analyzing a Join
Let us take a more complex query that involves an SQL join. For this, we use the Pagila sample database. You can download and install the sample database on your machine for demonstration purpose.

We can run a simple join as shown in the following:

explain analyze SELECT f.title, c.name
FROM film f
JOIN film_category fc ON f.film_id = fc.film_id
JOIN category c ON fc.category_id = c.category_id;

Once we run the given query, we should see the output as follows:

Let us explore the following query plan:

  1. Nested Loop – This indicates that the join uses a nested loop join strategy.
  2. Hash Join – This operation joins the film_category and the film tables using a Hash join algorithm. This operation has a cost of 77.50 and estimated 1000 rows. However, the actual time taken for this operation is 0.254 to 0.439 milliseconds, and it retrieves 1000 rows.
  3. Hash Cond – This indicates that the join condition uses a Hash join to match the film_id columns and the film_category columns in the film tables.
  4. Seq Scan on film_category – This operation performs a sequential scan on the film_category table with a costing of 16.00 and estimated 1000 rows. The actual time taken for this operation is 0.008 to 0.056 milliseconds, and it retrieves 1000 rows.
  5. Seq Scan on film – The query performs a sequential scan on the film table with the resulting estimated and actual costs and rows in this operation.
  6. Memoize – This operation caches the results of the join between film_category and film tables for subsequent use.
  7. Cache Key – This indicates that the cache key that is used for memoization is based on the category_id column from film_category.
  8. Cache Mode – This indicates that the query uses the logical cache mode.
  9. Hits, Misses, Evictions, Overflows – The three lines provide statistics about the cache, number of hits, misses, evictions, and overflows during the execution. This block also includes the memory usage during query execution.
  10. Index Scan using category_pkey – This shows the operation that performs an index scan on the category table using the primary key index.
  11. Index Cond – This shows that the index scan is based on the condition that matches the category_id column in the category table.
  12. Planning Time – This line shows the time taken for query planning which is 3.005 milliseconds.
  13. Execution Time – Finally, this line shows the total execution time of the query which is 0.745 milliseconds.

There you have it! A detailed information about the execution of a simple join in PostgreSQL.

Conclusion

You discovered the power and usage of the EXPLAIN ANALYZE statement in PostgreSQL. The EXPLAIN ANALYZE statement is a powerful tool for query analysis and optimization. Use this tool to build efficient and less resource-intensive queries.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list