On June 16 2022, a new PostgreSQL 14 minor release will be published by the PostgreSQL Global Development Group. The new PostgreSQL 14.4 release fixes an issue with all versions of PostgreSQL 14 that can lead to silent corruption of indexes. Usually, the community announces a minor version update on the second Tuesday of the second month of each quarter. However, when some emergency fixes are required, we may see such important release updates soon after a minor release. In this article, I am going to talk about the Important PostgreSQL 14 update to avoid silent corruption of indexes. We shall also discuss how to overcome this corruption with a rebuild of an impacted index.
What PostgreSQL versions are impacted ?
Following minor releases of PostgreSQL 14 are all impacted.
The problem was introduced in PostgreSQL 14 by a VACUUM improvement to ignore certain transactions that are executing CREATE INDEX CONCURRENTLY and REINDEX CONCURRENTLY. These changes caused the indexes created with the CONCURRENTLY option to miss heap tuples that were HOT-updated and HOT-pruned during the index creation. CONCURRENTLY is generally used to rebuild or create an Index online without an Access Exclusive lock on its table.
The cause of the silent data corruption in indexes is an existing issue in the concurrent build of indexes using CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY. This leaves the newly rebuilt Indexes to remain in a silently corrupted state. Your databases with PostgreSQL 14 are impacted if you have run such Index rebuild commands. See this discussion for more details.
Due to this reason, everyone using PostgreSQL 14.0 or 14.1 or 14.2 or 14.3 should upgrade to PostgreSQL 14.4 as soon as the release is published.
How to identify whether any of your Indexes are impacted ?
In order to identify such corrupted Indexes, we could use pg_amcheck if it is a b-tree index. pg_amcheck verifies the presence of all heap tuples has index tuples in an index. It verifies whether tuple(or data) pointed in the index really exists. We may write a detailed article on pg_amcheck to help you understand it in detail. For now, following can be considered as an example.
-- Create the extension
psql -d migops -c "CREATE EXTENSION amcheck"
-- To check all b-tree indexes of tables with a pattern
pg_amcheck -d migops --relation pattern --heapallindexed
-- To check all b-tree indexes matching this pattern
pg_amcheck -d migops --index pattern --heapallindexed
If you need assistance in finding and fixing the corrupted indexes, please send an email to email@example.com or fill the following form.
How to fix it on existing PostgreSQL 14 versions ?
If you are facing this issue and need to solve the problem immediately, the solution is to run CREATE INDEX or REINDEX without using CONCURRENTLY. You have to drop the existing corrupted indexes and then then create them.
This could cause downtime as we are not using the concurrently flag. However, to speed up the Index creations, you may want to use the reindexdb command with -j (jobs) for increased parallelism, but do not set the --concurrently flag.
pg_repack may not support here
pg_repack is a widely used extension to rebuild tables and indexes online. However, pg_repack does use the CONCURRENTLY option internally. In case you have used an automated job to rebuild indexes or tables using pg_repack, then, those indexes that got rebuilt may have got impacted too. In addition to that, pg_repack cannot be used a solution but you must follow the above mentioned approach of REINDEX without CONCURRENTLY as a solution instead.
What about your PostgreSQL databases on DBaaS ?
Customers using any of the following 3 cloud vendors to host their PostgreSQL 14 Instances are also impacted.
- AWS - Amazon Web Services
- Amazon RDS does support PostgreSQL 14. So, your databases may be impacted. Please validate and perform the necessary steps as discussed above.
- GCP - Google Cloud
- Cloud SQL for PostgreSQL does allow customers to create database of version PostgreSQL 14. So, you may have to validate using the above steps and perform the necessary action.
- Azure - Microsoft Azure Cloud
- Azure Database for PostgreSQL only supports up to PostgreSQL 11 version for Single Server and up to PostgreSQL 13 on flexible server. However, Hyperscale (Citus) on PostgreSQL 14 may have been impacted but we would recommend you to validate with your respective cloud vendor.
After upgrading to PostgreSQL 14.4 you can safely use CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY. However, it is important to validate and make sure that your databases are not impacted with such corruption. In case you need assistance in verifying whether any of your databases are impacted and how to fix it, please contact us or send an email to firstname.lastname@example.org. You may also fill the following form.