Is Aurora PostgreSQL really faster and cheaper than RDS PostgreSQL – Benchmarking

This article is to help you understand that you should benchmark before choosing Aurora PostgreSQL over PostgreSQL. We have often noticed that a lot of Customers migrating from Oracle to PostgreSQL get confused between PostgreSQL and Aurora PostgreSQL. Majority of leaders and key decision makers incline towards Aurora PostgreSQL as it is often marketed as 3 times faster than PostgreSQL. Is Aurora PostgreSQL really faster and cheaper than RDS PostgreSQL ? I would request all the leaders and experts to spend time reading this article for interesting observations from a benchmark done against AWS RDS PostgreSQL vs AWS Aurora PostgreSQL and how one could reduce their bills through proper insights.

Let's meet at AWS re-Invent - 2021

By the way, if you are coming to AWS re-Invent, you can always have a direct chat with me about Migrations to PostgreSQL or any existing problems with your PostgreSQL deployments on AWS. You may see me sometimes around the Percona booth : 1429, The Venetian - Las Vegas at the re-Invent. If you couldn't find me, please send an email to : talktoavi@migops.com for a conversation anytime. 

General observations by some of our Customers using Aurora PostgreSQL

Some of our customers, may be including you reading this article, are curious about why there is a huge CPU utilization on their Aurora instances. Because, this did force our customers to upgrade their Aurora Instance types to resolve performance issues. Some customers have also seen Aurora IOPS being the major reason for heavy bills on Aurora PostgreSQL. Some of our customers also got surprised looking at some wait events on Aurora PostgreSQL that are never seen on PostgreSQL documentation.

For this reason, I always had the thought in mind to publish a benchmark of PostgreSQL on RDS and Aurora PostgreSQL and provide some insights. RDS PostgreSQL is generally considered as PostgreSQL though there are some limitations compared to vanilla PostgreSQL. We will discuss this difference in my next article.

A high level view of the benchmark results

In a gist, I would assume that the overall magic of Aurora PostgreSQL is observed due to its huge IOPS allocation compared to RDS. Again, this does not mean that the IOPS performed on RDS PostgreSQL and Aurora PostgreSQL are the same. There is always a huge deviation and you do pay for it. You will understand this in detail by the end of this article. When RDS was tuned with slightly better IOPS limits, it outperformed Aurora Postgres each time. This benchmark also shows the mystery of Aurora's high CPU utilization that gives an understanding of why some of our Customers had to upgrade their Aurora Instances during performance issues due to 100% CPU usage.

Following are the results of the benchmark performed against RDS with tuned IOPS and Aurora PostgreSQL databases.

RDS PostgreSQL 13.4 vs Aurora PostgreSQL 13.4

Before looking into how the benchmark was performed and seeing some interesting graphs, let me talk about some of my observations from the benchmark. Every detail mentioned in this long article is worth reading even if it takes some time.

Observations from the Benchmark

Aurora Postgres CPU usage could be much higher than the RDS Postgres CPU usage

One of the most common reasons I have heard as the reason for choosing Aurora Postgres is that the replication latency is considered to be too low for near to real time reads on standby or reader instances. However, based on the CPU utilization graphs from AWS monitoring dashboards, it is very clear that RDS had a CPU utilization of less than 40% with tuned IOPS limits and lesser price, but for the same workload, Aurora Postgres had a CPU utilization of over 60% with lesser performance than RDS. For this reason, I would rather use the remaining server resources on the RDS Instance to satisfy my read workload, rather than spinning up a Synchronous standby for a near to real time read, at a cost of performance.

Aurora IOPS and IO Queue depth could be much higher than RDS

While RDS had a very tiny IO Queue depth between 5 to 10 sometimes, Aurora had an IO Queue depth mostly between 15 to 25 (surprising !). The Total IOPS reported on Aurora was several times higher than the Total IOPS on RDS, thus helping me understand why one of these Aurora Blogs mentioned  that 65% of the bill of Aurora is its IOPS. Unfortunately, such articles talk about tuning the Postgres database while the mystery of why Aurora Total IOPS vs RDS Total IOPS had a huge variation, is not considered. So, Customers end up spending time tuning wrong places and assume it is the untuned database or the application that is responsible for huge bills, maybe. Eventually, one may end up upgrading their Aurora PostgreSQL Instances for more CPUs and pay more than they actually should.

AWS never claimed Aurora PostgreSQL as PostgreSQL

An important fact to consider is that AWS never claimed Aurora PostgreSQL as PostgreSQL. AWS only claims Aurora Postgres as a PostgreSQL compatible database. The biggest difference is that PostgreSQL is an Open Source database software with over 30 years of active development. The development, patches, bugs, discussions and ideas are all open to the World. Anybody is free to contribute, review and discuss. More details about PostgreSQL can be seen on postgresql.org. At the same time, there may be some white-papers explaining how AWS designed Aurora Postgres by modifying the PostgreSQL source code and eliminating checkpoints and other IO generating background processes upon shifting a lot of logic to the storage layer.  This deviation from PostgreSQL may be the reason why we still see Aurora PostgreSQL 13 while PostgreSQL 14 is already released. By the way, it took almost a year for Aurora PostgreSQL 13 (Aug, 2021) to be released while PostgreSQL 13 (Sept, 2020) released much earlier. 

AWS Instances chosen for this benchmark

Following are the Server specifications chosen for this benchmark.

 

EC2

RDS PostgreSQL 13.4

Aurora PostgreSQL 13.4

Instance Type

r5.xlarge

db.r5.xlarge

db.r5.xlarge

Region/AZ

us-east-1a

us-east-1a

us-east-1a

VPC

                                Same VPC

CPU(s)

4 vCPUs

4 vCPUs

4 vCPUs

RAM

32 GiB

32 GiB

32 GiB

Storage

300 GB

Initially 300 GB and then upgraded to 1000 GB

Unlimited (Up to 65TB)

IOPS

900

Initially 900 and later increased to 3000

Up to Instance limits

Network

Up to 10 Gigabit

4,750 Mbps

4,750 Mbps

EBS Encryption

Enabled

Enabled

Enabled

Has Standby ?

No

No

No

What was used to perform the benchmark ?

pgbench has been used to perform the data load and to run TPC-B benchmarks. pgbench has been explicitly designed and developed by the PostgreSQL community for running benchmarks against PostgreSQL databases. While doing the benchmark, it is always important to perform repeated executions and see the TPS and thus I have 4 iterations each with 4, 8 and 12 clients. By the way, I have kept all the PostgreSQL parameters/flags as default and performed the test with AWS assigned default parameter groups. 

As the EC2 Instance, RDS and Aurora PostgreSQL Instances are in the same network, we can run pgbench from the EC2 instance remotely for both initialization and the benchmark.

Eliminating performance burst through free credits, during benchmark

AWS allocates 5.4 Million free IO credits for a newly created RDS instance. I have put enough load on the RDS Instances to ensure that my actual benchmark would not show performance numbers based on the free IO credits. So, my initial tests have all utilized the free IO credits.

Initial Data Load - pgbench

In order to run the benchmark using pgbench, we must start with initialization. In this stage, it creates 4 tables followed by loading some data, depending on the scale factor specified. Scale factor used for the data load was 10000.

Command used to perform the data load is as follows. Data load using the following commands created 4 tables and loaded data of size : 146 GB.

-- RDS
$ pgbench -i -s 10000 -h rds_host -U postgres -d postgres -p 5432

-- Aurora
$ pgbench -i -s 10000 -h aurora_host -U postgres -d postgres -p 5432
Benchmark - pgbench

Benchmarking using pgbench ran with different levels of concurrency against both RDS and Aurora Postgres instances for an hour.

  • Using a concurrency of 4 threads and 4 jobs
  • Using a concurrency of 8 threads and 8 jobs
  • Using a concurrency of 12 threads and 12 jobs

The command used to perform the TPS benchmark with 4 clients is as follows. The number 4 will be replaced by 8 and 12 when benchmarking with 8 and 12 clients.

pgbench -T 3600 -j 4 -c 4 -h <host> -U postgres -d postgres -p 5432
RDS PostgreSQL vs Aurora PostgreSQL with less IOPS on PostgreSQL RDS

When I initially performed the benchmarking on RDS with 300 GB storage (GP2 SSD) that gets us 900 IOPS (3 IOPS per each GB), the TPS (Transactions Per Second) was not that great because of the IOPS limitations for the workload on RDS.

Here are the benchmarking results for both RDS PostgreSQL 13.4 with untuned IOPS and Aurora PostgreSQL 13.4, with 8 clients for a period of 1 hour for all 4 iterations.

RDS PostgreSQL 13.4 (untuned IOPS) vs Aurora PostgreSQL 13.4

No doubt that Aurora PostgreSQL outperformed in this test. However, this is where everybody stops and do not understand why RDS did not perform well. Instead, they go ahead and migrate to Aurora PostgreSQL as the results create an assumption that Aurora is always many times faster than RDS.

What to do when your RDS PostgreSQL performance is not as good as expected ? 

What I have done to understand the performance degradation better was by having a closer look at the Wait Events on RDS. I have seen where Postgres' performance on RDS went bad. The 3 major wait events observed on RDS Performance Insights are all directly related to IO as seen below. 

IO:DataFileRead      -          Waiting for a read from a relation data file.
IO:WALSync              -          Waiting for a WaL to reach the durable storage.
LWLock:WALWrite  -           Postgres is waiting for WAL buffers to be written to disk.

Following image is just a snippet of the top wait events from AWS Performance Insights.

Wait Events on Aurora PostgreSQL

Optimizing the IOPS on RDS

I have noticed that my benchmark for around 8 hours on Aurora created 849 Million IO requests as per AWS Billing, that is equivalent to $ 169.88. This is when I decided to rather utilize some of these dollars to upgrade my storage on RDS for a better IOPS. So, I have upgraded my RDS storage to 1000 GB, which costs $ 100 dollars ($ 0.10 per GB-month x 1000), it gets me 3000 IOPS approximately. This is when the performance of RDS was observed to be much better than Aurora along with low resource utilization (CPU, IOPS, IO Queue depth, etc).

Following TPS was observed on RDS and Aurora after the TPS benchmarking was completed.

4 Clients

RDS PostgreSQL vs Aurora PostgreSQL with 4 clients

RDS PostgreSQL vs Aurora PostgreSQL 13 with 8 clients

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

RDS PostgreSQL vs Aurora PostgreSQL 13 with 12 clients

 

 

 

 

 

 

 

 

Consistent TPS rate

It is very important to ensure that the TPS rate is consistent throughout the benchmark duration. For example, if it is fluctuating between high and low values, then, the performance may never be considered as consistent. The same has been analyzed using the Performance Insights graphs as seen below. By the way, the timezone of the Performance Insights graphs are of UTC whereas all the other graphs mentioned later in this article are of my local timezone. So, please do not get confused with the timings between Performance Insights graphs and other graphs.

Following is the RDS TPS Rate that has been consistently around the same value throughout the 4 iterations of the benchmark with 8 clients.

RDS PostgreSQL TPS Consistency

CPU Utilization on RDS vs Aurora

One of the important differences observed during this benchmark is the CPU utilization. The average CPU Utilization of the RDS was around 30% and it has never gone beyond 50%. But, you could notice that the Aurora average CPU utilization was above 60% and maximum CPU utilization has gone up to 90%.

RDS CPU Graph for all the 4 iterations of the benchmark with 8 clients.

RDS PostgreSQL CPU Utilization

Aurora CPU Graph for all the 4 iterations of the benchmark with 8 clients.

Aurora PostgreSQL CPU Utilization

IOPS Utilization on RDS vs Aurora

Another huge difference observed during this benchmark is the IOPS utilization. The TPS of Aurora was lesser than RDS in this benchmark but the Aurora shows a huge IOPS utilization when compared with RDS. Please remember that these IOPS numbers are without a replica. If a replica is added it could increase the IOPS.

IOPS Utilization on RDS for all the 4 iterations of the benchmark

Total IOPS on RDS PostgreSQL

IOPS Utilization on Aurora for all the 4 iterations of the benchmark

Total IOPS on Aurora PostgreSQL

IO Queue depth on RDS vs Aurora

Another mystery observed during this benchmark is the IO Queue depth. Some of the blogs on Aurora do claim that Aurora PostgreSQL is great for massive concurrent workloads and faster than PostgreSQL. Some of the customers may also assume that Aurora has almost unlimited IOPS unlike RDS. In that case, there should not be an IO Queue depth that is worse than RDS. See the following graphs showing the IO queue depth on RDS vs Aurora.

IO Queue depth on RDS for all the 4 iterations of the benchmark

IO Queue depth on RDS PostgreSQL

IO Queue depth on Aurora for all the 4 iterations of the benchmark

IO Queue depth on Aurora PostgreSQL

Conclusion

When you do not get enough performance on RDS, please see where the performance is going bad. A good place to start is by looking at the WAIT Events. Most of the problems on RDS are related to IOPS. Tune your IOPS and compare the performance between RDS and Aurora before switching to Aurora. As noticed above, your bill for Aurora IOPS may be huge along with the need of upgrading Aurora Instances because of its weird CPU utilization when compared with RDS. 

I am volunteering to spend a few hours each month talking to Customers who deployed PostgreSQL on AWS or other cloud platforms and providing necessary advice. Especially, if you are facing any performance issues or huge bills and willing to get some advice, please send an email to : talktoavi@migops.com and our team should be able to schedule a call with me. For professional services around Migrations to PostgreSQL and tuning and maintaining PostgreSQL databases on cloud and On-Premise, please contact us at sales@migops.com or submit the following form.

1 thought on “Is Aurora PostgreSQL really faster and cheaper than RDS PostgreSQL – Benchmarking”

  1. Great article! What are the reasons for higher CPU usage with Aurora? Is it possible that the compute used by the storage layer is the reason?

Leave a Comment

Your email address will not be published.

Scroll to Top