I made a backup using pg_dump and restored it 21 times. I made backups in 4 different formats using 1 to 7 compression levels for each format. I recorded the results and compared the different types to understand which methods are more effective for my use case.

Details and measurements are below.

Table of content:

Why do I need this measurement?

I had a very specific task: to find the best backup format using the standard pg_dump. “Best” means the optimal ratio of backup creation speed, recovery speed and final file size.

I used this information in my open source project for PostgreSQL backups called Postgresus.

There were additional requirements:

Backup formats and compression types in pg_dump

pg_dump supports 4 formats:

Plain (SQL):

Custom (-Fc):

Directory (-Fd):

Tar (-Ft):

I was most interested in the custom format and directory format. They support parallel processing of backups. The custom format cannot create a backup in parallel (only restore), but writes it to a single file. The directory format can both back up and restore in parallel, but writes everything to a directory.

For these formats the following compression types are supported:

gzip:

lz4:

zstd:

The compression characteristics described are based on perfectly prepared data. In the case of a database, compression does not take up 100% of the time with 100% CPU utilization. There are many database-specific operations that are likely to slow down compression (especially “on the fly”).

Before the test, I assumed that a custom format with gzip compression would be the best option for me (as a happy medium) and had cautious hopes for zstd (as a more modern format). By the way, zstd only began to be supported in PostgreSQL 15.

PostgreSQL configuration

I launched two PostgreSQL instances in Docker Compose: one for creating backups (with data) and one for restoring from them. I didn’t use the standard ports because they are already in use by my local versions of PostgreSQL.

docker-compose.yml

version: "3.8"

services:
  db:
    image: postgres:17
    container_name: db
    environment:
      POSTGRES_DB: testdb
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: testpassword
    ports:
      - "7000:7000"
    command: -p 7000
    volumes:
      - ./pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres -d testdb -p 7000"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped

  db-restore:
    image: postgres:17
    container_name: db-restore
    environment:
      POSTGRES_DB: testdb
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: testpassword
    ports:
      - "7001:7001"
    command: -p 7001
    volumes:
      - ./pgdata-restore:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres -d testdb -p 7001"]
      interval: 10s
      timeout: 5s
      retries: 5
    restart: unless-stopped
    depends_on:
      - db

Then I updated postgresql.conf a little to use more computer resources. I have an AMD Ryzen 9 7950X (16 cores, 32 threads), 64 GB of RAM and a 1 TB NVMe drive. I configured the database to use 4 threads and 16 GB of memory via PgTune.

postgresql.conf

# DB Version: 17
# OS Type: linux
# DB Type: web
# Total Memory (RAM): 16 GB
# CPUs num: 4
# Connections num: 100
# Data Storage: ssd

max_connections = 100
shared_buffers = 4GB
effective_cache_size = 12GB
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 40329kB
huge_pages = off
min_wal_size = 1GB
max_wal_size = 4GB
max_worker_processes = 4
max_parallel_workers_per_gather = 2
max_parallel_workers = 4
max_parallel_maintenance_workers = 2

listen_addresses = '*'

Data preparation

To begin with, I created a database with 3 tables and 9 indexes with a total size of ~11 GB. The data is as diverse as possible. I am more than sure that pg_dump works better with some types of data and worse with others. But my project is aimed at a wide audience, so it is important to measure the “average across the board.”

Below is the table structure.

Tables:

Indexes:

The data is generated and inserted into the database using a Python script. The algorithm is as follows:

Measurement results

After 21 creations and restores, I obtained the following table with data that includes:

The table with raw CSV data:

tool,format,compression_method,compression_level,backup_duration_seconds,restore_duration_seconds,total_duration_seconds,backup_size_bytes,database_size_bytes,restored_db_size_bytes,compression_ratio,backup_success,restore_success,backup_error,restore_error,timestamp
pg_dump,plain,none,0,100.39210295677185,735.2188968658447,835.6109998226166,9792231003,11946069139,11922173075,0.8197031918249641,True,True,,,2025-07-29T09:56:20.611844
pg_dump,custom,none,0,264.56927490234375,406.6467957496643,671.216070652008,6862699613,11946069139,11943709843,0.5744734550878778,True,True,,,2025-07-29T10:07:37.226681
pg_dump,custom,gzip,1,214.07211470603943,383.0168492794037,597.0889639854431,7074031563,11946069139,11943611539,0.5921639562511493,True,True,,,2025-07-29T10:17:39.801883
pg_dump,custom,gzip,5,260.6179132461548,393.76623010635376,654.3841433525085,6866440205,11946069139,11943718035,0.5747865783384196,True,True,,,2025-07-29T10:28:40.167485
pg_dump,custom,gzip,9,272.3802499771118,385.1409020423889,657.5211520195007,6856264586,11946069139,11943619731,0.5739347819121977,True,True,,,2025-07-29T10:39:42.912960
pg_dump,custom,lz4,1,84.0079517364502,379.6986663341522,463.7066180706024,9146843234,11946069139,11943685267,0.765678075990583,True,True,,,2025-07-29T10:47:32.131593
pg_dump,custom,lz4,5,150.24981474876404,393.44346714019775,543.6932818889618,8926348325,11946069139,11943718035,0.7472205477078983,True,True,,,2025-07-29T10:56:41.333595
pg_dump,custom,lz4,12,220.93980932235718,418.26913809776306,639.2089474201202,8923243046,11946069139,11943767187,0.7469606062188722,True,True,,,2025-07-29T11:07:26.574678
pg_dump,custom,zstd,1,87.83108067512512,419.07846903800964,506.90954971313477,6835388679,11946069139,11943767187,0.5721872692570225,True,True,,,2025-07-29T11:15:59.917828
pg_dump,custom,zstd,5,102.42366409301758,413.64263129234314,516.0662953853607,6774137561,11946069139,11944357011,0.567059966100871,True,True,,,2025-07-29T11:24:42.075008
pg_dump,custom,zstd,15,844.7868592739105,388.23959374427795,1233.0264530181885,6726189591,11946069139,11943636115,0.5630462633973209,True,True,,,2025-07-29T11:45:17.885163
pg_dump,custom,zstd,22,5545.566084384918,404.1370210647583,5949.7031054496765,6798947241,11946069139,11943750803,0.5691367731000038,True,True,,,2025-07-29T13:24:30.014902
pg_dump,directory,none,0,114.9900906085968,395.2716040611267,510.2616946697235,6854332396,11946069139,11943693459,0.5737730391684116,True,True,,,2025-07-29T13:33:05.944191
pg_dump,directory,lz4,1,53.48561334609985,384.92091369628906,438.4065270423889,9146095976,11946069139,11943668883,0.7656155233641663,True,True,,,2025-07-29T13:40:30.590719
pg_dump,directory,lz4,5,83.44352841377258,410.42058181762695,493.86411023139954,8925601067,11946069139,11943718035,0.7471579950814815,True,True,,,2025-07-29T13:48:50.201990
pg_dump,directory,lz4,12,114.15110802650452,400.04946303367615,514.2005710601807,8922495788,11946069139,11943758995,0.7468980535924554,True,True,,,2025-07-29T13:57:30.419171
pg_dump,directory,zstd,1,57.22735643386841,414.4600088596344,471.6873652935028,6835014976,11946069139,11943750803,0.5721559867493079,True,True,,,2025-07-29T14:05:28.529630
pg_dump,directory,zstd,5,60.121564865112305,398.27933716773987,458.4009020328522,6773763858,11946069139,11943709843,0.5670286835931563,True,True,,,2025-07-29T14:13:13.472761
pg_dump,directory,zstd,15,372.43965554237366,382.9877893924713,755.427444934845,6725815888,11946069139,11943644307,0.5630149808896062,True,True,,,2025-07-29T14:25:54.580924
pg_dump,directory,zstd,22,2637.47145485878,394.4939453601837,3031.9654002189636,6798573538,11946069139,11943660691,0.5691054905922891,True,True,,,2025-07-29T15:16:29.450828
pg_dump,tar,none,0,126.3212628364563,664.1294028759003,790.4506657123566,9792246784,11946069139,11942759571,0.8197045128452776,True,True,,,2025-07-29T15:29:45.280592

I had to remove the results for zstd with compression level 15 and zstd with compression level 22 from the graphs. They distorted the graphs significantly due to the long compression time, while not providing any noticeable increase in compression.

So, measurements.

Backup speed in seconds (lower is better):

Zoom image will be displayed

Restore speed from backup in seconds (lower is better):

Total time for creation and restoration from backup in seconds (lower is better):

Total backup size as a percentage of the original database size (smaller is better):

Zoom image will be displayed

Conclusions based on measurements

Before talking about conclusions, I would like to make an important disclaimer: the test was performed on synthetic data, and the results with “real-world” data will differ significantly. The trend will be the same, but the time difference will be greater.

The measurements show that there is no radical difference in speed between the plain format, custom format and directory format on synthetic data. Despite the fact that the custom format is restored in parallel mode relative to plain, and the directory format also creates the copy itself in parallel.

The difference in speed between the custom format and plain is ~30%. Between the custom and directory formats, it is only ~20%. I would suppose that the test data lacked a sufficient number of independent tables and objects — otherwise, the gap between the formats would have been multiple times greater.

So, based on the measurements, I can make the following conclusions:

Conclusion

The measurement showed that the most optimal format for my task was a custom format with zstd compression and a compression level of 5. I get the maximum total speed with almost maximum compression and a single backup file.

After implementing zstd 5 instead of gzip 6 in the project, the backup size was reduced by almost half with a slightly shorter backup time. At the same time, unlike synthetic data, a 4.7 GB database was compressed to 276 MB (17 times smaller!):

Zoom image will be displayed

I hope that my test will be useful to those who develop backup tools or regularly dump databases using pg_dump scripts. Perhaps in the future, I will conduct the same test, but with a more diverse data set.

And once again, if you need to create regular backups, I have an open source project for this task. I would be extremely grateful for a star on GitHub ❤️, as the first stars are hard to come by.