๐Ÿš€ About


๐Ÿ” Features


๐Ÿ› ๏ธ Usage

Receive mode (the main loop of the WAL receiver)

cat <<EOF >config.yml
main:
  listen_port: 7070
  directory: wals
receiver:
  slot: pgrwl_v5
log:
  level: trace
  format: text
  add_source: true
EOF

export PGHOST=localhost
export PGPORT=5432
export PGUSER=postgres
export PGPASSWORD=postgres
export PGRWL_MODE=receive

pgrwl -c config.yml

Serve mode (used during restore to serve archived WAL files from storage)

cat <<EOF >config.yml
main:
  listen_port: 7070
  directory: wals
log:
  level: trace
  format: text
  add_source: true
EOF

export PGRWL_MODE=serve

pgrwl -c config.yml

restore_command example for postgresql.conf

# where 'k8s-worker5:30266' represents the host and port 
# of a 'pgrwl' instance running in 'serve' mode. 
restore_command = 'pgrwl restore-command --serve-addr=k8s-worker5:30266 %f %p'

โญ See also: examples (step-by-step archive and recovery), and k8s (basic setup)


โš™๏ธ Configuration Reference

The configuration file is in JSON or YML format (*.json is preferred).It supports environment variable placeholders like ${PGRWL_SECRET_ACCESS_KEY}.

main:                                    # Required for both modes: 'receive' / 'serve'
  listen_port: 7070                      # HTTP server port (used for management)
  directory: "/var/lib/pgwal"            # Base directory for storing WAL files

receiver:                                # Required for 'receive' mode
  slot: replication_slot                 # Replication slot to use
  no_loop: false                         # If true, do not loop on connection loss

uploader:                                # Optional (used in receive mode)
  sync_interval: 10s                     # Interval for the upload worker to check for new files
  max_concurrency: 4                     # Maximum number of files to upload concurrently

log:                                     # Optional
  level: info                            # One of: trace / debug / info / warn / error
  format: text                           # One of: text / json
  add_source: true                       # Include file:line in log messages (for local development)

storage:                                 # Optional
  name: s3                               # One of: s3 / sftp
  compression:                           # Optional
    algo: gzip                           # One of: gzip / zstd
  encryption:                            # Optional
    algo: aesgcm                         # One of: aes-256-gcm
    pass: "${PGRWL_ENCRYPT_PASSWD}"      # Encryption password (from env)
  sftp:                                  # Required section for 'sftp' storage
    host: sftp.example.com               # SFTP server hostname
    port: 22                             # SFTP server port
    user: backupuser                     # SFTP username
    pass: "${PGRWL_VM_PASSWORD}"         # SFTP password (from env)
    pkey_path: "/home/user/.ssh/id_rsa"  # Path to SSH private key (optional)
    pkey_pass: "${PGRWL_SSH_PKEY_PASS}"  # Required if the private key is password-protected
  s3:                                    # Required section for 's3' storage
    url: https://s3.example.com          # S3-compatible endpoint URL
    access_key_id: AKIAEXAMPLE           # AWS access key ID
    secret_access_key: "${PGRWL_AWS_SK}" # AWS secret access key (from env)
    bucket: postgres-backups             # Target S3 bucket name
    region: us-east-1                    # S3 region
    use_path_style: true                 # Use path-style URLs for S3
    disable_ssl: false                   # Disable SSL


๐Ÿš€ Installation

Manual Installation

  1. Download the latest binary for your platform from the Releases page.
  2. Place the binary in your system's PATH (e.g., /usr/local/bin).

Installation script for Unix-Based OS (requires: tar, curl, jq):

(
set -euo pipefail

OS="$(uname | tr '[:upper:]' '[:lower:]')"
ARCH="$(uname -m | sed -e 's/x86_64/amd64/' -e 's/\(arm\)\(64\)\?.*/\1\2/' -e 's/aarch64$/arm64/')"
TAG="$(curl -s https://api.github.com/repos/hashmap-kz/pgrwl/releases/latest | jq -r .tag_name)"

curl -L "https://github.com/hashmap-kz/pgrwl/releases/download/${TAG}/pgrwl_${TAG}_${OS}_${ARCH}.tar.gz" |
tar -xzf - -C /usr/local/bin && \
chmod +x /usr/local/bin/pgrwl
)


๐Ÿ—ƒ๏ธ Usage In Backup Process

The full process may look like this (a typical, rough, and simplified example):


๐Ÿงฑ Architecture

Design Notes

pgrwl is designed to use the local filesystem exclusively. This is a deliberate choice, because - as mentioned earlier - we must rely on fsync after each message is written to disk.

This ensures that *.partial files always contain fully valid WAL segments, making them safe to use during the restore phase (after simply removing the *.partial suffix).

pgrwl supports compression and encryption as optional features for completed WAL files (during upload on remote storage).

However, streaming *.partial files to any location other than the local filesystem can introduce numerous unpredictable issues.

In short: PostgreSQL waits for the replica to confirm commits, so we cannot afford to depend on external systems in such critical paths.

๐Ÿ’พ Notes on fsync (since the utility works in synchronous mode only):

๐Ÿ” Notes on archive_command and archive_timeout

Thereโ€™s a significant difference between using archive_command and archiving WAL files via the streaming replicationprotocol.

The archive_command is triggered only after a WAL file is fully completedโ€”typically when it reaches 16 MiB (the default segment size). This means that in a crash scenario, you could lose up to 16 MiB of data.

You can mitigate this by setting a lower archive_timeout (e.g., 1 minute), but even then, in a worst-case scenario,you risk losing up to 1 minute of data.Also, itโ€™s important to note that PostgreSQL preallocates WAL files to the configured wal_segment_size, so they arecreated with full size regardless of how much data has been written. (Quote from documentation:It is therefore unwise to set a very short archive_timeout โ€” it will bloat your archive storage.).

In contrast, streaming WAL archivingโ€”when used with replication slots and the synchronous_standby_namesparameterโ€”ensures that the system can be restored to the latest committed transaction.This approach provides true zero data loss (RPO=0), making it ideal for high-durability requirements.


๐Ÿ‘ท Developer Notes

๐Ÿงช Integration Testing:

Here an example of a golden fundamental test.It verifies that we can restore to the latest committed transaction after an abrupt system crash.It also checks that the WAL files generated are byte-for-byte identical to those generated by pg_receivewal.

Test Steps:

To contribute or verify the project locally, the following make targets should all pass:

# Compile the project
make build

# Run linter (should pass without errors)
make lint

# Run unit tests (should all pass)
make test

# Run integration tests (slow, but critical)
# Requires Docker and Docker Compose to be installed
make test-integ-scripts

# Run GoReleaser builds locally
make snapshot

โœ… All targets should complete successfully before submitting changes or opening a PR.

๐Ÿ—‚๏ธ Source Code Structure

internal/xlog/pg_receivewal.go
  โ†’ Entry point for WAL receiving logic.
    Based on the logic found in PostgreSQL:
    https://github.com/postgres/postgres/blob/master/src/bin/pg_basebackup/pg_receivewal.c

internal/xlog/receivelog.go
  โ†’ Core streaming loop and replication logic.
    Based on the logic found in PostgreSQL: 
    https://github.com/postgres/postgres/blob/master/src/bin/pg_basebackup/receivelog.c

internal/xlog/xlog_internal.go
  โ†’ Helpers for LSN math, WAL file naming, segment calculations.
    Based on the logic found in PostgreSQL:
    https://github.com/postgres/postgres/blob/master/src/include/access/xlog_internal.h

internal/xlog/walfile.go
  โ†’ Manages WAL file descriptors: open, write, close, sync.

internal/xlog/streamutil.go
  โ†’ Utilities for querying server parameters (e.g. wal_segment_size),
    replication slot info, and streaming setup.

internal/xlog/fsync/
  โ†’ Optimized wrappers for safe and efficient `fsync` system calls.

๐Ÿ“ Main Loop


โœ… TL;DR

If you're building reliable PostgreSQL backup pipelines and want streaming, durability, and developer control, give pgrwl a try.

๐Ÿ’ฌ Questions or feedback? Drop a GitHub Issue or comment here!

๐Ÿ‘‰ Check out the source

๐Ÿ”– Licensed under MIT