If your PostgreSQL tables are growing into the hundreds of millions of rows and queries are getting sluggish despite good indexes, partitioning might be exactly what you need. This guide covers the fundamentals and walks you through hands-on examples to get you started.
What Is Partitioning?
Table partitioning is a technique where a single logical table is split into multiple physical sub-tables called partitions. From the application's perspective, you still query one table. Under the hood, PostgreSQL routes reads and writes to the appropriate partition automatically.
Think of it like a filing cabinet with labeled drawers. Instead of searching every paper in a single drawer, you go directly to the "2024" drawer and search there. The result? Dramatically faster queries on large datasets.
Why Partition?
- Query performance: PostgreSQL can skip entire partitions via partition pruning, scanning only the relevant data.
- Faster maintenance: Vacuuming, reindexing, and analyzing smaller partitions is quicker than doing so on a single table.
- Easy data lifecycle management: Dropping old data is as simple as
DROP TABLE partition_name— much faster than and less expensiveDELETE. - Improved I/O: Frequently accessed partitions can live on faster storage.
Partitioning Strategies
PostgreSQL (> version 10) supports three built-in partitioning strategies:
1. Range Partitioning
Rows are distributed based on a range of values — most commonly dates or numeric IDs. This is the most popular strategy for time-series data.
2. List Partitioning
Rows are distributed based on a discrete list of values (e.g., country codes, status enums).
3. Hash Partitioning
Rows are distributed by computing a hash on the partition key, evenly spreading data across N partitions. Good when you don't have a natural range or list to partition on.
Setting Up Range Partitioning
Let's say we have an orders table that gets millions of rows per year. We'll partition it by created_at (monthly).
Step 1: Create the Partitioned Parent Table
CREATE TABLE orders (
id BIGSERIAL,
customer_id BIGINT NOT NULL,
amount NUMERIC(10,2) NOT NULL,
status TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);
Note: The parent table holds no data itself — it's purely a logical container.
Step 2: Create Partitions
CREATE TABLE orders_2024_01
PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
CREATE TABLE orders_2024_02
PARTITION OF orders
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
CREATE TABLE orders_2024_03
PARTITION OF orders
FOR VALUES FROM ('2024-03-01') TO ('2024-04-01');
The ranges are inclusive on the lower bound and exclusive on the upper bound.
Step 3: Add Indexes
Indexes must be created on each partition (or you can create them on the parent and PostgreSQL will propagate them):
-- Create index on parent — propagates to all partitions automatically (PG 11+)
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
CREATE INDEX idx_orders_created_at ON orders (created_at);
Step 4: Insert Data
INSERT INTO orders (customer_id, amount, status, created_at)
VALUES (42, 199.99, 'completed', '2024-01-15 10:30:00+00');
PostgreSQL automatically routes this row to orders_2024_01.
Step 5: Verify Partition Pruning
EXPLAIN SELECT * FROM orders
WHERE created_at >= '2024-02-01'
AND created_at < '2024-03-01';
You should see only orders_2024_02 in the query plan — that's partition pruning in action.
List Partitioning Example
Perfect for partitioning by a categorical column like region:
CREATE TABLE customers (
id BIGSERIAL,
name TEXT NOT NULL,
region TEXT NOT NULL
) PARTITION BY LIST (region);
CREATE TABLE customers_us
PARTITION OF customers
FOR VALUES IN ('US', 'CA');
CREATE TABLE customers_eu
PARTITION OF customers
FOR VALUES IN ('DE', 'FR', 'GB', 'NL');
CREATE TABLE customers_apac
PARTITION OF customers
FOR VALUES IN ('AU', 'JP', 'SG', 'IN');
Hash Partitioning Example
Useful when data doesn't have a natural range. Here we split into 4 partitions:
CREATE TABLE events (
id BIGSERIAL,
user_id BIGINT NOT NULL,
event_type TEXT NOT NULL,
payload JSONB,
occurred_at TIMESTAMPTZ NOT NULL
) PARTITION BY HASH (user_id);
CREATE TABLE events_p0 PARTITION OF events FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE events_p1 PARTITION OF events FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE events_p2 PARTITION OF events FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE events_p3 PARTITION OF events FOR VALUES WITH (MODULUS 4, REMAINDER 3);
Default Partitions
To catch rows that don't match any existing partition, create a default partition:
CREATE TABLE orders_default
PARTITION OF orders DEFAULT;
This is especially useful during development or when you're not sure all values are accounted for.
Automating Partition Creation
In production, you don't want to manually create monthly partitions. Use a scheduled function:
CREATE OR REPLACE FUNCTION create_monthly_partition(target_date DATE)
RETURNS VOID AS $$
DECLARE
partition_name TEXT;
start_date DATE;
end_date DATE;
BEGIN
start_date := DATE_TRUNC('month', target_date);
end_date := start_date + INTERVAL '1 month';
partition_name := 'orders_' || TO_CHAR(start_date, 'YYYY_MM');
EXECUTE FORMAT(
'CREATE TABLE IF NOT EXISTS %I PARTITION OF orders FOR VALUES FROM (%L) TO (%L)',
partition_name, start_date, end_date
);
END;
$$ LANGUAGE plpgsql;
-- Create partitions for the next 3 months
SELECT create_monthly_partition(DATE_TRUNC('month', NOW()) + (n || ' month')::INTERVAL)
FROM generate_series(0, 2) AS n;
Schedule this with pg_cron or an external scheduler (cron job, Airflow, etc.) to run monthly.
Dropping Old Partitions
This is where partitioning really shines for data lifecycle management. Instead of a slow, lock-heavy DELETE:
-- Instantly drop a year's worth of data
DROP TABLE orders_2022_01;
DROP TABLE orders_2022_02;
-- ... etc
Or detach it first if you want to archive it:
ALTER TABLE orders DETACH PARTITION orders_2022_01;
-- Partition now exists as a standalone table — archive or export it
Common Gotchas
Primary keys must include the partition key. PostgreSQL can't enforce uniqueness across partitions without it:
-- This will fail:
ALTER TABLE orders ADD PRIMARY KEY (id);
-- This works:
ALTER TABLE orders ADD PRIMARY KEY (id, created_at);
Foreign keys referencing partitioned tables are not supported (though foreign keys from partitioned tables are fine).
Partition pruning requires the partition key in the WHERE clause. A query without a filter on created_at will scan all partitions.
Be careful with very fine-grained partitions. Hundreds of partitions can hurt planning time. Monthly or quarterly granularity is usually a sweet spot for time-series data.
Checking Your Partitions
Some handy queries for inspecting your partition setup:
-- List all partitions of a table
SELECT inhrelid::regclass AS partition_name,
pg_get_expr(c.relpartbound, inhrelid) AS partition_bound,
pg_size_pretty(pg_relation_size(inhrelid)) AS size
FROM pg_inherits
JOIN pg_class c ON c.oid = inhrelid
WHERE inhparent = 'orders'::regclass
ORDER BY partition_name;
When NOT to Partition
Partitioning adds operational complexity. Skip it if:
- Your table is under ~10 million rows — indexing alone is sufficient.
- You don't have a natural partition key.
- Your queries are mostly aggregations across the entire table (partitioning won't help much).
- Your team isn't comfortable managing the added complexity.
Summary
PostgreSQL's declarative partitioning is mature, powerful, and relatively straightforward to implement. To recap:
- Use range partitioning for time-series and sequential data.
- Use list partitioning for categorical/enum-style columns.
- Use hash partitioning for even distribution without a natural key.
- Always include the partition key in your primary key.
- Automate partition creation and drop old partitions instead of deleting rows.
- Verify partition pruning with
EXPLAINto make sure your queries are benefiting.
Start with one table that's causing pain, instrument it, and measure the improvement. You'll likely find the effort well worth it.
Have questions or war stories about PostgreSQL partitioning? Drop them in the comments below.