Select Page

Deploying, Configuring, and Automating pg_partman on Amazon RDS PostgreSQL

Author: Abdul Sayeed | | November 29, 2023

Introduction

This blog helps to configure pg_partman and convert tables into Partition Tables. Also, it will list out the steps to schedule retentions.

Initial Configuration

To configure pg_partman on RDS PostgreSQL, install PostgreSQL 14 on an EC2 Node or Kubernetes pods where pg_partman can be configured along with deploying the partition scripts and schedule for the retentions.

$ sudo apt update

$ sudo apt install postgresql postgresql-contrib

$ sudo -i -u postgres

postgres@ip-xxx-xx-xx-xx:/usr/lib/postgresql/14/bin$ psql -h -d dv_partman_test -U datavail -f ‘Partman_Config’ >> /tmp/partman_config.log

postgres@ip-xxx-xx-xx-xx:/usr/lib/postgresql/14/bin$ more Partman_Config

CREATE SCHEMA partman;

CREATE EXTENSION pg_partman WITH SCHEMA partman;

GRANT ALL ON SCHEMA partman TO partman;

GRANT ALL ON ALL TABLES IN SCHEMA partman TO partman;

GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO partman;

GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO partman;

Deploying SQL scripts for each table

Let’s take a table called “orders” as an example, which will be partitioned using the pg_partman extension.

postgres@ip-xxx-xx-xx-xx:/usr/lib/postgresql/14/bin$ more partman_test.sq

---Rename original table with _old name:

ALTER TABLE public.orders rename to orders_old;

--- Check min value :

SELECT TO_TIMESTAMP(min(clock)) from public.orders_old;

SELECT TO_TIMESTAMP(max(clock)) from public.orders_old;

-- Create template table

CREATE TABLE public.orders_template
(
itemid bigint NOT NULL,
clock integer NOT NULL,
value bigint NOT NULL,
ns integer NOT NULL DEFAULT 0
);

CREATE INDEX orders_template_inx ON public.orders_template USING btree (itemid, clock, ns);

-- Create orders table - partition

CREATE TABLE public.orders
(
itemid bigint NOT NULL,
clock integer NOT NULL,
value bigint NOT NULL,
ns integer NOT NULL DEFAULT 0
) PARTITION BY RANGE (clock);

CREATE INDEX orders_inx ON public.orders USING btree (itemid, clock, ns);

-- Create partitions

SELECT partman.create_parent(
p_parent_table := 'public.orders',
p_control := 'clock',
p_type := 'native',
p_interval := 'daily',
p_constraint_cols := null,
p_premake := 10,
p_automatic_maintenance := 'on',
p_start_partition := null,
p_inherit_fk := true,
p_epoch := 'seconds',
p_template_table :='public.orders_template' );

-- Validate config in partman
select * from partman.part_config where parent_table='public.orders' \gx

-- Validate partitions

WITH RECURSIVE partition_info
(relid,
relname,
relsize,
relispartition,
relkind) AS (
SELECT oid AS relid,
relname,
pg_relation_size(oid) AS relsize,
relispartition,
relkind
FROM pg_catalog.pg_class
WHERE
relname = 'orders' AND
relkind = 'p'
UNION ALL
SELECT
c.oid AS relid,
c.relname AS relname,
pg_relation_size(c.oid) AS relsize,
c.relispartition AS relispartition,
c.relkind AS relkind
FROM partition_info AS p,
pg_catalog.pg_inherits AS i,
pg_catalog.pg_class AS c
WHERE p.relid = i.inhparent AND
c.oid = i.inhrelid AND
c.relispartition
)
SELECT * FROM partition_info;

-- move data
CALL partman.partition_data_proc('public.orders', p_batch := 1000, p_source_table := 'public.orders_old');

-- Validate row count with respect to child partition:

select to_char(TO_TIMESTAMP(clock),'dd-mm-YYYY') as daywise,
count(*) as total_records
from public.orders
group by daywise
order by daywise desc
;

-- Set retention:
UPDATE partman.part_config set retention = '30 day', retention_keep_table = false, retention_keep_index = false WHERE parent_table = 'public.orders';

You need to update the schema name in the partman_test.sql script if there is a change in schema from public:

postgres@ip-xxx-xx-xx-xx:/usr/lib/postgresql/14/bin$ psql -h -d dv_partman_test -U datavail -f 'partman_test.sql' >> /tmp/ partman_test.log

Verify the log output:

postgres@ip-xxx-xx-xx-xx:/usr/lib/postgresql/14/bin$ more /tmp/ partman_test.log

SQL Scripts Include the Following Tasks

  1. Partman_Config – It includes creating the partman extension, schema, and provisioning related permission to schema and underlying tables to the partman user.
  2. ALTER Table – This will rename the existing table with _old to keep data and perform data load post creating partitions.
  3. SELECT TO_TIMESTAMP(min(clock)) – To get the oldest record from current normal table.
  4. SELECT TO_TIMESTAMP(max(clock)) – To get the latest record from current normal table.
  5. CREATE TEMPLATE Table – This will create a template table to retain the index and which can be use while creating child tables. It means using the template table, pg_partman will create child tables with index.
  6. CREATE INDEX on Template Table – Manually creating an index to apply for all child tables.
  7. CREATE PARENT Table – This will create a parent table with the same name as existing normal table along with index.
  8. CREATE PARTITIONS – This will create partition tables using pg_partman extension. It includes pre-make parameter which we have set to 10. It means it will create 10 future partitions.
  9. SELECT Statement from pg_partman config file – To validate config in the partman schema.
  10. SQL – WITH RECURSIVE – To validate partitions.
  11. CALL partman.partition_data_proc – To load/move data from Normal Table to Partition tables. In this environment, we are using clock column for Partitioning, hence this CALL Procedure will load data with respect to date and move data in child partitions as per date.
  12. VACUUM – Perform vacuum analyze. It will update the statistics to improve overall Partition tables performance.
  13. SELECT TO_CHAR – Validate row count with respect to each partition’s tables.
  14. UPDATE – Set retention. This will invoke when we call run_maintenance from scheduler and retain 30 child partitions as per our requirement.

Post Partition Validation

Manually check by comparing the output of SELECT TO_TIMESTAMP(min(clock)) and SELECT TO_TIMESTAMP(max(clock)).

Example:

dv_partman_test=> SELECT TO_TIMESTAMP(min(clock)) from public.orders_old;

to_timestamp

------------------------

2019-11-06 20:00:00+00

(1 row)

dv_partman_test=> SELECT TO_TIMESTAMP(max(clock)) from public.orders_old;

to_timestamp

------------------------

2023-07-20 18:00:00+00

(1 row)

dv_partman_test=> SELECT * from public.orders_p2019_11_06 ;

dv_partman_test=> SELECT * from public.orders_p2023_07_20 ;

Scheduling Maintenance

Linux crontab is used for scheduling or the pg_cron rds extension.

Below is a cron job example using Linux crontab scheduler:

00 00 * * * /usr/lib/postgresql/14/bin/psql -h -d dv_partman_test -U datavail -c “SELECT partman.run_maintenance(‘public.partman_test’);” >> /tmp/partman_test.log

To schedule maintenance tasks for pg_partman using pg_cron in PostgreSQL, first ensure that pg_cron is installed and properly configured. Below are the steps on how to schedule maintenance tasks such as partition creation, management, and cleanup using pg_cron.

Install and configure pg_cron

Ensure pg_cron is installed and properly configured in your PostgreSQL database. You’ll need to follow the installation instructions specific to your PostgreSQL version and environment. You can find the installation and configuration steps for pg_cron in the official documentation: https://github.com/citusdata/pg_cron

Schedule partition management and cleanup

Create SQL scripts for partition management (e.g., merging or splitting partitions) and cleanup (e.g., dropping old partitions). Schedule these scripts using pg_cron to run at the desired intervals.

-- Example: Schedule partition management (e.g., merging) script

SELECT cron.schedule('0 1 * * *', 'CALL merge_partitions()');

-- Example: Schedule partition cleanup script

SELECT cron.schedule('0 2 * * *', 'CALL cleanup_partitions()');

Adjust the cron schedules (0 1 * * * and 0 2 * * * in these examples) to your desired frequencies.

Verify and monitor scheduled jobs

Use SELECT * FROM pg_cron.job_stats; to monitor the scheduled jobs and their execution status.
With these steps, you have scheduled maintenance tasks for partition creation, management, and cleanup using pg_partman through pg_cron in PostgreSQL. Make sure to review the pg_partman and pg_cron documentation for a deeper understanding of their functionalities and usage.

How to Solve the Oracle Error ORA-12154: TNS:could not resolve the connect identifier specified

The “ORA-12154: TNS Oracle error message is very common for database administrators. Learn how to diagnose & resolve this common issue here today.

Vijay Muthu | February 4, 2021

Oracle Database Extended Support Deadlines: What You Need to Know

If you’re confused about Oracle’s extended support deadlines, you are not alone. Here’s an overview of what’s in store for 11g through 19c.

Steve Thompson | January 19, 2021

Data Types: The Importance of Choosing the Correct Data Type

Most DBAs have struggled with the pros and cons of choosing one data type over another. This blog post discusses different situations.

Craig Mullins | October 11, 2017

Subscribe to Our Blog

Never miss a post! Stay up to date with the latest database, application and analytics tips and news. Delivered in a handy bi-weekly update straight to your inbox. You can unsubscribe at any time.

Work with Us

Let’s have a conversation about what you need to succeed and how we can help get you there.

CONTACT US

Work for Us

Where do you want to take your career? Explore exciting opportunities to join our team.

EXPLORE JOBS