Upgrade RDS PostgreSQL to higher version with less downtime using bucardo

Downtime ! when the business team listen to this word they simply say no…To upgrade a PostgreSQL Database using pg_upgrade we need good amount of downtime based on the database size. Even if we are using PostgreSQL RDS we need to have downtime to upgrade to latest versions.

If we are using PostgreSQL 9.4+ we can use the pglogical extension, some times it might not be possible to install this extension and we need another way. In that I found bucardo is very much useful. We can install this on a separate machine and configure the replication.

Let’s start!

What is bucardo ?

Bucardo is a perl based software used to replicate a set of tables from source to destination using triggers. I am using this to migrate from RDS PostgreSQL 9.5 to 11 version.

How bucardo works ?

Bucardo creates a schema bucardo in the source database and create a trigger on each table who are part of replication. Once there is a change (insert, update, delete) on source table, it will be stored in the delta tables created in bucardo schema. A notice will be sent to the bucardo daemon process which will handle the replication of record to destination.

My configuration ?

I have two rds instances and one EC2 instance with CentOS 7.

hostdatabaseversion
   
pghost1compliance9.5
pghost2compliance11
bucardobucardo10
configuration

Installation

The simplest way to install bucardo is using the yum repository of PostgreSQL. After installation the configure bucardo to connect to all databases (source and destination) without prompting password. Create the users with rds_superuser permissions and make entry for them in the pg_hba.conf of bucardo instance. As this is my private instance I created users with same password and used * for all instances in pg_hba.conf

yum install epel
yum install bucardo_10
[root@bucardo usr]# systemctl enable postgresql-10
Created symlink from /etc/systemd/system/multi-user.target.wants/postgresql-10.service to /usr/lib/systemd/system/postgresql-10.service.
[root@bucardo usr]# /usr/pgsql-10/bin/postgresql-10-setup initdb
Initializing database ... OK
[root@bucardo usr]# systemctl start postgresql-10
# Set the configuration parameters
postgres=# alter system set listen_addresses to '*';
ALTER SYSTEM
postgres=# alter system set shared_buffers to '250MB';
ALTER SYSTEM
# Add a line to pg_hba.conf
host    all             all             0.0.0.0/0               md5
# Create a file ~/.pgpass file with below content add in all hosts
-bash-4.2$ cat ~/.pgpass
*:*:*:*:<password>
# I have set password for all users to postgres so I set it like that
systemctl restart postgresql-10

Create bucardo repository database

bucardo use repository database to maintain the metadata about the instances involved in the replication and other information. Login to the bucardo instance and create the database.

postgres=# create user bucardo password 'postgres' superuser;
CREATE ROLE
postgres=# create database bucardo owner bucardo;
CREATE DATABASE

Install bucardo repository

This will create metadata tables required to store metadata in the bucardo database created in previous step.

To store the log files and process id related files create the required folders.

[root@bucardo usr]# mkdir -p /var/run/bucardo
[root@bucardo usr]# mkdir -p /var/log/bucardo
[root@bucardo usr]# chown -R postgres:postgres /var/run/bucardo
[root@bucardo usr]# chown -R postgres:postgres /var/log/bucardo

install bucardo

-bash-4.2$ bucardo install
This will install the bucardo database into an existing Postgres cluster.
Postgres must have been compiled with Perl support,
and you must connect as a superuser

Current connection settings:
1. Host:           <none>
2. Port:           5432
3. User:           bucardo
4. Database:       bucardo
5. PID directory:  /var/run/bucardo
Enter a number to change it, P to proceed, or Q to quit: 1

# Change hostname to localhost and select P to proceed
Enter a number to change it, P to proceed, or Q to quit: P


Attempting to create and populate the bucardo database and schema
Database creation is complete

Updated configuration setting "piddir"
Installation is now complete.
If you see errors or need help, please email bucardo-general@bucardo.org

You may want to check over the configuration variables next, by running:
bucardo show all
Change any setting by using: bucardo set foo=bar

Now the bucardo installation is completed. We will configure the instances for replication.

Configure instances

Take a schema backup of the compliance database from 9.5 and restore it to 11 version server.

pg_dump -C -d compliance -h 192.168.159.137 --schema-only | psql
# Install the extension plperl in both the instances
# For 9.5,11 version install plperl
yum install postgresql11-plperl
compliance=# create extension plperl;
CREATE EXTENSION

As we are using AWS create the below function in source and destination environments (databases)

postgres=# CREATE OR REPLACE FUNCTION public.rds_session_replication_role(role text)
postgres-#  RETURNS text
postgres-#  LANGUAGE plpgsql
postgres-#  SECURITY DEFINER
postgres-# AS $function$
postgres$#         DECLARE
postgres$#                 curr_val text := 'unset';
postgres$#         BEGIN
postgres$#                 EXECUTE 'SET session_replication_role = ' || quote_literal(role);
postgres$#                 EXECUTE 'SHOW session_replication_role' INTO curr_val;
postgres$#                 RETURN curr_val;
postgres$#         END
postgres$# $function$;
CREATE FUNCTION
postgres=# revoke all on function rds_session_replication_role(text) from public;
REVOKE
Also, make changes to the /usr/local/share/perl5/Bucardo.pm file at lines 5397 and 5330 with the following code:
$dbh->do(q{select rds_session_replication_role('replica');}); ## Assumes a sane default !
From.
$dbh->do(q{SET session_replication_role = default}); ## Assumes a sane default !

Add databases to bucardo

Add the databases into bucardo , to be part of replication and create a database group.

postgres@bucardo $ bucardo add db pgdb95 dbhost=192.168.159.137 dbport=5432 dbname=compliance dbuser=bucardo dbpass=postgres
Added database "pgdb95"
postgres@bucardo $ bucardo add db pgdb11 dbhost=192.168.159.130 dbport=5432 dbname=compliance dbuser=bucardo dbpass=postgres
Added database "pgdb11"
postgres@bucardo $
postgres@bucardo $ bucardo list db
Database: pgdb11  Status: active  Conn: psql -p 5432 -U bucardo -d compliance -h 192.168.159.130
Database: pgdb95  Status: active  Conn: psql -p 5432 -U bucardo -d compliance -h 192.168.159.137
postgres@bucardo $
postgres@bucardo $ bucardo add dbgroup migration_group
Created dbgroup "migration_group"
postgres@bucardo $
[root@bucardo ~]# bucardo add dbgroup migration_group pgdb95:source
Added database "pgdb95" to dbgroup "migration_group" as source
[root@bucardo ~]# bucardo add dbgroup migration_group pgdb11:target
Added database "pgdb11" to dbgroup "migration_group" as target
postgres@bucardo $

Add tables and sequences to be part of replication

We can add some specific tables to be part of replication process. Here I am adding all tables because of the requirement. This will not start replication it just note them as herds(relation groups).

[root@bucardo ~]# bucardo add all tables db=pgdb95 relgroup=migration_rel_group
Creating relgroup: migration_rel_group
Added table public.pgbench_branches to relgroup migration_rel_group
Added table public.pgbench_tellers to relgroup migration_rel_group
Added table public.pgbench_history to relgroup migration_rel_group
Added table public.pgbench_accounts to relgroup migration_rel_group
New tables added: 0
Already added: 4
[root@bucardo ~]#
[root@bucardo ~]# bucardo add all sequences db=pgdb95 relgroup=migration_rel_group --verbose
Relgroup already exists: migration_rel_group
Added sequence public.pgbench_history_sid_seq to relgroup migration_rel_group
New sequences added: 0
Already added: 1

[root@bucardo ~]# bucardo list herd
Relgroup: migration_rel_group  DB: pgdb95  Members: public.pgbench_accounts, public.pgbench_branches, public.pgbench_history, public.pgbench_history_sid_seq, public.pgbench_tellers

Create sync

Sync is the one which will start the replication by creating the triggers and other processes. Here I am creating the sync with “autokick=0” which means it will make a note of the changes (insert, update, delete) but not replicate the changes.

postgres@bucardo $ bucardo add sync migration_sync dbs=migration_group herd= migration_rel_group onetimecopy=2 autokick=0
Added sync "migration_sync"

postgres@bucardo $ bucardo list sync
Sync "migration_sync"  Relgroup "migration" [Active]
  DB group "migration_group" pgdb11:target pgdb95:source

Now take a backup of data from source database and copy it to the destination. After data copy is completed, start the replication process.

# Copy the data
pg_dump -v -U postgres -h 192.168.159.137 -d compliance --data-only --disable-triggers -N bucardo | PGOPTIONS='-c session_replication_role=replica' psql compliance
# Start replication
bucardo -U bucardo -d bucardo -P postgres update sync migration_sync autokick=1
# start bucardo
[root@bucardo bucardo]# bucardo start
Checking for existing processes
Removing file "/var/run/bucardo/fullstopbucardo"
Starting Bucardo

Check the bucardo status

[root@bucardo bucardo]# bucardo status
PID of Bucardo MCP: 11239
 Name             State    Last good    Time    Last I/D    Last bad    Time
================+========+============+=======+===========+===========+=======
 migration_sync | Good   | 06:55:33   | 52s   | 0/0       | none      |

# to check the status of the sync
postgres@bucardo $ bucardo status migration_sync
======================================================================
Sync name                : migration_sync
Current state            : No records found
Source relgroup/database : migration / pgdb95
Tables in sync           : 5
Status                   : Active
Check time               : None
Overdue time             : 00:00:00
Expired time             : 00:00:00
Stayalive/Kidsalive      : Yes / Yes
Rebuild index            : No
Autokick                 : Yes
Onetimecopy              : Yes
Post-copy analyze        : Yes
Last error:              :

Once the databases are in sync we can proceed to take the down time. stop the app, check the status of replication again and stop and start the bucardo to replicate the sequences. I observed that sequences are not getting replicated immediately only after restarting bucardo I observed them getting replicated. After that start the application to connect to the new instance.

errors:

Can’t locate Sys/Syslog.pm in @INC


[root@localhost perl5]# bucardo start
Checking for existing processes
Can’t locate Sys/Syslog.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /usr/local/share/perl5/Bucardo.pm line 32.
BEGIN failed–compilation aborted at /usr/local/share/perl5/Bucardo.pm line 32.
Compilation failed in require at /usr/local/bin/bucardo line 836.

Solution:
yum install perl-Sys-Syslog

Reference:

https://aws.amazon.com/blogs/database/migrating-legacy-postgresql-databases-to-amazon-rds-or-aurora-postgresql-using-bucardo/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s