VACUUM FULL or CLUSTER in online using pg_repack
pg_repack is a PostgreSQL extension which lets you remove bloat from tables and indexes, and optionally restore the physical order of clustered indexes. Unlike CLUSTER and VACUUM FULL it works online, without holding an exclusive lock on the processed tables during processing. pg_repack is efficient to boot, with performance comparable to using CLUSTER directly.
One caveat is though that it requires about double the size of the target table and its indexes, and the target table must have a PRIMARY KEY, or at least a UNIQUE total index on a NOT NULL column.
Requirements:
PostgreSQL versions
PostgreSQL 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 10
Disk space
Performing a full-table repack requires free disk space about twice as large as the target table(s) and its indexes. For example, if the total size of the tables and indexes to be reorganized is 1GB, an additional 2GB of disk space is required.
Installation:
[postgres@localhost ~]$ unzip pg_repack-1.4.2.zip
[postgres@localhost ~]$ cd pg_repack-1.4.2/
[postgres@localhost pg_repack-1.4.2]$
[postgres@localhost pg_repack-1.4.2]$ make
make[1]: Entering directory `/home/postgres/pg_repack-1.4.2/bin’
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -I/home/postgres/SOFTWARE_9.6/include -DREPACK_VERSION=1.4.2 -I. -I./ -I/home/postgres/SOFTWARE_9.6/include/server -I/home/postgres/SOFTWARE_9.6/include/internal -D_GNU_SOURCE -c -o pg_repack.o pg_repack.c
………………….
………………….
make[1]: Entering directory `/home/postgres/pg_repack-1.4.2/regress’
make[1]: Nothing to be done for `all’.
make[1]: Leaving directory `/home/postgres/pg_repack-1.4.2/regress’
[postgres@localhost pg_repack-1.4.2]$ make install
make[1]: Entering directory `/home/postgres/pg_repack-1.4.2/bin’
/bin/mkdir -p ‘/home/postgres/SOFTWARE_9.6/bin’
…………………..
…………………..
/bin/install -c -m 644 .//pg_repack.control ‘/home/postgres/SOFTWARE_9.6/share/extension/’
/bin/install -c -m 644 pg_repack–1.4.2.sql pg_repack.control ‘/home/postgres/SOFTWARE_9.6/share/extension/’
make[1]: Leaving directory `/home/postgres/pg_repack-1.4.2/lib’
make[1]: Entering directory `/home/postgres/pg_repack-1.4.2/regress’
make[1]: Nothing to be done for `install’.
make[1]: Leaving directory `/home/postgres/pg_repack-1.4.2/regress’
[postgres@localhost pg_repack-1.4.2]$
[postgres@localhost pg_repack-1.4.2]$ psql
psql (9.6.6)
Type “help” for help.
postgres=# create extension pg_repack ;
CREATE EXTENSION
postgres=#
pg_repack command options
–all repack all databases
–table=TABLE repack specific table only
–parent-table=TABLE repack specific parent table and its inheritors
–schema=SCHEMA repack tables in specific schema only
–tablespace=TBLSPC move repacked tables to a new tablespace
–moveidx move repacked indexes to TBLSPC too
–order-by=COLUMNS order by columns instead of cluster keys
–no-order do vacuum full instead of cluster
–dry-run print what would have been repacked and exit
–jobs=NUM Use this many parallel jobs for each table
–index=INDEX move only the specified index
–only-indexes move only indexes of the specified table
–wait-timeout=SECS timeout to cancel other backends on conflict
–host=HOSTNAME
–port=PORT
–username=USERNAME
Example :
Repack a table(non-clustered) :
pg_repack -t test_rpk postgres
Reoack a cluster table without clustering(ordering) :
pg_repack –no-order -t test_rpk postgres
Like this we can do for schema , indexes , move to different tablespaces without taking locks for much time compare to remaining methods like VACUUM FULL and CLUSTER.
How repack works internally :
Full Table Repacks:
To perform a full-table repack, pg_repack will:
*) create a log table to record changes made to the original table
*) add a trigger onto the original table, logging INSERTs, UPDATEs and DELETEs into our log table
*) create a new table containing all the rows in the old table
*) build indexes on this new table
*) apply all changes which have accrued in the log table to the new table
*) swap the tables, including indexes and toast tables, using the system catalogs
*) drop the original table
*) pg_repack will only hold an ACCESS EXCLUSIVE lock for a short period during initial setup (steps 1 and 2 above) and during the final swap-and-drop phase (steps 6 and 7). For the rest of its time, pg_repack only needs to hold an ACCESS SHARE lock on the original table, meaning INSERTs, UPDATEs, and DELETEs may proceed as usual.
Index Only Repacks:
To perform an index-only repack, pg_repack will:
*) create new indexes on the table using CONCURRENTLY matching the definitions of the old indexes
*) swap out the old for the new indexes in the catalogs
*) drop the old indexes