Sometimes it is required to install packages in different locations due to permission issues or for security reasons etc. To install the package in different location using the “make install” use below command
This is very typical requirement we get in DBA world where we need to copy tables from one schema to another schema. In Oracle we have a option like remap_schema while doing restoration which handles this automatically. But in PostgreSQL we don’t have such parameter.
Below is the way how I have done that in my dev environment.
PostgreSQL Version is 9.6
Source schema : s_schema (Database name : noncompliance) Destination schema : d_schema (Database name : compliance)
Scenario 1:
If the source schema does not exists in the destination database but destination schema is there and it has some objects.
Restore the complete schema
Change the schema for all restored tables
---- Take dump in custom format
pg_dump -Fc -n s_schema -f s_schema.dmp noncompliance
---- Restore the schema in destination database
pg_restore -Fc -d compliance s_schema.dmp
---- Change the schema for restored tables
DO
$$
DECLARE r record;
BEGIN
FOR r IN SELECT 'ALTER TABLE s_schema.'||tablename||' SET SCHEMA d_schema' as query FROM pg_tables WHERE schemaname = 's_schema'
LOOP
EXECUTE r.query;
END LOOP;
END$$;
----- We can see the schema of the table is changed
compliance=# \dt d_schema.*
List of relations
Schema | Name | Type | Owner
----------+------------------+-------+----------
d_schema | pgbench_accounts | table | postgres
d_schema | pgbench_branches | table | postgres
d_schema | pgbench_history | table | postgres
d_schema | pgbench_tellers | table | postgres
(4 rows)
Scenario 2
If the source schema already exists in the destination database and assume it has other objects too.
Take the backup of source schema s_schema
replace the s_schema word in the backup file with d_schema (user data is in binary format only the schema, table definition are in string format so this works. I tested by inserting s_schema name in a varchar column)
Restore the file
---- Take backup from the source database
pg_dump -Fc -n s_schema -f s_schema.dmp noncompliance
---- Replace the s_schema with d_schema
sed -i 's/s_schema/d_schema/g' s_schema.dmp
---- Restore the file
pg_restore -Fc -n d_schema -d compliance s_schema.dmp
---- The objects are restored directly into the destination schema
compliance=# \dt d_schema.*
List of relations
Schema | Name | Type | Owner
----------+------------------+-------+----------
d_schema | pgbench_accounts | table | postgres
d_schema | pgbench_branches | table | postgres
d_schema | pgbench_history | table | postgres
d_schema | pgbench_tellers | table | postgres
d_schema | student | table | postgres
(5 rows)
NOTE:
Very careful when running the sed command to replace the string as it is a binary file there are some chances it can corrupt the file and you can get below error.
-bash-4.2$ pg_restore -Fc -c --if-exists -t student -d compliance s_schema.dmp
pg_restore: [custom archiver] could not read from input file: end of file
In this post we will discuss about a PostgreSQL performance issue on my Linux box. The issue is CPU utilization reached 100% and good amount of I/O on the system.
This is how I started troubleshooting the issue. I started by checking the load average on the server.
What is Load average and how to check ?
Load average is the number of processes/threads that are running on CPU and waiting for CPU or IO or uninterruptable sleep.
It conventionally appears in the form of three numbers which represent the system load during the last one-, five-, and fifteen-minute periods.
The first column “run queue” is more than my CPU count so there is load on the system(more processes to run than my cpu count).
I can see there is high block out activity (disk writing is high).
High context switching and interrupts
I started with finding the cause of I/O.
You can check the disk activity using “iostat -mx” command, we can see the disk write activity is more.
Now I executed the “iotop” command to pinpoint which process is doing more writes. Below is the information about the same.
I can see the Wal writer, check pointer and writer processes are doing heavy writes in my system. It means there is high transactions (DML activity) in my system which are causing more pressure on writer and check pointer processes.
I opened my PostgreSQL error logs and found that the checkpoints are happening too frequently due to heavy update commands running on my machine.
2020-04-12 05:37:19.809 PDT,,,1071,,5e90791e.42f,4,,2020-04-10 06:48:14 PDT,,0,LOG,00000,"checkpoints are occurring too frequently (11 seconds apart)",,"Consider increasing the configuration parameter ""max_wal_size"".",,,,,,,""
2020-04-12 05:37:28.649 PDT,,,1071,,5e90791e.42f,5,,2020-04-10 06:48:14 PDT,,0,LOG,00000,"checkpoints are occurring too frequently (9 seconds apart)",,"Consider increasing the configuration parameter ""max_wal_size"".",,,,,,,""
I have changed the checkpoint related parameters
postgres=# alter system set max_wal_size = '10GB';
ALTER SYSTEM
postgres=# alter system set checkpoint_timeout = '30min';
ALTER SYSTEM
I can still still the wal writer and writer processes doing more writes
Writer process will do writes during below times.
When the check pointer process invokes it at checkpoint timeout or max_wal_size is reached.
when the shared_buffers is getting filled too frequently.
Also, writer process wakeup for every bgwriter_delay time and write the data from shared buffers to disk.
I verified the shared buffers setting and found that it is not enough and increased it. Also, I changed the wal writer parameters.
alter system set wal_buffers to '16MB';
alter system set shared_buffers to '2GB';
Now the I/O by writer and check pointer is not so frequent but there is high CPU utilization.
After that I started checking the context switching, which high on the system. As you can see in below image on average it is 590 but after the activity started it reached to 2000+ and same for system interrupts. If you want to know more about the vmstat command check
What is a context switch?
context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point. This allows multiple processes to share a single CPU, and is an essential feature of a multitasking operating system.
Also, I observed in the top, pg_stat_activity where there are lot of shot lived transactions, which means there are multiple processes getting created and closed which means there is high load on the system. But I implemented the connection pooling (pgbouncer) to reduce the connection creation overhead.
Finally, the CPU utilization is reduced from 100 to ~70%. But still the load average is not reduced as expected and see more processes in the run queue as we see in vmstat output.
Finally we need to increase the CPU count to meet the load on the system.