Category Archives: AWS

port 5432 failed: FATAL: PAM authentication failed for user

I was facing this issue in RDS PostgreSQL after I grant all roles to postgres user. To resolve this issue run below command

revoke rds_iam from postgres;

Query to kill idle connections in PostgreSQL

PostgreSQL is process oriented architecture, for evey user connection one process will be created in background. If the user forget to close the connection the process will still consume some resources. Below is the query to close the idle connections

--- Close the connections which are idle for more than 15 mins

SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'databasename'
   AND pid <> pg_backend_pid()
   AND state = 'idle'
   and extract(epoch from (current_timestamp - query_start)) > 15*60;

Error: error configuring Terraform AWS Provider: error validating provider credentials: error calling sts:GetCallerIdentity: SignatureDoesNotMatch: Signature expired: is now earlier than (- 15 min.)

When I was configuring the terraform in my server got below error.

Error: error configuring Terraform AWS Provider: error validating provider credentials: error calling sts:GetCallerIdentity: SignatureDoesNotMatch: Signature expired: 20210206T042543Z is now earlier than 20210206T042641Z (20210206T044141Z – 15 min.)

The error was due to the misconfiguration of time in my server, when we do API calls it will compare the timestamps, if it is within 5 mins it will accept else it will fail with above error.

Load AWS metrics to PostgreSQL

AWS provide lot of metrics about resource utilization of ec2/rds instance which is useful to analyze the system utilization. There are different ways to read the metrics like aws command line or using python boto3 library etc.

To know the resource utilization trend of the instance we need to maintain this metrics for longer term. As it is costly to store this metrics in Cloudwatch for longer term it would be good to copy the data to some database which will help to find the trend of resource utilization and useful for capacity planning.

Below is a simple way to load aws metrics of rds instance to PostgreSQL. PostgreSQL has bulk loading command COPY which can be used to load metrics directly from other programs. Below is the snippet of code useful to load metrics from aws to PostgreSQL.

Here I am creating three tables for read iops, write iops and cpu utilization and loading the respective metrics into the tables.

!/bin/bash
 export PGPASSWORD='**********'
 psql -h 10.12.34.56 -U viswamitra -d postgres -c "create table if not exists write_iops(datapoint varchar,value decimal,gather_time timestamp,unit varchar)" -c "\copy sandbox_write from PROGRAM 'aws cloudwatch get-metric-statistics --metric-name WriteIOPS --start-time date +%Y-%m-%d --date="$1 days ago"T00:00:00Z --end-time date +%Y-%m-%d --date="$1 days ago"T23:59:59Z --period 60 --namespace AWS/RDS --statistics Average --dimensions Name=DBInstanceIdentifier,Value=testinstance1 --output text | tail -n +2';"
 psql -h 10.12.34.56 -U viswamitra -d postgres -c "create table if not exists read_iops(datapoint varchar,value decimal,gather_time timestamp,unit varchar)" -c "\copy sandbox_write from PROGRAM 'aws cloudwatch get-metric-statistics --metric-name WriteIOPS --start-time date +%Y-%m-%d --date="$1 days ago"T00:00:00Z --end-time date +%Y-%m-%d --date="$1 days ago"T23:59:59Z --period 60 --namespace AWS/RDS --statistics Average --dimensions Name=DBInstanceIdentifier,Value=testinstance1 --output text | tail -n +2';"
 psql -h 10.12.34.56 -U viswamitra -d postgres -c "create table if not exists cpu_utilization(datapoint varchar,value decimal,gather_time timestamp,unit varchar)" -c "\copy sandbox_write from PROGRAM 'aws cloudwatch get-metric-statistics --metric-name WriteIOPS --start-time date +%Y-%m-%d --date="$1 days ago"T00:00:00Z --end-time date +%Y-%m-%d --date="$1 days ago"T23:59:59Z --period 60 --namespace AWS/RDS --statistics Average --dimensions Name=DBInstanceIdentifier,Value=testinstance1 --output text | tail -n +2';"

PostgreSQL idle sessions filled my disk space

We have an internal application using PostgreSQL as backend with application connection pooling. After few days we started seeing a peculiar issue where the disk allocated for database is getting full and application is running too slow. After restarting the application the space got released and application is working as expected.

After checking the tables which are consuming more space we found that temporary tables are consuming more space(to know more about temporary tables visit this). After checking it further, temporary tables will be dropped once the session is closed but as we are using the application connection pooling which is not managed properly causing this issue.

Before closing or establishing the connection it is recommended to execute the “DISCARD ALL” which will drop all the temporary tables or resources allocated in that session.

test_temp=# create temporary table test_emp(id int, name varchar);
 CREATE TABLE
 test_temp=#
 test_temp=# insert into test_emp values (1,'Viswamitra');
 INSERT 0 1
 test_temp=#
 test_temp=# select * from test_emp ;
  id |    name
 ----+------------
   1 | Viswamitra
 (1 row)
 test_temp=#
 test_temp=# discard all;
 DISCARD ALL
 test_temp=# select * from test_emp ;
 ERROR:  relation "test_emp" does not exist
 LINE 1: select * from test_emp ;
                       ^
 test_temp=#

In the tomcat connection pooling configuration (validationquery) set the command “discard all” which will release the resources occupied by the session before returning the connection back to the pool.

Reference:

PostgreSQL: Documentation: 11: DISCARD

Temporary tables in PostgreSQL

Temporary tables are short lived tables which will be dropped at the end of session or transaction. These are used to store intermediate data in a transaction. Temporary tables can be created in the same way as the permanent tables where we just add TEMP or TEMPORARY to create table statement. Below is the syntax to create temporary tables

create [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } table
(
col1 datatype,
col2 datatype,
.....
.....) ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP }
[ TABLESPACE tablespace_name ]

Databases like Oracle or SQL server support global temporary tables where a temporary table can be created once and used in all sessions, but in PostgreSQL it is not yet supported, the GLOBAL keyword is there just for SQL standard but it actually represents the LOCAL.

Temporary tables are visible only in a session, sessions can’t see temporary tables of each other. Temporary tables can be created with same name as the permanent tables but it is not recommended. When you try to access the table without schema, temporary tables will be given priority than permanent tables.

In the below example temporary table with name emp has more priority than permanent emp table. To access the permanent tables use fully qualified name (schema.table).

 postgres=# create database test_temp;
 CREATE DATABASE
 postgres=# \c test_temp
 psql (12.4, server 10.14)
 You are now connected to database "test_temp" as user "postgres".
 test_temp=# create table emp(id int, name varchar);
 CREATE TABLE
 test_temp=# insert into emp values (1,'Viswamitra');
 INSERT 0 1
 test_temp=# create temporary table emp (id int, name varchar);
 CREATE TABLE 
 test_temp=# \dt
           List of relations
   Schema   | Name | Type  |  Owner   
 -----------+------+-------+----------
  pg_temp_4 | emp  | table | postgres
 (1 row)
 test_temp=# show search_path ;
    search_path   
 "$user", public
 (1 row)
 test_temp=# \dt public.*
         List of relations
  Schema | Name | Type  |  Owner   
 --------+------+-------+----------
  public | emp  | table | postgres
 (1 row)
 test_temp=# select * from emp;
  id | name 
 ----+------
 (0 rows)
 test_temp=# select * from public.emp;
  id |    name    
 ----+------------
   1 | Viswamitra
 (1 row)

Temporary tables will be created in a special schema with naming convention “pg_temp_nnn”, temporary tables can not be created in non-temporary schemas. When a query is executed , PostgreSQL will search for matching tables in pg_temp_nnn schema and then it will search in schemas specified in search_path variable.

postgres=# create temporary table public.test(id int);
 ERROR:  cannot create temporary relation in non-temporary schema
 LINE 1: create temporary table public.test(id int);
                                ^

By default temporary tables(files store the data) will be created in the default tablespace pg_default, if there is any tablespace created and set in the parameter temp_tablespaces, tables will use that tablespace.

 test_temp=# create temporary table test_size(id int, name varchar);
 CREATE TABLE
 test_temp=# select pg_relation_filepath('test_size');
  pg_relation_filepath
 base/90113/t4_90140
 (1 row)
 test_temp=# ! du -sh /pgdata/10/data/base/90113/t4_90140
 0       /pgdata/10/data/base/90113/t4_90140
 test_temp=# insert into test_size select generate_series(1,10000),md5(generate_series(1,10000)::text);
 INSERT 0 10000
 test_temp=# ! du -sh /pgdata/10/data/base/90113/t4_90140
 960K    /pgdata/10/data/base/90113/t4_90140
 test_temp=#
 test_temp=# \dt+ test_size
                         List of relations
   Schema   |   Name    | Type  |  Owner   |  Size  | Description
 -----------+-----------+-------+----------+--------+-------------
  pg_temp_4 | test_size | table | postgres | 704 kB |
 (1 row)
 test_temp=# ! mkdir /tmp/test
 test_temp=# create tablespace test location '/tmp/test';
 CREATE TABLESPACE
 test_temp=# create temporary table test_size_1(id int, name varchar) tablespace test;
 CREATE TABLE
 test_temp=# select pg_relation_filepath('test_size_1');
               pg_relation_filepath
 pg_tblspc/90147/PG_10_201707211/90113/t4_90148
 (1 row)

Indexes can be created on temporary tables which are by default temporary and dropped after that. autovacuum process can not handle the temporary tables so it is highly recommended to analyze the temporary tables before using in complex queries.

temp_buffers is a session level memory component which is allocated each session to access the temporary tables, by default the size of temp_buffers is 8 MB. if the temporary tables are too large than temp_buffers query will start using disk (merge disk algorithm) to process the data.

Snowflake monitoring queries

This document describes some of the useful queries which helps in monitoring Snowflake account. There are two monitoring schemas in a Snowflake account which has the functions and views to monitor the Snowflake account

information_schema

Which provides a set of system defined views and metadata with real time statistics about queries. This data is typically retained for up to 7/14 days and is most useful to monitor ongoing queries.

account_usage

Provides a rich set of historical metadata which is held for up to a year and will be the primary source of usage analysis over time.

check current running queries

select * from table(information_schema.query_history())  where execution_status  in ('RESUMING_WAREHOUSE','RUNNING','QUEUED','BLOCKED');

queries running by a user

select * from table(information_schema.QUERY_HISTORY_BY_USER(USER_NAME => '<usr name>'))  where execution_status  in ('RESUMING_WAREHOUSE','RUNNING','QUEUED','BLOCKED');

Time consuming queries

Find top 10 time consuming queries

--- Queries taking more than 10 sec
SET long_running_threshold_seconds = 10;

SELECT * FROM TABLE(information_schema.query_history()) WHERE (execution_status = 'RUNNING' AND datediff('seconds',start_time,current_timestamp()) > $long_running_threshold_seconds);

--- To get the history , list top 10 queries
use schema snowflake.account_usage;

select QUERY_TEXT,DATABASE_NAME,SCHEMA_NAME,sum(TOTAL_ELAPSED_TIME)/1000 as Execution_time_sec
from query_history
group by QUERY_TEXT,DATABASE_NAME,SCHEMA_NAME 
order by sum(TOTAL_ELAPSED_TIME) desc
limit 10;

Queries consuming more credits for cloud services

use schema snowflake.account_usage;

select QUERY_TEXT,DATABASE_NAME,SCHEMA_NAME,USER_NAME,sum(CREDITS_USED_CLOUD_SERVICES) as credits_used
from query_history
group by QUERY_TEXT,DATABASE_NAME,SCHEMA_NAME,USER_NAME
order by sum(CREDITS_USED_CLOUD_SERVICES) desc
limit 10;

Queries doing more full partition scans

use schema snowflake.account_usage;

select QUERY_TEXT,DATABASE_NAME,SCHEMA_NAME,USER_NAME,PARTITIONS_SCANNED,PARTITIONS_TOTAL
from query_history
where PARTITIONS_SCANNED=PARTITIONS_TOTAL
and PARTITIONS_TOTAL > 0
order by PARTITIONS_TOTAL desc;

Lock Monitoring in Snowflake

select * from table(information_schema.query_history())  where execution_status  in ('BLOCKED');

To see more information about the blocked sessions execute “show locks in account” command. It will show the sessions who are blocking each other with status as “HOLDING” and “WAITING”.

“show transactions in account” command will show the current running transactions account wise with status.

To cancel any of the transaction (either blocking and blocked)

select system$abort_transaction(<transaction id>);
select system$abort_transaction(1594638696404);

Check query execution plan

If you find there is a long running query and you want to know the execution plan for the query, you can get that in two ways

  • using explain command
  • using query profile

EXPLAIN Command

The explain command will give the logical plan of a query.  EXPLAIN compiles the SQL statement, but does not execute it, so EXPLAIN does not require a running warehouse. Although EXPLAIN does not consume any compute credits, the compilation of the query does consume Cloud Service credits, just as other metadata operations do.

use schema snowflake_sample_data.tpch_sf1;

--- Check the query execution plan in text format
explain using text select
       l_returnflag,
       l_linestatus,
       sum(l_quantity) as sum_qty,
       sum(l_extendedprice) as sum_base_price,
       sum(l_extendedprice * (1-l_discount)) as sum_disc_price,
       sum(l_extendedprice * (1-l_discount) * (1+l_tax)) as sum_charge,
       avg(l_quantity) as avg_qty,
       avg(l_extendedprice) as avg_price,
       avg(l_discount) as avg_disc,
       count(*) as count_order
 from
       lineitem
 where
       l_shipdate <= dateadd(day, -90, to_date('1998-12-01'))
 group by
       l_returnflag,
       l_linestatus
 order by
       l_returnflag,
       l_linestatus;

GlobalStats:
    partitionsTotal=10
    partitionsAssigned=10
    bytesAssigned=165278208
Operations:
1:0     ->Result  LINEITEM.L_RETURNFLAG, LINEITEM.L_LINESTATUS, SUM(LINEITEM.L_QUANTITY), SUM(LINEITEM.L_EXTENDEDPRICE), SUM(LINEITEM.L_EXTENDEDPRICE * (1 - LINEITEM.L_DISCOUNT)), SUM((LINEITEM.L_EXTENDEDPRICE * (1 - LINEITEM.L_DISCOUNT)) * (1 + LINEITEM.L_TAX)), (SUM(LINEITEM.L_QUANTITY)) / (COUNT(LINEITEM.L_QUANTITY)), (SUM(LINEITEM.L_EXTENDEDPRICE)) / (COUNT(LINEITEM.L_EXTENDEDPRICE)), (SUM(LINEITEM.L_DISCOUNT)) / (COUNT(LINEITEM.L_DISCOUNT)), COUNT(*)  
1:1          ->Sort  LINEITEM.L_RETURNFLAG ASC NULLS LAST, LINEITEM.L_LINESTATUS ASC NULLS LAST  
1:2               ->Aggregate  aggExprs: [SUM(LINEITEM.L_QUANTITY), SUM(LINEITEM.L_EXTENDEDPRICE), SUM(LINEITEM.L_EXTENDEDPRICE * (1 - LINEITEM.L_DISCOUNT)), SUM((LINEITEM.L_EXTENDEDPRICE * (1 - LINEITEM.L_DISCOUNT)) * (1 + LINEITEM.L_TAX)), COUNT(LINEITEM.L_QUANTITY), COUNT(LINEITEM.L_EXTENDEDPRICE), SUM(LINEITEM.L_DISCOUNT), COUNT(LINEITEM.L_DISCOUNT), COUNT(*)], groupKeys: [LINEITEM.L_RETURNFLAG, LINEITEM.L_LINESTATUS]  
1:3                    ->Filter  LINEITEM.L_SHIPDATE <= '1998-09-02'  
1:4                         ->TableScan  SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.LINEITEM  L_QUANTITY, L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_RETURNFLAG, L_LINESTATUS, L_SHIPDATE  {partitionsTotal=10, partitionsAssigned=10, bytesAssigned=165278208}

NOTE: The EXPLAIN plan is the “logical” explain plan. It shows the operations that will be performed, and their logical relationship to each other. The actual execution order of the operations in the plan does not necessarily match the logical order shown by the plan.

Query profile

Query Profile, available through the Snowflake web interface, provides execution details for a query.

Go to history => Click on Query ID => This will show the query details and execution plan along with time.

Get query plan using query_id

You can get the query id from query history view or history from GUI

--- To get the plan in text format
select SYSTEM$EXPLAIN_JSON_TO_TEXT(system$explain_plan_json('01958e99-0187-c08e-0000-00214103b041')) as explain_plan;

GlobalStats:
	bytesAssigned=0
	partitionsAssigned=0
	partitionsTotal=0
Operations:
1:0     ->Result  TEST_123.ID  
1:1          ->TableScan  LOCK_TEST.PUBLIC.TEST_123  ID  {partitionsTotal=0, partitionsAssigned=0, bytesAssigned=0}

--- To get the plan in json format
select system$explain_plan_json('01958e99-0187-c08e-0000-00214103b041') as explain_plan;{"GlobalStats":{"partitionsTotal":0,"partitionsAssigned":0,"bytesAssigned":0},"Operations":[[{"id":0,"operation":"Result","expressions":["TEST_123.ID"]},{"id":1,"parent":0,"operation":"TableScan","objects":["LOCK_TEST.PUBLIC.TEST_123"],"expressions":["ID"],"partitionsAssigned":0,"partitionsTotal":0,"bytesAssigned":0}]]}

Login history

Snowflake maintain the user login history for 1 year and it also contain the failed login attempts by the users. you have two methods to read the login history

  • login_history function in information_schema
  • login_history in account_usage schema

information_schema maintain the data for last 7 days

--- login history for last 24 hours
select * from table(information_schema.login_history(dateadd('hours',-24,current_timestamp()),current_timestamp())) order by event_timestamp;

--- login history of a user in last 24 hours
select * from table(information_schema.login_history_by_user('<USERNAME>',dateadd('hours',-1,current_timestamp()),current_timestamp())) order by event_timestamp;

If the requirement to get the login history is beyond 7 days use the snowflake.account_usage.login_history view. There is a delay of 2 hrs to get the data into this view.

select * from snowflake.account_usage.login_history order by event_timestamp;

NOTE: Query_id in query_history is unique for each run. Same query can have different query_id’s.

PostgreSQL upgrade to 12 failed due to earth distance extension

While upgrading RDS PostgreSQL from 9.5 to 12 got below error, which is little confusing because it is not listed in the prerequisites documentation.

"/rdsdbbin/postgres/bin/pg_restore" --host /rdsdbdata/log/upgrade --port 8193 --username rdsadmin --create --exit-on-error --verbose --dbname template1 "pg_upgrade_dump_1.custom" >> "pg_upgrade_dump_1.log" 2>&1

*failure*
There were problems executing ""/rdsdbbin/postgres/bin/pg_restore" --host /rdsdbdata/log/upgrade --port 8193 --username rdsadmin --create --exit-on-error --verbose --dbname template1 "pg_upgrade_dump_16395.custom" >> "pg_upgrade_dump_16395.log" 2>&1"
Consult the last few lines of "pg_upgrade_dump_16395.log" for
the probable cause of the failure.
"/rdsdbbin/postgres/bin/pg_ctl" -w -D "/rdsdbdata/db" -o "--config_file=/rdsdbdata/config_new/postgresql.conf" -m fast stop >> "pg_upgrade_server.log" 2>&1
child process exited abnormally: status 256
"/rdsdbbin/postgres/bin/pg_ctl" -w -D "/rdsdbdata/db" -o "--config_file=/rdsdbdata/config_new/postgresql.conf" -m fast stop >> "pg_upgrade_server.log" 2>&1
----------------------- END OF LOG ----------------------

atabase instance is in a state that cannot be upgraded: pg_restore: from TOC entry 4241; 1259 62347 INDEX addresses_earthdistance_ix f0187642fc72b3a pg_restore: error: could not execute query: ERROR: type "earth" does not exist LINE 1: ...ians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth ^ QUERY: SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth CONTEXT: SQL function "ll_to_earth" during inlining Command was: -- For binary upgrade, must preserve pg_class oids SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('62347'::pg_catalog.oid); CREATE INDEX "addresses_earthdistance_ix" ON "publicw9l8ylklm8dupgr4zedbfssfgjmnfzwf"."addresses" USING "gist" ("publicw9l8ylklm8dupgr4zedbfssfgjmnfzwf"."ll_to_earth"("latitude", "longitude"));

after digging more found that there is a bug which is causing this issue. So we dropped the index “addresses_earthdistance_ix” and recreated it after upgradation.

PostgreSQL: Re: BUG #15112: Unable to run pg_upgrade with earthdistance extension

Aurora read replica option is not available for RDS PostgreSQL

We tried to migrate our existing RDS PostgreSQL instance of 9.5 version to 11.8 version. Below are the steps we followed after checking the documentation. The maximum Aurora PostgreSQL version supported is 11.8.

  1. upgrade existing RDS PostgreSQL to 11.8 version
  2. create read replica for existing RDS PostgreSQL instance
  3. Once both instances are in sync, promote read replica

When we were performing this on dev machine we found that RDS do not support read replica creation for 11.8. After discussing with AWS team they confirmed that RDS do not support Aurora read replica for all PostgreSQL versions. Below are details which can help others who want to migrate to Aurora.

RDS PostgreSQL VersionAurora Read Replica Version Supported
  
9.6.19.6.8, 9.6.9
9.6.29.6.8, 9.6.9
9.6.39.6.8, 9.6.9
9.6.59.6.8, 9.6.9
9.6.69.6.8, 9.6.9
9.6.89.6.8, 9.6.9
9.6.99.6.9
9.6.109.6.11, 9.6.12
9.6.119.6.11
9.6.129.6.12
9.6.14No Aurora Read Replica
9.6.15No Aurora Read Replica
9.6.16No Aurora Read Replica
9.6.17No Aurora Read Replica
9.6.18No Aurora Read Replica
  
10.110.4, 10.5, 10.6, 10.7, 10.12
10.310.4, 10.5, 10.6, 10.7, 10.12
10.410.4, 10.5, 10.6, 10.7, 10.12
10.510.5, 10.6, 10.7, 10.12
10.610.6, 10.7, 10.12
10.710.7, 10.12
10.910.11, 10.12
10.1O10.11, 10.12
10.1110.11, 10.12
10.1210.12
10.13No Aurora Read Replica
  
11.111.4, 11.7
11.211.4, 11.7
11.411.4, 11.7
11.511.6, 11.7
11.611.6, 11.7
11.711.7
11.8No Aurora Read Replica
  
12.2No Aurora Read Replica
12.3No Aurora Read Replica

« Older Entries