Tuesday, April 30, 2024

Predefined roles introduced in PostgreSQL 14 reduced my responsibilities

 

User management is a crucial role in the database, and the DBA should assign suitable permissions to users. For example, some users require read-only access to all tables, while others require read/write access (application user).

Now, if I want to allow the read or read-write access to the user, I need to perform specific instructions on the database.

Let's take example for read and read-write

Read

1. Grant select on all existing tables in schema to user
2. Alter default privileges for feature tables
3. Grant usage on schema
 
Read-Write
 
1. Grant select,insert,delete,update on all existing tables in schema to user
2. Alter default privileges for feature tables
3. Grant usage on schema
 
If there are comparable requirements for additional users, it is preferable to construct separate roles with different permissions, such as read and write roles, and then grant the roles to users depending on the requirements.
 
I used to follow the mentioned procedure, but not anymore after PostgreSQL 14. A new set of predefined roles has been introduced in PostgreSQL 14, which will simplify DBA work while providing permission to users, as seen below.
 
 
 


If I intend to perform the same activity to give read or read-write, I will only follow easy procedures.
 
Read
 
Grant pg_read_all_data to user ;
 
Read-Write
 
Grant pg_read_all_data,pg_write_all_data to user ;



 

 

 

 

 

 

 


Tuesday, March 5, 2024

Enhancement to Logical replication in PostgreSQL 16

 

Big transactions used to cause slowness on the target database, but not anymore. In PostgreSQL 16, a new option called parallel(in subscriber) was introduced, which speeds up writing to the target database.Before PostgreSQL 16, for big transactions, the publisher transmits data via multiple streams. On the subscriber-side, the apply worker writes modifications into temporary files and applies the complete transaction after receiving the commit.

  Let's understand new parameter,catalog table change related to above

 Parameter:

 max_parallel_apply_workers_per_subscription: Default value is 2 and these workers are taken from max_logical_replication_workers.  Maximum number of parallel apply workers per subscription. This parameter controls the amount of parallelism for streaming of in-progress transactions with subscription parameter streaming = parallel.

Catalog table:

 In pg_subscription, substream column gives information about the option used while creating the subscriber.

I did small testing to check advantage of this option. I have created two databases on same server running on 5432, 5433. 

Table create and Insert

create table replication_tabl(id bigint primary key, rand_data_1 varchar(50),dt timestamp,rand_data_2 varchar(50));
insert into replication_tabl values(generate_series(1,10000000),md5(random()::text),now(),md5(random()::text));

 

Set parameters in postgresql.conf, entry in pg_hba.conf for replication user.

postgresql.conf

wal_level='logical'


pg_hba.conf

host     blog     postgres     127.0.0.1/32     md5


Restart the db instance after settingup the parameters.

 

Without Parallel option:

psql -d blog -h 127.0.0.1 -p 5432 "create publication blog_pub for table replication_tabl;"
psql -d blog -h 127.0.0.1 -p 5433 create subscription blog_sub connection 'dbname=blog host=127.0.0.1 port=5432 password=passsword user=postgres ' publication blog_pub;

Workers running on operating system

postgres: logical replication launcher
postgres: logical replication apply worker for subscription 16403

 

 Parallel option :

psql -d blog -h 127.0.0.1 -p 5432 "create publication blog_pub for table replication_tabl;"
psql -d blog -h 127.0.0.1 -p 5433 -c "create subscription blog_sub connection 'dbname=blog host=127.0.0.1  port=5432' publication blog_pub WITH (streaming=parallel);"

Workers running on operating system


postgres: logical replication parallel apply worker for subscription
postgres: parallel worker for PID 747256
postgres: parallel worker for PID 747256

Restart the db instance after settingup the parameters.

below tables shows time taken to apply data on target database. 

 


 

Note: When you are planing to use this option please analyze your workload and resource.
 





 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Friday, March 1, 2024

How to produce DDLs for Views and MViews using pg_dump in the absence of a GUI client.

 

pg_dump is a utility for backing up a PostgreSQL database. It creates consistent backups even when the database is being used concurrently. pg_dump does not prevent other users from using the database.
There are various switches/options supported in the pg_dump and we will be seeing usage of -t option. This option allows to pull pattern match DDL alone with data(only for tables)
-t - option can be used to dump the definition of matching views, materialized views, foreign tables, and sequences

Example: 

 

I created view as given below

postgres=# \d+ test_vw
View "public.test_vw"
Column | Type | Collation | Nullable | Default | Storage | Description
--------+-----------------------+-----------+----------+---------+----------+-------------
i | integer | | | | plain |
seq | double precision | | | | plain |
name | character varying(10) | | | | extended |
View definition:
SELECT i,
i_gen::double precision / 10::double precision AS seq,
name
FROM test;

Let's generate DDL of view using pg_dump.

pg_dump -t public.test_vw

 
pg_dump output:

pg_dump: creating VIEW "public.test_vw"
--
-- TOC entry 216 (class 1259 OID 34287)
-- Name: test_vw; Type: VIEW; Schema: public; Owner: postgres
--

CREATE VIEW public.test_vw AS
SELECT i,
((i_gen)::double precision / (10)::double precision) AS seq,
name
FROM public.test;


ALTER VIEW public.test_vw OWNER TO postgres;

-- Completed on 2024-03-02 04:04:31 UTC

--
-- PostgreSQL database dump complete
--




 

Saturday, April 6, 2019

WOW!!! What ? Now, I can get timestamp of created table and committed rows in PostgreSQL ? That's Awesome !!

I have seen people, DBA questioning whether there any way to check when the table was created in Postgres.
Till 9.5 there was NO appropriate answer for the question but after 9.5 we can say yes we can get create time of table because of new feature i.e Commit timestamps.

track_commit_timestamp

Record commit time of transactions. This parameter can only be set in PostgreSQL.conf file or on the server command line.
The default value is off. This value is a boolean value.
Change in value needs postmaster restart.

Test Case:

Without enabling track_commit_timestamp

SELECT pg_xact_commit_timestamp(xmin), * FROM pg_class where relname= 'tracking';
ERROR: could not get commit timestamp data
HINT: Make sure the configuration parameter "track_commit_timestamp" is set.
Enabling track_commit_timestamp
alter system set track_commit_timestamp='on';

Then restart the Postgres service(postmaster spinup).
create table tracking(i int);
To query that information, use the function pg_xact_commit_timestamp(transaction_id) to find out when rows were
INSERTed/UPDATEed, call the pg_xact_commit_timestamp function passing in the xmin system column
When a new table was created the new entry will be added to pg_class.

From pg_class we can take the last commit time for a table in our case its "tracking"(table name).
postgres=#
SELECT pg_xact_commit_timestamp(xmin), * FROM pg_class where relname= 'tracking';
-[ RECORD 1 ]------------+---------------------------------
relnamespace | 2200
pg_xact_commit_timestamp | 2016-10-08 00:41:57.824771+05:30
relname | tracking
reltype | 16508
Insert and update
postgres=# insert into tracking values (1);
INSERT 0 1
INSERT 0 1
postgres=# insert into tracking values (2);
postgres=#
SELECT pg_xact_commit_timestamp(xmin), * FROM tracking ;
   pg_xact_commit_timestamp | i
----------------------------------+---
2016-10-08 12:32:11.223785+05:30 | 1
2016-10-08 12:32:13.862489+05:30 | 2
postgres=# update tracking set i=10 where i=2;
UPDATE 1
postgres=#

SELECT pg_xact_commit_timestamp(xmin), * FROM tracking ;
   pg_xact_commit_timestamp | i
----------------------------------+----
2016-10-08 12:32:11.223785+05:30 | 1
2016-10-08 12:34:21.379262+05:30 | 10
Basically track_commit_timestamp records the committed row values(insert,update).
Note: Make sure that user must enable the track_commit_timestamp if he wants to record row's committed time.
The advantage of enabling this parameter can be used in multimaster(BDR) systems or pg_logical(replication setup) to over
comer from conflict resolution
Performance Impact:
This GUC added by the community to track the last committed timestamp of rows in tables.
Once enabled, commit timestamps are tracked for all transactions in the DB cluster, which causes a performance hit.If you only need it for one or a few tables, rather add timestamps to involved tables manually creating a trigger on tables and capturing the timestamp would be preferable .


Friday, March 22, 2019

Updated pg_rewind feature in PostgreSQL-9.6 for timeline switches

In this test case I am trying to make my previous Master as Master by rollbacking the changes happen into Standby.

Step 1:Install PostgreSQL-10 on both Master and Standby. Follow the Installation steps from Installation_Steps

Step 2: Now, Create Data Directory and make the changes to config files.

initdb -D /var/lib/pgsql/10/data/

vi /var/lib/pgsql/10/data/postgresql.conf

      port=5432
      listen_addresses = '*'
      wal_level = replica
      max_wal_senders = 10
      wal_keep_segments = 32
      wal_log_hints = on
      archive_mode    = on
      archive_command = 'cp %p /var/lib/pgsql/pg_archive/%f' 

vi /var/lib/pgsql/10/data/pg_hba.conf

host replication all 0.0.0.0 md5

Step 3: Start Master

pg_ctl -D /var/lib/pgsql/10/data/ start

Step 4: Upload the data in Master

/usr/pgsql-10/bin/pgbench -i -s 10

Step 5: Build the Streaming Replication

/usr/pgsql-10/bin/pg_basebackup -h 127.0.0.1 -p 5432 -U replication -D /var/lib/pgsql/10/slave-data/ -Xs -c fast -R

vi /var/lib/pgsql/10/slave-data/postgresql.conf

      port=6432
      listen_addresses = '*'
      hot_standby='on'

vi /var/lib/pgsql/10/slave-data/recovery.conf

         standby_mode = 'on'
         primary_conninfo = 'host=127.0.0.1 port=5432 dbname=postgres user=postgres'
         #recovery_target_timeline = 'latest'
         #restore_command = 'cp /var/lib/pgsql/pg_archive/%f %p'


Step 6:  Start the Standby by server and the promote it

/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/slave-data/ start
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/slave-data/ promote

Step 7: Now load the data using pgbench into New Master(Which was earlier was standby) and Stop the New Master

/usr/pgsql-10/bin/pgbench -T 60 -p 6432 -d postgres
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/slave-data/ -m fast stop

Step 8: Run pg_rewind to make the Previous Master and Rollback the changes happen in New Master

/usr/pgsql-10/bin/pg_rewind --target-pgdata /var/lib/pgsql/10/slave-data/ -n --source-server="port=5432 user=postgres dbname=postgres"
/usr/pgsql-10/bin/pg_rewind --target-pgdata /var/lib/pgsql/10/slave-data/ --source-server="port=5432 user=postgres dbname=postgres"

Step 9:  Edit postgres.conf file and create recovery.conf once it is done then start the server

vi /var/lib/pgsql/10/slave-data/postgresql.conf

      port=6432
      listen_addresses = '*'
      hot_standby='on'

vi /var/lib/pgsql/10/data/recovery.conf

    standby_mode = 'on'
    primary_conninfo = 'host=127.0.0.1 port=5432 dbname=postgres user=postgres'
    #recovery_target_timeline = 'latest'
    #restore_command = 'cp /var/lib/pgsql/pg_archive/%f %p'

/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/slave-data/ start

postgres=# select * from pg_stat_replication ;
 pid  | usesysid | usename  | application_name | client_addr | client_hostname | client_port |         backend_start         | bac
kend_xmin |   state   | sent_lsn  | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync
_state
------+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+----
----------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+-----
-------
 6011 |       10 | postgres | walreceiver      | 127.0.0.1   |                 |       55866 | 2019-03-21 11:50:00.193565+00 |
          | streaming | 0/D02BEE0 | 0/D02BEE0 | 0/D02BEE0 | 0/D02BEE0  |           |           |            |             0 | asyn
c
(1 row)

-bash-4.2$ psql
psql (10.7)
Type "help" for help.

postgres=# create table test(i int);
CREATE TABLE
postgres=# \q
-bash-4.2$ psql -p 6432    (Standby )
psql (10.7)
Type "help" for help.

postgres=# \dt
              List of relations
 Schema |       Name       | Type  |  Owner
--------+------------------+-------+----------
 public | pgbench_accounts | table | postgres
 public | pgbench_branches | table | postgres
 public | pgbench_history  | table | postgres
 public | pgbench_tellers  | table | postgres
 public | test             | table | postgres
(5 rows)

postgres=#

Friday, March 16, 2018

Use pg_rewind to avoid rebuilding standby from scratch after promote

Some time it happens like Master is up & running but Standby gets promoted due to false failover under this situation first question raises is "how to rebuild the standby"?

Now, let's rewind the promoted cluster to make old standby as new standby of current Master.


pg_rewind: pg_rewind is a tool for synchronizing a PostgreSQL cluster with another copy of the same cluster after the clusters' timelines have diverged. A typical scenario is to bring an old master server back online after failover as a standby that follows the new master.


pg_rewind requires that the target server either has the wal_log_hints option enabled in postgresql.conf or data checksums enabled when the cluster was initialized with initdb.


How it works:

copies file system-level changes from the source cluster to the target cluster:

It will check the divorced point of Master and Standby and from depending on divorced point it will decide from where it has to apply the changes on target cluster.
Apply the WAL from the source cluster, starting from the checkpoint created at failover.



Lets do the setup and test the pg_rewind:


On Master Set parameters:

wal_level = replica
archive_mode='on'
archive_command='cp %p /tmp/archive/%f'
wal_keep_segments = 500
max_wal_senders=4
wal_log_hints = on

Initially, I have built the replication Master:5432 and Standby:5433.



I have promoted the standby and deleted records from Standby


Master is Master and Standby is also accepting the DML's (Replication is broken)

Now instead of taking the full base backup and building the replication again just use pg_rewind and build the replication in short time.



Once pg_rewind is successfully completed then you will see the backup_label file in Standby's data directory.



I have created the recovery.conf file in Standby cluster and started it




Now let's check the status of replication?



Data is also streamed to Standby Cluster.




This ways pg_rewind helps you to rebuild the false promoted Standby or Master as standby.


In an Earlier version of PostgreSQL < 9.6 pg_rewind use to work only in one direction i.e If Standby got promoted then using pg_rewind user can build the Master as new standby for new Master.
From 9.6 user can use pg_rewind in both the direction.


Always try to Keep your Environments up-to-date to use this kind of features.









Predefined roles introduced in PostgreSQL 14 reduced my responsibilities

  User management is a crucial role in the database, and the DBA should assign suitable permissions to users. For example, some users requir...