Big transactions used to cause slowness on the target database, but not anymore. In PostgreSQL 16, a new option called parallel(in subscriber) was introduced, which speeds up writing to the target database.Before PostgreSQL 16, for big transactions, the publisher transmits data via multiple streams. On the subscriber-side, the apply worker writes modifications into temporary files and applies the complete transaction after receiving the commit.
Let's understand new parameter,catalog table change related to above
Parameter:
max_parallel_apply_workers_per_subscription: Default value is 2 and these workers are taken from max_logical_replication_workers. Maximum number of parallel apply workers per subscription. This parameter controls the amount of parallelism for streaming of in-progress transactions with subscription parameter streaming = parallel.
Catalog table:
In pg_subscription, substream column gives information about the option used while creating the subscriber.
I did small testing to check advantage of this option. I have created two databases on same server running on 5432, 5433.
Table create and Insert
create table replication_tabl(id bigint primary key, rand_data_1 varchar(50),dt timestamp,rand_data_2 varchar(50));
insert into replication_tabl values(generate_series(1,10000000),md5(random()::text),now(),md5(random()::text));
Set parameters in postgresql.conf, entry in pg_hba.conf for replication user.
postgresql.conf
wal_level='logical'
pg_hba.conf
host blog postgres 127.0.0.1/32 md5
Restart the db instance after settingup the parameters.
Without Parallel option:
psql -d blog -h 127.0.0.1 -p 5432 "create publication blog_pub for table replication_tabl;"
psql -d blog -h 127.0.0.1 -p 5433 create subscription blog_sub connection 'dbname=blog host=127.0.0.1 port=5432 password=passsword user=postgres ' publication blog_pub;
Workers running on operating system
postgres: logical replication launcher
postgres: logical replication apply worker for subscription 16403
Parallel option :
psql -d blog -h 127.0.0.1 -p 5432 "create publication blog_pub for table replication_tabl;"
psql -d blog -h 127.0.0.1 -p 5433 -c "create subscription blog_sub connection 'dbname=blog host=127.0.0.1 port=5432' publication blog_pub WITH (streaming=parallel);"
Workers running on operating system
postgres: logical replication parallel apply worker for subscription
postgres: parallel worker for PID 747256
postgres: parallel worker for PID 747256
Restart the db instance after settingup the parameters.
below tables shows time taken to apply data on target database.
Note: When you are planing to use this option please analyze your workload and resource.