'Postgresql pglogical subscription is down

I want to setup logical replication between two postgresql 9.5 servers. I could make it work on two virtual pc-s but when I try to do it on our production machines the replication status is down (pglogical.show_replication_status()).

On our main server there is a lot of data (~250GB) so we can't really afford to delete it. However I moved it to the replication storage server via pg_dump.

Main production server:

pg_hba.conf:

local   all             postgres                                peer
local   all             all                                     md5
host    all             all             127.0.0.1/32            md5
host    all             all             ::1/128                 md5
local   replication     all                                     trust
host    replication     all             127.0.01/32             trust
host    replication     all             ::1/128                 trust
host    all             all             all                     trust
host    replication     all             all                     trust

postgresql.conf:

data_directory = '/mnt/Data/fmeterdb/postgresql/9.5/main'       # use data in another directory
hba_file = '/etc/postgresql/9.5/main/pg_hba.conf'   # host-based authentication file
ident_file = '/etc/postgresql/9.5/main/pg_ident.conf'   # ident configuration file
external_pid_file = '/var/run/postgresql/9.5-main.pid'          # write an extra PID file
listen_addresses = '*'      # what IP address(es) to listen on;
port = 5432             # (change requires restart)
max_connections = 1000          # (change requires restart)
unix_socket_directories = '/var/run/postgresql' # comma-separated list of directories
ssl = true              # (change requires restart)
ssl_cert_file = '/etc/ssl/certs/ssl-cert-snakeoil.pem'      # (change requires restart)
ssl_key_file = '/etc/ssl/private/ssl-cert-snakeoil.key'     # (change requires restart)
shared_buffers = 1024MB         # min 128kB
dynamic_shared_memory_type = posix  # the default is the first option
shared_preload_libraries = 'pglogical'  # (change requires restart)
max_worker_processes = 16
wal_level = logical         # minimal, archive, hot_standby, or logical
synchronous_commit = local      # synchronization level;
archive_mode = on       # enables archiving; off, on, or always
archive_command = 'cp %p /mnt/Data/fmeterdb/postgresql/9.5/main/archive/%f'     # command to use to archive a logfile segment
max_wal_senders = 16        # max number of walsender processes
wal_keep_segments = 300     # in logfile segments, 16MB each; 0 disables
max_replication_slots = 16  # max number of replication slots
track_commit_timestamp = on # collect timestamp of transaction commit
synchronous_standby_names = 'pgslave001'    # standby servers that provide sync rep
log_line_prefix = '%t [%p-%l] %q%u@%d '         # special values:
log_timezone = 'UTC'
stats_temp_directory = '/var/run/postgresql/9.5-main.pg_stat_tmp'
datestyle = 'iso, ymd'
timezone = 'UTC'
lc_messages = 'hu_HU.UTF-8'         # locale for system error message
lc_monetary = 'hu_HU.UTF-8'         # locale for monetary formatting
lc_numeric = 'hu_HU.UTF-8'          # locale for number formatting
lc_time = 'hu_HU.UTF-8'             # locale for time formatting
default_text_search_config = 'pg_catalog.hungarian'

bash:

createuser -s --replication -P khrh_replicator
    * Enter password for new role: qwe123
    * Enter it again: qwe123
createdb -O khrh_replicator khrh_replicator
psql --username=khrh_replicator
\c fmeter
create extension pglogical;
select pglogical.create_node( node_name := 'khrh_provider', dsn := 'host=10.0.2.1 port=5432 user=khrh_replicator dbname=fmeter' );
select pglogical.create_replication_set('khrh_replication_set');
select pglogical.replication_set_add_table( set_name := 'khrh_replication_set', relation := 'measure_results' synchronize_data := true ); 
select pglogical.replication_set_add_table( set_name := 'khrh_replication_set', relation := 'measure_runs' synchronize_data := true );
select pglogical.replication_set_add_table( set_name := 'khrh_replication_set', relation := 'measures' synchronize_data := true );       

I don't get any error after these commands.

The steps I made on our storage server:

pg_hba.conf:

local   all             postgres                                peer
local   all             all                                     md5
host    all             all             127.0.0.1/32            md5
host    all             all             ::1/128                 md5
local   replication     all                                     trust
host    replication     all             127.0.0.1/32            trust
host    replication     all             ::1/128                 trust
host    all             all             all                     trust
host    replication     all             all                     trust

postgresql.conf:

data_directory = '/mnt/Data/fmeterdb/postgresql/9.5/main'       # use data in another directory
hba_file = '/etc/postgresql/9.5/main/pg_hba.conf'   # host-based authentication file
ident_file = '/etc/postgresql/9.5/main/pg_ident.conf'   # ident configuration file
external_pid_file = '/var/run/postgresql/9.5-main.pid'          # write an extra PID file
listen_addresses = 'localhost,172.17.11.18,10.0.2.3'        # what IP address(es) to listen on;
port = 5432             # (change requires restart)
max_connections = 1000          # (change requires restart)
unix_socket_directories = '/var/run/postgresql' # comma-separated list of directories
ssl = on
ssl_cert_file = '/etc/ssl/certs/ssl-cert-snakeoil.pem'
ssl_key_file = '/etc/ssl/private/ssl-cert-snakeoil.key'
shared_buffers = 1024MB         # min 128kB
dynamic_shared_memory_type = posix  # the default is the first option
shared_preload_libraries = 'pglogical'  # (change requires restart)
max_worker_processes = 16       # (change requires restart)
wal_level = logical         # minimal, replica, or logical
synchronous_commit = local      # synchronization level;
archive_mode = on
archive_command = 'cp -i %p /mnt/Data/fmeterdb/postgresql/9.5/main/archive/%f'      # command to use to archive a logfile segment
max_wal_senders = 16        # max number of walsender processes
wal_keep_segments = 400
max_replication_slots = 16  # max number of replication slots
track_commit_timestamp = on # collect timestamp of transaction commit
synchronous_standby_names = 'pgslave001'    # standby servers that provide sync rep
hot_standby = on            # "off" disallows queries during recovery
log_line_prefix = '%m [%p] %q%u@%d '        # special values:
log_timezone = 'UTC'
cluster_name = '9.5/main'           # added to process titles if nonempty
stats_temp_directory = '/var/run/postgresql/9.5-main.pg_stat_tmp'
datestyle = 'iso, ymd'
timezone = 'UTC'
lc_messages = 'hu_HU.UTF-8'         # locale for system error message
lc_monetary = 'hu_HU.UTF-8'         # locale for monetary formatting
lc_numeric = 'hu_HU.UTF-8'          # locale for number formatting
lc_time = 'hu_HU.UTF-8'             # locale for time formatting
default_text_search_config = 'pg_catalog.hungarian'
include_dir = 'conf.d'          # include files ending in '.conf' from

bash:

createuser -s --replication -P khrh_replicator
    * Enter password for new role: qwe123
    * Enter it again: qwe123
createdb -O khrh_replicator khrh_replicator
psql --username=urh_replicator
create database fmeter_khrh;
\c fmeter_khrh
create extension pglogical;
select pglogical.create_node( node_name := 'khrh_subscriber', dsn := 'host=10.0.2.3 port=5432 dbname=fmeter_khrh user=khrh_replicator' );
select pglogical.create_subscription( subscription_name := 'khrh_subscription', replication_sets := array['khrh_replication_set'], provider_dsn := 'host=10.0.2.1 port=5432 dbname=fmeter user=khrh_replicator' );

I don't get any error after this either.

Do somebody met with this, and how can I fix this?

Thanks in advance!



Solution 1:[1]

pglogical extension uses postgresql log to provide error messages. To get current replication status use this select * from pglogical.show_subscription_status();

Some tips for initial configuration can be found in this Depesz post https://www.depesz.com/2016/11/08/major-version-upgrading-with-minimal-downtime/

Solution 2:[2]

Check your postgresql log files. Either follow the live logging or simply view the logs. The logs will have the errors, and you will be able to debug accordingly. Further, one has to reset the subscriptions if the status is down. Check subscription: select * from pglogical.show_subscription_status(); If status is down, drop the subscription and recreate it.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ???? ???????
Solution 2 jsibs