Exploring failover handling with PGD v5.7.0
Documentation improvements are made only to the latest version.
As per semantic versioning, PGD minor releases remain backward compatible and may include important bug fixes and enhancements.
We recommend upgrading the latest minor release as soon as possible.
If you want up-to-date information, read the latest PGD documentation.
With a high-availability cluster, the ability to failover is crucial to the overall resilience of the cluster. When the lead data nodes stops working for whatever reason, applications need to be able to continue working with the database with little or no interruption. For PGD, that means directing applications to the new lead data node, which takes over automatically. This is where PGD Proxy is useful. It works with the cluster and directs traffic to the lead data node automatically.
In this exercise, you'll create an application that sends data to the database regularly. Then you'll first softly switch lead data node by requesting a change through the PGD CLI. And then you'll forcibly shut down a database instance and see how PGD handles that.
Your quick started configuration
This exploration assumes that you created your PGD cluster using the quick start for Docker, the quick start for AWS, or the quick start for Linux hosts.
At the end of each quick start, you'll have a cluster with four nodes and these roles:
Host name | Host role |
---|---|
kaboom | PGD data node and pgd-proxy co-host |
kaftan | PGD data node and pgd-proxy co-host |
kaolin | PGD data node and pgd-proxy co-host |
kapok | Barman backup node |
You'll use these hostnames throughout this exercise.
A best practice recommendation
This example is based on the quick start configuration. For speed and simplicity, it uses the Barman backup server in place of creating a bastion server. It also uses the Barman login to the Postgres cluster.
In a production environment, we recommend that you create a separate bastion server to run the failover experiment from and that you create an appropriate Postgres user to log in to the cluster.
Installing xpanes
Xpanes optional
We recommend the xpanes utility for this exercise. It allows you to easily switch between multiple terminal sessions. If you prefer to use multiple terminals, tmux, or another terminal multiplexer, you can do so. Just make sure you can easily switch between multiple terminal sessions.
You'll use xpanes, a utility that allows you to quickly create multiple terminal sessions that you can easily switch between. It isn't installed by default, so you have to install it. For this exercise, you launch xpanes from the system where you ran tpaexec to configure your quick-start cluster.
If the system is running Ubuntu, run:
sudo apt install software-properties-common sudo add-apt-repository ppa:greymd/tmux-xpanes sudo apt update sudo apt install tmux-xpanes
These are the installation instructions from the xpanes repository. If you aren't on Ubuntu, the repository also contains installation instructions for other systems.
Connecting to the four servers
With xpanes installed, you can create an SSH session with all four servers by running:
cd democluster xpanes -d -c "ssh -F ssh_config {}" "kaboom" "kaolin" "kaftan" "kapok"
After running these commands, there are four panes. The four panes are connected to kaboom, kaolin, kaftan, and kapok and you're logged in as the root user on each. You need this privilege so you can easily stop and start services later in the exercise.
Press Control-b followed by q to briefly display the numeric values for each pane.
To switch the focus between the panes, you can use Control-b and the cursor keys to navigate between them. Or you can use Control-b followed by q and the number of the pane you want to focus on. We'll show both ways.
Use Control-b ↓ Control-b → or Control-b q 3 to move the focus to the bottom-right pane, which is the kapok host. This server is responsible for performing backups. You'll use this as the base of operations for your demo application. You can use Barman credentials to connect to the database servers and proxies:
sudo -iu barman psql -h kaboom -p 6432 bdrdb
This code connects to the proxy on the kaboom host, which also runs a Postgres instance as part of the cluster.
The next step is to create the table for your application to write to:
drop table if exists ping cascade; CREATE TABLE ping (id SERIAL PRIMARY KEY, node TEXT, timestamp TEXT) ;
This code first drops the ping
table. Then it re-creates the ping
table with an id primary key and two text fields for a node and timestamp. The table should now be ready. To verify that it is, use Control-b ← Control-b ↑ or Control-b q 0 to move to the top left pane, which puts you on the kaboom server. In this pane, become the enterprisedb user so you can easily connect to the database:
sudo -iu enterprisedb
You can now connect to the local database by running:
psql bdrdb
This command connects you directly to the local database instance on kaboom. Use \dt
to view the available tables:
bdrdb=# \dt
List of relations
Schema | Name | Type | Owner
--------+------+-------+--------
public | ping | table | barman
(1 row)
Running \d ping
shows that the DDL to create ping is on the kaboom server:
bdrdb=# \d ping
Table "public.ping"
Column | Type | Collation | Nullable | Default
-----------+---------+-----------+----------+----------------------------------
id | integer | | not null | nextval('ping_id_seq'::regclass)
node | text | | |
timestamp | text | | |
Indexes:
"ping_pkey" PRIMARY KEY, btree (id)
If you want to be sure that this table is replicated, you can connect to another node in the cluster and look. The \c
command in psql lets you connect to another server. To connect to the kaftan node, run:
\c - - kaftan
You'll see a login message similar to this:
psql.bin (16.2.0, server 16.2.0) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
You are now connected to database "bdrdb" as user "enterprisedb" on host "kaftan" (address "10.33.25.233") at port "5444".
bdrdb=#
Run \dt
and \d ping
, and you'll see the same results on the kaftan node.
To reconnect to the kaboom node, run:
\c - - kaboom
Setting up a monitor
Next, you want to monitor the activity of the ping table. Enter this SQL to display the 10 most recent entries:
select * from ping order by timestamp desc limit 10;
To run this command more than once, use the \watch
command in the shell, which executes the last query at regular intervals. To update every second, enter:
\watch 1
So far, there's nothing to see. You'll add activity soon.
Creating pings
Return to the Barman host kapok by using Control-b ↓ Control-b → or Control-b q 3.
This session is still logged into the psql session. Since you next want to run a shell script, you need to exit psql. Press Control-d.
The shell prompt now reads:
barman@kapok:~$
If it says admin@kapok
or root@kapok
, run sudo -iu barman
to become the Barman user again.
The application you'll create is simple. It gets the node to write to and a timestamp for the ping. Then, as quickly as it can, it writes a new ping to the ping table.
In the shell, enter:
while true; do psql -h kaftan,kaolin,kaboom -p 6432 bdrdb -c "INSERT INTO ping(node, timestamp) select node_name, current_timestamp from bdr.local_node_summary;"; done
In a more readable form, that is:
while true; do psql -h kaftan,kaolin,kaboom -p 6432 bdrdb -c \ "INSERT INTO ping(node, timestamp) select node_name, current_timestamp from bdr.local_node_summary;" done
In a constant loop, you call the psql
command, telling it to connect to any of the three proxies as hosts, giving the proxy port and selecting the bdrdb database. You also pass a command that inserts two values into the ping table. One of the values comes from bdr.local_node_summary
, which contains the name of the node you're actually connected to. The other value is the current time.
Once the loop is running, new entries appear in the table. You'll see them in the top-left pane where you set up the monitor.
You can now start testing failover.
Changing the write leader
For this part of the process, switch to the host kaftan, which is in the lower-left corner. Use Control-b ← or Control-b q 2 to switch focus to it.
To gain appropriate privileges to run pgd, at the PGD command line interface, run:
sudo -iu enterprisedb
To see the state of the cluster, run:
pgd groups list -v
You'll see output like this: