How PGD CLI and Connection Manager join forces for a comprehensive HA(High Availability)

August 13, 2025

This blog was co-authored by Abhijit Save.

Overview

The Postgres Distributed (PGD) is a distributed database that provides high availability and scalability. PGD6 builds on its now-integrated CLI tooling, and leveraging PGD6's newly embedded Connection Manager, allows you to create production-grade PGD clusters with a single command.

 

This blog provides a quick overview of how PGD CLI and PGD Connection Manager combine forces to provide a comprehensive HA to a Postgres cluster in a couple of steps.

 

Getting started with PGD

Before we explore cool HA stuff, let's get started with the PGD cluster. The PGD 6  release introduces a new CLI command, pgd node setup, that automates the tedious, mundane, and error-prone but critical tasks of setting up a PGD node.  With the command as simple as the one below, you can set up a PGD node with Connection Manager up and listening for client connections in one go, without any extra setup or configuration.

pgd node pgd-1 setup \
    --dsn "host=pgd-1 port=5432 dbname=pgddb user=pgdadmin" \
    --pgdata /var/lib/edb-pge/17/main \
    --cluster-name dc-1 \
    --group-name group-1

So what does this command do?  Well, it's going to set up a node called pgd-1 on the system it's running on. We assume you've already installed the PGD packages, which include the CLI and the Postgres packages, at this point. 

 

Anyway, the setup command initialises the Postgres database—that's why the pgdata setting is there—and once initialized, automatically reconfigures the database for PGD use.

 

But, you might be asking, if there's no database running at the start, what's the DSN setting for? That's the connection string that will be used once the database is up and running. And once it is up and running, it connects to the newly PGD-enabled database and creates a cluster and group for our new node. More nodes can be added to the cluster with the same command, with an additional argument --cluster-dsn pointing to the existing active node in the cluster. 

 

The command is powerful enough to validate and update the cluster configurations locally and remotely if the existing settings are no longer compatible with the recommended configuration as the cluster grows.

I won't be going into details about the command arguments. Check the CLI documentation in the references for more details. 

The pgd binary comes as an integral part of PGD and is available in the bin directory for PGD, so no external installation is required for the CLI. As a prerequisite, Postgres and its corresponding PGD packages need to be installed on the node where the command will be run.
 

The HA challenge

PGD already provides active multimaster replication with strong conflict-resolution capability. So you might be wondering,  why do we need a separate HA solution? The simple answer is to avoid the conflicts. Conflict resolution can be expensive and can potentially hamper data consistency and server performance if the servers get too busy resolving conflicts.

Before PGD 6, PGD relied on external services, be it HA Proxy or its own HARP or PGD Proxy, to handle incoming connections and routing. These services did the job, but they were external services, and then users had to deal with managing and maintaining an additional service—another thing to worry about!
 

Connection Manager: the solution

Now, PGD 6, introduced the embedded Connection Manager, which is a background worker within the database. That tight integration with the database allows it to work in harmony with Postgres, leveraging the configuration and authentication of the host database to provide a smarter, faster proxy.

Every PGD data node has a Connection Manager instance that listens for incoming connections and routes them to the appropriate node in the cluster, specifically the current write-leader in the cluster. If the current write-leader node goes down, then PGD automatically elects a new write-leader, and the connection managers will start routing connections to the new write-leader

The Raft-backed write-leader solution is about getting the best performance, ensuring consistency, and maximizing availability. The fully integrated nature of the Connection Manager provides faster failure detection and response to the event, ensuring comprehensive Availability and Partition Tolerance. This together powers the five 9s of HA for PGD, which is critical for any distributed application.

The Connection Manager accepts client connections on a read-write port and a read-only port and routes the connections to the current writer-leader or the read nodes accordingly. The client applications can connect to a read-write port for all their write queries to ensure data consistency across the cluster and leverage the read-only ports to route all the read-only queries to non-write-leader nodes, thereby unleashing the full power of your PGD cluster.

The read-write port is, by default, set to the Postgres port + 1000 (usually 6432). The read-only port is set to the read-write port + 1001 (usually 6433). 

The connection manager does session-level pooling in 6.0, handling the authentication and TLS termination directly in the connection manager endpoints. There's no separate configuration file with Connection Manager. It reads the pg_hba.conf file, beloved by Postgres administrators, and uses it to also configure the connection manager's incoming connections.
 

How is HA achieved?

In a very simplistic approach, end users can leverage the multi-host connection string capability of PostgreSQL to achieve a near-seamless HA in the event of write-leader node failure.

Example: 

Consider a 3-node PGD cluster with nodes as pgd-1, pgd-2, and pgd-3. 
The multi-host connection string for client applications can be something like below.

"host=pgd-1,pgd-2,pgd-3 user=postgres port=6432 dbname=pgddb" 

Note that the host component has the hostnames of all three nodes, and the port is 6432, which is the default port for Connection Manager.

Now, consider that the node pgd-1 is the current write-leader, and it goes down. PGD will automatically elect the new write leader, say pgd-3.  Now, all the client connections will start routing to node pgd-2 as the Connection Manager on node pgd-1 is dead. The Connection Manager on node pgd-2 will accept the client connections and route them to node pgd-3, as node pgd-3 is the current write-leader.

A similar multi-host connection string can be used for read-only port 6433, too, to achieve read scalability by leveraging the non-write-leader data nodes in your cluster. 

This ensures that the client apps continue to function near-seamlessly without any manual intervention or any external tool for HA. 

For more sophisticated setups involving load balancers or connection poolers, rest assured that Connection Manager integrates seamlessly. You'll find detailed configuration instructions for these tools in their respective product documentation. A quick example with HAProxy is available in the Load Balancing section of the Connection Manager documentation in the references section.

The bottom line here is that these tools, too, can be clubbed with Connection Manager, similar to the multi-host connection string described above for a seamless HA solution.
 

Customizing the Connection Manager

The default configurations are typically good enough to get you started. However, to make it production-ready, you need to make some customisations for sure. The ease of making the customisations becomes critical here. The PGD CLI again comes to the rescue.  

 

The PGD CLI offers get-option and set-option commands for node and group resources. You can view the current active options using the get-option command and make necessary changes using the set-option command. 

 

Again, not going into specifics of the command. You can find the details in the PGD CLI documentation.
 

Monitoring the Connection Manager

The PGD 6 release enhances the bdr.stat_activity view to add a few columns for monitoring Connection Manager metrics.  There are several additional views showing the number of connected clients to Connection Manager on the current node and which nodes the connections are routed to, etc. 

 

The Connection Manager also optionally exposes an HTTP (S) interface (port 8080 by default), which provides a liveness check, info about whether the node is read-writeread-only (or neither), and some routing information. These endpoints can be instrumental in working with HA Proxy and other existing Proxy/Load Balancing solutions.

 

This empowers users with a great level of transparency and monitoring capabilities for Connection Manager. The enhanced views and the HTTP (S) interfaces provide extensibility for custom monitoring and automation.

 

Check the Monitoring section in the Connection Manager documentation in the references for more details.
 

Conclusion

  • The PGD CLI  pgd node setup command is a powerful tool to set up a PGD cluster with a single command. The CLI is integrated into PGD, eliminating the need for a separate CLI installation.
  • The PGD Connection Manager is a powerful HA solution that provides read-write and read-only network interfaces to route client connections to write-leader and read nodes accordingly.
  • The Connection Manager uses the underlying PostgreSQL’s configuration(postgresql.conf) and authentication(pg_hba.conf) mechanism, eliminating the need for a separate installation and management of an external tool for HA.
  • The Raft-backed, fully integrated Connection Manager empowers PGD to provide five 9s of HA
     

References

The PGD 6 product documentation for PGD CLI 

The PGD 6 product documentation for Connection Manager
 

About the authors

Jagdish Kewat and Abhijit Save are part of the PGD dev team. The team has extensive experience in designing and developing HA solutions for PostgreSQL clusters across multiple products.

 

 

Share this