Postgres gives each connecting client its own process called a client backend. And either because of concerns about 1) resource contention or 2) latency or 3) both, we users tend to limit the maximum number of client connections to a few hundred. Then we introduce connection poolers like pgbouncer or pgcat.
These poolers can handle more connections (with lower latency) by transparently multiplexing many client connections onto a single Postgres client backend. But how does this work? And how does it behave?
Let's take a look at pgcat to see.
Build Postgres and pgcat
I was hoping to skip thinking about dependencies and configuration by finding a nice docker-compose.yml for pgcat and Postgres. There's one in the pgcat repo but I couldn't get it to work, possibly because I am using podman, not docker. And I couldn't find any other working docker-compose yamls either. So let's just build Postgres and pgcat from source and avoid docker/podman permissions issues.
$ git clone https://github.com/postgres/postgres
$ cd postgres
$ git checkout REL_18_STABLE
$ ./configure --without-icu \
--prefix=$(pwd)/build \
--libdir=$(pwd)/build/lib
$ make -j16 && make installThen create and start an instance and add a password to your user (pgcat seems to require this).
$ ./build/bin/initdb testdb
$ ./build/bin/pg_ctl -D $(pwd)/testdb -l logfile start
$ ./build/bin/psql postgres -c "ALTER USER $(whoami) WITH PASSOWRD '$(whoami)'"Now let's grab and build pgcat.
$ git clone https://github.com/postgresml/pgcat
$ cd pgcat
$ cargo buildpgcat requires a config file but the ones that come in the repo involve too many features and I just wanted a minimal config. So I started with an empty file and added settings until pgcat stopped erroring about a missing setting to get to this pgcat.toml:
echo '
[general]
admin_username = "$(whoami)" # Must be set but we won't use this.
admin_password = "" # Must be set but can be blank.
port = 6432
[pools.postgres.shards.0]
servers = [["localhost", 5432, "primary"]]
database = "postgres"
[pools.postgres.users.0]
username = "$(whoami)"
password = "$(whoami)"
pool_size = 1
' > pgcat.tomlWe're telling pgcat how and where to connect and that pgcat itself should run at 6432. The only thing we'll tweak in the rest of this post is pool_size. More on that as we go.
Start pgcat:
$ ./target/debug/pgcat pgcat.tomlpool_size = 1
So we've got Postgres running right now and we've got pgcat running. Let's connect to pgcat with the psql REPL (back in the root of the Postgres repo we cloned):
$ PGPASSWORD=$(whoami) ./build/bin/psql -h localhost -p 6432 postgres
psql (18rc1)
Type "help" for help.
postgres=#pgcat is intercepting all interactions between the client (psql) and Postgres's client backend. But it isn't really easy to tell from the client itself that we're connected to a pooler and not directly to Postgres.
Let's edit the psql prompt for this session so it'll be easier to tell them apart as we add more.
postgres=# \set PROMPT1 '[client1] %/%R%x%# '
[client1] postgres=# \set PROMPT2 '[client1] %/%R%x%# '
[client1] postgres=#Now let's run a simple query.
[client1] postgres=# SELECT 'hello world';
?column?
-------------
hello world
(1 row)Now let's check the PID of the Postgres client backend our queries are actually being executed on.
[client1] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
19584
(1 row)Now let's do something session-local, let's change our timezone. I'm in NYC so my timezone is America/New_York.
[client1] postgres=# SHOW timezone;
TimeZone
------------------
America/New_York
(1 row)Let's change it to America/Los_Angeles.
[client1] postgres=# SET timezone TO 'America/Los_Angeles';
SET
[client1] postgres=# SHOW timezone;
TimeZone
---------------------
America/Los_Angeles
(1 row)Keep that psql session open and in another terminal open another psql session.
$ PGPASSWORD=$(whoami) ./build/bin/psql -h localhost -p 6432 postgres
psql (18rc1)
Type "help" for help.
postgres=#Let's set this one's prompt too to distinguish it.
postgres=# \set PROMPT1 '[client2] %/%R%x%# '
[client2] postgres=# \set PROMPT2 '[client2] %/%R%x%# '
[client2] postgres=#Within client2 the timezone should be America/New_York because that's the default.
[client2] postgres=# SHOW timezone;
TimeZone
------------------
America/New_York
(1 row)Now check the Postgres client backend PID.
[client2] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
19584
(1 row)That's so cool! In Postgres that's not supposed to be possible! If we were connected directly to Postgres these PIDs would be unique. One process per client. But we're sharing the same process across clients.
Check back on that timezone setting in client1 to make sure it didn't somehow get reset.
[client1] postgres=# SHOW timezone;
TimeZone
---------------------
America/Los_Angeles
(1 row)Still its own value. So, among other things, the pooler is making it look as if we've got our own session just like each client had its own unique Postgres client backend even though we have multiple clients sharing the same Postgres client backend.
Sharing is ... caring?
The basic way these poolers work is that there's a pool of open Postgres client backend connections. In transaction pooling mode it will acquire a client backend connection from the pool and then relinquish it when the transaction is over. Each statement in Postgres is a transaction so each client only holds onto a pooled connection for the duration of the statement's execution.
If one client runs a statement that takes a while to execute it can starve other clients from access to the pooled connection. We can see this most easily because we set pool_size = 1 so the pooler is only allowed to have a single open Postgres client backend.
For example, let's just call pg_sleep() on client2 for 15 seconds. It will appear hang for 15 seconds before returning:
[client2] postgres=# SELECT pg_sleep(15);
pg_sleep
----------
(1 row)And immediately after hitting Enter on client2, run a SELECT 1 on client1. client1 will hang for a few seconds and then return an error:
[client1] postgres=# SELECT 1;
FATAL: could not get connection from the pool - AllServersDownBut once the client2 sleep completes and the statement is executed, client1 will be able to run commands again.
[client1] postgres=# SELECT 1;
FATAL: could not get connection from the pool - AllServersDown
-- Shortly later --
[client1] postgres=# SELECT 1;
?column?
----------
1
(1 row)That makes sense! We're sharing the same Postgres client backend!
Long-running statements are bad practice, you say. Indeed. Let's do an interactive transaction instead.
On client1 start an interactive transaction with BEGIN.
[client1] postgres=# BEGIN;
BEGIN
[client1] postgres=*#And on client2 try to run a SELECT 1:
[client2] postgres=# select 1;
FATAL: could not get connection from the pool - AllServersDownIt hangs and fails. Let's run a query in client1's transaction and then commit it.
[client1] postgres=# BEGIN;
BEGIN
[client1] postgres=*# SELECT 1;
?column?
----------
1
(1 row)
[client1] postgres=*# COMMIT;
COMMITNow back in client2 once client1's transaction is complete we can access the Postgres client backend again.
[client2] postgres=# SELECT 1;
FATAL: could not get connection from the pool - AllServersDown
-- After client1 commits
[client2] postgres=# SELECT 1;
?column?
----------
1
(1 row)It's quite neat, to me, to play with the behavior when you only allow a single Postgres client backend. But now let's take a look at what happens if we allow more than a single client backend.
pool_size = 2
In the pgcat.toml we overrode above, within the pgcat repo, change the pool_size from 1 to 2. Then Ctrl-c the pgcat server and run it again. Close the two psql sessions and start two new ones. Give them the same prompt prefixes of [client1] and [client2].
See what the current Postgres client backend PID for client1 is. (It will have changed because we restarted the pooler.)
$ PGPASSWORD=$(whoami) ./build/bin/psql -h localhost -p 6432 postgres
psql (18rc1)
Type "help" for help.
postgres=# \set PROMPT1 '[client1] %/%R%x%# '
[client1] postgres=# \set PROMPT2 '[client1] %/%R%x%# '
[client1] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
20809
(1 row)And do the same for client2:
$ PGPASSWORD=$(whoami) ./build/bin/psql -h localhost -p 6432 postgres
psql (18rc1)
Type "help" for help.
postgres=# \set PROMPT2 '[client2] %/%R%x%# '
postgres=# \set PROMPT1 '[client2] %/%R%x%# '
[client2] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
20809
(1 row)So even though we allowed pgcat to use more than one Postgres client backend it's still only using one, interesting. Certainly any observations we make at this point are observations only. How a pooler chooses to implement its pooling policies are likely considered not part of its public API. Still though, let's poke and see what it does.
What happens if we have a blocking client connection now? If we start an interactive transaction on client2:
[client2] postgres=# BEGIN;
BEGIN
[client2] postgres=*#Let's see what its client backend PID is:
[client2] postgres=# BEGIN;
BEGIN
[client2] postgres=*# SELECT pg_backend_pid();
pg_backend_pid
----------------
20809
(1 row)Still the same one as before. Leave this transaction open.
Then on client1 execute a simple SELECT 1:
[client1] postgres=# SELECT 1;
?column?
----------
1
(1 row)And it works! Let's see what client backend PID client1 is running on.
[client1] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
21004
(1 row)Not the same one as before! So since one client was blocking the use of a connection and pool_size allowed for more, pgcat opened a new Postgres client backend that client connections could use.
Now let's commit the transaction on client2 and check what Postgres client backend PID it has after that transaction.
[client2] postgres=# BEGIN;
BEGIN
[client2] postgres=*# SELECT pg_backend_pid();
pg_backend_pid
----------------
20809
(1 row)
[client2] postgres=*# COMMIT;
COMMIT
[client2] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
21004
(1 row)It changed! Let's check it again. And again. And again.
[client2] postgres=# [client2] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
20809
(1 row)
[client2] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
21004
(1 row)
[client2] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
20809
(1 row)And the same on client1.
[client1] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
21004
(1 row)
[client1] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
20809
(1 row)
[client1] postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
21004
(1 row)So basically the logic seems to be that clients will (randomly?) pull from the pool of existing connections and not create new ones until all the existing connections are blocked. Each client may end up running each transaction on an entirely different Postgres client backend every single transaction.
That's pretty neat!
Prepared statements and other advanced things
We've only looked at the absolute basics of connection poolers. And even then I'd like to try out a few more (pgbouncer, odyssey, pgpool-ii, etc.) to see if they all behave the same way or even to get a minimal working environment like I did here for pgcat.
There are many advanced things that the various poolers support (prepared statements in transaction mode) and don't/can't support (LISTEN/NOTIFY in transaction mode), and I'm ignoring these for this post.
What about session pooling?
Connection poolers might support a few types of pooling modes, not just transaction pooling. They might also support session pooling and statement pooling (which I won't get into).
With session pooling, each client connection is mapped to a single Postgres client backend. Just like connecting to Postgres directly. This means things like LISTEN/NOTIFY work without a problem. It's just that rather than allowing the Postgres client backend to exit when the client connection is over, the pooler keeps the Postgres client backend around and reuses it for a future client connection.
Session pooling helps avoid the latency of establishing connections and can even help scale the number of client connections because poolers will queue connections up to a configurable timeout while Postgres will just drop connections immediately once there are max_connections open connections. Not quite as effective for scaling the number of client connections as transaction pooling but still an improvement.
Basically, poolers default to transaction pooling because this mode can handle more connections. Users switch to session pooling when they need features or behavior that isn't possible with transaction pooling.