PostgreSQL Connection Poolers

The reason ot use connection poolers is, that it is relative expensive to open a new connection to a PostgreSQL server.

This is due to the architecture of PostgreSQL as each connection is a forked process. This is obviously taking some time, in fact longer than checking the user credentials.

A connection pooler can step in to help to solve this problem. When a client is connecting to PostgreSQL through a connection pooler, the pooler takes care to open connections. It keeps unused connections opened to reuse them. That way the cost and time to open a database connection is substantially reduced.

In addition poolers can queue incoming queries and handle more connections than the database server itself capable to handle.

PostgreSQL itself does not come with a connection pooler.

Available Connections

The maximum amount of connections is configured in the configuration file with max_connections.

But to calculate the connections that are really available, one has also to check the reserved connections for superusers configured in superuser_reserved_connections.

With PostgreSQL 16 arrived a new parameter to reserve connections to certain roles, reserved_connections.

These two reduce the amount of connections being available for regular users.

-- default is 100
show max_connections;

-- default is 3
show superuser_reserved_connections;

-- default is 0
show reserved_connections;

-- calculation, usable in all PostgreSQL versions
SELECT sum(
    CASE name
      WHEN 'max_connections' THEN
        setting::int
      ELSE
        setting::int * (-1)
      END)
  AS available_connections
FROM pg_settings
WHERE name IN
  (
    'max_connections',
    'superuser_reserved_connections',
    'reserved_connections'
  )
;

The formula to calculate available connections for regular users is:

max_connections - superuser_reserved_connections - reserved_connections

Client Side Connection Pooler

There are several client side connection pooling ORMs arround. Nearly every programming language has minimum one of them on the list.

These poolers have all one thing in common: They only pool connections for one application, whether it is a web server, or a desktop client.

They also cannot handle more connections than what is configured in PostgreSQL itself.

For example your web application is load balanced on several servers, virtual machines, or Kubernetes pods. In all of these cases you have to divide the available connections by the amount of servers, virtual machines, or Kubernetes pods.
Ohterwise the application will run out of connections at some point and will throw errors.

Server Side Connection Poolers

The advantage of server side pooling it doesn’t matter, which client is accessing the pooler, nor from which other hardware the connection is established.

Usually connection poolers are installed on different servers or virtual machines. The hardware requirements differ by tool and database i/o.
In Kubernetes they are installed in their own pods.

All three presented conection pooling tools are in active development.

PgBouncer

PgBouncer is licensed under ISC License.

This pooler is very lightweight as it is only handling connection pooling.

There are different pool modes available:

session: The connection is reusued after it has been closed
transaction: The connection is reused after each transaction
statement: The connection is reused after each query

To get the most out of it, consider transaction as pool mode. Transactions can be useful with several statements, that will be either, submitted or rolled back.

PgBouncer is used by all relevant Kubernetes PostgreSQL operators like CloudNativePG, PGO by CrunchyData, Postgres Operator by Zalando.

Pgpool II

Pgpool II is licensed under pgpool-II License.

It is not that lightweight as it does also handle other things, like load balancing, failover, and query caching.

In an environment with PostgreSQL replications, load balancing Pgpool II is using the primary PostgreSQL server for INSERT and UPDATE queries, but SELECT queries are automatically forwarded to followers.
All queries that are calling functions and procedures will always use the primary server. It cannot take for granted, that a function or procedure will not change data.

Pgpool II supports transactions, but be aware that the timeout configuration affects also running transactions.

There are Kubernetes operators to install Pgpool II Kubernetes. But one needs to implement it with either, other PostgreSQL operators or write one from scratch.
Only CloudNativePG has an interface to implement a connection pooler, currently only used for PgBouncer.

Odyssey

Odyssey is licensed under BSD-3-Clause license.

It is used mostly in very high load environments.

As of now I don’t know any Kubernetes operator for or Kubernetes implementation of Odyssey.

Odyssey has two pooling options per configuration, session and transaction.
If there are more requests than what is available. the clients will be put into a wait queue.

It is important to change the configuration to demonize yes, otherwise the pooler will not start automatically.

Client Side Plus Server Side poolers

This covers having a connection pooler running in an application and one of the three mentioned above on a server between your application and PostgreSQL.

Session Based Connections

Session based connections are available for PgBouncer and Odissey.

When the connection pooler in the application is keeping the database connection open. In that case the server side pooler has to be configured to session. But the disadvantage is, that there are only as much connections available, as shown in Available Connections.

The only advantage here is, that the connection pooler is between the application and the database.

Transaction Based Connections

Transactions are supported by PgBouncer when configured to mode transaction, Pgpool II always, and Odyssey.

With transaction based connections statements can be rolled back or committed. That is the same behaviour as running directly against the database.
For example a simple update statement without explicit begin/end is directly committed after it is finished.

The pooler is keeping the connection to the database and the connection can be reused by the next incoming transaction. In combination with queing incoming transactions the pooler can offer more connections than what the database itself can handle.

Statement Based Connections

Statement based connections are only available for PgBouncer.

The pooler is forwarding the connection to the next incoming query as soon, as the current query is finished. There is no rollback possible, changes are applied immediately.

How To Calculate Pool sizes

Pool sizes have to be calculated for each databases by keeping in mind, how many ressources have already been used by other pools.

To get the actual usage of the databases one can query pg_stat_database. The data is available since PostgreSQL 14.

active_time is the time spend to execute queries.
idle_in_transaction_time is the time spend doing nothing within a transaction

SELECT datname AS database_name
	, active_time / (active_time + idle_in_transaction_time) AS busy
FROM pg_stat_database
WHERE active_time > 0
;

Author: Stefanie Janine Stölting