PostgreSQL Streaming Replication (WAL); What It Is and How to Configure One

pqdbr · 2024-10-12T13:13:21.000000Z

It's a great article, but I've always felt these are missing critical real-world application from the perspective of a full stack dev who also wants to manage their own databases.

- How do I check how many seconds the replica is lagging behind master?

- How would I monitor the replica? A simple cron task that pings a health check if everything is OK (lag is < x) would be a nice start.

And then things get complicated quickly:

- How do I failover to the replica if the primary goes down? There's pgBouncer, repmgr, patroni...

- Should I have it automatically or manually?

- Do I need 2 replicas to avoid the split brain scenario? my brain hurts already.

- After a failover occurs (either automatically or manually), how in the world am I going to configure the primary to be the primary again, and the replica to act as the replica again, going back to the original scenario?

I'd pay to learn this with confidence.

napsterbr · 2024-10-12T13:47:54.000000Z

> Do I need 2 replicas to avoid the split brain scenario? my brain hurts already.

It will hurt even more.

The recommended way is to set up a witness server. Yet another thing to manage in a properly designed Postgres cluster. Certainly not an easy/trivial thing to do, ops-wise.

From [0]:

> By creating a witness server in the same location (data centre) as the primary, if the primary becomes unavailable it's possible for the standby to decide whether it can promote itself without risking a "split brain" scenario: if it can't see either the witness or the primary server, it's likely there's a network-level interruption and it should not promote itself. If it can see the witness but not the primary, this proves there is no network interruption and the primary itself is unavailable, and it can therefore promote itself (and ideally take action to fence the former primary).

An interesting acronym you'll hear is STONITH (in order to fence the former primary).

[0] - https://www.repmgr.org/docs/current/repmgrd-witness-server.h...

nolist_policy · 2024-10-12T14:46:29.000000Z

You want STONITH anyways. It's all necessary complexity in a HA cluster.

klaussilveira · 2024-10-12T18:18:25.000000Z

I am managing my own PostgreSQL cluster with Patroni and, so far, the experience has been a breeze. The initial learning curve is difficult, Patroni docs are not the best, and the community support around it is minimal. Naive questions are usually torpedoed by the maintainer or other experienced devs, which does not foster a good environment for people learning on their own.

Luckily, the Percona Distribution for Postgres includes amazing documentation on how to setup and run Patroni, even if you choose not to use their distribution. I would highly recommend following their step by step: https://docs.percona.com/postgresql/17/solutions/ha-setup-ap...

I have OpenTofu scripts for setting this cluster up, although they might not be useful to you since I am using CloudStack instead of other clouds.

candiddevmike · 2024-10-12T13:56:14.000000Z

> I'd pay to learn this with confidence.

Great, there's a whole industry of PostgreSQL consultants/platforms ready to take your money. You could put on your tinfoil hat and say say PostgreSQL's lack of out of the box, integrated, easy to use HA is by design to make money.

I think most folks use Patroni with some kind of service discovery solution like Kubernetes or Consul and have it abstract all of this for them.

andix · 2024-10-12T17:01:04.000000Z

You’re completely right. You need a full management solution around postgres to make it work, and I wouldn’t recommend building it yourself.

One solution is Kubernetes and one of the many postgres operators. Still not easy as pie.

forinti · 2024-10-12T13:20:14.000000Z

Lag is one little detail that I find lacking in Streaming Replication. If there was no transaction in the last n seconds, it's going to tell you that lag is n s.

Oracle's Dataguard will tell there's no lag, because the databases are the same.

In a development database, lag can get quite high, but it makes no sense to set off an alarm if lag gets too high because it might just be that there was nothing going on.

But the simplicity and reliability of Postgresql I much prefer over Oracle, any day of the week.

WJW · 2024-10-12T13:42:55.000000Z

If it's really a problem, you can always use the pt-heartbeat tool from the percona toolkit: https://docs.percona.com/percona-toolkit/pt-heartbeat.html

Conceptually it is very straightforward: It just updates the only row in a special table to the current time every --interval seconds on the main database, then monitors the value in any followers to see which value they have. Subtract the two timestamps to get replication lag.

forinti · 2024-10-12T13:44:31.000000Z

I'll check it out. Thanks.

himinlomax · 2024-10-12T19:08:16.000000Z

> - How do I check how many seconds the replica is lagging behind master?

> - How would I monitor the replica? A simple cron task that pings a health check if everything is OK (lag is < x) would be a nice start.

No, you should use Patroni. It configures and monitors Postgres' native replication.

https://patroni.readthedocs.io/en/latest/

egnehots · 2024-10-12T14:35:36.000000Z

> How do I check how many seconds the replica is lagging behind the master?

Use PostgreSQL administrative functions, specifically: pg_last_xact_replay_timestamp. (https://www.postgresql.org/docs/current/functions-admin.html...)

> How would I monitor the replica? A simple cron task that pings a health check if everything is OK (lag is < x) would be a good start.

There are many solutions, highly dependent on your context and the scale of your business. Options range from simple cron jobs with email alerts to more sophisticated setups like ELK/EFK, or managed services such as Datadog.

> How do I failover to the replica if the primary goes down?

> Should I handle failover automatically or manually?

> Do I need two replicas to avoid a split-brain scenario? My head hurts already.

While it may be tempting to automate failover with a tool, I strongly recommend manual failover if your business can tolerate some downtime.

This approach allows you to understand why the primary went down, preventing the same issue from affecting the replica. It's often not trivial to restore the primary or convert it to a replica. YOU become the concensus algorithm, the observer, deciding which instance become the primary.

Two scenarios to avoid:

* Falling back to a replica only for it to fail (e.g., due to a full disk).

* Successfully switching over so transparently that you will not notice that you're now running without a replica.

> After a failover (whether automatic or manual), how do I reconfigure the primary to be the primary again, and the replica to be the replica?

It's easier to switch roles and configure the former primary as the new replica. It will then automatically synchronize with the current primary.

You might also want to use the replica for:

* Some read-only queries. However, for long-running queries, you will need to configure the replication delay to avoid timeouts.

* Backups or point-in-time recovery.

If you manage yourself a database, I strongly recommand to gain confidence first in your backups and your ability to restore them quickly. Then you can play with replication, they are tons of little settings to configure (async for perf, large enough wall size to restore quickly, ...).

It's not that hard, but you want to have the confidence and the procedure written down before you have to do it in a production incident.

meow_catrix · 2024-10-12T13:24:50.000000Z

The modern way is to sidestep the issue altogether and use Kubernetes with a database designed to run on Kubernetes. You can get sharding, replication and leader election essentially for free - you can concentrate on using the database instead of running the database.

Compute is really cheap compared to engineering man-hours.

kachapopopow · 2024-10-12T14:58:06.000000Z

https://artifacthub.io/packages/helm/bitnami/postgresql, postgres is fortunately one of them. Downside is that you cannot scale writes.

cpucycling7 · 2024-10-12T13:35:44.000000Z

> The modern way is to sidestep the issue altogether and use Kubernetes

Kubernetes does require quite some time to learn/master. So you could say one replaces one time-consuming issue with another.

remram · 2024-10-12T13:46:13.000000Z

What's "a database designed to run on Kubernetes"? Cassandra?

linuxdude314 · 2024-10-12T18:37:05.000000Z

Pretty sure they were referring to the operators you can install that will stand up production ready replicated clusters.

They work quite well.

fastest963 · 2024-10-12T13:56:15.000000Z

Something like Yugabyte or Cockroach

yrro · 2024-10-12T15:00:40.000000Z

Does the Cloud Native PG operator count?

cheboygan · 2024-10-12T16:23:13.000000Z

+1 on checking out cloud native PostgreSQL operator, or other PG operators like crunchy or zalando or ongres

dionian · 2024-10-12T14:02:12.000000Z

it was a minor pain finding and setting up a postgres operator in k8s, but once i got it going it wasn't too horrible. are these other solutions that are more built for it significantly easier to manage?

slig · 2024-10-12T15:01:21.000000Z

Which one did you end up choosing?

andix · 2024-10-12T16:58:21.000000Z

The only real world easy to use solution for postgres replication I’ve found, are the kubernetes operators. For example CloudnativePG.

It’s not just replication what you need. It’s failover, recovery, monitoring, self-healing, backups, and so on.

Are there any other free/open implementations outside of kubernetes?

klaussilveira · 2024-10-12T18:21:28.000000Z

For the K8S folks: https://stackgres.io

kachapopopow · 2024-10-12T14:57:00.000000Z

I see this as one of the reasons to use kubernetes (& helm).

https://artifacthub.io/packages/helm/bitnami/postgresql

Configures all of this for you with near zero additional configuration required. There's also postgres-ha which handles zero-downtime failover by spawning proxy that handles failures in a specialized way versus just directly forwarding to a psql server.