Patroni replica health check spew in logs after Postgres HA upgrade
mcassano
PROOP

2 months ago

Ever since upgrading to the Postgres HA option, I am getting this spew in my logs:

{"message":"\u001b[2m2026-03-24T13:57:04.777280Z\u001b[0m \u001b[32m INFO\u001b[0m Primary check: OK (via Patroni fallback)","severity":"info","attributes":{"level":"info"},"timestamp":"2026-03-24T13:57:04.840537606Z"}
{"message":"\u001b[2m2026-03-24T13:57:04.876602Z\u001b[0m \u001b[32m INFO\u001b[0m Replica check: PostgreSQL unreachable, falling back to Patroni \u001b[3merror\u001b[0m\u001b[2m=\u001b[0mFailed to connect to PostgreSQL","severity":"info","attributes":{"level":"info"},"timestamp":"2026-03-24T13:57:04.879488336Z"}
{"message":"\u001b[2m2026-03-24T13:57:04.892415Z\u001b[0m \u001b[32m INFO\u001b[0m Replica check: FAIL (via Patroni fallback)","severity":"info","attributes":{"level":"info"},"timestamp":"2026-03-24T13:57:04.901749790Z"}
{"message":"\u001b[2m2026-03-24T13:57:04.949706Z\u001b[0m \u001b[32m INFO\u001b[0m Replica check: PostgreSQL unreachable, falling back to Patroni \u001b[3merror\u001b[0m\u001b[2m=\u001b[0mFailed to connect to PostgreSQL","severity":"info","attributes":{"level":"info"},"timestamp":"2026-03-24T13:57:04.965695918Z"}

It is harmless which is great, but points to an issue. Are you seeing this in other replica setups? I haven't touched any of the Postgres HA options since upgrading from single instance Postgres to the HA option.

Solved

3 Replies

Status changed to Awaiting Railway Response Railway about 2 months ago


mcassano
PROOP

2 months ago

To provide context, I get 39k of these messages every 30 minutes in the logs. That's a lot.


Status changed to Awaiting User Response Railway about 2 months ago


2 months ago

Hi, thank you for reporting this. Basically your source Postgres is 2 years old and had a different env var configuration. PGHOST should be ${{RAILWAY_PRIVATE_DOMAIN}}, not ${{RAILWAY_TCP_PROXY_DOMAIN}}.

Am working on a fix so future conversions don't run into this issue, but you may want to update your PGHOST to ${{RAILWAY_PRIVATE_DOMAIN}}.

These logs are not a big deal, it means it's failing to do the fast liveness check and falling back to a slighly slower one (which is perfectly fine). But updating the env var will make the logs go away.


paulo

Hi, thank you for reporting this. Basically your source Postgres is 2 years old and had a different env var configuration. `PGHOST` should be `${{RAILWAY_PRIVATE_DOMAIN}}`, not `${{RAILWAY_TCP_PROXY_DOMAIN}}`. Am working on a fix so future conversions don't run into this issue, but you may want to update your `PGHOST` to `${{RAILWAY_PRIVATE_DOMAIN}}`. These logs are not a big deal, it means it's failing to do the fast liveness check and falling back to a slighly slower one (which is perfectly fine). But updating the env var will make the logs go away.

mcassano
PROOP

2 months ago

Very good, thank you very much for the investigation and findings.

I changed PGHOST as your recommended (just on Postgres, Postges 2 and Postgres 3 didn't carry over vars from before HA) and the logs are clean now.


Status changed to Awaiting Railway Response Railway about 2 months ago


Status changed to Solved mcassano about 2 months ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...