Deployments fail healthchecks intermittently due to database connection failures
xevion
HOBBYOP

24 days ago

I'm running Axum/Svelte/Postgres/sqlx for my personal website.
This is what my database connection logs look like (6 more attempts after this before the container is killed since the healthcheck fails).
It's quite intermittent and always goes 1 of 2 ways: connects immediately (healthcheck succeeds on first try) OR fails since healthcheck never returns 200 OK.
The issue is always the database connection never appearing to resolve.

{
  "message": "Database connection failed, retrying...",
  "attributes": {
    "attempt": 1,
    "delay_secs": 1,
    "error": "pool timed out while waiting for an open connection",
    "level": "warn",
    "max_attempts": 10,
    "target": "xevion::db",
    "timestamp": "2026-01-14T18:28:59.926786936Z"
  },
  "tags": {
    "project": "4f95bcc3-aeda-4bd6-9345-bee985903c94",
    "environment": "0efba50e-21aa-41aa-b594-6d532c39ccfd",
    "service": "36ae7bd0-8406-42e9-bb51-ba96d9f7261e",
    "deployment": "ae911cf1-2afc-48c9-a73a-5155f15e35d7",
    "replica": "6110a7fc-efd5-41bc-a87d-bb135f61650d"
  },
  "timestamp": "2026-01-14T18:28:59.929080955Z"
}

16 Replies

xevion
HOBBYOP

24 days ago

Project ID: 4f95bcc3-aeda-4bd6-9345-bee985903c94


xevion
HOBBYOP

24 days ago

Fully open source. Some key files:
Dockerfile:
async fn createpool(databaseurl: &str):


xevion
HOBBYOP

24 days ago

Database is completely fine, and re-deploying will often resolve it. It's just a random roll of the dice whether it works on the first try or not. Which screams race condition, meaning I must be doing something wrong.


xevion
HOBBYOP

24 days ago

I will note I'm on an Alpine image, which I recall once upon a time was problematic with Railway, but I can't find anything on the docs about it 'still' being problematic, so I assume I'm fine there.


xevion
HOBBYOP

24 days ago

Also, why is the edge used from asia? railway/asia-southeast1-eqsg3a??


xevion
HOBBYOP

24 days ago

I set my deployment for US East


brody
EMPLOYEE

24 days ago

It's an anycast network.


xevion
HOBBYOP

24 days ago

I assume the edge being used here has something to do with with my request takes 3.5 seconds

1461082828560404500


xevion
HOBBYOP

24 days ago

Note that it was fine earlier, so I'm wondering what changed


brody
EMPLOYEE

24 days ago

Are you geographically close to Asia?


xevion
HOBBYOP

24 days ago

Not at all. I live in Texas. I also migrated both database/server services in Railway to US West; but the edge used is still Asia


brody
EMPLOYEE

24 days ago

VPN? Cloudflare?


xevion
HOBBYOP

24 days ago

VPN no, Cloudflare yes


brody
EMPLOYEE

24 days ago

What happens if you disable Cloudflare?


xevion
HOBBYOP

24 days ago

Unsure, because it takes time to propagate. But the generated domain resolves in 370ms via a us-east4 edge.


brody
EMPLOYEE

24 days ago

So then it's the routing between the Cloudflare PoP and our network.


Loading...