a month ago
🚨 URGENT: Production Postgres stuck in catatonit crash loop after May 19 incident
Project: wholesome-unity / production
Service: Postgres
Volume: vol_rhvs1pn63q838ifk
Project URL: https://railway.com/project/aa57fc91-f60b-4a74-8c35-1fca90366964/service/46ebb7f5-d1b1-4544-8752-01c6316b2e5f
Status banner now says "fix pushed", but my Postgres service has been crash-looping for hours with:
Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/.../vol_rhvs1pn63q838ifk
ERROR (catatonit:2): failed to exec pid1: No such file or directory
Another Postgres in the same project (quality-label-db) is Online — only this one is stuck. Volume mounts fine, but the container image cannot exec its entrypoint. Looks like the image was not properly re-pulled after the GCP incident.
Please rebuild/repull the image for this service. Do NOT delete the volume — production data is on it. This is a production database for a Japanese client and I need it back ASAP. Happy to provide any further info.
1 Replies
a month ago
Postgres crash loop after Image Registry outage
If your Postgres service is stuck in a crash loop after the recent Railway incident and shows no new logs after
restarting, here's what happened and how to fix it:
What happened:
Railway had an outage affecting their Image Registry. Services that needed to restart couldn't pull their Docker
image and got stuck in a crash loop. Your data is safe — the volume was not affected.
Symptoms:
- Postgres service shows "Crashed" and won't restart
- No new logs appear after clicking Restart
- Backend gets error: P1001: Can't reach database server
- Last logs are from before the incident date
Fix:
- Go to your Postgres service in Railway
- Click the Deployments tab
- Find the latest failed deployment
- Click the three dots (⋯) next to it
- Select Redeploy
This forces Railway to pull a fresh copy of the Postgres image from the registry, which is now stable again. Your
data will be intact.
- Once Postgres is back to Active (green), go to your backend service → Deployments → Redeploy as well.
Note: Use Redeploy from the Deployments tab, not the Restart button — Restart won't pull a fresh image.