postgres crash-looping since outage

wingrover

HOBBYOP

a month ago

Project: Socket Study

Service: Postgres (production)

Symptom: Postgres container crash-looping with failed to exec pid1: No such file or directory. Logs show the entrypoint binary can't be found, and looks like the container image is broken post-outage, not a config issue on my end.

I restarted the Postgres service once during the outage recovery (around 04:13 UTC). That's likely when the broken container was pulled. Have not touched it since.

App-side impact: App deploys fail at the pre-deploy migration step with psycopg2.OperationalError: connection to server at "postgres.railway.internal" ... Connection timed out. App code/config is fine — same commit deployed successfully on May 15.

Please help recover the Postgres container with the existing volume attached — do not provision fresh. Data should be intact on the volume; just need a working container to mount it.

Happy to share project ID / logs / anything else needed. Day before a major exam for my users, so any help appreciated. Thanks 🙏

Solved

7 Replies

Railway

BOT

a month ago

Your Postgres service appears to have been affected by the service disruption on May 20, which impacted the Image Registry during the window when you restarted. The incident is now resolved, so redeploying your Postgres service from the service's three-dot menu should pull a clean image and start it against your existing volume - your data will remain intact on the volume.

Status changed to Awaiting User Response Railway • about 1 month ago

Railway

Your Postgres service appears to have been affected by the [service disruption on May 20](https://status.railway.com/incident/I23M92U0), which impacted the Image Registry during the window when you restarted. The incident is now resolved, so redeploying your Postgres service from the service's three-dot menu should pull a clean image and start it against your existing volume - your data will remain intact on the volume.

wingrover

HOBBYOP

a month ago

i've done that a couple times and this is when the above issue arises

Status changed to Awaiting Railway Response Railway • about 1 month ago

wingrover

HOBBYOP

a month ago

i really need help here - i have been out of action since initial downtime earlier today

simenrom

FREE

a month ago

Same for me. Restarts just makes the same crash. When I try to redeploy I get "Problem processing request" error.

airwingwebdesignseo

PRO

a month ago

Same issue here. After attempting to redeploy the DB I keep getting:

Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/XXX/vol_XXX

ERROR (catatonit:2): failed to exec pid1: No such file or directory

wingrover

HOBBYOP

a month ago

back online now after updating postgres from 18 > 18.4

chandrika

EMPLOYEE

a month ago

Glad you're back online! Updating the Postgres version works because it forces a fresh image pull instead of reusing the corrupted one from the outage. For the others still stuck, updating your Postgres service's version from 18 to 18.4 in the service settings should resolve the crash-loop the same way.

Status changed to Awaiting User Response Railway • about 1 month ago

Status changed to Solved chandrika • about 1 month ago

Welcome!