Hi Railway team, Our production Postgres service on the 'beneficial-prosperity' project will not start. It crashes immediately on every Restart and Redeploy attempt. The lead-crm Node service depends on this DB, so our production CRM at crm.stockboxtech.com is returning 502 Bad Gateway for all users. This started after the platform disruption today (May 20 2026). The Postgres deployment was last successful 3 weeks ago and stopped sometime between then and today. Project details: - Project: beneficial-prosperity - - Project ID: 7278e4dd-22bf-4c2e-b406-10bfd9962ef1 - - Postgres service ID: c3503024-4f50-4195-8d54-88f5d0f49f14 - - Environment ID: 30ef9853-585b-4a44-a834-c4c521ef8a82 - - Region: US East - - Image: ghcr.io/railwayapp-templates/postgres-ssl:18 - - Volume: postgres-volume (vol_7ak6u125zr206y8) - - Public hostname: crm.stockboxtech.com Deploy logs show this repeating pattern every second: Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/6e42bf23-cf5c-4fec-a3aa-76bcee9f44e3/vol_7ak6u125zr206y8 ERROR (catatonit:2): failed to exec pid1: No such file or directory What I've tried: 1. Clicked Restart on the crashed deployment - immediately re-crashes with same error 2. Clicked Redeploy - container starts, then crashes within seconds with same error 3. Both actions repeated multiple times - identical result The 'catatonit: failed to exec pid1' error suggests the container init process cannot locate the entrypoint binary inside the postgres-ssl:18 image. This appears to be an infrastructure-level issue with the container runtime or image pull, not something I can fix from my side. My other Railway projects (intuitive-charm = smartcrm-saas, thorough-wholeness = celesteabode, proud-prosperity = demo, newshop, meshcentral) are all running fine, so this is specific to the Postgres service on beneficial-prosperity. Request: - Please investigate why catatonit cannot find pid1 in this Postgres container - - The data on postgres-volume must be preserved - we have not authorized any volume reset - - If a fresh image pull is needed on your side, please proceed The outage banner on the dashboard was visible earlier today but has now cleared. Even after the platform recovered, Postgres keeps crashing with this same error. Thank you for the help - happy to grant access or share more logs as needed. - Gopal (gopalvserve@gmail.com)

Postgres crashes on every restart: catatonit: failed to exec pid1 (No such file or directory)

gopalvserve-rgb

HOBBYOP

a month ago

Hi Railway team,

Our production Postgres service on the 'beneficial-prosperity' project will not start. It crashes immediately on every Restart and Redeploy attempt. The lead-crm Node service depends on this DB, so our production CRM at crm.stockboxtech.com is returning 502 Bad Gateway for all users.

This started after the platform disruption today (May 20 2026). The Postgres deployment was last successful 3 weeks ago and stopped sometime between then and today.

Project details:

Project: beneficial-prosperity
- Project ID: 7278e4dd-22bf-4c2e-b406-10bfd9962ef1
- Postgres service ID: c3503024-4f50-4195-8d54-88f5d0f49f14
- Environment ID: 30ef9853-585b-4a44-a834-c4c521ef8a82
- Region: US East
- Image: ghcr.io/railwayapp-templates/postgres-ssl:18
- Volume: postgres-volume (vol_7ak6u125zr206y8)
- Public hostname: crm.stockboxtech.com

Deploy logs show this repeating pattern every second:

Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/6e42bf23-cf5c-4fec-a3aa-76bcee9f44e3/vol_7ak6u125zr206y8

ERROR (catatonit:2): failed to exec pid1: No such file or directory

What I've tried:

Clicked Restart on the crashed deployment - immediately re-crashes with same error
Clicked Redeploy - container starts, then crashes within seconds with same error
Both actions repeated multiple times - identical result

The 'catatonit: failed to exec pid1' error suggests the container init process cannot locate the entrypoint binary inside the postgres-ssl:18 image. This appears to be an infrastructure-level issue with the container runtime or image pull, not something I can fix from my side.

My other Railway projects (intuitive-charm = smartcrm-saas, thorough-wholeness = celesteabode, proud-prosperity = demo, newshop, meshcentral) are all running fine, so this is specific to the Postgres service on beneficial-prosperity.

Request:

Please investigate why catatonit cannot find pid1 in this Postgres container
- The data on postgres-volume must be preserved - we have not authorized any volume reset
- If a fresh image pull is needed on your side, please proceed

The outage banner on the dashboard was visible earlier today but has now cleared. Even after the platform recovered, Postgres keeps crashing with this same error.

Thank you for the help - happy to grant access or share more logs as needed.

Gopal (gopalvserve@gmail.com)

Solved

7 Replies

dolevalgam

PRO

a month ago

Same here. Staging DB is fine but Prod isn't. I removed the deployment and now it doesn't let me re deploy!

ruthwikkakumani

PRO

a month ago

Same here having an error

ERROR (catatonit:2): failed to exec pid1: No such file or directory

arguser

FREE

a month ago

Glad to know the Pros already reported this.

ruthwikkakumani

PRO

a month ago

Hey, it's working for me try to redeploy once

coinfox

PRO

a month ago

My Postgres service (id db976043) in project b4e32f6d-b327-465c-9915-4a89f09c766d is crash-looping after the May 19 outage. Deploy log repeats:

ERROR (catatonit:2): failed to exec pid1: No such file or directory

Volume mounts successfully, but the container can't start. This appears to be image/runtime corruption, not just transient. The volume postgres-volume contains the data and is still intact per the mount log. Please restore the container image so Postgres can boot against the existing volume.

Status changed to Awaiting Railway Response Railway • about 1 month ago

sam-a

EMPLOYEE

a month ago

Apologies for this canned message but in an effort to help all our customers get back up and running, we are sending this bulk message. As you may know, we had a major interruption to our services yesterday. We've published a post-mortem if you'd like more information on the incident. It describes what happened and what we are doing to prevent it in the future. We are deeply sorry for the impact that it has had on you.

It is taking some time to bring everything back up, but we are working on it as fast as we can. In general, a redeployment should fix most service issues. Due to the volume of customers redeploying right now, builds and deploys may take longer than normal to process.

You can track recovery status here: https://status.railway.com/incident/KVZ1Z8GY

If you are still having other issues that might be related to the incident you can read more here: https://station.railway.com/community/road-to-recovery-post-gcp-outage-builds-d362e48c

Feel free to respond if your question has not been addressed.

Status changed to Awaiting User Response Railway • about 1 month ago

sam-a

EMPLOYEE

a month ago

You can track recovery status here: https://status.railway.com/incident/KVZ1Z8GY

If you are still having other issues that might be related to the incident you can read more here: https://station.railway.com/community/road-to-recovery-post-gcp-outage-builds-d362e48c

Feel free to respond if your question has not been addressed.

Railway

BOT

a month ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway • 28 days ago

Welcome!