16 days ago
Our production PostgreSQL service on Railway stopped starting unexpectedly.
Service details:
Project: RIP-Tear-api
Environment: production
Service: Postgres
Source image: ghcr.io/railwayapp-templates/postgres-ssl:17
Current behavior:
The Postgres service is stuck in a crash loop.
Deploy logs show:
Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/.../vol_...
ERROR (catatonit:2): failed to exec pid1: No such file or directory
The volume still exists and is attached to the service.
There are no backups available in the Backups tab.
Important context:
We previously saw some application-level SQL errors caused by invalid UUID input from our API, but those were normal query errors and should not prevent PostgreSQL from starting.
The current issue appears to happen before PostgreSQL itself starts, during container startup.
We do not have a custom start command configured for this Postgres service.
The service still points to the default Railway Postgres image and variables look default.
What we need help with:
Please help recover data from the existing volume.
If possible, please reattach or migrate the existing volume to a healthy PostgreSQL instance.
If this is a broken deployment/image issue, please advise the safest recovery path without losing the current volume data.
This is a production database, so we want to avoid any action that could destroy the existing volume or make recovery harder.
2 Replies
16 days ago
The catatonit: failed to exec pid1 error is a known issue caused by a stale container image, typically after host-level disruptions, and does not indicate volume data loss. To resolve it, open your Postgres service, use the command palette (Cmd/Ctrl+K), and select "Redeploy source image" to re-pull a fresh image. A normal redeploy from the three-dot menu will not work because it reuses the cached image. If the service does not recover after that, let us know and we can look into the volume directly.
Status changed to Awaiting User Response Railway • 16 days ago
Railway
The `catatonit: failed to exec pid1` error is a known issue caused by a stale container image, typically after host-level disruptions, and does not indicate volume data loss. To resolve it, open your Postgres service, use the command palette (Cmd/Ctrl+K), and select "Redeploy source image" to re-pull a fresh image. A normal redeploy from the three-dot menu will not work because it reuses the cached image. If the service does not recover after that, let us know and we can look into the volume directly.
15 days ago
Thanks it work!
Status changed to Awaiting Railway Response Railway • 15 days ago
Status changed to Solved flamerevenge • 15 days ago