Postgres volume mount failure on redeploy — Infrastructure Error (deployment 2d490562)
louisdurden
HOBBYOP

3 hours ago

Project: 7cf16c71-ff58-4a9d-8669-9bb569068d7e

Service: Postgres-tKZj (environment: production)

Failed deployment: 2d490562-7709-4cef-9a57-454cb2f72dd4 (2026-07-02 ~10:45 UTC, reason "redeploy")

Redeploying the SAME image and digest that has been running since May 23 (ghcr.io/railwayapp-templates/postgres-ssl:18 @ sha256:db079cbb84096338c227efac952c9ea91ff7491b7d97975cf854affcf86ea5d0) failed at "Deploy > Create container" after 11:44 min.

The platform Diagnosis says: "This deployment failed because the platform could not mount the persistent volume to the Postgres container. The system retried 10 times over 11 minutes before giving up."

Volume: postgres-volume-pDmN. The previous deployment (969c3f9d, from May 23) is still Online and serving traffic normally.

Questions:

  1. What is the root cause - is the node/volume healthy?
  2. Is it safe to retry the redeploy? We want to enable PITR next, which we understand may trigger a redeploy of this service.
  3. Possibly related: the same day, long-lived pg_dump streams through the public TCP proxy (shortline.proxy.rlwy.net:47945) were dropped at ~84 s unless TCP keepalives are set (works fine with keepalives=1&keepalives_idle=15).

This database backs a production system, so we would like to know whether to expect instability before we retry anything.

Solved

2 Replies

Railway
BOT

3 hours ago

Apologies for the trouble. We had an incident affecting US East, US West, and network traffic on July 2 between 07:44 and 12:01 UTC, which has since been resolved. Your volume mount failure at ~10:45 UTC and the TCP proxy connection drops both fall within that window and are almost certainly related. It should be safe to retry the redeploy now, including enabling PITR. You can see the full timeline here: incident details. Let us know if the issue persists on retry.


Status changed to Awaiting User Response Railway about 3 hours ago


louisdurden
HOBBYOP

2 hours ago

Confirming resolution: we retried after the incident window closed. The redeploy succeeded on the first attempt (deployment b94ce9b4, SUCCESS) and PITR is now enabled - WAL archiving confirmed flowing to the bucket, recovery window populating. The pg_dump TCP proxy drops we mentioned also fall inside the incident window, so we consider all three questions answered. Thanks for the quick response.


Status changed to Awaiting Railway Response Railway about 2 hours ago


Status changed to Solved Railway about 2 hours ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...