AI agent replaced my data volume with empty one - original vol missing
siriuswitje
HOBBYOP

24 days ago

During recovery attempts today after the May 19 incident, Railway's AI agent

created and attached a new empty volume to my Postgres service, replacing my

original volume which contained 433 MiB of data.

Can you please confirm my original volume still exists and reattach it?

Project: responsible-forgiveness

Service: Postgres

Original volume ID: vol_8nc40nhuv4kug67p

New empty volume ID: vol_evxob2zpespqk3yx

Last clean checkpoint: 2026-05-20 08:12:58 UTC

Project: responsible-forgiveness

Service: Postgres

During the incident on 2026-05-20 (~09:00-11:00 UTC), my Postgres volume was replaced:

Old volume ID: vol_8nc40nhuv4kug67p

  • Size: 433 MiB
  • Status: Missing/replaced

New volume ID: d9cdb01f-22d3-44c3-a75c-9e5befbf9bc7

  • Size: 500 MiB
  • Status: Empty

Last clean database checkpoint: 2026-05-12 12:59 UTC

The data is production critical. Can Railway support help restore from a backup or provide access to the old volume?

Awaiting Railway Response

9 Replies

Status changed to Awaiting Railway Response Railway 24 days ago


sam-a
EMPLOYEE

24 days ago

Apologies for this canned message but in an effort to help all our customers get back up and running, we are sending this bulk message. As you may know, we had a major interruption to our services yesterday. We've published a post-mortem if you'd like more information on the incident. It describes what happened and what we are doing to prevent it in the future. We are deeply sorry for the impact that it has had on you.

It is taking some time to bring everything back up, but we are working on it as fast as we can. In general, a redeployment should fix most service issues. Due to the volume of customers redeploying right now, builds and deploys may take longer than normal to process.

You can track recovery status here: https://status.railway.com/incident/KVZ1Z8GY

If you are still having other issues that might be related to the incident you can read more here: https://station.railway.com/community/road-to-recovery-post-gcp-outage-builds-d362e48c

Feel free to respond if your question has not been addressed.


Status changed to Awaiting User Response Railway 24 days ago


sam-a

Apologies for this canned message but in an effort to help all our customers get back up and running, we are sending this bulk message. As you may know, we had a major interruption to our services yesterday. [We've published a post-mortem if you'd like more information on the incident](https://blog.railway.com/p/incident-report-may-19-2026-gcp-account-outage). It describes what happened and what we are doing to prevent it in the future. We are deeply sorry for the impact that it has had on you. It is taking some time to bring everything back up, but we are working on it as fast as we can. In general, a redeployment should fix most service issues. Due to the volume of customers redeploying right now, builds and deploys may take longer than normal to process. You can track recovery status here: https://status.railway.com/incident/KVZ1Z8GY If you are still having other issues that might be related to the incident you can read more here: https://station.railway.com/community/road-to-recovery-post-gcp-outage-builds-d362e48c Feel free to respond if your question has not been addressed.

siriuswitje
HOBBYOP

23 days ago

My Postgres is still in a crash loop with WAL corruption (PANIC: could not locate valid checkpoint record). I have not been able to connect at all. Please move my volume to a healthy node.

Project ID: responsible-forgiveness

Service: Postgres

Volume: vol_8nc40nhuv4kug67p


Status changed to Awaiting Railway Response Railway 23 days ago


chandrika
EMPLOYEE

23 days ago

Hey, we've escalated this internally for investigation. We'll follow up when we have more information. Sorry for the wait.


Status changed to Awaiting User Response Railway 23 days ago


chandrika

Hey, we've escalated this internally for investigation. We'll follow up when we have more information. Sorry for the wait.

siriuswitje
HOBBYOP

23 days ago

Hey thank you! Hopefully it can be resolved soon, I have an important week where a lot of people should be able to visit my website.


Status changed to Awaiting Railway Response Railway 23 days ago


chandrika

Hey, we've escalated this internally for investigation. We'll follow up when we have more information. Sorry for the wait.

siriuswitje
HOBBYOP

21 days ago

Hey, it's been 3 days since my website went down, and 2 days since this was escalated with still no resolution or update. This is directly costing me customers during a critical week. I really need either a fix or a concrete timeline today, otherwise I'll have to look at alternatives. Please prioritize this.


chandrika

Hey, we've escalated this internally for investigation. We'll follow up when we have more information. Sorry for the wait.

siriuswitje
HOBBYOP

17 days ago

Update: I've reviewed the startup logs and can see pgbackrest is configured with a 433 MiB volume. The last known good state was 2026-05-20 08:12:58 UTC before the WAL corruption occurred. Could you please perform a pgbackrest restore to that point? This has now been down for 7 days with no update.


16 days ago

Really sorry for the wait. Your 433 MiB of data is confirmed still on the volume - the issue is WAL corruption from the May 19 service disruption, not missing data. Before attempting recovery, please take a manual backup of your Postgres volume first - go to your Postgres service's Backups tab and create one. That way you have a safety net. Once the backup completes, you can fix the corruption by going to your Postgres service settings and setting the custom start command to sh -c "pg_resetwal -D /var/lib/postgresql/data/pgdata", then redeploy. Your earlier recovery attempts failed because cd isn't a standalone executable in containers - wrapping the command in sh -c avoids that. Once the redeploy succeeds and logs show the WAL was reset, remove the custom start command and redeploy again so Postgres starts normally. Note that pg_resetwal may lose any transactions that were in-flight at the moment of the crash (08:12:58 UTC), but the rest of your data will be intact.


Status changed to Awaiting User Response Railway 16 days ago


mykal

Really sorry for the wait. Your 433 MiB of data is confirmed still on the volume - the issue is WAL corruption from the May 19 service disruption, not missing data. Before attempting recovery, please take a manual backup of your Postgres volume first - go to your Postgres service's Backups tab and create one. That way you have a safety net. Once the backup completes, you can fix the corruption by going to your Postgres service settings and setting the custom start command to `sh -c "pg_resetwal -D /var/lib/postgresql/data/pgdata"`, then redeploy. Your earlier recovery attempts failed because `cd` isn't a standalone executable in containers - wrapping the command in `sh -c` avoids that. Once the redeploy succeeds and logs show the WAL was reset, remove the custom start command and redeploy again so Postgres starts normally. Note that pg_resetwal may lose any transactions that were in-flight at the moment of the crash (08:12:58 UTC), but the rest of your data will be intact.

siriuswitje
HOBBYOP

15 days ago

hHey, thank you for your answer! I cannot make a backup since I am on the hobby plan and not the pro plan. How should I proceed?


Status changed to Awaiting Railway Response Railway 15 days ago


mykal

Really sorry for the wait. Your 433 MiB of data is confirmed still on the volume - the issue is WAL corruption from the May 19 service disruption, not missing data. Before attempting recovery, please take a manual backup of your Postgres volume first - go to your Postgres service's Backups tab and create one. That way you have a safety net. Once the backup completes, you can fix the corruption by going to your Postgres service settings and setting the custom start command to `sh -c "pg_resetwal -D /var/lib/postgresql/data/pgdata"`, then redeploy. Your earlier recovery attempts failed because `cd` isn't a standalone executable in containers - wrapping the command in `sh -c` avoids that. Once the redeploy succeeds and logs show the WAL was reset, remove the custom start command and redeploy again so Postgres starts normally. Note that pg_resetwal may lose any transactions that were in-flight at the moment of the crash (08:12:58 UTC), but the rest of your data will be intact.

siriuswitje
HOBBYOP

15 days ago

Also, the start command does not work since it needs to be performed by postgres superuser and not root.


Welcome!

Sign in to your Railway account to join the conversation.

Loading...