Backup restore infinite loading
askara
PROOP

9 months ago

I attempted restoring database backup, but it is now showing "Restoring from backup..." for 2 hours now.

UI "Restore" buttons are disabled now. There is nothing I can do currently, I can't cancel or try again.

Can you please restore our database https://railway.com/project/f314bc41-e76d-4834-bdf6-180b61b1712a/service/89b56dbf-4964-4c4f-be7d-6eb2d832151b/data?environmentId=4d4d389c-8360-4211-9746-6ab65cfa6d8c to a Railway backup labeled "2025-06-17 at 04:50 UTC".

Thank you,

Antonio

Solved

5 Replies

9 months ago

I see a couple volumes on the canvas. The restore user experience needs some definite work and we will do better here.

Please let me know if you're still blocked happy to look into this


Status changed to Awaiting User Response Railway 9 months ago


jake

I see a couple volumes on the canvas. The restore user experience needs some definite work and we will do better here.Please let me know if you're still blocked happy to look into this

askara
PROOP

9 months ago

Backup restoring "loading" process eventually finished, and it created a volume on canvas labeled "2025-06-17 at 04:50". The process didn't ended up creating a "Stage Changes" against "Postgres" service, and the volume suspiciously looked empty. I then clicked again on "Restore" on the same backup labeled "2025-06-17 at 04:50 UTC", I got some error notification (don't remember text), but it kept loading, and it eventually created another volume, now labeled "2025-06-17 at 04:50-6WXN". The same occurred; no staged changes and volume looking suspiciously empty.

At that point I did not wanted to proceed with mounting the volume on "Postgres" service, as I am afraid I may be mounting empty or corrupted volume, and loosing all my backups, and that I want to avoid.

I created new Postgres service named "Postgres-Main-Alt" for volume testing purposes. I mounted the volume "2025-06-17 at 04:50-6WXN" on "Postgres-Main-Alt" service. The volume in UI looks like it is empty.
I can't establish connection to this database; error "[28P01] FATAL: password authentication failed for user "postgres"", and Railway UI also can't establish connection to this database, the UI spinner "Database Connection" spins indefinitely with API network error of "password authentication failed for user \"postgres\". So, I can't test this volume, but so far it seems like a good decision not to mount this volume on my production "Postgres" service.

I then clicked "Restore" on older backup labeled "2025-06-16 at 04:50 UTC", and contrary to the previous attempts, this action did resulted in staged changes created for "Postgres" service, waiting for me to click "Deploy", which I am not comfortable doing at this point, as this volume is not on canvas as previous restores ended up being, it is attached to the "Postgres" service in the UI and it also looks empty as previous volumes. I don't feel confident proceeding with this backup restore, and I can't test it by attaching it to the test service "Postgres-Main-Alt" (or other) since it is not on canvas but attached to the main service "Postgres".

Could you please shed some light on this situation, I am not sure how to proceed.

The service did get automatically upgraded to Railway Metal today (after failed automatic upgrade), and I deployed some app changes today. I didn't inspect the app in between those, but after deployment I noticed the database is in weird state, has several months old data for some rows and tables, some rows are missing, but it also does appear to have some newer rows. Very odd, didn't made a full inspection to better understand the state.
This was my initial reason to do a backup restore.

Thank you in advance!


Status changed to Awaiting Railway Response Railway 9 months ago


9 months ago

What looks to have happened is we migrated the deployment successfully 11 hours ago, and now we've gone ahead and swapped in a few different backups like, 5 hours ago

My best recommended course of action would be:

  • Figure out which ones are restores (looks like the ones with dates)

  • Discard the current staged change (which looks to be swapping in a restore)

  • Figure out what to do with pgdata and volcano-volume

It looks like you've really twisted up the deployment here. What were you trying to solve? What was the issue?


Status changed to Awaiting User Response Railway 9 months ago


askara
PROOP

9 months ago

Thanks for the follow-up.

To clarify:
My original goal was to restore the database to a known good state because, after the Railway Metal migration and a deploy, I noticed critical data inconsistencies. Some row values reflect a prior version of the data, from several months ago - suggesting possible rollback or partial data loss. I was deploying regular app updates with db migration for index updates. A code deployment alone wouldn't roll back persisted database changes made months ago, there was something more to it.

For this reason, I have attempted to initiate a restore from the backup labeled "2025-06-17 at 04:50 UTC", but no restore was ever actually completed. The process:

  • Got stuck on "Restoring from backup..." for over 2 hours

  • Created canvas volumes that looked empty

  • Did not trigger any UI staged changes

  • Returned errors or authentication failures when I mounted to a test Postgres service

Out of caution, I’ve not deployed any of these restore attempts to the production service.

What I need now:

  1. I’d appreciate help confirming whether the backups labeled '2025-06-17 at 04:50' and '2025-06-16 at 04:50' are complete and valid. From what I see in the UI, they appear empty or incomplete, but I’m not familiar enough with the backup interface to determine that with confidence.

  2. I need a reliable way to safely test a backup within the constraints of the Railway UI, without risking corruption or loss of the original backups.

Appreciate your help - just trying to avoid overwriting production with a broken or empty volume, and loosing all other backups.

Thank you


Status changed to Awaiting Railway Response Railway 9 months ago


askara
PROOP

9 months ago

I restored external AWS backup instead of using Railway backup.


Status changed to Solved askara 9 months ago


Loading...