20 days ago
Important clarification:
The database data was already missing before we attempted any volume swap.
Timeline:
-
Nobody from our team deployed, restored, deleted, migrated, or intentionally modified the production database.
-
We noticed the application was failing and immediately checked the Railway PostgreSQL database.
-
At that moment, the database was already missing data. Many tables were empty/missing and the DB size was only around 14 MB, while previous production volume/backups were around 1.15 GB.
-
We then saw multiple PostgreSQL volumes/snapshots in the Railway UI, including:
- postgres-volume
- postgres-2026-03-30 18:33 UTC
- postgres-2026-03-28 07:10 UTC
-
Because the production DB was already missing data, we tried switching/attaching another volume to check whether the original data was still there.
-
After trying the volume swap, the DB still did not show the expected production data.
-
We then attempted to switch back / reattach the previous volume, but Railway now blocks the operation with:
Service already has a volume attached in this environment. A service can only have one volume.
So the critical point is:
The data loss happened first.
The volume swap attempt was only a recovery attempt after we had already confirmed the production DB was missing data.
Please investigate the original PostgreSQL volume state before our recovery attempts and help us restore or reattach the correct production volume/snapshot.
15 Replies
Status changed to Awaiting Railway Response Railway • 20 days ago
20 days ago
URGENT UPDATE:
It has been several hours and our entire company's operations are completely paralyzed. This is a severe production outage.
We are a paying PRO customer and this data loss/volume conflict issue is causing catastrophic business disruption for us. We have also escalated this via email to team@railway.com.
Could a Level 3 Infrastructure Engineer please look into our physical volume immediately? We desperately need to recover our data from your internal platform snapshot from MAY 24 before it is permanently overwritten or garbage-collected by your system.
Please advise immediately. Every minute counts!
20 days ago
Adding a +1 to this issue. >
I am facing the exact same catastrophic data loss today with my Postgres database. Absolutely no deployments or manual changes were made from our side. The database just woke up completely empty (logs show normal checkpoints but with almost 0 data, acting like a brand-new blank volume).
this appears to be a broader platform issue affecting persistent volumes silently. Our production ERP system is completely paralyzed right now. Could you please escalate this urgently and help us restore the connection to our original volumes?
20 days ago
+1, we are also experiencing the same issue with our Postgres database. No deployments or infrastructure changes were made from our side before the data suddenly disappeared. This looks like a serious persistent volume problem on the platform side. Please investigate urgently.
20 days ago
We are seeing the same behavior on our side. The database restarted with an empty state even though no actions were taken from our infrastructure team. This appears to be related to persistent storage/volume handling on the platform. Any urgent update would be greatly appreciated.
20 days ago
EXPLICIT PERMISSION TO RESTORE:
Dear Railway Engineering Team,
We are in the GMT+7 timezone and our team will be offline for the next few hours (night time).
To minimize our severe downtime, if you are able to locate our lost volume or the internal platform snapshot from MAY 24, 2026, you have our EXPLICIT AND FULL PERMISSION to restore/reattach it immediately to our production Postgres service.
You do NOT need to wait for our confirmation to proceed with the restoration. Please just do it and let us know. Thank you!
20 days ago
Additional Note on Snapshot Timing:
For the restoration, the absolute best/most complete snapshot would be around May 24 at 23:00 UTC (which is May 25 at 6:00 AM in our GMT+7 timezone - just 1 hour before the incident).
However, if you cannot find a snapshot at that exact time, ANY recent snapshot from the last 1 to 3 days is perfectly acceptable. Please just restore the most recent valid snapshot you have prior to the incident. Thank you!
20 days ago
Your data is not lost. Your original postgres-volume is still attached to the "Postgres" service and holds ~1.15 GB, which matches your expected production size. The service your application is currently querying is a different Postgres instance created during your recovery attempts, which is why it appears empty (~14 MB with missing tables).
During the backup restore process, new volumes and services were created ("Postgres-Phong-đang-recovery", "test 1", and others). Your app's DATABASE_URL likely now points to one of these newer, near-empty instances instead of the original "Postgres" service. Update your application's database connection variables to point back to the original "Postgres" service and your data should be accessible again.
For the other users reporting similar issues in this thread, please open individual support threads so we can investigate your specific projects and volumes.
Status changed to Awaiting User Response Railway • 20 days ago
20 days ago
URGENT (PRO Plan)
Our entire company's operations are currently paralyzed due to this production outage. Every single minute of downtime is causing severe business disruption for us right now. We urgently need a Railway engineer to look into this volume conflict immediately.
Thank you for your prompt assistance!
Status changed to Awaiting Railway Response Railway • 20 days ago
sam-a
Your data is not lost. Your original `postgres-volume` is still attached to the "Postgres" service and holds ~1.15 GB, which matches your expected production size. The service your application is currently querying is a different Postgres instance created during your recovery attempts, which is why it appears empty (~14 MB with missing tables). During the backup restore process, new volumes and services were created ("Postgres-Phong-đang-recovery", "test 1", and others). Your app's DATABASE_URL likely now points to one of these newer, near-empty instances instead of the original "Postgres" service. Update your application's database connection variables to point back to the original "Postgres" service and your data should be accessible again. For the other users reporting similar issues in this thread, please open individual support threads so we can investigate your specific projects and volumes.
20 days ago
Thanks for checking this.
However, I manually verified both points already:
- The backend DATABASE_URL is still pointing to the original
Postgresservice. - When I directly open/query the original
Postgresinstance itself, it also appears nearly empty (~14 MB with missing tables).
So this does not seem to be only an application connection issue.
Additionally:
- before any recovery attempt, the production app suddenly failed
- I immediately checked the original
Postgresservice and the data was already missing at that moment - the volume swap attempts happened only afterwards while trying to recover
This is why I’m confused by the current state:
- the original
postgres-volumestill shows ~1.15 GB - but querying the original
Postgresinstance itself does not expose the expected production data
Could there possibly be:
- a volume mount mismatch
- stale attachment metadata
- partial volume rollback
- or PostgreSQL starting against a different/empty PGDATA path while the original volume still exists?
At the moment, the behavior seems inconsistent between:
- the reported volume size (~1.15 GB)
- and the actual visible database contents (~14 MB / missing tables)
Could you please inspect:
- which exact volume is currently mounted to the original
Postgresservice - which PGDATA path PostgreSQL is actually reading from
- and whether the original 1.15 GB volume contents are still accessible internally?
Thank you.
Attachments
sam-a
Your data is not lost. Your original `postgres-volume` is still attached to the "Postgres" service and holds ~1.15 GB, which matches your expected production size. The service your application is currently querying is a different Postgres instance created during your recovery attempts, which is why it appears empty (~14 MB with missing tables). During the backup restore process, new volumes and services were created ("Postgres-Phong-đang-recovery", "test 1", and others). Your app's DATABASE_URL likely now points to one of these newer, near-empty instances instead of the original "Postgres" service. Update your application's database connection variables to point back to the original "Postgres" service and your data should be accessible again. For the other users reporting similar issues in this thread, please open individual support threads so we can investigate your specific projects and volumes.
20 days ago
Hi Sam,
I need to clarify that the assumption about us querying the wrong PostgreSQL service is incorrect.
We already verified that:
our backend DATABASE_URL points to the original Postgres service,
and when directly opening the original PostgreSQL instance itself, the database still appears nearly empty in Railway UI (~14 MB with missing tables).
So this is not an application-side misconfiguration issue.
At the same time, Railway still reports the original volume size as ~1.15 GB, which strongly suggests there may be an underlying volume mounting/platform issue.
At this stage, we urgently need recovery first, investigation second.
Could you please escalate this to the infrastructure/platform team and perform a snapshot/volume restore from before the incident (ideally May 24 before the corruption occurred)?
We are currently losing critical business time and need the production database restored as soon as possible.
sam-a
Your data is not lost. Your original `postgres-volume` is still attached to the "Postgres" service and holds ~1.15 GB, which matches your expected production size. The service your application is currently querying is a different Postgres instance created during your recovery attempts, which is why it appears empty (~14 MB with missing tables). During the backup restore process, new volumes and services were created ("Postgres-Phong-đang-recovery", "test 1", and others). Your app's DATABASE_URL likely now points to one of these newer, near-empty instances instead of the original "Postgres" service. Update your application's database connection variables to point back to the original "Postgres" service and your data should be accessible again. For the other users reporting similar issues in this thread, please open individual support threads so we can investigate your specific projects and volumes.
20 days ago
HI Sam !!!
CRITICAL CEO UPDATE ON THE 1.15 GB SIZE:
Following up on my IT Lead's (Phong) response: I need to explicitly point out a massive red flag regarding the 1.15 GB volume size you mentioned.
Please look at our backup history: 1.15 GB was the size of our database TWO MONTHS AGO (March 30, 2026).
For the past 2 months (April and May), our company has inputted a massive amount of production data. The actual production volume on MAY 24 MUST be significantly larger than 1.15 GB.
If the postgres-volume you are looking at right now holds exactly 1.15 GB, it means you are looking at stale data from March 30, NOT our recent data!
We do NOT want the 1.15 GB volume. We desperately need you to find the actual AUTOMATED SNAPSHOT from MAY 24 (or May 23) which should be much larger, and restore that. Please do not restore the stale March data!
20 days ago
Visual Proof:
Hi Sam, please look at the attached screenshot from our Railway dashboard.
As you can see, 1.15 GB is EXACTLY the size of our manual backup from March 30. This visually confirms my point: the 1.15 GB volume you are currently looking at is the stale data from 2 months ago, NOT our recent production data!
Please escalate to Tier 3 to find our automated snapshot from May 24.
19 days ago
Hi Phong, we investigated the platform side carefully before responding.
When the new container started, Postgres read the existing data on postgres-volume and logged that the database system was last known up at 2026-03-31 01:08:43 UTC, with only ~20 KB of WAL replay. If this database had been actively receiving writes for the past 2 months, we would see a recent checkpoint timestamp and a much larger replay. We don't. No writes have been hitting this Postgres service since March 31.
The postgres-volume has stayed continuously attached at the same mount path since March 19, no remounts. No backup/restore or snapshot operation touched it.
Your production writes for the past 2 months were likely landing on a different database. Worth checking what host your DATABASE_URL actually resolves to inside the running container, and looking at the other postgres volumes in this project (the older services still have volumes attached).
Apologies, but this looks like an issue with the application level code. Due to volume, we can only answer platform level issues.
Status changed to Awaiting User Response Railway • 19 days ago
Status changed to Open mykal • 19 days ago
mykal
Hi Phong, we investigated the platform side carefully before responding. When the new container started, Postgres read the existing data on postgres-volume and logged that the database system was last known up at 2026-03-31 01:08:43 UTC, with only ~20 KB of WAL replay. If this database had been actively receiving writes for the past 2 months, we would see a recent checkpoint timestamp and a much larger replay. We don't. No writes have been hitting this Postgres service since March 31. The postgres-volume has stayed continuously attached at the same mount path since March 19, no remounts. No backup/restore or snapshot operation touched it. Your production writes for the past 2 months were likely landing on a different database. Worth checking what host your DATABASE_URL actually resolves to inside the running container, and looking at the other postgres volumes in this project (the older services still have volumes attached). Apologies, but this looks like an issue with the application level code. Due to volume, we can only answer platform level issues.
19 days ago
Hi Mykal,
I already verified the data directly from the Railway Postgres service itself via the Railway UI / psql connection — not through our backend application. So this is not an application-level DATABASE_URL or code issue.
At the moment, the currently attached postgres-volume only shows ~14 MB and missing tables directly inside Postgres itself.
What I’m trying to do now is re-attach the older volume:
postgres-2026-03-30 18:33 UTC
because before the incident, the original Postgres service was attached to that volume, not postgres-volume.
After the data suddenly disappeared, I manually attached postgres-volume to the Postgres service to test recovery. Now when I try to switch back to postgres-2026-03-30 18:33 UTC, Railway throws this error:
“Service already has a volume attached in this environment. A service can only have one volume.”
So my main question now is:
How can I safely detach the current volume and re-attach postgres-2026-03-30 18:33 UTC to the original Postgres service without hitting this volume swap error?
Attachments


