20 days ago
Project: fortunate-joy / Environment: production / Plan: Hobby (paying)
My production marketplace (Glowp) has been completely DOWN since the
May 19 outage (started 22:29 UTC). 8+ hours later it is still offline.
Railway's OWN deployment Diagnosis confirms this needs your team:
"The Postgres service is stuck and needs Railway support to resolve.
The CREATE_CONTAINER step has been pending across multiple consecutive
deployments, meaning the container never starts. A volume migration to
the europe-west4-drams3a region appears to be involved in the
scheduling failure."
Current state:
-
Postgres: deployment fails repeatedly, CREATE_CONTAINER stuck,
container never starts.
-
Volume 3a73e3f6-2ba4-4a8a-a0ae-5325ddf5f3d5 (mounted at
/var/lib/postgresql/data) holds ALL my production data. It MUST stay
attached — do NOT migrate, wipe, detach or re-initialize it.
-
Glowp service: crash-looping (P1001 cannot reach
postgres.railway.internal) because Postgres is down.
-
Both services correctly configured for europe-west4-drams3a. US East
deployments were cancelled.
What I need:
-
Manual intervention to unstick the Postgres CREATE_CONTAINER
scheduling failure in europe-west4-drams3a, with the existing
volume attached, so my database comes back online with all data
intact.
-
Full credit refund for the Railway Agent usage I was forced to
consume during this incident — I only used the Agent because your
outage broke the normal recovery path. Over $3 of my $5 credit
consumed.
-
A service credit for the production downtime, per standard practice
after a major incident.
This is a live marketplace with real sellers and buyers locked out
right now. Please prioritize.
Thank you.
4 Replies
Status changed to Awaiting Railway Response Railway • 20 days ago
20 days ago
Thanks for reaching out. We sincerely apologize for the service disruption.
We're seeing recovery in our API, builds, and deployments. If your service is experiencing an issue, please try redeploying it. We'll publish a public postmortem once we're fully recovered.
For all customers, we’ll publish a detailed postmortem outlining what happened and the steps we’re taking to prevent similar incidents in the future. For Enterprise customers, service credits are covered under our SLA and will be reviewed as part of our post-incident process.
Status changed to Awaiting User Response Railway • 20 days ago
20 days ago
Looks like they aren't even giving credits for this extended downtime. If you are using railway for business I would suggest you look elsewhere.
Status changed to Awaiting Railway Response Railway • 20 days ago
Status changed to Awaiting User Response brody • 20 days ago
brody
Thanks for reaching out. We sincerely apologize for the service disruption. We're seeing recovery in our API, builds, and deployments. If your service is experiencing an issue, please try redeploying it. We'll publish a public postmortem once we're fully recovered. For all customers, we’ll publish a detailed postmortem outlining what happened and the steps we’re taking to prevent similar incidents in the future. For Enterprise customers, service credits are covered under our SLA and will be reviewed as part of our post-incident process.
20 days ago
Thanks for confirming recovery — my app is back online.
I understand SLA service credits are reserved for Enterprise plans.
However, I'm not raising an SLA claim. I'm raising a billing-fairness
issue:
During the incident, your own dashboard's recovery path (Restart) was
broken, and I was pushed to use the billable Railway Agent to recover
my production database. It consumed over $3 of my $5 monthly credit.
The Agent also nearly triggered a destructive cross-region migration
that would have wiped my production volume — I had to manually catch
and reverse it.
I would never have spent that credit if your infrastructure had been
working. Charging me for it is not fair.
I'm formally requesting a refund of the Agent usage consumed during
this incident (May 19-20 outage). This is independent of any SLA
discussion.
Separately — while I understand downtime credits are Enterprise-only,
I'd genuinely appreciate any goodwill gesture for a 10+ hour
production outage on a paid account. I run a live marketplace and
this was a hard hit.
Please confirm the Agent refund. Thank you.
Status changed to Awaiting Railway Response Railway • 20 days ago
20 days ago
You can request a refund by following the docs at https://docs.railway.com/reference/pricing/refunds#requesting-a-refund
Status changed to Awaiting User Response Railway • 20 days ago
13 days ago
This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!
Status changed to Solved Railway • 13 days ago