What we know so far: May 19th 2026

Full post-mortem still in-progress.

Hello all,

As part of our commitment to be as transparent as possible about the recent outage: https://status.railway.com/ (and any incident on the platform) we are summarizing what we know thus far- as well as some common questions that we have fielded thus far. We plan to update this thread with our official post-mortem once we have the details.

As it stands right now, we are working on getting everyone recovered. If not, a user can now redeploy and we will route your code to a healthy machine. We have our whole support team working getting workloads on Google Cloud hosts restored. Keep in mind, we are getting ratelimited by GH as our build pipeline is restoring to be fully healthy.

What we know:

Around 22:20 UTC, our Google Cloud account was placed into a "restricted" status hence removing all of our cloud overflow VMs, our CloudSQL instance, and our API. In removing our API, it removed a central dependency that affected all GCP host workloads, and then after our network route cache expired, then affecting all workloads hosted on the Railway platform.

We don't have full knowledge as to why our account was suspended automatically. We got into contact at the start of incident and we remain in contact with the GCP engineering team as we root cause the issue.

FAQ:

Q: "Doesn't Railway run it's own hardware?"

A: Yes, Railway runs hardware in 8 sites around 4 locations around the world. At the start of 2026, due to demand on our systems, we have bursted back onto the cloud on AWS and GCP.

A subset of non-latency sensitive customers and Enterprise customers are using a public cloud for their hosts. However, when we migrated fully onto Metal in Mar. 2025, we kept our API and our DB on GCP as we felt that leaving that workload was well within our risk model. (Candidly: we didn't expect to get our cloud account to get removed via automated enforcement.)

Q: "Why does your API being down mean that my workload went down?"

A: Railway's API talks to our distributed edge which is a suite of proxies that we have all around the world. Each location at the edge maintains a routing table of DNS, however, the network team was in the middle of fully distributing the routing table to make it that each region was fully independent. Once the routing cache at the edge was expired, then all workloads were affected and not just GCP workloads. This is due to be rectified.

Q: "My workload is still down. What gives?"

A: We're seeing recovery in our API, builds, and deployments. If your service is having an issue, please try to give it a redeploy. We'll publish a public postmortem covering what happened when we're fully recovered.

64 Replies

3 hours ago

Thank you guys for keeping us in the loop

Wishing you guys the best with the final recovery


asadullahjan
FREE

3 hours ago

Is my mongodb getting crashed related to this or not?

"MongoDB cannot start: Linux kernel versions 6.19 and newer has a known incompatibility with this version of MongoDB. See https://jira.mongodb.org/browse/SERVER-121912 for more information."

It started happening now.


3 hours ago

For any technical issues, please open your own thread explaining any remaining issues.


optimusopus
FREE

3 hours ago

"Candidly: we didn't expect to get our cloud account to get removed via automated enforcement." very politely put haha


qubestream
HOBBY

3 hours ago

These things happen. Thanks for keeping us in the loop and working hard to get us all back up and running.


yeferson59
HOBBY

3 hours ago

my Postgres database crashed, I think is problem about network GCP


yeferson59

my Postgres database crashed, I think is problem about network GCP

angustemple
PRO

3 hours ago

Crashed or timed out? Mine is same, I suspect issue with server talking to DB over railway vcp


radicitus
HOBBY

3 hours ago

Thanks for the updates! Stuff like this is always interesting to follow haha, thought I broke something


linkovichchomofski
PRO

3 hours ago

I sincerely hope Google pays for this. This was a resume generating event for sure


edoswald
HOBBY

3 hours ago

My question is, this isn't an isolated issue with service instability. I am seeing more and more issues daily; most don't affect us, but they're becoming more numerous. The failure point seems to have stemmed from a sudden need for more capacity, and if Railway hadn't been nearing capacity, this wouldn't have happened, since GCP wouldn't have been needed.


ridaouledhaddou
HOBBY

3 hours ago

Thank you for keeping us informed.


ricks-yfwi
PRO

3 hours ago

redeploying kicked us back online... I wish they told us to try that earlier. Thanks for the mid-post mortem still!


ricks-yfwi

redeploying kicked us back online... I wish they told us to try that earlier. Thanks for the mid-post mortem still!

Very likely that you wouldn't have been able because after we got services online, we had to re-bootstrap the build fleet. We are back. Thanks again.


banggsatga
PRO

2 hours ago

MongoDB cannot start: Linux kernel versions 6.19 and newer has a known incompatibility with this version of MongoDB. See https://jira.mongodb.org/browse/SERVER-121912 for more information.


Anonymous
PRO

2 hours ago

Whats the point of having multiple 'Replicas' (that I am paying for) when they are all going to be hosted by GCP? How do you not have a redundancy system for this? Like using AWS or something? This downtime has cost me thousands of dollars as it happened during the morning in my timezone.


edoswald

My question is, this isn't an isolated issue with service instability. I am seeing more and more issues daily; most don't affect us, but they're becoming more numerous. The failure point seems to have stemmed from a sudden need for more capacity, and if Railway hadn't been nearing capacity, this wouldn't have happened, since GCP wouldn't have been needed.

angustemple
PRO

2 hours ago

Agreed. I really enjoy Railway as a platform but am starting to consider other options given recent stability issues.


edoswald

My question is, this isn't an isolated issue with service instability. I am seeing more and more issues daily; most don't affect us, but they're becoming more numerous. The failure point seems to have stemmed from a sudden need for more capacity, and if Railway hadn't been nearing capacity, this wouldn't have happened, since GCP wouldn't have been needed.

This is a fair read. That said, I will say that the uptime crunchiness from Feb. - Mar. was GH and then capacity. You don't have to take my word for it, but each outage was unique way to stress test our systems to which, knock on wood, we've been able to manage so far.

Outages as of late until this one were usually tied to host failures, of which we are working to mitigate with "live migrations" (an in progress feature that will come with VMs) - that said, this one was egregiously bad because it was a single and expected point of failure like a cloud account getting removed. That said, we own our uptime and it affected everyone, so everyone has a right to be mad because we did impact businesses for 6 or so hours.

The good I will take away from this is that- we have acted on your feedback on comms and we were on the ball with information delivery, now we just need to land the rest of the reliability work to make it so that the platform is anti-fragile. For those who feel the need to migrate, it's been an honor to serve your business.


white
PRO

2 hours ago

Recently, I’ve seen quite a few tutorials on YouTube explaining how to use the Railway service to set up a “free” VPN. These users are taking advantage of the platform’s promotional offers for their own convenience, but they’re disrupting the system’s balance. I wonder if Google has detected abnormal network requests due to the sheer number of these “free” users and made an erroneous judgment as a result?


white

Recently, I’ve seen quite a few tutorials on YouTube explaining how to use the Railway service to set up a “free” VPN. These users are taking advantage of the platform’s promotional offers for their own convenience, but they’re disrupting the system’s balance. I wonder if Google has detected abnormal network requests due to the sheer number of these “free” users and made an erroneous judgment as a result?

I would like to hope that it is, but we have no word yet on why GCP got us in the automated "cull" - to say we are livid is an understatement.


sonicviz
HOBBY

2 hours ago

That was a double dose of "welcome to API risk". I hope we get a full accounting of why Google killed access. My deployment crashed as a result, but I need to manually restart. Need a mechanism to auto-restart and continually try to restart until success, in case it was nighttime for me.

Fortunately, I'm still in alpha test deploy mode (Hence Hobby plan currently) so not live yet, but this is a bit of a rude awakening to me. I have to reassess my deployment options now.


banggsatga

MongoDB cannot start: Linux kernel versions 6.19 and newer has a known incompatibility with this version of MongoDB. See https://jira.mongodb.org/browse/SERVER-121912 for more information.

This is related to your DB version not being pinned on re-deploy, which you need to pin to the correct version of Mongo.


Whats the point of having multiple 'Replicas' (that I am paying for) when they are all going to be hosted by GCP? How do you not have a redundancy system for this? Like using AWS or something? This downtime has cost me thousands of dollars as it happened during the morning in my timezone.

Anonymous
PRO

2 hours ago

Can you respond to this? We may as well just go direct with Google at this rate.


Can you respond to this? We may as well just go direct with Google at this rate.

Not to be dismissive, but Q: "Why does your API being down mean that my workload went down?" should be the answer that you are looking for. As for your hosts, the records show that you are indeed on AWS/GCP/Metal but the networking being tied tot the API is what got you and others. We have a mitigation for this shortcoming is in progress.


santidevi
PRO

2 hours ago

You should add a feature that allows customers to deploy in replica mode across AWS, Google Cloud, and Azure. I’m willing to pay extra for this. For example, if Google Cloud goes down today, my instances running on AWS and Azure would still remain operational.


santidevi

You should add a feature that allows customers to deploy in replica mode across AWS, Google Cloud, and Azure. I’m willing to pay extra for this. For example, if Google Cloud goes down today, my instances running on AWS and Azure would still remain operational.

white
PRO

2 hours ago

That's a good suggestion 👍


santidevi

You should add a feature that allows customers to deploy in replica mode across AWS, Google Cloud, and Azure. I’m willing to pay extra for this. For example, if Google Cloud goes down today, my instances running on AWS and Azure would still remain operational.

Anonymous
PRO

2 hours ago

That should be the default! All our replicas being the same host is insane. This downtime cost me thousands. Is railway not made for serious work? I get outages happen, but no redudency is mind blowing.


paddyjakes
PRO

2 hours ago

our Postgres service has been down since the GCP outage and is now stuck in a crash loop. The volume mounts successfully but the container immediately fails with catatonit: failed to exec pid1: No such file or directory ...the Postgres binary appears missing from the container layer. We've tried restarting multiple times, same result every time. anyone else having this issue


Whats the point of having multiple 'Replicas' (that I am paying for) when they are all going to be hosted by GCP? How do you not have a redundancy system for this? Like using AWS or something? This downtime has cost me thousands of dollars as it happened during the morning in my timezone.

Anonymous
PRO

2 hours ago

I feel like AI has made us all dumber. If you host an app with a load balancer on AWS do you assume replicas would be hosted on external services like GCP? No, thats not the point of replicas. Replicas reduce strain on a singular system, what people are saying about "just it hosted on multiple services (AWS, GCP, etc)" is like building the whole service pipeline again, whats the point if you're building on a system intended to have quadruple 9's uptime.

Ultimately every system has bottlenecks regardless of it's size. Anger should not be pointed at the railway team, these are circumstances even the best devops team cannot plan for.


inoovated
HOBBY

2 hours ago

Sou um iniciante ainda, não entendo muito. Já estou pensando em planos B. Vou aumentar meu custo mas tenho que ter um plano B pra quando isso acontecer e minha operação continuar rodando


yeferson59

my Postgres database crashed, I think is problem about network GCP

mika9339
HOBBY

2 hours ago

Redploying Postgres fixed the issue for me


progrennis
PRO

2 hours ago

Hey, my service keeps crashing immediately after restarting. Is there anything I can do? My complete app is down now


progrennis

Hey, my service keeps crashing immediately after restarting. Is there anything I can do? My complete app is down now

frostykdev
PRO

2 hours ago

The same: ERROR (catatonit:2): failed to exec pid1: No such file or directory


Anonymous
PRO

2 hours ago

Just started scaling up on Railway and this happened. Hilarious. I will be waiting to know what measures Railway will take to prevent these sort of outages in the future. Also why should I re-deploy the services myself again? It is pass after all or have I missed anything?


newair2222-hash
PRO

2 hours ago

My PostgreSQL has same problem. How can I recovery the data and redeploy?


asadullahjan

Is my mongodb getting crashed related to this or not? "MongoDB cannot start: Linux kernel versions 6.19 and newer has a known incompatibility with this version of MongoDB. See https://jira.mongodb.org/browse/SERVER-121912 for more information." It started happening now.

mfaqihridho
HOBBY

2 hours ago

yes i have same issues, i think because they still not fully recovered


mika9339

Redploying Postgres fixed the issue for me

newair2222-hash
PRO

2 hours ago

How did you redeploy the PG DB? Did you recovery the full data?


ivanchflores-star
HOBBY

2 hours ago

I redeployed my services and it worked. Thanks for letting us know about every update on this.


marcpope
FREE

2 hours ago

people that are complaining, why did you rely on a single provider yourself? you can plan for redundancy at every level and something will still take you down.


Just started scaling up on Railway and this happened. Hilarious. I will be waiting to know what measures Railway will take to prevent these sort of outages in the future. Also why should I re-deploy the services myself again? It is pass after all or have I missed anything?

sonicviz
HOBBY

2 hours ago

Same.


frostykdev
PRO

2 hours ago

Our production Postgres service crashed with 'ERROR (catatonit:2): failed to exec pid1: No such file or directory' and won't restart—the volume appears corrupted. We need help recovering the data without losing it. Project ID: 5beb36b2-c52f-43a5-be5b-29c953a7d463


frostykdev

Our production Postgres service crashed with 'ERROR (catatonit:2): failed to exec pid1: No such file or directory' and won't restart—the volume appears corrupted. We need help recovering the data without losing it. Project ID: 5beb36b2-c52f-43a5-be5b-29c953a7d463

If you can open up a new support thread so we can get you going, would love to.


frostykdev
PRO

2 hours ago

redeploy again seems fixed it


tutoviewplus-byte
HOBBY

2 hours ago

Ah! The problem was resolved after making the reconfiguration.


brody

For any technical issues, please open your own thread explaining any remaining issues.

ttannouss7-beep
HOBBY

2 hours ago

my data base IS DOWN ( Crashed for like 8 to 9 hours ) postgress down my clinets cant log in cant fo anything ) when we should expect this to be live

?


kolade-amire
HOBBY

2 hours ago

psql: error: connection to server at "monorail.proxy.rlwy.net" , port 30194 failed: server closed the connection unexpectedly

    This probably means the server terminated abnormally

    before or while processing the request.

the issues responsible for this too?


brody

For any technical issues, please open your own thread explaining any remaining issues.

swingmicro
PRO

an hour ago

I'm still getting bad gateway 502 error. Pls help my app is not accessible through cloudflare domain


Yup, for those with Postgres issues, I would redeploy and check that you have matching databases.


doryza
PRO

an hour ago

I get the feeling Railway team will grow wayyyy more robust from this, weary geeks on the verge. While I thought I broke the whole thing when adding an env var; I'm glad to have taken some time off the screen. Live and learn @railway; a crash like this is worthy of a bountiful bounce back!


angelo-railway

Yup, for those with Postgres issues, I would redeploy and check that you have matching databases.

swingmicro
PRO

an hour ago

Do I have to do anything from my side??


doryza

I get the feeling Railway team will grow wayyyy more robust from this, weary geeks on the verge. While I thought I broke the whole thing when adding an env var; I'm glad to have taken some time off the screen. Live and learn @railway; a crash like this is worthy of a bountiful bounce back!

Appreciate it, we still dissapointed (understatement) a lot of our customers, we'll keep on working for you all and improving the system.


ealiyoruk-cell
FREE

an hour ago

Hello,

Could you please provide an estimated timeline for when this issue is expected to be resolved? Having a tentative timeframe would greatly help us manage our expectations and plan accordingly.


brody

For any technical issues, please open your own thread explaining any remaining issues.

progrennis
PRO

an hour ago

My app is still down. On your status page you say: Monitoring

Railway services have fully recovered. Some workloads may still need a redeploy, we're automatically redeploying any we detect as unhealthy. If your service isn't responding correctly, please trigger a redeploy from the dashboard or CLI.

We're sorry for the disruption. A detailed postmortem will follow once we've confirmed stability.

Yet after restarting my app it immediately crashes without further info


progrennis

My app is still down. On your status page you say: Monitoring Railway services have fully recovered. Some workloads may still need a redeploy, we're automatically redeploying any we detect as unhealthy. If your service isn't responding correctly, please trigger a redeploy from the dashboard or CLI. We're sorry for the disruption. A detailed postmortem will follow once we've confirmed stability. Yet after restarting my app it immediately crashes without further info

Note how monitoring doesn't mean resolved for this exact reason, we are working case by case to get you and others in a good spot.


Anonymous
HOBBY

an hour ago

Good job with comms. Just to confirm, will redeploying my DB instance wipe my data? Our DB is still down and I want to see what works.


jithinzac
FREE

an hour ago

In our case, a redeployment fixed the Postgres issue without any data loss.


santidevi

You should add a feature that allows customers to deploy in replica mode across AWS, Google Cloud, and Azure. I’m willing to pay extra for this. For example, if Google Cloud goes down today, my instances running on AWS and Azure would still remain operational.

this would be a cool feature


jithinzac

In our case, a redeployment fixed the Postgres issue without any data loss.

oscargm10176
PRO

an hour ago

It worked, No data loss after the redeploy.


bogk9
HOBBY

an hour ago

I understand outages happen, but what I don’t understand is why my deployment wasn’t automatically restarted afterward. The incident happened overnight in Europe, so I woke up to 8+ hours of downtime reports from clients.

If instances had been automatically redeployed after the outage, the impact could likely have been reduced significantly, probably closer to 4–6 hours instead of 8+.


bogk9

I understand outages happen, but what I don’t understand is why my deployment wasn’t automatically restarted afterward. The incident happened overnight in Europe, so I woke up to 8+ hours of downtime reports from clients. If instances had been automatically redeployed after the outage, the impact could likely have been reduced significantly, probably closer to 4–6 hours instead of 8+.

We are indeed rolling through redeployments, however, it's a queued system to avoid back pressure. Absolutely heard on the feedback.


gabbygall
FREE

24 minutes ago

Good morning, I am still without a DB and cannot restart - need help!


tempo22
PRO

13 minutes ago

Same for me, multiple services cannot be restarted


macciep
HOBBY

11 minutes ago

Good morning, I also have problem with my servers - please help!


asadullahjan

Is my mongodb getting crashed related to this or not? "MongoDB cannot start: Linux kernel versions 6.19 and newer has a known incompatibility with this version of MongoDB. See https://jira.mongodb.org/browse/SERVER-121912 for more information." It started happening now.

mohannedm
PRO

7 minutes ago

Yoo, my mongodb image keeps on crashing 😿


mohannedm
PRO

4 minutes ago

This frustrating, anyway, I ate a burger from yesterday for breakfast 😋


Welcome!

Sign in to your Railway account to join the conversation.

Loading...