Repeated incidents, host level performance variability.. is anyone else experiencing this?
injung
PROOP

9 days ago

Hi all,

I wanted to share our recent experience with Railway and see if others are facing similar issues.

We chose Railway despite its relatively high pricing (especially memory, which is significantly more expensive than alternatives like AWS Fargate), mainly because:

  • the developer experience is excellent

  • we wanted to support a startup platform

  • and it helped us simplify early infrastructure setup (e.g. avoiding VPC complexity/costs)

However, over the past few months, our experience has been quite concerning.

Repeated incidents

We've experienced multiple major incidents in a short period:

  • Feb 12

  • Mar 25

  • Mar 31

  • May 4

That’s 4+ production-impacting incidents within a few months.

Ongoing performance instability

Beyond those incidents, we're also seeing repeated patterns like:

  • latency gradually increasing without any deploy or traffic change

  • database queries (even simple indexed lookups) becoming significantly slower

  • API latency increasing accordingly

  • performance eventually recovering on its own

From our perspective, this strongly suggests host-level variability (e.g. noisy neighbors / underlying infra issues) rather than application-level problems.

Lack of clarity

We've reported these issues multiple times and waited for responses, but:

  • we haven't received clear root cause explanations

  • it's unclear what guarantees exist around host isolation

  • and it's hard to understand what's being done to prevent recurrence

This makes it difficult to reason about reliability.

Pricing vs trade-offs

We were told that a dedicated VM option is available at ~10x the cost.

At that point, it raises a serious question:

If we need to pay 10x for predictable performance, why not just move to AWS (e.g. Flightcontrol + ECS/Fargate)?

Even accounting for AWS complexity, the cost difference doesn’t seem justified given the current instability.

Where we're at

To be honest, this is starting to impact our trust in Railway as a production platform.

We were early supporters. We recommended Railway to others and even helped migrate some workloads onto it. But given the current experience, it's becoming difficult to confidently continue doing so.

Would really appreciate hearing others' experiences, and also more transparency from the Railway team on:

  • what's causing these issues

  • and how reliability is expected to improve going forward

Thanks pray emoji

0 Replies

Welcome!

Sign in to your Railway account to join the conversation.

Loading...