Backend Response Time is Super Slow

tude-diniz

PROOP

12 days ago

Hi everyone,

I’m experiencing a serious performance issue with one of my production services. Requests are taking an unusually long time to process, making the overall response time extremely slow.

This service is a backend running on Node.js (v22.22.2). I haven’t made any changes in the past month, and the issue started suddenly yesterday.

So far, I’ve tried:

Restarting and redeploying the service
Scaling by duplicating replicas across two regions

Unfortunately, none of these actions improved the situation.

The database appears to be functioning normally, I can connect remotely and queries are responding within expected timeframes.

I also checked the Railway status page, and everything seems operational (all green), so I’m unsure if there’s an ongoing incident affecting performance.

This is a user-facing production system, also consumed by mobile apps, so the impact is significant.

Is anyone else experiencing similar issues, or is there any known problem that could explain this behavior?

Any guidance would be greatly appreciated.

$20 Bounty

3 Replies

Status changed to Open Railway • 12 days ago

tude-diniz

PROOP

12 days ago

Here’s a screenshot covering the last 7 days, you can clearly see the degradation starting yesterday.

For context, my last release was about a month ago, and there haven’t been any deployments or changes since then.

Attachments

Screenshot%...

tude-diniz

Here’s a screenshot covering the last 7 days, you can clearly see the degradation starting yesterday.For context, my last release was about a month ago, and there haven’t been any deployments or changes since then.

arisnance

HOBBY

10 days ago

Hey, I've experienced this in my node project, and it took a while to figure out, but i'm sure I can help. Your issue is likely one of the problems below, as was my case. Let me know how else I can help, etc.

1. Node.js 22 "Adaptive Semi-Space" Issue

One of the most documented performance regressions in Node.js 22 involves a change in how V8 manages the Young Generation (semi-space) memory.

The Problem: V8 now attempts to "adaptively" size the memory used for short-lived objects. In certain workloads—especially those with many JSON parsings or object transformations—the semi-space can shrink too small (sometimes to 1MB).
The Symptom: This forces objects into the "Old Space" prematurely, triggering massive, "stop-the-world" Major GC pauses. These pauses look exactly like your chart: flat median latency (p50) but extreme spikes in p99.
Why it started yesterday: It often triggers once you hit a specific traffic threshold or a slight change in payload size that breaks the adaptive heuristic.
Fix: Explicitly set the semi-space size in your Railway start command: node --max-semi-space-size=64 index.js (or 128 depending on your RAM).

2. Railway Edge Network Latency

While the status page is green, historical incidents and recent user reports (as recently as yesterday, May 1) suggest localized issues in specific regions like us-east4 or the Edge Network.

In late March, Railway experienced "Request timeouts and Service Unavailable responses on Edge Network".
If your mobile apps are hitting a global edge URL, the latency might be occurring at the entry point before it even reaches your replicas, which is why scaling replicas doesn't fix it.

3. Outbound Networking Timeouts

Your database is fine, but check if your service makes calls to third-party APIs (Auth, Analytics, or Push Notifications).

Railway has had past issues with "Outbound networking to Azure timing out".
If a specific outbound dependency is hanging, it will block the Node.js event loop or exhaust the connection pool, leading to the 20-30s spikes seen in your screenshot.

radulescuandrew

PRO

9 days ago

We have the same issue with our Node application. Nothing changed, everything was fine until 2-3 days ago. We redeployed the service, the database, checked everything and nothing seems to work... we already migrated to DigitalOcean and everything is back to normal. Do we have support for this?

Welcome!