12 days ago
Hi everyone,
I’m experiencing a serious performance issue with one of my production services. Requests are taking an unusually long time to process, making the overall response time extremely slow.
This service is a backend running on Node.js (v22.22.2). I haven’t made any changes in the past month, and the issue started suddenly yesterday.
So far, I’ve tried:
Restarting and redeploying the service
Scaling by duplicating replicas across two regions
Unfortunately, none of these actions improved the situation.
The database appears to be functioning normally, I can connect remotely and queries are responding within expected timeframes.
I also checked the Railway status page, and everything seems operational (all green), so I’m unsure if there’s an ongoing incident affecting performance.
This is a user-facing production system, also consumed by mobile apps, so the impact is significant.
Is anyone else experiencing similar issues, or is there any known problem that could explain this behavior?
Any guidance would be greatly appreciated.
3 Replies
Status changed to Open Railway • 12 days ago
12 days ago
Here’s a screenshot covering the last 7 days, you can clearly see the degradation starting yesterday.
For context, my last release was about a month ago, and there haven’t been any deployments or changes since then.
Attachments
tude-diniz
Here’s a screenshot covering the last 7 days, you can clearly see the degradation starting yesterday.For context, my last release was about a month ago, and there haven’t been any deployments or changes since then.
10 days ago
Hey, I've experienced this in my node project, and it took a while to figure out, but i'm sure I can help. Your issue is likely one of the problems below, as was my case. Let me know how else I can help, etc.
1. Node.js 22 "Adaptive Semi-Space" Issue
One of the most documented performance regressions in Node.js 22 involves a change in how V8 manages the Young Generation (semi-space) memory.
The Problem: V8 now attempts to "adaptively" size the memory used for short-lived objects. In certain workloads—especially those with many JSON parsings or object transformations—the semi-space can shrink too small (sometimes to 1MB).
The Symptom: This forces objects into the "Old Space" prematurely, triggering massive, "stop-the-world" Major GC pauses. These pauses look exactly like your chart: flat median latency (p50) but extreme spikes in p99.
Why it started yesterday: It often triggers once you hit a specific traffic threshold or a slight change in payload size that breaks the adaptive heuristic.
Fix: Explicitly set the semi-space size in your Railway start command:
node --max-semi-space-size=64 index.js(or 128 depending on your RAM).
2. Railway Edge Network Latency
While the status page is green, historical incidents and recent user reports (as recently as yesterday, May 1) suggest localized issues in specific regions like us-east4 or the Edge Network.
In late March, Railway experienced "Request timeouts and Service Unavailable responses on Edge Network".
If your mobile apps are hitting a global edge URL, the latency might be occurring at the entry point before it even reaches your replicas, which is why scaling replicas doesn't fix it.
3. Outbound Networking Timeouts
Your database is fine, but check if your service makes calls to third-party APIs (Auth, Analytics, or Push Notifications).
Railway has had past issues with "Outbound networking to Azure timing out".
If a specific outbound dependency is hanging, it will block the Node.js event loop or exhaust the connection pool, leading to the 20-30s spikes seen in your screenshot.
9 days ago
We have the same issue with our Node application. Nothing changed, everything was fine until 2-3 days ago. We redeployed the service, the database, checked everything and nothing seems to work... we already migrated to DigitalOcean and everything is back to normal. Do we have support for this?
