3 months ago
Hi Railway team,
We’re seeing a sudden outage in the Singapore region where our Node/NestJS services started failing to connect to Redis and then crashed across all nodes.
What happened
Around 2025-12-16 07:26 (timestamp from app logs), Redis connections began timing out.
Shortly after, all Node services that depend on Redis crashed.
The error is consistent across services and looks like a network/connection issue rather than an auth/config change on our side (we didn’t deploy any related changes right before this).
Error logs
Redis Connection Error: ConnectionTimeoutError: Connection timeout
at Socket.<anonymous> (/app/node_modules/@redis/client/dist/lib/client/socket.js:177:124)
at Object.onceWrapper (node:events:631:28)
at Socket.emit (node:events:517:28)
at Socket._onTimeout (node:net:598:8)
at listOnTimeout (node:internal/timers:569:17)
at process.processTimers (node:internal/timers:512:7)
[Nest] 15 - 12/16/2025, 7:26:14 AM ERROR [ExceptionHandler] Connection timeout
Error: Connection timeout
at Socket.<anonymous> (/app/node_modules/@redis/client/dist/lib/client/socket.js:177:124)
at Object.onceWrapper (node:events:631:28)
at Socket.emit (node:events:517:28)
at process.processTimers (node:internal/timers:512:7)
npm warn config production Use `--omit=dev` instead.
What we’re doing now
We are currently redeploying all Node services and the Redis container to recover.
Request
Could you please check if there were any network incidents / Redis connectivity issues / infrastructure degradation affecting Redis connections in the Singapore region around that time?
If there’s anything you can see on your end (connection drops, networking problems, platform incident), we’d appreciate details and recommended mitigation steps.
Thanks!
6 Replies
3 months ago
Hi all,
We have an ongoing incident that this may be related to — we're investigating issues with elevated network latency and slow deploys and working actively to resolve this right now.
Best,
The Railway Team
Status changed to Awaiting User Response Railway • 3 months ago
3 months ago
Just an update: we are starting to see recovery, so this should be resolving for you soon
3 months ago
Hey folks, we are seeing the same issue, only for our Redis container, in Amsterdam, and only in our production environment. Other environments are working fine.
Some of our critical flows are affected due to this. Do you have any ETA on when the fix would be in place? Happy to provide more details. Our Redis redeployment has been stuck in the "Creating containers" phase for the last 15 minutes or so.
Status changed to Awaiting Railway Response Railway • 3 months ago
3 months ago
Hey folks - apologies for the silence here. Are you still experiencing this issue?
Status changed to Awaiting User Response Railway • 3 months ago
3 months ago
Hi all, just checking in to make sure you're all set as the incident was resolved earlier today. Please let us know if we can help in any way.
We'll also be sharing a post-mortem for the incident. We're really very sorry for any inconvenience.
Best,
The Railway Team
Status changed to Solved hjh010501 • 3 months ago
3 months ago
Hi, we've published an incident report for December 16, 2025: https://blog.railway.com/p/incident-report-december-16-2025
Once again, we're deeply sorry for the impact this caused. We're actively working to eliminate the class of issues that contributed to this incident and to make the platform more resilient.
Best,
The Railway Team
Status changed to Awaiting User Response Railway • 3 months ago
Status changed to Solved chandrika • 3 months ago
