2 months ago
Hello,
I posted a ticket few weeks ago and never had a response.
Had no issues until yesterday, our production server keeps on getting non responsive without no real reason (cf old ticket, link below). Decided to upgrade our plan to PRO to see if it would fix anything, and it just made everything worse. Our production server went down 4 times since yesterday evening (CET).
This situation cannot continue for us, we'd really appreciate your help.
Old ticket:
https://station.railway.com/questions/understanding-unwanted-redis-restart-814f1f2a
To recap the issue. Seems like my server gets disconnected from the redis instance (on railway) when the Redis instance backups
Let us know if there's anything we can share to help resolve this asap
7 Replies
2 months ago
Logs attached showing that the error comes most likely from Redis
Attachments
2 months ago
Do you have a link to a deployment where this happened?
I checked some hosts your deploys were on and there was nothing abnormal about them with all other user workloads operating normally. There is no infrastructure-level evidence suggesting this is an issue on our end as far as I can tell.
If your deployment is "non-responsive" without any logs etc. and requests are 502-ing, it's highly likely your app has crashed or is stuck in some infinite loop. If it crashes without exiting with a proper error code (e.g. 1), we don't automatically restart it (because we'd have no way to know whether it's crashed or running).
Status changed to Awaiting User Response Railway • about 2 months ago
2 months ago
Thanks for confirming. We identified the issue was our Redis client not having a reconnection strategy configured
Although it was not tribal, as it was not documented, that keepalive connections should be expected to fail at anytime on Railway
A transient socket close (likely from idle timeout or network hiccup) killed our Redis connections permanently, leaving the app unresponsive without crashing
We've implemented a reconnection logic and TCP keepalives
Hopefully should be resolved now
Thanks for the quick answer!
Status changed to Awaiting Railway Response Railway • about 2 months ago
Status changed to Solved ibadus • about 2 months ago
a month ago
Hey, coming back as the retry logic didn't fix the issue. We still have the issue even with a reconnection strategy on the redis connexion.
Below the TypeScript code we used to make the redis connection (using the Private Networking from Railway):
import IORedis from "ioredis";
import { env } from "@/lib/env";
const createRedisClient = (url: string, name: string) => {
const client = new IORedis(url, {
maxRetriesPerRequest: null,
enableReadyCheck: true,
retryStrategy: (times) => {
if (times > 10) {
console.error(`[Redis:${name}] Max reconnection attempts reached`);
return null;
}
const delay = Math.min(times * 100, 3000);
console.warn(`[Redis:${name}] Reconnecting in ${delay}ms (attempt ${times})`);
return delay;
},
reconnectOnError: (err) => {
const targetErrors = ["READONLY", "ECONNRESET", "ETIMEDOUT", "ECONNREFUSED"];
return targetErrors.some((e) => err.message.includes(e));
},
});
client.on("error", (err) => console.error(`[Redis:${name}] Error:`, err.message));
client.on("reconnecting", (ms: number) => console.warn(`[Redis:${name}] Reconnecting in ${ms}ms`));
client.on("ready", () => console.log(`[Redis:${name}] Connected`));
return client;
};
export const redisRateLimiter = createRedisClient(env.RATE_LIMIT_REDIS_URL, "rate-limiter");
export const api_key_redis = createRedisClient(env.API_KEYS_REDIS_URL, "api-keys");Status changed to Awaiting Railway Response Railway • about 2 months ago
a month ago
Attached the logs. Also it's very strange that our application receives no requests, but railway observability shows 5XX/4XX errors on the dashboard (even tho the server receives nothing)
Attachments
a month ago
Whole app (blitz-api) was down from 25 jan at ~10:50 PM to 26 jan ~9:03 (manual restart)
a month ago
Attached some tests done on localhost by killing the redis instance to verify the retry logic:
Started development server: http://localhost:3000
[Redis:subscriber] Connected
[Redis:api-keys] Connected
[Redis:rate-limiter] Connected
[2026-01-26T09:50:00.023Z] info: --> POST /v2/enrichment/email 200 231ms
[Redis:rate-limiter] Reconnecting in 100ms (attempt 1)
[Redis:rate-limiter] Reconnecting in 100ms
[2026-01-26T09:50:37.071Z] info: <-- POST /v2/enrichment/email
[Redis:rate-limiter] Reconnecting in 200ms (attempt 2)
[Redis:rate-limiter] Reconnecting in 200ms
[Redis:rate-limiter] Reconnecting in 300ms (attempt 3)
[Redis:rate-limiter] Reconnecting in 300ms
[Redis:rate-limiter] Reconnecting in 400ms (attempt 4)
[Redis:rate-limiter] Reconnecting in 400ms
[Redis:rate-limiter] Reconnecting in 500ms (attempt 5)
[Redis:rate-limiter] Reconnecting in 500ms
[Redis:rate-limiter] Reconnecting in 600ms (attempt 6)
[Redis:rate-limiter] Reconnecting in 600ms
[Redis:rate-limiter] Reconnecting in 700ms (attempt 7)
[Redis:rate-limiter] Reconnecting in 700ms
[Redis:rate-limiter] Reconnecting in 800ms (attempt 8)
[Redis:rate-limiter] Reconnecting in 800ms
[Redis:rate-limiter] Reconnecting in 900ms (attempt 9)
[Redis:rate-limiter] Reconnecting in 900ms
[2026-01-26T09:50:46.692Z] info: <-- POST /v2/enrichment/email
[Redis:rate-limiter] Reconnecting in 1000ms (attempt 10)
[Redis:rate-limiter] Reconnecting in 1000ms
[Redis:rate-limiter] Max reconnection attempts reached
[2026-01-26T09:50:48.451Z] error: Connection is closed.
[2026-01-26T09:50:48.451Z] error: Connection is closed.
[2026-01-26T09:50:48.469Z] info: --> POST /v2/enrichment/email 500 11s
[2026-01-26T09:50:48.469Z] info: --> POST /v2/enrichment/email 500 2s
[2026-01-26T09:50:53.956Z] info: <-- POST /v2/enrichment/email
[2026-01-26T09:50:53.989Z] error: Connection is closed.
[2026-01-26T09:50:53.993Z] info: --> POST /v2/enrichment/email 500 36ms
[2026-01-26T09:51:00.020Z] info: <-- POST /v2/enrichment/email
[2026-01-26T09:51:00.044Z] error: Connection is closed.
[2026-01-26T09:51:00.046Z] info: --> POST /v2/enrichment/email 500 25ms
[2026-01-26T09:53:04.075Z] info: <-- POST /v2/enrichment/email
[2026-01-26T09:53:04.112Z] error: Connection is closed.
[2026-01-26T09:53:04.115Z] info: --> POST /v2/enrichment/email 500 39msa month ago
This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.
Status changed to Open Railway • about 1 month ago