a month ago
Evidence this is a networking/Redis-path problem, not my code:
/healthendpoint (does not touch Redis) responds in ~500ms consistently:
$ time curl -s -o /dev/null -w "Total: %{time_total}s\n" https://api.myapp.com/health
Total: 0.475s
Total: 0.488s
Total: 0.715s/nonexistent(a 404 that passes through my rate-limit middleware which calls RedisINCR) takes 11-15 seconds:
$ time curl -s -o /dev/null -w "Total: %{time_total}s\n" https://api.myapp.com/nonexistent
Total: 11.355s
Total: 15.348s
Total: 15.747s
Total: 15.563sThe only difference between these two paths is one Redis call.
- Redis itself appears healthy in the Redis service logs:
Ready to accept connections tcp- Memory < 1.2 MB with 7 keys loaded
- CPU ~0%
- No OOM, no MISCONF, no slow-log entries
- I restarted/redeployed the Redis service — no change.
- Backend request timing middleware log excerpt (every line is a single request):
[req] POST /refresh 200 12315ms
[req] POST /refresh 200 13054ms
[req] GET /active 200 14699ms
[req] POST /refresh 200 14866ms
[req] GET /active 200 14785ms
[req] GET /e0a40dca-... 200 14733ms
[req] POST /refresh 200 10645ms
[req] POST /refresh 200 11484ms
[req] POST /refresh 200 11032ms
[req] GET /nonexistent 404 14239ms
[req] GET /.env 404 10949msEven 404 responses for random nonexistent URLs take 10-14 seconds because they still pass through the rate-limit middleware.
- I ruled out:
- Postgres slowness (
pg_stat_activityshows no stuck queries, tables are tiny — 6 rows inrefresh_tokens) - Query inefficiency (dropped LATERAL joins, added indexes — no change)
- Cold starts (Hobby plan, sustained across many requests)
- Sentry overhead (removed entirely — no change)
- Concurrent request stampede (added dedup on
/auth/refresh— verified 1 call, still slow) - Rate-limit misconfiguration (simple
INCR+EXPIREper request, tested on a tiny dataset)
- Postgres slowness (
- The app has only 2 test users. Redis has 7 keys. There is zero load.
What I'd like help with:
- Is there a known issue with
redis.railway.internalor private networking in my region? - Can you confirm traffic between my backend and Redis services is routing correctly?
- Any diagnostics from your side that show latency or packet drops on my services' private network?
Update: Confirmed slowness is NOT specific to private networking. I switched my backend from redis.railway.internal:6547 to the public proxy URL trolley.proxy.rlwy.net:558754 and latency is identical (~11-15s per request). OPTIONS preflight requests (which short-circuit before my rate-limit middleware) complete in ~500ms. GET/POST requests (which do one Redis INCR via ioredis) take 11-15 seconds. Either my backend's connection to Redis has an internal issue (ioredis auto-reconnect storm?) or there's something weirder going on with traffic from this specific backend service.
Attachments
5 Replies
Status changed to Open Railway • 29 days ago
a month ago
- Seems like latency issue is due Redis Connection. You might be creating a new Redis connection for every single API request.
- Creating a TCP handshake + TLS wrap for every request can easily take few seconds under load.
- If you're using frameworks like Nextjs or a serverless-style backend, then you need to initialize Radis globally. The Fix: Globalize your Redis client instance so it is reused across requests.
bilalnawaz072
* Seems like latency issue is due Redis Connection. You might be creating a **new Redis connection** for every single API request. 1. Creating a TCP handshake + TLS wrap for every request can easily take few seconds under load. 2. If you're using frameworks like Nextjs or a serverless-style backend, then you need to initialize Radis globally. **The Fix:** Globalize your Redis client instance so it is reused across requests.
a month ago
Thanks — I checked and my Redis client is already a single top-level instance via export const redis = new Redis(REDIS_URL) in ioredis. Not created per-request. The issue persists even with a confirmed global client, so I don't think connection initialization is the cause. Would you mind checking network path diagnostics on the backend service's private networking?
a month ago
Update: confirmed my Redis client is a single global new Redis(REDIS_URL) at module top level, not created per-request. The "Redis connected" line only appears once at container startup. I've also confirmed the latency is consistent across every subsequent request, not just the first one (which rules out TCP/TLS handshake cost).
Switching between redis.railway.internal and the public *.proxy.rlwy.net URL makes no difference — both paths give 10-15s per Redis call.
For now I've worked around by moving rate limiting to in-memory, which makes my backend fast. Still keen on a root-cause investigation on the networking side since I have other legitimate Redis usage.
a month ago
I'm not officially from railway team. I'm a developer and suggest the solution based on my experience. Your network flow logs should look like this with 0ms latency.
So based on my experience, the possible issue is how you're using redis instead of railway. I use below snippet for redis. You need to export global redis instead of initialized one.
// 1. Define a function to create the instance
const redisClientSingleton = () => {
return new Redis(process.env.REDIS_URL as string);
};
// 2. Extend the global object type
declare global {
var redis: undefined | ReturnType<typeof redisClientSingleton>;
}
// 3. Use the existing global instance or create a new one
export const redis = globalThis.redis ?? redisClientSingleton();
// 4. In development, save the instance to the global object
if (process.env.NODE_ENV !== 'production') {
globalThis.redis = redis;
}
Attachments
bilalnawaz072
I'm not officially from railway team. I'm a developer and suggest the solution based on my experience. Your network flow logs should look like this with 0ms latency.  So based on my experience, the possible issue is how you're using redis instead of railway. I use below snippet for redis. You need to export global redis instead of initialized one. `// 1. Define a function to create the instance` `const redisClientSingleton = () => {` ` return new Redis(process.env.REDIS_URL as string);` `};` `// 2. Extend the global object type` `declare global {` ` var redis: undefined | ReturnType<typeof redisClientSingleton>;` `}` `// 3. Use the existing global instance or create a new one` `export const redis = globalThis.redis ?? redisClientSingleton();` `// 4. In development, save the instance to the global object` `if (process.env.NODE_ENV !== 'production') {` ` globalThis.redis = redis;` `}`
a month ago
Thanks. My backend is a plain Node/Express app (not Next.js / serverless). I have a single export const redis = new Redis(REDIS_URL) at module scope. The "Redis connected" log fires once at container startup, not per request, confirming the client is reused. Latency is sustained across ALL requests after startup, not just the first, so it's not a connection-establishment cost.
I've worked around the issue by moving rate limiting to in-memory. Hoping someone from the Railway team can look at the networking path.