3 months ago
Hello Railway Support,
I’m experiencing a critical issue with my n8n setup connected to Redis on Railway. Everything worked fine until yesterday. After I tried to use the RAG tool, the service froze, and since then the connections to Redis no longer work.
Context:
- I have 4 services running in Railway:
1. Primary (main n8n instance)
2. Worker (n8n queue workers)
3. Redis (Bitnami Redis image managed by Railway)
4. Postgres (database for n8n)
- Everything was stable and running normally until yesterday.
Problem:
Since yesterday, both Primary and Worker fail to connect to Redis. Logs always show:
[Redis client] getaddrinfo ENOTFOUND redis.railway.internal
Lost Redis connection. Trying to reconnect...
Unable to connect to Redis after trying for 10s
Exiting process due to Redis connection error
This happens even though Redis starts correctly and shows “Ready to accept connections tcp”.
What I have already tried:
1. Checked environment variables
- Both Primary and Worker are configured with:
QUEUE_BULL_REDIS_HOST=${{Redis.REDISHOST}}
QUEUE_BULL_REDIS_PORT=${{Redis.REDISPORT}}
QUEUE_BULL_REDIS_USERNAME=${{Redis.REDISUSER}}
QUEUE_BULL_REDIS_PASSWORD=${{Redis.REDIS_PASSWORD}}
- I confirmed multiple times that these values point to the current Redis instance.
- I also deleted old variables referencing redis.railway.internal.
2. Deleted and redeployed Redis
- I cleared the Redis volumes, but then started getting permission errors (mkdir: cannot create directory '/bitnami/redis': Permission denied).
- I deleted and recreated the Redis service multiple times, but Primary and Worker still keep trying to connect to redis.railway.internal instead of the new hostname ($RAILWAY_PRIVATE_DOMAIN).
3. Fully redeployed Primary and Worker
- I manually redeployed both services, but the error persists.
4. Checked n8n configuration warnings
- The logs also mention that the file /home/node/.n8n/config has incorrect permissions (0644 too open). I suspect Railway may be storing old Redis settings inside this file.
Current situation:
- Redis is running fine and ready to accept TCP connections.
- Primary and Worker still cannot resolve the Redis host and exit after 10s with ENOTFOUND redis.railway.internal.
- Postgres continues to work normally.
- I suspect either:
1. Railway is caching old variables or hostnames, or
2. Something inside the n8n container (/home/node/.n8n/config) is overriding the environment variables.
Request:
Could you please:
- Check why my Primary and Worker are still trying to connect to redis.railway.internal instead of the current Redis private domain ($RAILWAY_PRIVATE_DOMAIN)?
- Confirm if Railway is persisting old volumes or settings even after I delete and recreate Redis.
- Help me reset Redis and reconnect it properly to n8n without depending on the old redis.railway.internal.
This is urgent because all my n8n workflows and credentials are stuck and I cannot continue working.
Thank you for your support!
4 Replies
3 months ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
3 months ago
Hello Railway Team, thank you for the suggestions. I already went through the documentation and applied all the recommended fixes: all 4 services (Primary, Worker, Redis, and Postgres) are inside the same project/workspace. I updated the environment variables to use ${{Redis.REDISHOST}}, which resolves to redis.railway.internal. I tested both redis and redis.railway.internal as QUEUE_BULL_REDIS_HOST. I removed old/duplicate variables and confirmed that only the correct ones are present. However, the issue persists. Both the Primary and Worker containers continue to show the same error: “[Redis client] getaddrinfo ENOTFOUND redis.railway.internal - Lost Redis connection. Trying to reconnect… Unable to connect to Redis after trying for 10s”. Redis itself starts correctly and is ready to accept TCP connections, but the other services inside the same Railway project cannot resolve redis.railway.internal. This setup was working fine until yesterday. The issue started only after I tried using the RAG tool, and since then, both Primary and Worker have been unable to connect to Redis. Could you please check if the internal DNS resolution for redis.railway.internal is functioning correctly within my workspace, or if something is wrong with the private networking between services? Thank you for your help.
Railway
Hey there! We've found the following might help you get unblocked faster: - [📚 ENOTFOUND redis.railway.internal](https://docs.railway.com/reference/errors/enotfound-redis-railway-internal) - [🧵 ENOTFOUND redis.railway.internal](https://station.railway.com/questions/enotfound-redis-railway-internal-94515829) - [🧵 "n8n deployment crashed" error](https://station.railway.com/questions/n8n-deployment-crashed-error-0f992914) If you find the answer from one of these, please let us know by solving the thread!
3 months ago
Hello Railway Team, thank you for the suggestions. I already went through the documentation and applied all the recommended fixes: all 4 services (Primary, Worker, Redis, and Postgres) are inside the same project/workspace. I updated the environment variables to use ${{Redis.REDISHOST}}, which resolves to redis.railway.internal. I tested both redis and redis.railway.internal as QUEUE_BULL_REDIS_HOST. I removed old/duplicate variables and confirmed that only the correct ones are present. However, the issue persists. Both the Primary and Worker containers continue to show the same error: “[Redis client] getaddrinfo ENOTFOUND redis.railway.internal - Lost Redis connection. Trying to reconnect… Unable to connect to Redis after trying for 10s”. Redis itself starts correctly and is ready to accept TCP connections, but the other services inside the same Railway project cannot resolve redis.railway.internal. This setup was working fine until yesterday. The issue started only after I tried using the RAG tool, and since then, both Primary and Worker have been unable to connect to Redis. Could you please check if the internal DNS resolution for redis.railway.internal is functioning correctly within my workspace, or if something is wrong with the private networking between services? Thank you for your help.
3 months ago
Just wanted to check a few things before it becomes elevated for any issue related to platform, which I'm hoping its not.
1. Under your Redis service → Settings → Networking → Private Networking, is the address there still redis.railway.internal or is it different?
2. Under your Primary (Worker afterwards) service, have you tried pressing CTRL+K (CMD+K on Mac) to open the command palette, then clicking "Deploy latest version" to make sure that none of the variables or data from the existing deployment are preserved during a re-deploy?
3. Are you able to see if you can connect to the Redis database through either one of these (or any) single-service templates you could deploy to your project temporarily? https://railway.com/deploy/dbgate or https://railway.com/deploy/drizzle-studio-gateway (use password variable to login)
