n8n Cluster Worker Not Dequeuing Jobs from Redis (Webhook Triggers Stuck)
jeremielining
PROOP

2 months ago

Hi Railway Support,

I'm running an n8n cluster setup on Railway and encountering an issue where webhook-triggered executions get stuck indefinitely in the "Queued" status within the n8n application. I've exhausted standard n8n configuration troubleshooting and suspect an environmental issue.

My Setup:

  • Platform: Railway

  • Application: n8n (self-hosted cluster)

  • Services:

    • n8n-master (Handles UI, scheduling)

    • n8n-worker (Executes workflows)

    • n8n-whp (Handles incoming webhooks)

    • n8n-redis (Used as the queue broker)

    • n8n-pg (PostgreSQL database)

  • n8n Version:1.116.1 (latest stable)

Problem Description:

  • Workflows triggered by the Webhook node immediately enter the "Queued" state and never transition to "Running" or "Succeeded".

  • This happens even with the absolute simplest workflow (Webhook -> Respond to Webhook).

  • Workflows triggered manually (e.g., Manual Trigger -> Set) execute successfully and instantly.

Evidence & Logs:

  • The n8n-whp service logs correctly show "Enqueued execution (job XX)" when a webhook request is received via Postman.

  • The n8n-worker service logs show no activity related to picking up or processing these jobs.

  • The n8n-master service logs show no relevant errors or activity when webhooks are triggered.

  • The n8n-redis service logs are clean, showing normal background saves and no connection or command errors.

  • External requests (e.g., from Postman) to the webhook URL eventually time out, as the workflow never completes.

Troubleshooting Steps Taken:

  • Confirmed sufficient resources (CPU/Memory) on n8n-master (1vCPU/2GB) and n8n-worker (4vCPU/8GB). Metrics do not show maxing out.

  • Confirmed EXECUTIONS_MODE=queue is set correctly on n8n-master, n8n-worker, and n8n-whp.

  • Confirmed EXECUTIONS_PROCESS=main is set correctly on n8n-worker.

  • Triple-checked Redis connection variables (QUEUE_BULL_REDIS_HOST, PORT, PASSWORD, USERNAME) on all services (master, worker, whp) and confirmed they exactly match the connection details from the n8n-redis service.

  • Explicitly set QUEUE_BULL_NAME=n8n_queue on master, worker, and whp.

  • Added N8N_SKIP_WEBHOOK_REGISTRATION_ON_BOOT=true to worker.

Suspected Cause:The n8n worker service starts but does not log any attempt to connect to the Redis BullMQ queue, despite having the correct EXECUTIONS_MODE=queue, EXECUTIONS_PROCESS=main, and Redis connection variables set. This suggests a potential networking issue within the Railway environment preventing the worker from reaching Redis during its startup initialization, or a subtle incompatibility with the specific n8n cluster template/setup being used on Railway, rather than an n8n configuration error itself.

Question: Could you please help investigate why the n8n-worker might be failing to dequeue jobs from the Redis queue in this Railway environment? Are there any specific network policies or configurations required for this type of inter-service communication via Redis Bull queue?

Thank you for your assistance.

$10 Bounty

2 Replies

brody
EMPLOYEE

2 months ago

This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.

Status changed to Open brody about 2 months ago


jeremielining
PROOP

2 months ago

Update:
1. Added N8N_PROCESS_TYPE=worker: I added this environment variable specifically to the n8n-worker service and redeployed it.
2. Verified N8N_DISABLE_PRODUCTION_MAIN_PROCESS=true: I confirmed this variable was already present and set to true on the n8n-master service.
3. Re-verified All Other Settings: I have double-checked and confirmed that all other suggested variables are set correctly across master, worker, and whp, including:

  • EXECUTIONS_MODE=queue

  • EXECUTIONS_PROCESS=main (on worker)

  • Correct Redis connection variables (using private host)

  • QUEUE_BULL_REDIS_DUALSTACK=true

  • QUEUE_BULL_NAME=n8n_queue

    Result: Unfortunately, despite verifying all these configurations and adding N8N_PROCESS_TYPE=worker, the issue persists. Even the simplest webhook workflow (Webhook -> Respond to Webhook) immediately gets stuck in the "Queued" status.

    Core Observation: The n8n-worker service starts up successfully, but its startup logs still show no attempt to connect to Redis or initialize the BullMQ queue system. This remains the key symptom.

    This further strengthens the suspicion that the issue lies within the Railway environment or deployment configuration, preventing the worker from properly initializing its queue processing mode, rather than an n8n application setting.

    Any further insights or suggestions would be greatly appreciated.


andresndp
FREE

a month ago

This might be an issue with the Private Network Configuration.

Are you using the private redis.railway.internal hostname or the public Redis URL?

Are all 5 services n8n-master, n8n-worker, n8n-whp, n8n-redis, n8n-pg) deployed in the same Railway project?

Is your n8n worker configured to listen on :: (IPv6) or only 0.0.0.0 (IPv4)?


Loading...