3 months ago
Hi. I have a n8n environment with 4 services, Worker, Primary, Redis and Postgres (from template https://railway.com/deploy/n8n-with-workers).
My Worker suddenly crashed after updating image, and it seems that the logs are pointing to it being a Postgres/db issue.
I wanted to test a bit and revert changes if it didn't solve it, but my first change caused the issue I'm now having, where it just loads on the database connection part in the Postgres service. I tried to add public address and use "psql" too, but that also times out now.
The change I did was to use the GUI in Postgres service and entered the "workflow_entity" table and set a row's "versionId" to null or empty string.
Immediately after pressing "save" it got stuck in this loop that seems to be impossible to get out of, I've tried restart, redeploy (redeploy fails) and log out and in again
Should I expect to be able to connect to the database again after doing this, or was that a "dangerous" change and it might not be a fix and I must delete my database/volume?
3 Replies
3 months ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
3 months ago
WHAT HAPPENED (The Real Issue)
When you edited workflow_entity.versionId manually in the Railway Postgres GUI, you did something n8n does not allow at the DB level:
versionId is a NOT NULL + foreign key–dependent field
In n8n:
workflow_entity.versionIdis requiredIt references
workflow_version.idIt is used during boot to migrate, load workflows, and lock versions
Setting it to
NULLor empty string means…
n8n boot MIGRATIONS now crash mid-transaction
When the Worker/Primary starts, n8n loads all workflows:
SELECT ... FROM workflow_entity w
JOIN workflow_version v ON w.versionId = v.id
But since a row has versionId = null, Postgres throws:
ERROR: null value in column "versionId" violates not-null constraint
Depending on where the failure occurred, n8n leaves a transaction open or keeps retrying.
WHY THE POSTGRES SERVICE NOW HANGS
Railway Postgres logs show the “Database Connection” spinner forever because:
n8n is repeatedly attempting migrations → locking system catalog
This causes:
Table locks
Long-running transactions
Any new connection (psql, Railway GUI) = timeout
GOOD NEWS: You probably do NOT need to delete your database
Because…
Railway almost never becomes unrecoverable
99% of these cases are fixed by killing the bad query and correcting the corrupted row.
THE FIX (safe, no data loss)
You need to:
Stop the Worker + Primary
Kill stuck connections
Connect to Postgres
Restore the broken row’s
versionId
STEP 1 — Stop services
In Railway:
Stop:
Worker
Primary (main n8n)
Leave only:
Postgres
Redis
This prevents n8n from instantly reconnecting and re-locking the DB.
STEP 2 — Kill all active pids in Postgres
Railway → Postgres → Shell → run:
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = current_database()
AND pid <> pg_backend_pid();
If GUI is frozen, run via psql:
psql "postgresql://USER:PASSWORD@HOST:PORT/railway"
Then run the terminate query again.
STEP 3 — Fix the broken row
Find the corrupted workflow:
SELECT id, name, versionId FROM workflow_entity WHERE versionId IS NULL OR versionId = '';
You should get 1 row—the one you changed.
Now find the latest version:
SELECT id FROM workflow_version WHERE workflowId = '<workflow-id>' ORDER BY createdAt DESC LIMIT 1;
Then restore it:
UPDATE workflow_entity
SET "versionId" = '<the version id you found>'
WHERE id = '<workflow id>';
STEP 4 — Verify the schema
Check:
\d workflow_entity
Confirm from output:
versionId UUID NOT NULL
If it's NOT NULL, you MUST set a value again (n8n requires it).
STEP 5 — Restart n8n Primary (only)
If it boots normally, restart Worker after.
If this works → your DB is safe
If all connections still hang after terminating backends
You have 2 possibilities:
Option A — There's still an open transaction from n8n (most likely)
Run:
SELECT * FROM pg_stat_activity;
Look for long-running queries like:
idle in transaction
Kill them with:
SELECT pg_terminate_backend(<pid>);
Keep terminating until none remain.
Option B — Corrupt row causes boot-loop
If you cannot restore the versionId because you don't know it:
You can copy version from another workflow version:
UPDATE workflow_entity SET "versionId" = (
SELECT id FROM workflow_version WHERE workflowId = workflow_entity.id LIMIT 1
)
WHERE workflow_entity.versionId IS NULL OR workflow_entity.versionId = '';
🧨 MUST I DELETE THE DATABASE? (Rare)
Only if ALL of these are true:
You cannot connect
Terminating backends fails
Even
psqltimes out when services are stoppedRailway DB logs show corrupted Postgres cluster, not locked tables
From your description, this is only a lock, not corruption.
You should NOT need to delete your DB.
3 months ago
Hi, I would ignore the obvious bot response before mine. Here's what I think is going on as I'm having the exact same issue (not with N8n though). Internal networking seems to not work at all in my new project. Using public Host & Port / URL works everytime. So far i've tried with a Mysql DB, Redis and MinIO service..