Recurring Timeout Issues on n8n Server - Postgres Checkpoint Blocking System
orithee
PROOP

5 months ago

Title:

Recurring Timeout Issues on n8n Server - Postgres Checkpoint Blocking System

Description:

I'm running a self-hosted n8n server on Railway with a Pro Plan and experiencing severe, recurring timeout issues that are affecting my server's availability.

Symptoms:

- Webhook requests failing with "timeout exceeded when trying to connect"

- Server returns "Database is not ready!" during failures

- Issue repeats every 2-4 hours consistently

- Logs show Postgres checkpoints taking 16-17 seconds

Example from logs:

2025-10-22 18:28:04 - timeout exceeded when trying to connect

2025-10-22 18:28:05 - Error in handling webhook request GET /webhook/xxx: timeout exceeded when trying to connect

2025-10-22 18:20:12 - checkpoint complete: wrote 170 buffers (1.0%); write=16.931s, sync=0.008s, total=17.101s

Current Setup:

- Railway Pro Plan active white_check_mark emoji

- CPU Usage: ~30-40% (not high)

- Memory Usage: ~50-60% (normal)

- Sufficient resources allocated to the service

- Architecture: 2 Workers + Primary + Redis + Postgres (all on Railway)

What I've tried:

- white_check_mark emoji Enabled execution pruning

- white_check_mark emoji Increased connection pool (DB_POSTGRESDB_POOL_SIZE=20)

- white_check_mark emoji Reduced logging

- white_check_mark emoji Increased timeouts

- white_check_mark emoji Disabled unnecessary execution saves

- x emoji Nothing has helped significantly

My question:

Why are Postgres checkpoints taking so long (16+ seconds) on Pro Plan with available resources? This blocks the entire system and severely impacts availability. Is this:

1. A Disk I/O issue specific to my allocated node?

2. A Postgres configuration that needs tuning?

3. A known issue with Railway Postgres?

I would appreciate understanding how to resolve this or if resource/node reallocation is needed.

Additional context:

- Service: n8n (self-hosted)

- Database: Railway Postgres (same project)

- Critical for production (WhatsApp bot serving customers)

- Need 99%+ uptime

Thank you for your help!

$10 Bounty

5 Replies

Railway
BOT

5 months ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!


Railway

Hey there! We've found the following might help you get unblocked faster: - [🧵 I can't deploy n8n with Redis and Postgres](https://station.railway.com/questions/i-can-t-deploy-n8n-with-redis-and-postgr-104e28d7) - [🧵 Postgres slow/timeout](https://station.railway.com/questions/postgres-slow-timeout-dfffea16) - [🧵 Postgres Connection Limit and Timeout](https://station.railway.com/questions/postgres-connection-limit-and-timeout-2173af9b) - [🧵 n8n Send Email Node timeout](https://station.railway.com/questions/n8n-send-email-node-timeout-c7eaacbf) If you find the answer from one of these, please let us know by solving the thread!

orithee
PROOP

5 months ago

Not good enouth.
its very important, i already tried anything.

its not OK. i need to fix it.
its a production environment !!


5 months ago

This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.

Status changed to Open brody 5 months ago


orithee
PROOP

5 months ago

*** More Data ***
I already have a few projects, all with the same configurations, and this is the only project that ,makes me problems with the timeout. i dont understand what may make this happen .


orithee
PROOP

5 months ago

Thank you for your response !
Can you explain more deeper about the options?

1. How to do that? i am afraid to moce my exiting postgres to a new one and make changes on it ...
2. Can you explain it more deeper? how to set this layer and what is the benefits of it?
3. EXECUTIONS_DATA_PRUNE_HARD_DELETE this is not existing .. i will check about it !! Thanks !!
4. I have a lot of resources .. how to control the I/O? The metrics are ok .
5. how it works? what i need to do?

Another point - I have a few projects, with the same configurations, and still, the errors happens only on this project.
My current problematic project :

Another fine project :


orithee
PROOP

5 months ago

@Railway !!!
I have proof that this is an infrastructure issue:
Two identical n8n setups:
- Project A: ZERO timeout errors white_check_mark emoji
- Project B: Constant timeouts every few hours x emoji
Same code, same config, same (minimal) load.

Please investigate the node/disk allocated to Project B and migrate it to better infrastructure.

-----


Loading...