3 months ago
Subject: Intermittent Postgres disconnects via maglev.proxy.rlwy.net:47180 (ECONNRESET → connect timeouts) from external client
Service: Railway Managed Postgres
Host/Port:maglev.proxy.rlwy.net:47180
DB name:railway
Region: Us California
Client app: n8n hosted on Render ($7 Starter web service)
Client driver/runtime: Node 18, pg 8.12.0 via pg-pool 3.6.2
Summary
We see stable DB operation for ~10–20 minutes, then a burst of:
Connection terminated unexpectedlyread ECONNRESETrepeated
timeout exceeded when trying to connect(1–10+ minutes)
During those windows, new connections fail and existing ones drop. After we manually restart the DB, it works again until the next burst. Load is very light.
Client connection settings
SSL enabled (
rejectUnauthorized=false)Pool size: 3
Connect timeout: 70,000 ms
Idle connection timeout: 70,000 ms
App ping/keepalive: lightweight query every 5s
Workload: small n8n metadata/executions reads; no long-running queries
Timeline (UTC) from client logs (2025-09-09)
16:07:35 OK (periodic query)
16:08:35 OK
16:09:19 ERROR Connection terminated unexpectedly
16:09:27 ERROR read ECONNRESET (twice)
16:10:29 ERROR timeout exceeded when trying to connect
16:10:35 ERROR timeout exceeded when trying to connect
16:11:31/12:33/13:35/14:35/15:35 Repeated connect timeouts
16:14:48 ERROR Failed to hard-delete executions (root cause: timeout exceeded when trying to connect)
Railway Postgres logs after our manual restart (same day)
16:25:39 starting PostgreSQL 17.6
16:25:40 database system was interrupted; last known up at 16:18:35
16:25:40 database system was not properly shut down; automatic recovery in progress
16:25:40 redo starts at 0/32F5850
16:25:40 invalid record length at 0/32F5990: expected at least 24, got 0
16:25:40 redo done; checkpoint end-of-recovery
16:25:40 database system is ready to accept connections
Notably, client-side errors begin at 16:09:19, while the DB reports last known up at 16:18:35 and an unclean shutdown at restart time. That suggests proxy/LB issues or backend host/container problems that preceded the final DB interruption.
What we tried
Tuned pool size/timeouts and added 5s app-level pings.
Restarted DB; problem recurs later.
Traffic is minimal; other app calls (HTTP, Google Sheets OAuth refresh) succeed—only DB path fails.
Hypotheses
Maglev proxy/LB instability in our region causing RSTs, then “black-holed” connects.
DB container/host events (OOM/host drain/migration) leading to unclean shutdown.
Aggressive TCP/NAT idle timeouts on the proxy path despite app pings.
Less likely: per-client limits at very low connection counts.
Requests to Railway
Check proxy/LB logs for
maglev.proxy.rlwy.net:47180between 16:08–16:16 UTC and surrounding minutes for RSTs, health flaps, backend detachments for our DB.Check DB container/host events around 16:09–16:19 UTC (OOM kills, node drain, maintenance).
Confirm effective TCP/idle timeouts and recommended Postgres/pg client keepalive settings (
tcp_keepalives_*), or other best-practice values for external clients.Advise on any known incidents in this region during that window.
Is there a way to obtain a direct endpoint (bypassing shared proxy) or recommended PgBouncer approach for cross-provider clients?
Context
We may migrate this DB to Render Postgres (private network) to avoid cross-provider hops, but we’d like root cause on the Railway side to decide future usage and to help you triage a potential platform issue.
Happy to provide full client/server logs or run a short reproduction window if needed.
6 Replies
3 months ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
Railway
Hey there! We've found the following might help you get unblocked faster: - [🧵 Postgres slow/timeout](https://station.railway.com/questions/postgres-slow-timeout-dfffea16) - [🧵 Postgres ECONNRESET / unable to connect using TCP Proxy](https://station.railway.com/questions/postgres-econnreset-unable-to-connect-0aba867d) - [🧵 Postgres Connection Limit and Timeout](https://station.railway.com/questions/postgres-connection-limit-and-timeout-2173af9b) If you find the answer from one of these, please let us know by solving the thread!
3 months ago
No these did not help
3 months ago
Hello!
We're acknowledging your issue and attaching a ticket to this thread.
We don't have an ETA for it, but, our engineering team will take a look and you will be updated as we update the ticket.
Please reply to this thread if you have any questions!
3 months ago
The edgeproxies will at some point restart. Could you please attempt to use the private networking instead?
I've attached a ticket to our 'Support hotreloading connections across machines' ticket
Status changed to Awaiting User Response Railway • 3 months ago
3 months ago
so you are saying i should try private networking with render and railway is that correct
Status changed to Awaiting Railway Response Railway • 3 months ago
3 months ago
Private networking between Railway services. Per docs here: https://docs.railway.com/guides/private-networking
Status changed to Awaiting User Response Railway • 3 months ago
2 months ago
This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!
Status changed to Solved Railway • 2 months ago