Railway proxy timeout

thecarnivalalldaybuffet

HOBBYOP

6 months ago

Subject: Intermittent Postgres disconnects via maglev.proxy.rlwy.net:47180 (ECONNRESET → connect timeouts) from external client

Service: Railway Managed Postgres
Host/Port:maglev.proxy.rlwy.net:47180
DB name:railway
Region: Us California
Client app: n8n hosted on Render ($7 Starter web service)
Client driver/runtime: Node 18, pg 8.12.0 via pg-pool 3.6.2

Summary

We see stable DB operation for ~10–20 minutes, then a burst of:

Connection terminated unexpectedly
read ECONNRESET
repeated timeout exceeded when trying to connect (1–10+ minutes)

During those windows, new connections fail and existing ones drop. After we manually restart the DB, it works again until the next burst. Load is very light.

Client connection settings

SSL enabled (rejectUnauthorized=false)
Pool size: 3
Connect timeout: 70,000 ms
Idle connection timeout: 70,000 ms
App ping/keepalive: lightweight query every 5s
Workload: small n8n metadata/executions reads; no long-running queries

Timeline (UTC) from client logs (2025-09-09)

16:07:35 OK (periodic query)
16:08:35 OK
16:09:19 ERROR Connection terminated unexpectedly
16:09:27 ERROR read ECONNRESET (twice)
16:10:29 ERROR timeout exceeded when trying to connect
16:10:35 ERROR timeout exceeded when trying to connect
16:11:31/12:33/13:35/14:35/15:35 Repeated connect timeouts
16:14:48 ERROR Failed to hard-delete executions (root cause: timeout exceeded when trying to connect)

Railway Postgres logs after our manual restart (same day)

16:25:39 starting PostgreSQL 17.6
16:25:40 database system was interrupted; last known up at 16:18:35
16:25:40 database system was not properly shut down; automatic recovery in progress
16:25:40 redo starts at 0/32F5850
16:25:40 invalid record length at 0/32F5990: expected at least 24, got 0
16:25:40 redo done; checkpoint end-of-recovery
16:25:40 database system is ready to accept connections

Notably, client-side errors begin at 16:09:19, while the DB reports last known up at 16:18:35 and an unclean shutdown at restart time. That suggests proxy/LB issues or backend host/container problems that preceded the final DB interruption.

What we tried

Tuned pool size/timeouts and added 5s app-level pings.
Restarted DB; problem recurs later.
Traffic is minimal; other app calls (HTTP, Google Sheets OAuth refresh) succeed—only DB path fails.

Hypotheses

Maglev proxy/LB instability in our region causing RSTs, then “black-holed” connects.
DB container/host events (OOM/host drain/migration) leading to unclean shutdown.
Aggressive TCP/NAT idle timeouts on the proxy path despite app pings.
Less likely: per-client limits at very low connection counts.

Requests to Railway

Check proxy/LB logs for maglev.proxy.rlwy.net:47180 between 16:08–16:16 UTC and surrounding minutes for RSTs, health flaps, backend detachments for our DB.
Check DB container/host events around 16:09–16:19 UTC (OOM kills, node drain, maintenance).
Confirm effective TCP/idle timeouts and recommended Postgres/pg client keepalive settings (tcp_keepalives_*), or other best-practice values for external clients.
Advise on any known incidents in this region during that window.
Is there a way to obtain a direct endpoint (bypassing shared proxy) or recommended PgBouncer approach for cross-provider clients?

Context

We may migrate this DB to Render Postgres (private network) to avoid cross-provider hops, but we’d like root cause on the Railway side to decide future usage and to help you triage a potential platform issue.

Happy to provide full client/server logs or run a short reproduction window if needed.

Solved

6 Replies

Railway

BOT

6 months ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!

Railway

Hey there! We've found the following might help you get unblocked faster: - [🧵 Postgres slow/timeout](https://station.railway.com/questions/postgres-slow-timeout-dfffea16) - [🧵 Postgres ECONNRESET / unable to connect using TCP Proxy](https://station.railway.com/questions/postgres-econnreset-unable-to-connect-0aba867d) - [🧵 Postgres Connection Limit and Timeout](https://station.railway.com/questions/postgres-connection-limit-and-timeout-2173af9b) If you find the answer from one of these, please let us know by solving the thread!

thecarnivalalldaybuffet

HOBBYOP

6 months ago

No these did not help

Railway

BOT

6 months ago

Hello!

We're acknowledging your issue and attaching a ticket to this thread.

We don't have an ETA for it, but, our engineering team will take a look and you will be updated as we update the ticket.

Please reply to this thread if you have any questions!

jake

EMPLOYEE

6 months ago

The edgeproxies will at some point restart. Could you please attempt to use the private networking instead?

I've attached a ticket to our 'Support hotreloading connections across machines' ticket

Status changed to Awaiting User Response Railway • 6 months ago

thecarnivalalldaybuffet

HOBBYOP

6 months ago

so you are saying i should try private networking with render and railway is that correct

Status changed to Awaiting Railway Response Railway • 6 months ago

david

EMPLOYEE

6 months ago

Private networking between Railway services. Per docs here: https://docs.railway.com/guides/private-networking

Status changed to Awaiting User Response Railway • 6 months ago

Railway

BOT

6 months ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway • 6 months ago