read ECONNRESET in both Postgres and Redis

a month ago

Hello, I'm having an issue with Postgres and Redis connection in Node.js app.

I get these errors:

[ioredis] Unhandled error event: Error: connect ETIMEDOUT

at Socket.<anonymous> (/app/node_modules/ioredis/built/Redis.js:171:41)

at Object.onceWrapper (node:events:633:28)

at Socket.emit (node:events:519:28)

at Socket._onTimeout (node:net:604:8)

at listOnTimeout (node:internal/timers:585:17)

at process.processTimers (node:internal/timers:521:7)

⚠️ [Postgres LISTEN error]: read ECONNRESET

It's been months now and I'm still getting these errors and I cannot figure out why, I'm using Private network to connect Node.js app with these services.

The issue happens nearly everyday or every 2 days for a long time now. And I want to find a solution for it so that our services will still continue running without crashing or being down.

Screenshot 2026-05-19 at 11.41.22 AM.png

Attachments

$20 Bounty

3 Replies

Status changed to Open Railway about 1 month ago


Status changed to Awaiting Railway Response Railway about 1 month ago


Status changed to Awaiting User Response Railway about 1 month ago


ve-jo
HOBBYTop 5% Contributor

a month ago

For Redis and Postgres this should be treated as a normal transient TCP disconnect, but there is one important Postgres-specific detail: if the failing connection is used for LISTEN, you must re-create the connection and re-run every LISTEN channel after reconnect. A normal pg.Pool error handler is not enough for notification listeners.

Recommended split:

  1. Use a normal pg.Pool only for queries.
  2. Use a dedicated pg.Client for LISTEN/NOTIFY.
  3. On error or end for the listener client:
    • close/discard that client
    • create a new client
    • connect again
    • re-run all LISTEN ... statements

For Redis with ioredis, make sure you have both a retry strategy and an error listener, so connection errors are logged and retried instead of crashing the process.

Also confirm you are using the private *.railway.internal host from the same Railway environment. Railway private networking is environment-isolated, and the docs describe it as service-to-service networking over internal DNS.

Some useful docs aswell:

https://docs.railway.com/networking/private-networking

https://docs.railway.com/networking/private-networking/how-it-works


Status changed to Awaiting Railway Response Railway 29 days ago


Status changed to Awaiting User Response Railway 29 days ago


a month ago

Thanks a lot for your reply

I did do your suggestion, but I'm still seeing a lot of disconnecting errors.

The main issue here is that recently I've been getting these errors a lot, I've been having the same logic for redis and pg listen for long time but I get these types of errors often, sometimes once per week, sometimes per two weeks.

System running fine as usual, but sometimes I get these errors and the whole API just cannot connect to Postgres database or Redis, a lot of times a restart or redeploy is needed in order to restore the system.


Status changed to Awaiting Railway Response Railway 29 days ago


suryalim11
HOBBYTop 10% Contributor

a month ago

Since the issue persists even with retry logic, and happens every 1-2 days requiring a full restart/redeploy to fix, this points to connections getting into a permanently broken state rather than just a transient disconnect.

The most likely root cause: your LISTEN client and/or ioredis get into a state where they stop retrying after too many consecutive failures. Here is a more resilient approach:

For the Postgres LISTEN connection, you need a dedicated client with forced recreation on any error or end event:

let listenClient = null;

async function startListening() {
  if (listenClient) {
    listenClient.removeAllListeners();
    listenClient.end().catch(() => {});
  }
  listenClient = new Client({ connectionString: process.env.DATABASE_URL });
  try {
    await listenClient.connect();
    await listenClient.query('LISTEN your_channel');
    listenClient.on('notification', handleNotification);
    listenClient.on('error', () => setTimeout(startListening, 5000));
    listenClient.on('end', () => setTimeout(startListening, 5000));
  } catch (err) {
    setTimeout(startListening, 5000);
  }
}
startListening();

For ioredis, add reconnectOnError to force reconnect on ECONNRESET:

const redis = new Redis({
  host: process.env.REDIS_HOST,
  port: 6379,
  retryStrategy: (times) => Math.min(times * 500, 10000),
  reconnectOnError: (err) => /ECONNRESET|ETIMEDOUT/.test(err.message),
  maxRetriesPerRequest: null,
  enableOfflineQueue: true,
});
redis.on('error', (err) => console.error('[Redis]', err.message));

The critical difference: reconnectOnError forces a full reconnect (not just retry) when ECONNRESET occurs, which is what you need since the old TCP connection is dead.

Also reduce pool idleTimeoutMillis to recycle stale connections before they go dead:

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 10000,
});

One more thing to check: are your errors increasing after the recent Railway GCP outage? There may be lingering network instability — a full redeploy of all services (app + Postgres + Redis) to get fresh network assignments could help.


Status changed to Open chandrika 28 days ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...