Postgres unresponsive — degraded volume (US East), production down
sebastiamliyo
PROOP

a month ago

My Postgres service (postgres-production-5798, US East / Virginia) is online but completely unresponsive. External connections are reset with ECONNRESET ~7.4s into the TCP handshake. Deploy logs show checkpoints writing <1MB taking sync=2458s, total=3000s (improving but still ~500s). The volume usage metric reads 0 B, which looks broken. Mount /var/lib/postgresql/data, 50GB volume. This strongly indicates degraded volume storage on your side. Please inspect the volume health urgently — production is down.

Solved

6 Replies

Status changed to Awaiting Railway Response Railway 27 days ago


axisor
PRO

a month ago

Same issue here at Axisor, multiple production projects affected in US East / Virginia. Would appreciate an update as soon as possible.

image.png

  • Carlos, CTO

Attachments


axisor

Same issue here at Axisor, multiple production projects affected in US East / Virginia. Would appreciate an update as soon as possible. ![image.png](https://station-server.railway.com/attachments/att_01ksqwwtjqeh5r52z96j76vvqk) - Carlos, CTO

sebastiamliyo
PROOP

a month ago

+1 — same incident, my Postgres in US East / Virginia is fully unresponsive (ECONNRESET on connect). Production down. Please update.


a month ago

Same here, more than 1 hour ago and no incidents response at all


circlearoundhere
PRO

a month ago

Same here for 4+ hours


a month ago

Worked for me using NO_CACHE=1 on redis/postgree and making changes on mounted volume (creating and restoring DB or wiping if you dont need the data, like redis).


a month ago

This should now be resolved. The root cause was an underlying issue on the host serving your service, which was fixed a few hours ago. Apologies for being late to follow up here. Can you confirm everything is back online on your end?

To stay resilient against hardware-level failures like this in the future, we'd recommend moving to a high-availability Postgres setup, which runs replicas with automatic failover so a single host going down doesn't take your database offline: https://docs.railway.com/databases/postgresql-ha. We'd also suggest enabling point-in-time recovery (https://docs.railway.com/volumes/point-in-time-recovery) and automated backups (https://docs.railway.com/volumes/backups) so you can restore quickly if anything goes wrong.


Status changed to Awaiting User Response Railway 27 days ago


Railway
BOT

20 days ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway 20 days ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...