Postgres unresponsive — degraded volume (US East), production down
sebastiamliyo
PROOP

9 days ago

My Postgres service (postgres-production-5798, US East / Virginia) is online but completely unresponsive. External connections are reset with ECONNRESET ~7.4s into the TCP handshake. Deploy logs show checkpoints writing <1MB taking sync=2458s, total=3000s (improving but still ~500s). The volume usage metric reads 0 B, which looks broken. Mount /var/lib/postgresql/data, 50GB volume. This strongly indicates degraded volume storage on your side. Please inspect the volume health urgently — production is down.

Solved

6 Replies

Status changed to Awaiting Railway Response Railway 9 days ago


axisor
PRO

9 days ago

Same issue here at Axisor, multiple production projects affected in US East / Virginia. Would appreciate an update as soon as possible.

image.png

  • Carlos, CTO

Attachments


axisor

Same issue here at Axisor, multiple production projects affected in US East / Virginia. Would appreciate an update as soon as possible. ![image.png](https://station-server.railway.com/attachments/att_01ksqwwtjqeh5r52z96j76vvqk) - Carlos, CTO

sebastiamliyo
PROOP

9 days ago

+1 — same incident, my Postgres in US East / Virginia is fully unresponsive (ECONNRESET on connect). Production down. Please update.


9 days ago

Same here, more than 1 hour ago and no incidents response at all


circlearoundhere
PRO

9 days ago

Same here for 4+ hours


9 days ago

Worked for me using NO_CACHE=1 on redis/postgree and making changes on mounted volume (creating and restoring DB or wiping if you dont need the data, like redis).


9 days ago

This should now be resolved. The root cause was an underlying issue on the host serving your service, which was fixed a few hours ago. Apologies for being late to follow up here. Can you confirm everything is back online on your end?

To stay resilient against hardware-level failures like this in the future, we'd recommend moving to a high-availability Postgres setup, which runs replicas with automatic failover so a single host going down doesn't take your database offline: https://docs.railway.com/databases/postgresql-ha. We'd also suggest enabling point-in-time recovery (https://docs.railway.com/volumes/point-in-time-recovery) and automated backups (https://docs.railway.com/volumes/backups) so you can restore quickly if anything goes wrong.


Status changed to Awaiting User Response Railway 9 days ago


Railway
BOT

2 days ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway 1 day ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...