URGENT!! Postgres crash-looping — volume resize not propagating after deployment outage
cooldev765
PROOP

2 months ago

My Postgres is crash-looping with "No space left on device" on pg_wal. Disk filled up during an index build. Resized volume 10GB→20GB but filesystem never picked it up. Restored from backup which created a new volume, but Postgres still mounts the old full volume on every restart. WAL replay completes successfully but crashes writing the checkpoint. Data is intact — purely a disk space issue. Old Volume is always mounted despite the restore. Tried: live resize, redeploy, multiple restores — nothing worked. Note: Railway deployment outage (incident QVJT4QJO) occurred at the same time. Could this have caused the volume resize to not propagate? FATAL: could not write to file "pg_wal/xlogtemp.30": No space left on device. @Railway ![image.png](https://station-server.railway.com/attachments/att\_01knhxtnj9e4ea3g1j91017598)

Note: there are currently 3 volumes visible — postgres-volume (original), and two dated copies from previous restore attempts. The original postgres-volume is the one mounted to the service and contains all the data. Please expand the filesystem on that one only

Attachments

Closed

1 Replies

cooldev765
PROOP

2 months ago

NEED HELP: Logs confirm WAL recovery completes successfully every restart (redo done at 16/F6FD4BA8) but Postgres immediately crashes with No space left on device when writing pg_wal/xlogtemp.30.

The volume was live-resized from 10GB to 20GB but the ext4 filesystem was never expanded inside the container — the block device is 20GB but the filesystem is still 10GB.

Running resize2fs on the block device inside the Postgres container will fix this immediately. Data is intact and uncorrupted — WAL replay completes successfully every time.


Status changed to Closed brody about 2 months ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...