2 months ago
Hi, my production Postgres is stuck in a crash loop and I can't get it back up.
The volume filled up around 22:27 UTC tonight, hit 97% by 22:29 and crashed. I resized the volume right away — railway volume list now shows 324 MB / 5000 MB so theres clearly enough space. But Postgres still fails with the same error every time: "FATAL: could not write to file "pg_wal/xlogtemp.30": No space left on device"
The WAL recovery actually completes fine (redo finishes), it just can't write the checkpoint file after. So I think the block device got resized but the filesystem inside wasn't expanded.
I've tried redeploying 3-4 times, also detached and reattached the volume via CLI. Nothing changes.
Project: devoted-tranquility
Service: Postgres
Volume: postgres-volume (vol_pwwustzkuh8acw2s)
Environment: production
This is my production db so kind of urgent. Is this the correct platform to get help? + Is there any way to expand the filesystem or get shell access so I can resize2fs it myself?
Thanks in advance
4 Replies
2 months ago
2 months ago
Same issue here. Project: blissful-elegance (c8e1eaa7-fce8-4ebc-bf5c-47193effa900), volume resized to 10GB but filesystem still reports "No space left on device". Postgres crash-looping since ~22:39 UTC. WAL redo completes, checkpoint write fails. Need filesystem resize2fs.
2 months ago
Replace the start command (remember to save the original) with sleep 100000, use railway ssh to access the container then run whatever commands you need.
Thanks for the responses. I SSH'd in with sleep 100000 and confirmed the filesystem is still 434MB at 99% full. /proc/partitions shows the block device is about 480MB so the resize never hit the actual zvol. I installed e2fsprogs to try resize2fs myself but the device node doesn't exist in /dev and mknod is not permitted in the container.
Could someone on the Railway side either resize the zvol or run resize2fs on the host? I've got a restored backup sitting on this volume and I need the space to start Postgres and dump my data. Production site is still down.