a month ago
Hi Railway team,
We are hitting the EXACT same Live Resize / ext4 bug that angelo-demiryolu manually fixed 7 days ago in this already-solved ticket: https://station.railway.com/questions/postgre-sql-crash-loop-after-disk-full-82a101ad
The old ticket was from another user's Hobby-plan project. Our case is on a Pro-plan project and production is down for hours, so please treat as high priority.
Same symptom: volume Live Resize (beta) updated block-device metadata (now 30 GB) but the ext4 filesystem inside the container was never expanded. Metrics shows 30 GB total / 0 B used, yet Postgres crashes on every start with FATAL: could not write to file "pg_wal/xlogtemp.33": No space left on device - exactly as in the solved ticket.
We need the same manual fix you applied there: expand the ext4 filesystem, then redeploy the Postgres service.
Our IDs and logs
- Project: b5c96610-f9ef-4944-b302-0bcfc0b04c83 (diligent-blessing) - PRO plan
- - Environment: 11aab17f-d32f-4672-85ca-9fc739fd55e9 (production)
- - Postgres service: 5dc68903-4eec-44d7-a9d7-cdaef1385ae6
- - Currently attached volume: 3a95d14f-86b5-4ec5-ae5c-049086b7a57a (restored from 2026-04-16 09:01 UTC backup, resized to 30 GB metadata)
- - Bind-mount path: /var/lib/containers/railwayapp/bind-mounts/66638ab2-a424-40e6-b6a4-76ddef2508f4/vol_rbx38morrxkkhj1g
- - Container mount target: /var/lib/postgresql/data
- - Postgres image: ghcr.io/railwayapp-templates/postgres-ssl:18 (PostgreSQL 18.3)
Deploy log from the latest crash (2026-04-16 09:23:37 UTC)
2026-04-16 09:23:37.309 UTC [7] LOG: starting PostgreSQL 18.3
2026-04-16 09:23:37.322 UTC [33] LOG: database system was interrupted while in recovery at 2026-04-16 09:23:30 UTC
2026-04-16 09:23:37.791 UTC [33] LOG: database system was not properly shut down; automatic recovery in progress
2026-04-16 09:23:37.796 UTC [33] LOG: redo starts at 0/1132C738
2026-04-16 09:23:43.308 UTC [33] LOG: redo done at 0/18FFE5B0 elapsed: 5.51 s
2026-04-16 09:23:43.331 UTC [33] FATAL: could not write to file "pg_wal/xlogtemp.33": No space left on device
2026-04-16 09:23:43.337 UTC [7] LOG: startup process (PID 33) exited with exit code 1
2026-04-16 09:23:43.353 UTC [7] LOG: database system is shut down
Recovery successfully replays ~131 MB of WAL (0/1132C738 -> 0/18FFE5B0) and completes in 5.5 s. Postgres then fails trying to create the first new WAL segment because the ext4 is still the original tiny size.
Things we've already tried
- Live Resize original volume 500 MB -> 20 GB -> 30 GB + Restart
- 2. Full Redeploy of Postgres service
- 3. Backup restore -> new volume attached (3a95d14f...)
- 4. Live Resize of the new volume 20 GB -> 30 GB + Restart
All failed with the identical ENOSPC on xlogtemp creation.
Request
Please run resize2fs (or equivalent) on the ext4 filesystem backing the bind-mount above so it actually fills the 30 GB block device, then redeploy our Postgres service - same procedure you used in the 82a101ad ticket. Data integrity is critical: this is production accounting data for our customers.
Thank you very much, standing by for your response.
1 Replies
Status changed to Awaiting Railway Response Railway • about 1 month ago
a month ago
Hey, the volume resize didn't fully apply to your Postgres volume's internal storage. We've expanded it manually and redeployed your Postgres service. You should now have the full 30 GB of disk space available.
Your database's WAL recovery had already completed successfully before the space issue, so your data should be intact once the service comes back up.
Status changed to Awaiting User Response Railway • about 1 month ago
a month ago
This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!
Status changed to Solved Railway • 29 days ago