PostgreSQL Service Crash Loop - "No space left on device" with 99% Free Storage

jackar

HOBBYOP

6 months ago

I'm experiencing a critical issue with my PostgreSQL service that's causing it to crash repeatedly. Despite showing only 100MB used out of 8GB allocated storage (99% free), PostgreSQL is failing with "No space left on device" errors.

Issue Description: PostgreSQL is in a continuous crash loop and cannot complete basic recovery operations. The checkpointer process repeatedly crashes with disk space errors, even though storage metrics show 99% free space.

Error Logs:

  2025-09-11 00:55:13.748 UTC [53184] PANIC: could not write to file

  "pg_logical/replorigin_checkpoint.tmp": No space left on device

  2025-09-11 00:55:13.749 UTC [3] LOG: checkpointer process (PID 53184) was

  terminated by signal 6: Aborted

  2025-09-11 00:55:13.749 UTC [3] LOG: terminating any other active server

  processes

  2025-09-11 00:55:13.750 UTC [3] LOG: all server processes terminated;

  reinitializing

This pattern repeats continuously, preventing the database from starting

properly.

Impact:

- Production application is down

- Cannot create new tables or perform migrations

- Database recovery fails repeatedly

- Service is completely unusable

What I've Tried:

- Set RAILWAY_SHM_SIZE_BYTES = 536870912 and redeploy (the deployment hangs)

- Simplified migration scripts to reduce disk usage

- Verified actual storage usage is minimal (100MB/8GB)

Suspected Causes:

This appears to be a Railway infrastructure issue, possibly:

- Inode exhaustion despite available disk space

- Underlying storage volume actually full despite metrics

- PostgreSQL service misconfiguration

- Hardware/filesystem issues on Railway's end

Request:

Please investigate this PostgreSQL service immediately. This seems to be a

platform-level issue where storage metrics don't reflect actual availability. The

service needs to be restored or recreated to resolve the crash loop.

Can you please:

1. Check actual disk usage vs. reported metrics

2. Verify inode availability on the PostgreSQL volume

3. Restart/rebuild the PostgreSQL service if needed

4. Provide timeline for resolution

This is blocking our production application. Any assistance would be greatly

appreciated.

Solved$10 Bounty

2 Replies

Railway

BOT

6 months ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!

Status changed to Solved jackar • 6 months ago

jackar

HOBBYOP

6 months ago

Resolved. Increased volume max to 5 GB.

Status changed to Awaiting Railway Response Railway • 6 months ago

Status changed to Solved brody • 6 months ago