Postgres Crashed on Volume Mount

marvin-bitterlich

HOBBYOP

2 months ago

My postgres crashed suddenly with:

```

01 s; distance=65 kB, estimate=80 kB; lsn=0/4A93A20, redo lsn=0/4A939C8

2026-04-26 21:34:36.100 UTC [28] LOG: checkpoint starting: time

2026-04-26 21:34:37.724 UTC [28] LOG: checkpoint complete: wrote 17 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=1.606 s, sync=0.007 s, total=1.625 s; sync files=15, longest=0.006 s,

average=0.001 s; distance=68 kB, estimate=79 kB; lsn=0/4AA4D00, redo lsn=0/4AA4CA8

Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/9c0814b2-1aef-46bc-921a-303ab6ea150e/vol_0dek5cz4jp10aenj

2026-04-26 22:33:48.719 UTC [54320] PANIC: could not fdatasync file "000000010000000000000004": No space left on device

2026-04-26 22:33:48.719 UTC [54320] STATEMENT: COMMIT

2026-04-26 22:33:48.720 UTC [7] LOG: server process (PID 54320) was terminated by signal 6: Aborted

2026-04-26 22:33:48.720 UTC [7] DETAIL: Failed process was running: COMMIT

2026-04-26 22:33:48.720 UTC [7] LOG: terminating any other active server processes

2026-04-26 22:33:48.721 UTC [7] LOG: all server processes terminated; reinitializing

2026-04-26 22:33:48.756 UTC [7] LOG: could not write to file "postmaster.pid": No space left on device

2026-04-26 22:33:48.767 UTC [54321] LOG: database system was interrupted; last known up at 2026-04-26 21:34:37 UTC

2026-04-26 22:33:48.819 UTC [54321] LOG: could not fsync file "./pg_serial": Input/output error

2026-04-26 22:33:49.128 UTC [54321] LOG: database system was not properly shut down; automatic recovery in progress

2026-04-26 22:33:49.128 UTC [54321] LOG: could not remove cache file "global/pg_internal.init": Read-only file system

2026-04-26 22:33:49.128 UTC [54321] LOG: could not remove cache file "base/16384/pg_internal.init": Read-only file system

2026-04-26 22:33:49.129 UTC [54321] PANIC: could not open file "/var/lib/postgresql/data/pgdata/global/pg_control": Read-only file system

2026-04-26 22:33:49.130 UTC [7] LOG: startup process (PID 54321) was terminated by signal 6: Aborted

2026-04-26 22:33:49.130 UTC [7] LOG: aborting startup due to startup process failure

2026-04-26 22:33:49.132 UTC [7] LOG: database system is shut down

Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/9c0814b2-1aef-46bc-921a-303ab6ea150e/vol_0dek5cz4jp10aenj

```

And the mounted volume is only at 224MB/5GB so its not that resizing that would help. Seems like some internal volume ran out of diskspace for some reason? And a redeploy hangs for 15 minutes before crashing as well.

And ofc I also can't just download the volume either so I'm a bit worried that my postgres is just gone for no reason 😕 Any help appreciated, this feels like a bug/incident on Railway side

Solved

5 Replies

Status changed to Awaiting Railway Response Railway • 2 months ago

ray-chen

EMPLOYEE

2 months ago

I've triggered a deployment of your database and it seems fine and online now?

Status changed to Awaiting User Response Railway • 2 months ago

ray-chen

I've triggered a deployment of your database and it seems fine and online now?

marvin-bitterlich

HOBBYOP

2 months ago

Nice that seems to have worked. Is there any way I could have self served this? As whenever I tried it just wasn't working

Status changed to Awaiting Railway Response Railway • 2 months ago

brody

EMPLOYEE

2 months ago

Yep, you could have triggered a deployment too!

Status changed to Awaiting User Response Railway • 2 months ago

brody

Yep, you could have triggered a deployment too!

marvin-bitterlich

HOBBYOP

2 months ago

That is very weird because I did trigger multiple deployments and none of them worked. I presume that the volume needed some kind of repair task which maybe runs once a day or something?

When I searched around I found that other people fixed the same issue by triggering a volume resize but on the Hobby plan that is not possible so I'd have to upgrade my plan for the priviledge of running chdisk conceptually 😄

Very glad its resolved, but makes me worried what I can do if it happens again. Also made me realise that I am really missing a "Pause/Shutdown this service" button as I had to completely kill all my cronjobs settings, as things just kept coming back online and spamming me with emails about crashed deployments, which makes restoring service once the DB issues are fixed much more of a pain.

Status changed to Awaiting Railway Response Railway • 2 months ago

brody

EMPLOYEE

2 months ago

Nope, I didn't do anything special at all. Nothing you couldn't have done yourself. But glad you're in a good spot now.

Status changed to Awaiting User Response Railway • 2 months ago

Status changed to Solved brody • 2 months ago

Welcome!