Production PostGIS crashed, volume read-only / no space left, need data-safe recovery
emmepi27
HOBBYOP

23 days ago

Hi Railway Support,

Our production PostGIS service in project rsfly, environment production, crashed on April 29, 2026 around 09:03 UTC.

Service: PostGIS

Volume shown in logs: vol_y4k38raujsj0ve8l

Mount path in logs: /var/lib/containers/railwayapp/bind-mounts/7fa0e4b4-7c46-40ae-bfd0-6d6e21561cc8/vol_y4k38raujsj0ve8l

Important: please do not reset, delete, or recreate the volume. We need a data-safe recovery path because this database contains user data.

Relevant logs:

PANIC: could not fdatasync file "00000001000000000000007A": No space left on device LOG: could not write to file "postmaster.pid": No space left on device LOG: could not fsync file "./pg_snapshots": Input/output error LOG: could not remove cache file "global/pg_internal.init": Read-only file system PANIC: could not open file "/var/lib/postgresql/data/pgdata/global/pg_control": Read-only file system LOG: database system is shut down

The query shown at the time was only a small Django session delete:

DELETE FROM "django_session" WHERE "django_session"."session_key" IN (...)

So this looks like a Postgres volume/WAL/temp/checkpoint/quota or mount issue, not an application-level data deletion issue.

We have already scaled our Celery worker to zero and are not running heavy jobs or write batches against production. The web app is mostly idle, but deploys are failing because the pre-deploy migrate step cannot connect to PostGIS:

connection to server at "postgis.railway.internal", port 5432 failed: Connection timed out

Could you please help us with a data-safe recovery?

Requested help:

  1. Check whether the PostGIS volume was mounted read-only due to storage/quota/I/O failure.
  2. Recover/remount the existing volume if possible.
  3. Confirm whether the database can be started long enough for us to take a backup/dump.
  4. If the volume cannot be recovered safely, tell us the latest restorable backup/snapshot and the safest restore path.
  5. Please avoid destructive actions unless explicitly confirmed by us.

Thank you.

Solved

1 Replies

Status changed to Awaiting Railway Response Railway 23 days ago


sam-a
EMPLOYEE

23 days ago

Your volume reached 100% capacity, which caused PostgreSQL to panic and the filesystem to go read-only. Your data has not been deleted or modified by us.

The recovery path is to live resize the volume from the volume settings in your project canvas. Because the volume is at 100%, the system will automatically perform an offline resize with data integrity checks and restart the service, which should allow PostgreSQL to recover. On the Hobby plan, volumes can be resized up to 5GB. If your volume is already at 5GB and you need more space, you would need to upgrade to Pro (up to 50GB).

If the service is still crash-looping after the resize and a redeploy, let us know and we can investigate further. If you have any backups configured on the volume, those are also available as a restore path from the service's Backups tab.


Status changed to Awaiting User Response Railway 23 days ago


Railway
BOT

16 days ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway 16 days ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...