3 days ago
I am trying to verify a non-destructive Point-in-Time Recovery restore for a production Railway Postgres service. This is not an outage, but it is important for recovery readiness.
Environment: production
Source service: Postgres
Volume: production Postgres volume
Postgres image: ghcr.io/railwayapp-templates/postgres-ssl:17.7
Goal:
Create a separate restored Postgres service using PITR, without touching the live production Postgres service, without changing DATABASE_URL, and without restoring over the existing production volume.
We attempted the Railway GraphQL mutation volumeInstancePITRRestore twice:
-
Target timestamp: 2026-06-20T11:36:42.649Z
Proposed new service: issue-143-pitr-drill-20260620-1136
Trace ID: 2263536005594059044
-
Target timestamp: 2026-06-20T11:12:47.188Z
Proposed new service: issue-143-pitr-drill-20260620-1112
Trace ID: 5861890342496370075
Both attempts returned:
"Couldn't reach the source service's pgBackRest catalog. This is usually transient (network or storage hiccup) — try again in a moment. If it persists, check that the source service is healthy."
Afterward, no restore drill service was created, and production stayed healthy.
Can someone advise:
- whether this means PITR is not correctly enabled/usable for this Postgres service;
- why Railway cannot reach the source service's pgBackRest catalog;
- whether WAL/archive/catalog storage may need attention;
- what I should do to make a non-destructive PITR restore to a separate service work;
- whether there is a safe dashboard-supported way to run this restore drill without touching the live source service?
Please note: I do not want to restore over the existing production volume or change the live service connection.
1 Replies
Status changed to Awaiting Railway Response Railway • 3 days ago
17 hours ago
I dug into your Postgres service (the drill's source). The reason volumeInstancePITRRestore returns "couldn't reach the source service's pgBackRest catalog" is that the source isn't actually running pgBackRest / WAL archiving, there's no stanza or catalog for a restore to read from. Its logs over the last day show only normal checkpoints, with no archive-push / WAL-archiving activity at all.
PITR has to be enabled on the service first: that's what provisions pgBackRest, creates the stanza, takes a base backup, and begins continuously archiving WAL. Until that's running and a base backup has completed, a PITR restore has no catalog to restore from, which is exactly the error you're hitting. So the next step is to confirm PITR / continuous backups is enabled on this Postgres service (Settings → Backups). If it's off, enable it and let the first base backup complete, then the drill will work. If you believe it's already enabled, reply and I'll escalate to find why the stanza/archiving isn't running on your service. Your live production service and data are untouched by any of this. — Angelo
Status changed to Awaiting User Response Railway • about 17 hours ago