21 days ago
2026-05-20 17:30:36.015 UTC [2] FATAL: private key file "/var/lib/postgresql/data/certs/server.key" must be owned by the database user or root
2026-05-20 17:30:36.015 UTC [2] LOG: database system is shut down
This happened after the outage yesterday, and I do not know how to fix it. it retries this private key thing multiple times, and fails.
I've tried using the agent to fix the issue, and it is recommending me to wipe out the whole volume with my data??
Please help, I do not want to loose the data in here
7 Replies
Status changed to Awaiting Railway Response Railway • 21 days ago
21 days ago
We don't provide managed PostgreSQL, so the database configuration, including certificate file ownership, is on the application side. You should be able to fix this by temporarily setting your service's start command to sleep infinity (make sure to note your current start command first so you can restore it after). This will keep the container alive so you can connect via railway ssh and run chown postgres:postgres /var/lib/postgresql/data/certs/server.key to correct the file ownership. Once that's done, restore your original start command and redeploy. Your data is intact on the volume, no need to wipe it. We're going to connect you with the community for further help with this.
Status changed to Awaiting User Response Railway • 21 days ago
Status changed to Open mykal • 21 days ago
21 days ago
Alternatively, instead of SSHing into your service, you can set the start command of your postgres service to: bash -c "chown postgres:postgres /var/lib/postgresql/data/certs/server.key && chmod 600 /var/lib/postgresql/data/certs/server.key", and redeploy. once the container starts and runs the start command, you can remove it and redeploy again.
21 days ago
This error specifically points to a filesystem ownership/permissions problem on the SSL private key rather than direct database corruption.
PostgreSQL intentionally refuses to start if:
- the private key owner is incorrect
- or the permissions are considered insecure
So the current logs suggest the startup is aborting during PostgreSQL security checks, not because the database files themselves are necessarily damaged.
At this stage, there’s no indication that wiping the volume is required.
The Railway team recommendation to temporarily regain shell access and fix ownership/permissions on:
/var/lib/postgresql/data/certs/server.key
is technically the correct next recovery step.
21 days ago
Please do not wipe the volume yet. This error does not indicate that the Postgres data itself is corrupted. It means Postgres is refusing to start because the SSL private key file has the wrong owner/permissions:
/var/lib/postgresql/data/certs/server.keyPostgres requires server.key to be owned by the database user, or by root, with restrictive permissions.
The fix should be to repair ownership/permissions on the existing volume, not delete it.
I would try this recovery path:
- First, create a manual backup/snapshot of the Railway volume if the Backups tab is available.
- Do not wipe or recreate the volume.
- If you can run the container as root, set this env var temporarily:
RAILWAY_RUN_UID=0Then redeploy and see if the Postgres image/entrypoint repairs the ownership.
- If it still does not start, the volume needs a one-time permission repair on the mounted filesystem:
chown postgres:postgres /var/lib/postgresql/data/certs/server.key
chmod 600 /var/lib/postgresql/data/certs/server.keyor, if Railway’s image expects root-owned certs:
chown root:root /var/lib/postgresql/data/certs/server.key
chmod 640 /var/lib/postgresql/data/certs/server.key- After that, restart the Postgres service.
If Railway does not provide a way for me to shell into the failed database container, I need Railway support/staff to run the permission repair on the mounted volume or provide a safe recovery shell. The important part is: the data volume should be preserved. This should be recoverable without deleting the database data.
mykal
We don't provide managed PostgreSQL, so the database configuration, including certificate file ownership, is on the application side. You should be able to fix this by temporarily setting your service's start command to `sleep infinity` (make sure to note your current start command first so you can restore it after). This will keep the container alive so you can connect via [railway ssh](https://docs.railway.com/cli/ssh) and run `chown postgres:postgres /var/lib/postgresql/data/certs/server.key` to correct the file ownership. Once that's done, restore your original start command and redeploy. Your data is intact on the volume, no need to wipe it. We're going to connect you with the community for further help with this.
21 days ago
THANNKS!!!!!!!!!
21 days ago
Thanks to all the replies helping me with my issue, I connected through ssh, and runned the command and managed to re-start the db.
I still have a question, was this related to the outage and problems with google cloud? was this my fault? how could I prevent this from happening again?
Again, thanks!
21 days ago
Glad you got it fixed! To answer your questions: this may have been related to the May 19 GCP outage, which caused abrupt shutdowns that could corrupt file permissions on volumes. It wasn't anything you did.
To prevent it in the future, enabling volume backups would give you a restore point, and Point-in-Time Recovery would give you continuous protection.
Status changed to Awaiting User Response Railway • 21 days ago
14 days ago
This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!
Status changed to Solved Railway • 14 days ago
