Replica crashes and is not rebooted

a month ago

Project ID: 2a582b81-1f25-4953-91bc-bcae9e4e8496
Environment ID: 4a42db16-ceec-423d-a5e2-733bec28af15
Service ID: 69a0f188-8e1d-4927-b583-d895f942fec1
Replica ID: 1899dd0d-73bf-48da-8681-1ed268d255a0

Several days ago, the replica crashed and does not reboot automatically despite having Restart Policy: Always. This has happened to this service both when i had only 1 replica, and when I've had 3 replicas. When only having 1 replica, the entire service was considered crashed and served 502s until I could manually restart it. With 3 replicas, it successfully reroutes to the other 2, but I worry other replicas could crash in the same way.

Attempting to SSH into the crashed replica returns Your application is not running or in a unexpected state. I thought that since it is an Elixir service that maybe the application was still running but the webserver portion had crashed, but this error makes me think that the entire container has exited.

If Railway knows the container has crashed, why is it not restarting it?

0 Replies

a month ago

Could have sworn I had replied, so my apologies.

We have other reports of this, and are actively looking into this.


Loading...