Apps are crashing and not being restarted?
Anonymous
PROOP

2 months ago

After migrating from Kubernetes to Railway, I've noticed a few outages have caused some of our services to crash, then they are not being properly restarted.


on kubernetes a "crashed" app would retry indefinitely... the cause of this crash was a connection error to rabbitmq which was resolved in 5 minutes, but after 10 app restarts railway stops spinning up the service.... I think this is not good.... a crash caused by an oopsie in your infrastracture lead to services being taken down.

I was shopping the other day and one of our customers were trading OTC and the services weren't working, I luckily managed to fix it by using my phone to restart the service and the trade went through but like..... why aren't services guaranteed to be up after they crash, running this system on kubernetes for years I never had this issue.

Attachments

Solved

4 Replies

sarahkb125
EMPLOYEE

2 months ago

Hi there,

Railway has a configurable Restart Policy that controls what happens when a deployed service stops or crashes. You can find this in your service's Settings tab under the deployment settings, where you'll be able to adjust the behavior to fit your needs.

For production workloads where you need robust handling of transient failures, pair application-level restarts with Healthchecks in Railway. If a health check fails, Railway will continuously attempt to restart unhealthy services rather than giving up.

If you are still having issues or need further assistance, could you please link the Railway service you are mentioning?

Best, The Railway Team


Status changed to Awaiting User Response Railway about 2 months ago


Anonymous
PROOP

2 months ago

How to set to infinity?


Status changed to Awaiting Railway Response Railway about 2 months ago


Anonymous
PROOP

2 months ago

Is there a faster way to set this on a bunch of services then going in one by one?

Attachments


sarahkb125
EMPLOYEE

2 months ago

I would recommend setting up healthchecks for each service, this would allow them to be restarted infinitely.

Here are some docs on healthchecks: https://docs.railway.com/reference/healthchecks

Best,

The Railway Team


Status changed to Awaiting User Response Railway about 2 months ago


Railway
BOT

a month ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway about 1 month ago


Loading...