7 months ago
Hi!
I understand from other threads that you restrict consecutive restarts to avoid restart spinning. However the semantics of this are unclear to me.
Is that 10 restarts that fail to come up, or 10 restarts within a time window? If so, what is the time window? Are any further restart attempts made, e.g. with a backoff strategy?
I'll add that my reading of the current documentation at https://docs.railway.com/guides/restart-policy is that for paid plans, "always" is not limited to 10 restarts.
4 Replies
7 months ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
7 months ago
Always means that we'll keep on attempting restart with an exponential backoff.
Status changed to Awaiting User Response Railway • 7 months ago
7 months ago
Thanks. I'd love a little more detail on this. For example, how long does the app need to be up before the backoff resets? Do you always restart within the same replica or do you (sometimes) attempt full replica replacements? Do you emit information anywhere about when restarts were attempted or do you rely on us collecting that application side? The reason for the last question is that from our side it did not look like an exponential backoff, but rather a number of near-immediate restarts followed by a multi-hour stopped state (that ended when we restarted manually).
Status changed to Awaiting Railway Response Railway • 7 months ago
7 months ago
Absolutely.
With this policy, Railway attempts to restart your service every time it stops, regardless of the reason. While there isn't a strict time window for restarts, an exponential backoff strategy is used, meaning the time between restart attempts increases incrementally to avoid a rapid restart loop.
To reset the backoff, the service needs to remain up and running for a period of time without crashing. If a service fails multiple times in quick succession, it might appear as if it stops, but this is typically a result of the backoff strategy coming into play to minimize resource wastage and potential further issues.
Regarding replica replacements, restarts within a single replica are generally attempted first. However, if issues persist, a full replica replacement might be considered, especially if additional troubleshooting indicates it's necessary.
As for logging, information about restarts isn't automatically logged within Railway's platform, so it's advisable to implement logging within your application to track restart attempts and any associated errors.
Status changed to Awaiting User Response Railway • 7 months ago
6 months ago
This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!
Status changed to Solved Railway • 6 months ago