FastAPI service ran for 3 weeks then shut off without being replaced
cranium
HOBBYOP

a month ago

Hi!

I deployed a containerized FastAPI service which ran uninterrupted for ~3 weeks, and has been running with minor updates for more than a year now.

Today, the service shut down (gracefully), and sat in "COMPLETED" state. It never came back up, despite having Restart Policy: On Failure.

I can switch to restart policy always, but would that even fix it? Why did my container get a random sigterm/sigint after 3 weeks? Shouldn't the railway orchestration restart it if it needs to evict it from the host for some reason (updates, outage, etc)?

$10 Bounty

6 Replies

a month ago

I'd guess that this is related to an incident Railway had earlier today.
Incident Report: February 11, 2026

Did the service come back up after you restarted it?


cranium
HOBBYOP

a month ago

Yes- it does seem to line up with the outage. Looking at the timeline- it looks like it should have been restarted automatically though?


cranium

Yes- it does seem to line up with the outage. Looking at the timeline- it looks like it should have been restarted automatically though?

a month ago

Not all of my services came back up after the outage by themselves.

I had to manually restart a few.


@mykal, is your restart policy set to Failure or Always for services that did not start automatically?


mykal

Not all of my services came back up after the outage by themselves.I had to manually restart a few.

I just realized this depends on restart policy.



Loading...