Previous deploy crash alert after new deploy

danstewart

PROOP

4 months ago

Hello 👋

Sorry if I've missed something in the docs but I can't find an answer.

When we do a new deploy (from a GitHub push) the previous deploy shuts down as expected, but we get a deploy crash alert for that previous deploy.

We have a healthcheck and a restart policy of "On Failure".

How do we prevent this?

Thanks,

Dan

Solved

17 Replies

passos

MODERATOR

4 months ago

Hey, can you confirm that your deployment doesn't exit with code 1? Railway might interpret this as a crash. Also, if your service uses volumes, be aware that there's a brief volume downtime during new version deployments which could cause your application to crash.

danstewart

PROOP

4 months ago

I think that's it, thanks!

Looks like my app is just taking too long to shutdown

I think I can use RAILWAY_DEPLOYMENT_DRAINING_SECONDS to give it longer

brody

EMPLOYEE

4 months ago

If your app is node -

danstewart

PROOP

4 months ago

Thanks, it's a rails app.

Setting RAILWAY_DEPLOYMENT_DRAINING_SECONDS=30 didn't seem to help

Here are the logs I get during shutdown

1459916365086457858

danstewart

PROOP

4 months ago

The next deploy build started at 14:18:50
The container started at 14:20:23
The healthcheck started at 14:20:26
The server started at 14:20:27
The healthcheck passed at 14:20:29

passos

MODERATOR

4 months ago

Maybe RAILWAY_DEPLOYMENT_OVERLAP_SECONDS also needs to be set?

passos

MODERATOR

4 months ago

danstewart

PROOP

4 months ago

Thanks, I'll give that a try

danstewart

PROOP

4 months ago

No luck unfortunately, I'm still getting emails telling me the deploy crashed 🫤

passos

MODERATOR

4 months ago

When setting RAILWAY_DEPLOYMENT_DRAINING_SECONDS=60 Railway tells me exactly when it sends SIGTERM (trying gracefully) and then SIGKILL (actually terminates it). The only option I could give here is that your application takes too long to exit, even when using 30 seconds, or even if it's successful shutdown, it somehow exits in a code 1 for some reason.

1460787869231878195

danstewart

PROOP

4 months ago

Is the "Stopping container" line the SIGKILL?

I see two of those but no SIGTERM line 🤔

I'll try increasing RAILWAY_DEPLOYMENT_DRAINING_SECONDS further

1461099049435398449

danstewart

PROOP

4 months ago

I've increased RAILWAY_DEPLOYMENT_DRAINING_SECONDS to 60 but still no luck 🤔

danstewart

PROOP

4 months ago

I think I've found the issue

I've replicated locally by doing:

railway shell unset DATABASE_URL ./bin/thrust ./bin/rails s

Wait for it to start up then find the process with ps -ef | grep thrust

Then send a SIGTERM to the process: kill -15 <PID>

Then check the exit code of thrust by running echo $?

It was 255

When sending a SIGTERM the logs had:

- Gracefully stopping, waiting for requests to finish

Exiting

{"time":"2026-01-17T11:21:01.685527Z","level":"INFO","msg":"Server stopping"}

{"time":"2026-01-17T11:21:01.685947Z","level":"INFO","msg":"Server stopped"}

Interestingly I get an exit code of 0 if I use SIGINT

When sending a SIGINT (ctrl+c) the logs had:

- Gracefully stopping, waiting for requests to finish

=== puma shutdown: 2026-01-17 11:12:45 +0000 ===

- Goodbye!

Exiting

{"time":"2026-01-17T11:12:45.07372Z","level":"INFO","msg":"Server stopping"}

{"time":"2026-01-17T11:12:45.07384Z","level":"INFO","msg":"Server stopped"}

If anyone has any suggestions please let me know but I think I'll raise this on the thruster GitHub repository.

When running rails server without thrust it exits with code 143, which I think is the expected exit code.

brody

EMPLOYEE

4 months ago

I think you now need to look into why your application is exciting with 255, nothing much can be done about that on the Railway side of things as 255 is indeed an error code unlike 143.

danstewart

PROOP

4 months ago

Thanks for the help, I've raised it with thruster.

https://github.com/basecamp/thruster/issues/109

brody

EMPLOYEE

4 months ago

For peace of mind, as long as you can at least get it to exit with 143, we do treat that as a successful exit code.

danstewart

PROOP

4 months ago

For anyone that has the same issue I've wrote a little wrapper script to exit with the right code:

https://gist.github.com/danstewart/d13abd4f06f04ce567b36423df64fb60

Status changed to Solved brody • 4 months ago

Welcome!