Issue with Traffic Routing during Long Deployment Overlap (Graceful Shutdown for Stateful Game)
xqm32
PROOP

11 days ago

We are hosting a stateful card game server on Railway and releasing updates frequently. We rely on a graceful shutdown strategy to ensure active games are not interrupted.

Our Setup & Logic:

1. We listen for SIGTERM on the old deployment.

2. Once received, we stop accepting "Create Room" requests but keep processing logic for existing rooms.

3. We set a 1-hour timeout to allow players to finish their current games.

4. We have configured:

* RAILWAY_DEPLOYMENT_OVERLAP_SECONDS: 3600

* RAILWAY_DEPLOYMENT_DRAINING_SECONDS: 3600

The Problem:

Even with the overlap settings, it appears that all incoming traffic is routed to the NEW deployment immediately once it becomes active. This breaks our game logic because players currently in a game (hosted on the OLD deployment) send requests that are being routed to the NEW deployment, where their game session data does not exist.

Our Question:

During the OVERLAP_SECONDS period, how does Railway handle traffic routing? Is there a way to ensure the OLD deployment continues to receive traffic (or enable sticky sessions) so existing players can finish their games?

We need the old deployment to remain publicly accessible via the domain until it shuts down safely.

Solved

4 Replies

Railway
BOT

11 days ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!


brody
EMPLOYEE

10 days ago

Hello,

New traffic is routed to the latest deployment as soon as it passes its health check. Any ongoing connections to the old deployment will continue until they close, but we don’t support sticky sessions so requests from existing users will hit the new deployment if its health check has succeeded. For stateful workloads like yours, you’ll want to store session or game state outside of the deployment’s memory (for example, in Redis or a similar external store) so any deployment can handle any request.

Best,

Brody


Status changed to Awaiting User Response Railway 10 days ago


brody

Hello,New traffic is routed to the latest deployment as soon as it passes its health check. Any ongoing connections to the old deployment will continue until they close, but we don’t support sticky sessions so requests from existing users will hit the new deployment if its health check has succeeded. For stateful workloads like yours, you’ll want to store session or game state outside of the deployment’s memory (for example, in Redis or a similar external store) so any deployment can handle any request.Best,Brody

xqm32
PROOP

10 days ago

If we add a health check that keeps the new deployment unhealthy, it won't receive any traffic, right? How long can it stay unhealthy before the deployment is marked as failed?


Status changed to Awaiting Railway Response Railway 10 days ago


brody
EMPLOYEE

10 days ago

That is correct.

You can adjust the health check timeout to a maximum of 3600 seconds in the UI.


Status changed to Awaiting User Response Railway 10 days ago


Status changed to Solved xqm32 10 days ago


Loading...