Partial deployment failure triggered by branch changes
Anonymous
PROOP

7 months ago

We have two services – Web (server) + Scheduler – connected to the same github branch: i.e. changes made to the branch will be automatically trigger new deployments of each.

Earlier today ubuntu archive was down when we pushed a code change to the branch:

  • The Web build consequently failed, and so the old Web deployment remained active.

  • However, the Scheduler deployment succeeded and the new deployment went live.

  • This meant that the branch commit that was live on Web was out of sync with what was live on Scheduler, which is scenario we never want to happen.

  1. Is there a way to configure Railway such that either both deployments succeed and go live, or neither go live (if either of the Web or Scheduler deployments fail)?

  2. Also, today we have also noticed that our build + deployments are taking a long time (~30 minutes), while on previous days on Railway they always finished in well-under 10 minutes (normally ~ 6 mins). We haven't made any changes to build-related code or service compute provisioning. Is this level of variance expected with Railway?

    1. For example, one of today's Web deployments here took 30 mins, and yesterday an equivalent deployment took 5 mins here.

    2. The Scheduler deployment today (which corresponded with the 30-minute Web deployment linked above) took less than 2 minutes here

    3. This meant there was a period of 28 minutes whereby Web and Scheduler were live with different versions of the branch code. Is there a way to sychronize these deployments (in line with question 1) to that both always go live at the same time?

Solved

1 Replies

brody
EMPLOYEE

7 months ago

Hello Cormac,

Regarding synchronized deployments, Railway doesn't currently support atomic deployments across multiple services where both must succeed or both must fail. However, you can configure deployment dependencies using reference variables to control the startup ordering of your services. This documentation covers that -

https://docs.railway.com/guides/deployment-actions#deployment-dependencies---startup-ordering

The extended build times you experienced today were directly related to an incident we had this morning where GCP severely limited the number of available builders in the EU-West region -

https://status.railway.com/cmclvq0nm00bstks1igwbytu2

This explains why your Web deployment took 30 minutes while your Scheduler deployment completed in 2 minutes - they were processed by different builder pools with varying availability during the incident.

While Railway doesn't support true atomic deployments where services go live simultaneously, the deployment dependencies feature can help you manage the ordering to minimize the window where services are running different versions of your code.

Best,

Brody


Status changed to Awaiting User Response Railway 7 months ago


Railway
BOT

6 months ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway 6 months ago


Loading...