Old Deployments Not Going Away
alec2435
PROOP

10 months ago

Suddently starting the last 48 hours, I've noticed old deployments have not been getting removed automatically and must be manually removed. This caused our service to go down as our pg connection count 10x'd from having 10 active deployments when I'd at most expect 2 (given we configured RAILWAYDEPLOYMENTDRAINING_SECONDS to be 600 (10 minutes)). Was there a behavior change? Is this a new bug? Project id b34282ae-5797-4e55-bd45-347f7a9fc694

11 Replies

brody
EMPLOYEE

10 months ago

Hello!

Can you link to the specfic service in question?



alec2435
PROOP

10 months ago

Seems like it fixed itself but I want to understand was there some downtime on railways end? A misconfiguration on our end? How do we mitigate this from causing downtime in the future


brody
EMPLOYEE

10 months ago

you do have your overlap time set to I think 600 seconds, could that be why?


brody
EMPLOYEE

10 months ago

I was slightly mistaken, you have the draining seconds set to 600, that means the older deployment can run for 600 seconds after a new deployment rolls out, where as the default is 3 seconds


alec2435
PROOP

10 months ago

I mean you can see in the screenshot above that it was much much more than 600 seconds


alec2435
PROOP

10 months ago

and that was after i manually pruned a few 2 day old deployments


alec2435
PROOP

10 months ago

so not sure how that could be the cause


brody
EMPLOYEE

10 months ago

fair point, is this still happening?


alec2435
PROOP

10 months ago

Not at this moment


alec2435
PROOP

10 months ago

It's happening again. This is twice in two days our app has gone down because we max out db connections because these old containers stay up.

Attachments


Loading...