Health check failing after minor code change
Anonymous
TRIALOP

2 years ago

I made a very minor text change, and now my deployment isn't working where as it was working fine yesterday. The deploy logs look normal, but my build logs show that my health check isn't working (it works locally).

87 Replies

Anonymous
TRIALOP

2 years ago

N/A


Anonymous
TRIALOP

2 years ago

My health check is /health which serves:

def health_check(request):
    return HttpResponse(status=200)

brody
EMPLOYEE

2 years ago

what is the health check failing with?


Anonymous
TRIALOP

2 years ago

Attempt #6 failed with service unavailable. Continuing to retry for 6m31s


brody
EMPLOYEE

2 years ago

are you on the legacy or v2 runtime? check your service settings


Anonymous
TRIALOP

2 years ago

legacy


Anonymous
TRIALOP

2 years ago

Oh wait


Anonymous
TRIALOP

2 years ago

It's V2?


brody
EMPLOYEE

2 years ago

on the v2 runtime your app needs to listen on ::


Anonymous
TRIALOP

2 years ago

Did it get auto-switched or something?


brody
EMPLOYEE

2 years ago

might have


Anonymous
TRIALOP

2 years ago

I don't love that. Would've been nice to know about a breaking change like this.


Anonymous
TRIALOP

2 years ago

Is that safe for me to switch back to legacy?


Anonymous
TRIALOP

2 years ago

And where would I find docs for the difference between legacy and V2?


You can indeed switch to legacy.


brody
EMPLOYEE

2 years ago

it has been mentioned in two places, what else would work best for you?


We expect the legacy runtime to stay in place for as long as we get the expected behavior that our users need.


brody
EMPLOYEE

2 years ago


brody
EMPLOYEE

2 years ago

if the healthcheck issue is the only issue you face, I cannot recommend switching back to the legacy runtime


brody
EMPLOYEE

2 years ago

fwiw the health check issue has been reported to the team


Anonymous
TRIALOP

2 years ago

That's fair, but I'm not always looking at the changelog unless I'm interested to see what new features are available to me. I do get emails about the changelog with some basic bullet points, but it would've been nice to have in this email, or a separate email a message long the lines of:
"Starting 6/x/2024, all services will be switched from Legacy to V2, and here's what you need to do to before then:"

1250149322935373800


Anonymous
TRIALOP

2 years ago

Ohhh. So it wasn't anticipated.


Anonymous
TRIALOP

2 years ago

That's fair.


Anonymous
TRIALOP

2 years ago

Ok. Thanks for the help! Will fix up my /health response 😄


brody
EMPLOYEE

2 years ago

its a fair assumption that the changelogs would only include new features, and they do, but that also mention migration timelines and such for new features and new features always have the possibility to cause issues


Anonymous
TRIALOP

2 years ago

True, but imo known breaking changes should be communicated more directly. I don't always have the time to read changelogs for all of the services I use. My project is a hobby project, so no bigs, but for the enterprise customers, that could put a snag in their work. Luckily the support here is really on top of things!


brody
EMPLOYEE

2 years ago

i dont think this was known tbh, but i have no way to know for sure


brody
EMPLOYEE

2 years ago

waiting to hear back from char on this issue


Anonymous
TRIALOP

2 years ago

Yeah, in that case, it's a hiccup. And good on the Railway team for testing with Hobby accounts first so they can find these issues before they reach enterprise customers.


Anonymous
TRIALOP

2 years ago

I'm still having issues getting this to work. I've add [::1]:$PORT to my gunicorn command. I've confirmed this working locally, but still having trouble with the health check


Anonymous
TRIALOP

2 years ago

So it was gunicorn project.wsgi and now it's gunicorn -b 127.0.0.1:$PORT -b [::]:$PORT project.wsgi


brody
EMPLOYEE

2 years ago

it needs to be :: not ::1


Anonymous
TRIALOP

2 years ago

Ah, see, I tried that, but I get [ERROR] Connection in use: ('::', 65090)


Dumb ask and unsure if you did this in the past, switching to Legacy confirmed will fix the issue? Wanna make sure our network engineer can do a proper repro.


Anonymous
TRIALOP

2 years ago

I figured that there's already something running there.


brody
EMPLOYEE

2 years ago

yes, check <#880575219541114940>


Anonymous
TRIALOP

2 years ago

I'll give it a try here and confirm.


Anonymous
TRIALOP

2 years ago

I don't have access to that channel.


He is flagging me to another case 🙂


Anonymous
TRIALOP

2 years ago

ahhh


We just wanna have more languages to test runtime with hence why I ask.


The more cases the better.


Anonymous
TRIALOP

2 years ago

Sounds good! Yeah, I'll test and report back.


And sorry to use you as a test pig, I can comp you the month since you are doing QA work.


brody
EMPLOYEE

2 years ago

gunicorn -b [::]:$PORT project.wsgi


Anonymous
TRIALOP

2 years ago

Yup! Tried that, and got the "Connection in use" error.


Anonymous
TRIALOP

2 years ago

Much appreciated!


brody
EMPLOYEE

2 years ago

deploy logs please -


new role added


comped, test away, let us know when you have recovered the healthcheck


Anonymous
TRIALOP

2 years ago

Ah, looks like that only gets the newest logs. Here's what it shows:

[2024-06-11 19:11:19 +0000] [1] [INFO] Starting gunicorn 21.2.0

[2024-06-11 19:11:19 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:19 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:20 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:20 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:21 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:21 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:22 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:22 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:23 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:23 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:24 +0000] [1] [ERROR] Can't connect to ('::', 65090)

container event container died

brody
EMPLOYEE

2 years ago

ill try to reproduce


brody
EMPLOYEE

2 years ago

what version of gunicorn?


Anonymous
TRIALOP

2 years ago

I changed my gunicorn command back to what it was, and flipped the runtime to Legacy and it deployed successfully. And that's including the minor code change mentioned in the original post.


Anonymous
TRIALOP

2 years ago

21.2.0


Gotcha- that seems to be enough, going to add this case on the Runtime V2 blockers in the root thread.


Anonymous
TRIALOP

2 years ago

Great. Thanks!


brody
EMPLOYEE

2 years ago

my start command is gunicorn -b [::]:$PORT main:app on the v2 runtime with the same gunicorn version you are using, so this new error doesnt look like a v2 vs legacy issue

1250176510388600800
1250176510619422700


Anonymous
TRIALOP

2 years ago

But what would already be running on that port? 🤔


Anonymous
TRIALOP

2 years ago

In my case.


brody
EMPLOYEE

2 years ago

does your container run gunicorn and only gunicorn?


Anonymous
TRIALOP

2 years ago

It runs a couple django commands before gunicorn. migrate and collectstatic


brody
EMPLOYEE

2 years ago

can you provide the full command


Anonymous
TRIALOP

2 years ago

python [manage.py](manage.py) migrate && python [manage.py](manage.py) collectstatic --noinput && gunicorn project.wsgi


brody
EMPLOYEE

2 years ago

and what was the command when you got this error?


Anonymous
TRIALOP

2 years ago

python manage.py migrate && python manage.py collectstatic --noinput && gunicorn -b [::]:$PORT grbot.wsgi


Anonymous
TRIALOP

2 years ago

I may have had an extra -b 127.0.0.1:$PORT in there for IPv4.


Anonymous
TRIALOP

2 years ago

Testing just [::] atm


brody
EMPLOYEE

2 years ago

that would do it, gunicorn supports dual stack binding anyway so that wouldnt be needed, 127.0.0.1 would also be the incorrect address


Anonymous
TRIALOP

2 years ago

Their documentation seems to suggest that you need to state both: https://docs.gunicorn.org/en/stable/settings.html#bind


Anonymous
TRIALOP

2 years ago

and I'm assuming the correct address is 0.0.0.0?


Yes, binding on 127.0.0.01 won't bind properly.


But wondering why legacy did it.


Anonymous
TRIALOP

2 years ago

I didn't have that for legacy. I was just adding it in because I assumed I needed it if I also needed to have IPv6. My bad.


Anonymous
TRIALOP

2 years ago

Alright, well. It worked with python [manage.py](manage.py) migrate && python [manage.py](manage.py) collectstatic --noinput && gunicorn -b [::]:$PORT project.wsgi


Anonymous
TRIALOP

2 years ago

on V2


brody
EMPLOYEE

2 years ago

by default gunicorn binds to 0.0.0.0:$PORT so that would have worked for legacy


brody
EMPLOYEE

2 years ago

as i suggested 🙂


Anonymous
TRIALOP

2 years ago

Yup! For some reason I thought I tested that. Sorry about that.


brody
EMPLOYEE

2 years ago

no worries


Anonymous
TRIALOP

2 years ago

Guess this isn't a new bug then, Angelo! I apologize. New to messing with IPv6.


brody
EMPLOYEE

2 years ago

it is a new bug


brody
EMPLOYEE

2 years ago

you should not need to listen on ipv6 just for the health check to work


Yea, if any behavior is different vs. old, its a bug.


You did us a favor.


brody
EMPLOYEE

2 years ago

technically solved


brody
EMPLOYEE

a year ago

Update, health checks can now pass if your app only listens on 0.0.0.0 but if you have already changed it to :: there's no point in changing anything back as listening on :: has no known drawbacks.


Loading...