Healthcheck fails despite the server being online

ngryman
HOBBY

a year ago

Hi,

I added the /health endpoint as healthcheck in my deploy settings. Despite the server being successfully deployed (cf. deploy log excerpt), the build fails (cf. build log excerpt).

The health path does exist and is responsive. It seems that the /health request never reached the container, and as a result, makes the build fail.

When I re-deploy without the healthcheck, everything works fine.

Deploy Log Excerpt

Jun 16 17:31:49 Starting Container
Jun 16 17:31:50 2024-06-16T15:31:50.092279Z INFO listening on 0.0.0.0:65090
Jun 16 17:37:29 Stopping Container

Build Log Excerpt

Jun 16 17:31:52 ====================
Jun 16 17:31:52 Starting Healthcheck
Jun 16 17:31:52 ====================
Jun 16 17:31:52
Jun 16 17:31:52 Path: /health
Jun 16 17:31:52 Retry window: 30s
Jun 16 17:31:52
Jun 16 17:31:52 Attempt #1 failed with service unavailable. Continuing to retry for 29s
Jun 16 17:31:53 Attempt #2 failed with service unavailable. Continuing to retry for 28s
Jun 16 17:31:55 Attempt #3 failed with service unavailable. Continuing to retry for 26s
Jun 16 17:31:59 Attempt #4 failed with service unavailable. Continuing to retry for 22s
Jun 16 17:32:07 Attempt #5 failed with service unavailable. Continuing to retry for 14s
Jun 16 17:32:08
Jun 16 17:32:08 1/1 replicas never became healthy!
Jun 16 17:32:08 Healthcheck failed!

9 Replies

a year ago

What kind of app is this?

There is a bug with the v2 runtime that means your app would need the listen on IPv6 :: to pass the health check.


ngryman
HOBBY

a year ago

It's a v2 app indeed.


a year ago

You mean to say your app is running on the v2 runtime.

Please apply my suggested fix.


ngryman
HOBBY

a year ago

Yes, it's running on the v2 runtime.

Thanks for the fix.

One suggestion, if it's a known lasting bug, it would be nice to have some kind of callout in the UI explaining this.


a year ago

The bug will be fixed, it wouldn't make sense to add a UI element in the mean time.


a year ago

Hey all, really sorry for not following up here, this was fixed a long time ago.

For health checks it no longer matters if you listen on IPv4 or IPv6.


Status changed to Solved brody 11 months ago


javierortegap
PRO

4 months ago

Hi Railway Team & Community,

I'm encountering an issue where Railway's health check fails for my Python FastAPI service when I configure Uvicorn to bind to the IPv6 wildcard address (`::`) needed for private networking.

Context:

I have two services in the same project/environment: an *Orchestrator** (FastAPI) and a Worker (FastAPI, e.g., my-worker-service).

* The Orchestrator needs to call the Worker over the private network using the internal DNS name (`http://my-worker-service.railway.internal:<PORT>`).

Based on documentation and previous discussions, I've configured the *Worker's Railway Start Command** to:

```bash

/bin/sh -c "exec poetry run uvicorn my_worker_module.main:app --host :: --port $PORT"

```

* My main.py uses uvicorn.run(..., port=port) without specifying the host, letting the Start Command control it.

Observations:

1. Successful Private Connection: After setting the Start Command and redeploying the worker, the worker's logs show Uvicorn successfully binding (e.g., Uvicorn running on http://[::]:8080). Crucially, the Orchestrator service CAN now successfully connect to the worker using the http://my-worker-service.railway.internal:8080 address. Private networking is functional.

2. Failing Health Check: However, Railway's health check for the worker service (using path /health) now consistently fails. Before changing the host binding (when it was likely defaulting to 127.0.0.1 or 0.0.0.0), the health check passed.

3. Health Check Endpoint Works (Externally): If I temporarily expose the worker service publicly and access /health via its public URL, the endpoint responds correctly with a 200 OK.

Hypothesis:

It seems Railway's internal health checker might be trying to connect to the service using an IPv4 address (like 127.0.0.1 or the container's internal IPv4). When Uvicorn is explicitly bound to ::, even though this often enables dual-stack listening on Linux, it might not be accepting the IPv4 connection from the health checker in the specific Railway environment configuration.

Question:

What is the recommended configuration or Start Command for a Uvicorn/FastAPI service on Railway to reliably support both IPv6 private networking (`railway.internal`) AND Railway's internal health checks? Is there a different host address (other than :: or [::]) or a specific flag required for Uvicorn in this environment to achieve robust dual-stack binding that satisfies both needs?

Disabling the health check works as a temporary workaround, but isn't ideal for ensuring service reliability.

Thanks for any insights!


Status changed to Open Railway 4 months ago


javierortegap

Hi Railway Team & Community,I'm encountering an issue where Railway's health check fails for my Python FastAPI service when I configure Uvicorn to bind to the IPv6 wildcard address (`::`) needed for private networking.Context:I have two services in the same project/environment: an *Orchestrator** (FastAPI) and a Worker (FastAPI, e.g., my-worker-service).* The Orchestrator needs to call the Worker over the private network using the internal DNS name (`http://my-worker-service.railway.internal:<PORT>`).Based on documentation and previous discussions, I've configured the *Worker's Railway Start Command** to:```bash/bin/sh -c "exec poetry run uvicorn my_worker_module.main:app --host :: --port $PORT"```* My main.py uses uvicorn.run(..., port=port) without specifying the host, letting the Start Command control it.Observations:1. Successful Private Connection: After setting the Start Command and redeploying the worker, the worker's logs show Uvicorn successfully binding (e.g., Uvicorn running on http://[::]:8080). Crucially, the Orchestrator service CAN now successfully connect to the worker using the http://my-worker-service.railway.internal:8080 address. Private networking is functional.2. Failing Health Check: However, Railway's health check for the worker service (using path /health) now consistently fails. Before changing the host binding (when it was likely defaulting to 127.0.0.1 or 0.0.0.0), the health check passed.3. Health Check Endpoint Works (Externally): If I temporarily expose the worker service publicly and access /health via its public URL, the endpoint responds correctly with a 200 OK.Hypothesis:It seems Railway's internal health checker might be trying to connect to the service using an IPv4 address (like 127.0.0.1 or the container's internal IPv4). When Uvicorn is explicitly bound to ::, even though this often enables dual-stack listening on Linux, it might not be accepting the IPv4 connection from the health checker in the specific Railway environment configuration.Question:What is the recommended configuration or Start Command for a Uvicorn/FastAPI service on Railway to reliably support both IPv6 private networking (`railway.internal`) AND Railway's internal health checks? Is there a different host address (other than :: or [::]) or a specific flag required for Uvicorn in this environment to achieve robust dual-stack binding that satisfies both needs?Disabling the health check works as a temporary workaround, but isn't ideal for ensuring service reliability.Thanks for any insights!

4 months ago

Hello,

You cannot bind to only :: when using a heath check, you need to dual stack bind, for that I would recommend granian -

https://github.com/emmett-framework/granian


brody

Hello,You cannot bind to only :: when using a heath check, you need to dual stack bind, for that I would recommend granian -https://github.com/emmett-framework/granian

javierortegap
PRO

4 months ago

That worked, thanks!


Healthcheck fails despite the server being online - Railway Help Station