Loading...

Health check failed

mvpleilaoHOBBY

2 days ago

We are experiencing persistent healthcheck failures with our FastAPI application on Railway. Despite multiple attempts to fix the issue, the healthcheck continues to fail, preventing successful deployment of the application.

Configuration Details

Application: FastAPI API (Python)
Railway Configuration:
CopyInsert
[deploy] startCommand = "uvicorn api.main:app --host 0.0.0.0 --port $PORT" healthcheckPath = "/railway-health" healthcheckTimeout = 300 healthcheckInterval = 15 healthcheckGracePeriod = 60
Healthcheck Endpoint:
CopyInsert
@app.get("/railway-health") def railway_healthcheck(): """Extremely simple endpoint for Railway healthcheck""" return {"status": "ok"}

Solution Attempts

Dedicated healthcheck endpoint implementation:
- Created a simple
```
/health
```
  endpoint that always returns a 200 OK status
- Later created an even simpler
```
/railway-health
```
  endpoint specifically for Railway
- Both endpoints are independent of any environment variables or external services
Application resilience improvements:
- Modified the
```
get_supabase_client()
```
  function to return
```
None
```
  instead of throwing exceptions when credentials are unavailable
- Updated endpoints to explicitly check if the Supabase client is available
- Implemented proper error handling with 503 status when services are unavailable
Railway configuration adjustments:
- Updated railway.toml to use the dedicated healthcheck endpoint
- Increased healthcheck timeout to 300 seconds
- Added a 15-second interval between checks
- Configured a 60-second grace period for initialization
Local testing:
- Tested the healthcheck endpoints locally and confirmed they are working correctly
- Verified that the application starts without errors even without Supabase environment variables
  
  LOG ERROR:
  ====================
  Jun 10 18:07:55
  Starting Healthcheck
  Jun 10 18:07:55
  ====================
  Jun 10 18:07:55
  Jun 10 18:07:55
  Path: /health
  Jun 10 18:07:55
  Retry window: 1m40s
  Jun 10 18:07:55
  Jun 10 18:08:06
  Attempt #1 failed with service unavailable. Continuing to retry for 1m29s
  Jun 10 18:08:17
  Attempt #2 failed with service unavailable. Continuing to retry for 1m18s
  Jun 10 18:08:19
  Attempt #3 failed with service unavailable. Continuing to retry for 1m16s
  Jun 10 18:08:24
  Attempt #4 failed with service unavailable. Continuing to retry for 1m11s
  Jun 10 18:08:32
  Attempt #5 failed with service unavailable. Continuing to retry for 1m3s
  Jun 10 18:08:48
  Attempt #6 failed with service unavailable. Continuing to retry for 47s
  Jun 10 18:09:18
  Attempt #7 failed with service unavailable. Continuing to retry for 17s
  Jun 10 18:09:18
  Jun 10 18:09:18
  1/1 replicas never became healthy!
  Jun 10 18:09:18
  Healthcheck failed!

Solved$10 Bounty

6 Replies

mvpleilaoHOBBY

2 days ago

We would like to request assistance to:

Verify if there is any issue with the healthcheck configuration in railway.toml
Confirm if Railway is trying to access the correct endpoint for the healthcheck
Suggest additional configurations that might solve the problem
Provide detailed logs of the deployment attempt to help with debugging

Thank you in advance for your assistance.

loudbookHOBBY

2 days ago

Would it be possible to share the repository?

mvpleilaoHOBBY

2 days ago

What you mean? The link or the files?

mvpleilao

What you mean? The link or the files?

loudbookHOBBY

2 days ago

The link to the GitHub repository.

mvpleilaoHOBBY

a day ago

Hey guys!
Was a port conflict.
Somehow I hardcode it and then revert again to a variable in Railway and it worked fine.

sarahkb125EMPLOYEE

a day ago

Sounds good, let us know if you have any other issues!

Status changed to Awaiting User Response railway[bot] • 1 day ago

Status changed to Solved sarahkb125 • 1 day ago