Health check failed
mvpleilao
HOBBYOP

a year ago

We are experiencing persistent healthcheck failures with our FastAPI application on Railway. Despite multiple attempts to fix the issue, the healthcheck continues to fail, preventing successful deployment of the application.

Configuration Details

  1. Application: FastAPI API (Python)
  2. Railway Configuration:
toml  

CopyInsert

[deploy] startCommand = "uvicorn api.main:app --host 0.0.0.0 --port $PORT" healthcheckPath = "/railway-health" healthcheckTimeout = 300 healthcheckInterval = 15 healthcheckGracePeriod = 60

3. Healthcheck Endpoint:

python  

CopyInsert

@app.get("/railway-health") def railway_healthcheck(): """Extremely simple endpoint for Railway healthcheck""" return {"status": "ok"}

Solution Attempts

  1. Dedicated healthcheck endpoint implementation:
    • Created a simple
    /health  
    endpoint that always returns a 200 OK status
    • Later created an even simpler
    /railway-health  
    endpoint specifically for Railway
    • Both endpoints are independent of any environment variables or external services
  2. Application resilience improvements:
    • Modified the
    get_supabase_client()  
    function to return
    None  
    instead of throwing exceptions when credentials are unavailable
    • Updated endpoints to explicitly check if the Supabase client is available
    • Implemented proper error handling with 503 status when services are unavailable
  3. Railway configuration adjustments:
    • Updated railway.toml to use the dedicated healthcheck endpoint
    • Increased healthcheck timeout to 300 seconds
    • Added a 15-second interval between checks
    • Configured a 60-second grace period for initialization
  4. Local testing:
    • Tested the healthcheck endpoints locally and confirmed they are working correctly
    • Verified that the application starts without errors even without Supabase environment variables
    LOG ERROR: ==================== Jun 10 18:07:55 Starting Healthcheck Jun 10 18:07:55 ==================== Jun 10 18:07:55 Jun 10 18:07:55 Path: /health Jun 10 18:07:55 Retry window: 1m40s Jun 10 18:07:55 Jun 10 18:08:06 Attempt #1 failed with service unavailable. Continuing to retry for 1m29s Jun 10 18:08:17 Attempt #2 failed with service unavailable. Continuing to retry for 1m18s Jun 10 18:08:19 Attempt #3 failed with service unavailable. Continuing to retry for 1m16s Jun 10 18:08:24 Attempt #4 failed with service unavailable. Continuing to retry for 1m11s Jun 10 18:08:32 Attempt #5 failed with service unavailable. Continuing to retry for 1m3s Jun 10 18:08:48 Attempt #6 failed with service unavailable. Continuing to retry for 47s Jun 10 18:09:18 Attempt #7 failed with service unavailable. Continuing to retry for 17s Jun 10 18:09:18 Jun 10 18:09:18 1/1 replicas never became healthy! Jun 10 18:09:18 Healthcheck failed!
Solved$10 Bounty

6 Replies

mvpleilao
HOBBYOP

a year ago

We would like to request assistance to:

  1. Verify if there is any issue with the healthcheck configuration in railway.toml
  2. Confirm if Railway is trying to access the correct endpoint for the healthcheck
  3. Suggest additional configurations that might solve the problem
  4. Provide detailed logs of the deployment attempt to help with debugging

Thank you in advance for your assistance.


a year ago

Would it be possible to share the repository?


mvpleilao
HOBBYOP

a year ago

What you mean? The link or the files?


mvpleilao

What you mean? The link or the files?

a year ago

The link to the GitHub repository.


mvpleilao
HOBBYOP

a year ago

Hey guys!

Was a port conflict.

Somehow I hardcode it and then revert again to a variable in Railway and it worked fine.


sarahkb125
EMPLOYEE

a year ago

Sounds good, let us know if you have any other issues!


Status changed to Awaiting User Response Railway 11 months ago


Status changed to Solved sarahkb125 11 months ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...