Django ASGI Application 502 Errors - Procfile and Port Binding Issues - Health Check failure despite successful build

albinxxx

FREEOP

5 months ago

Issue Description:

Our Django 4.2.23 application with ASGI/WebSocket support is experiencing 502 "Application failed to respond" errors on Railway deployment, despite successful builds.

Error Details:

- HTTP 502 responses on all endpoints (including /api/health/ and /admin/)

- Error message: "Application failed to respond"

- Application builds successfully but fails to respond to requests

- Railway edge reports x-railway-fallback: true

Configuration:

- Python 3.11.13 (runtime.txt specified)

- Django 4.2.23 with Channels for WebSocket support

- Daphne ASGI server for production

- PostgreSQL database (Railway provided)

- Redis instance (Railway provided)

Current Procfile:
web: start_production.sh release: python manage.py migrate --noinput
Production Start Script:

Our start_production.sh includes:

- Static file collection

- Database connection testing

- Environment validation

- Daphne server startup with: daphne -b 0.0.0.0 -p $PORT lune_backend.asgi:application

Troubleshooting Attempted:

1. Verified local Docker container works correctly

2. Confirmed health endpoints respond with proper X-Forwarded-Proto headers

3. Tested both Procfile approaches:

- Direct: web: daphne -b 0.0.0.0 -p $PORT lune_backend.asgi:application

- Script-based: web: ./scripts/start_production.sh

4. Verified script permissions and execution

5. Confirmed environment variables are set correctly

6. Database migrations complete successfully in release phase

Questions:

1. Are there specific requirements for Django ASGI applications on Railway?

2. Should we use a different ASGI server (uvicorn vs daphne)?

3. Are there Railway-specific port binding or network requirements?

4. How can we access deployment logs to debug the 502 errors?

5. Are there known issues with Django Channels + Railway deployment?

Repository:

https://github.com/2Lune/2lune-backend (private - can provide access if needed)

Expected Behavior:

Application should respond to HTTP requests and serve the Django admin panel and API endpoints.

Additional Context:

- Application includes WebSocket support via Django Channels

- Uses Redis for caching and channel layers

- Requires HTTPS in production (configured correctly)

- Includes custom health check middleware for Railway

Railway.toml:
```
# railway.toml

[build]

builder = "Dockerfile"

[deploy]

# Use minimal production startup script (no DB migrations or roles needed)

startCommand = "./scripts/start_production.sh"

[healthcheck]

path = "/api/health/"

interval = 15

timeout = 20

retries = 8

# Custom middleware handles Railway's internal health checks

startPeriod = 30
```
Dockerfile:

# Health check to ensure the application is running
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:$PORT/api/health/ || exit 1

# Run the application with our production startup script
CMD ["./scripts/start_production.sh"]

$10 Bounty

2 Replies

Railway

BOT

5 months ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!

idiegea21

HOBBY

5 months ago

Could you please share your Railway deployment logs?

In the meantime, a quick thing to try: simplify your Procfile to run Daphne (or even Uvicorn) directly like this:

web: daphne -b 0.0.0.0 -p $PORT lune_backend.asgi:application

This helps rule out problems in your startup script. Also, double-check your Redis env vars and try increasing the startPeriod in your railway.toml to 60 seconds to give your app more time to start before health checks kick in.