16 days ago
Issue Description:
Our Django 4.2.23 application with ASGI/WebSocket support is experiencing 502 "Application failed to respond" errors on Railway deployment, despite successful builds.
Error Details:
- HTTP 502 responses on all endpoints (including /api/health/ and /admin/)
- Error message: "Application failed to respond"
- Application builds successfully but fails to respond to requests
- Railway edge reports x-railway-fallback: true
Configuration:
- Python 3.11.13 (runtime.txt specified)
- Django 4.2.23 with Channels for WebSocket support
- Daphne ASGI server for production
- PostgreSQL database (Railway provided)
- Redis instance (Railway provided)
Current Procfile:
web: start_production.sh release: python manage.py migrate --noinput
Production Start Script:
Our start_production.sh includes:
- Static file collection
- Database connection testing
- Environment validation
- Daphne server startup with: daphne -b 0.0.0.0 -p $PORT lune_backend.asgi:application
Troubleshooting Attempted:
1. Verified local Docker container works correctly
2. Confirmed health endpoints respond with proper X-Forwarded-Proto headers
3. Tested both Procfile approaches:
- Direct: web: daphne -b 0.0.0.0 -p $PORT lune_backend.asgi:application
- Script-based: web: ./scripts/start_production.sh
4. Verified script permissions and execution
5. Confirmed environment variables are set correctly
6. Database migrations complete successfully in release phase
Questions:
1. Are there specific requirements for Django ASGI applications on Railway?
2. Should we use a different ASGI server (uvicorn vs daphne)?
3. Are there Railway-specific port binding or network requirements?
4. How can we access deployment logs to debug the 502 errors?
5. Are there known issues with Django Channels + Railway deployment?
Repository:
https://github.com/2Lune/2lune-backend (private - can provide access if needed)
Expected Behavior:
Application should respond to HTTP requests and serve the Django admin panel and API endpoints.
Additional Context:
- Application includes WebSocket support via Django Channels
- Uses Redis for caching and channel layers
- Requires HTTPS in production (configured correctly)
- Includes custom health check middleware for Railway
Railway.toml:
```
# railway.toml
[build]
builder = "Dockerfile"
[deploy]
# Use minimal production startup script (no DB migrations or roles needed)
startCommand = "./scripts/start_production.sh"
[healthcheck]
path = "/api/health/"
interval = 15
timeout = 20
retries = 8
# Custom middleware handles Railway's internal health checks
startPeriod = 30
```
Dockerfile:
# Health check to ensure the application is running
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:$PORT/api/health/ || exit 1
# Run the application with our production startup script
CMD ["./scripts/start_production.sh"]
2 Replies
16 days ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
15 days ago
Could you please share your Railway deployment logs?
In the meantime, a quick thing to try: simplify your Procfile to run Daphne (or even Uvicorn) directly like this:
web: daphne -b 0.0.0.0 -p $PORT lune_backend.asgi:application
This helps rule out problems in your startup script. Also, double-check your Redis env vars and try increasing the startPeriod
in your railway.toml
to 60 seconds to give your app more time to start before health checks kick in.