a month ago
Our FastAPI application becomes completely unresponsive multiple times per day. The process hangs silently - no error logs, no exceptions, no crash messages. The healthcheck endpoint stops responding, Railway returns 499 errors (client closed request - proxy timed out waiting for response), and manual restart is required to recover.
This has happened at least 4 times today:
- ~01:00 UTC (30 min outage)
- ~19:00 UTC
- ~22:05 UTC
- ~22:33 UTC
From application logs: Nothing - complete silence. Last log entry is a successful healthcheck, then no logs until manual restart.
Logs - Example timeline (22:33 UTC incident):
2026-01-27T22:31:04.471Z [INFO] event="GET /healthcheck" latency_ms=1.53 status_code=200
2026-01-27T22:32:04.855Z [INFO] event="GET /healthcheck" latency_ms=1.02 status_code=200
2026-01-27T22:33:05.532Z [INFO] event="GET /healthcheck" latency_ms=1.36 status_code=200
<-- NO MORE LOGS - APP HUNG -->
No error messages, no exceptions - the process is alive but completely unresponsive.
Stack
- Framework: FastAPI + Uvicorn
- Database: Supabase PostgreSQL (direct connection port 5432)
- Background jobs: Celery + Redis
- Python: 3.13
We checked the DB and that doesn't seem to be responsive.
Any idea what this might be?
Pinned Solution
a month ago
Seen this before. Usually not Railway. FastAPI is likely getting stuck in a blocking call DB, Redis, Celery. When the event loop blocks, the process stays alive, no errors, no logs, healthcheck just hangs and Railway returns 499. Common fixes like add timeouts to Postgres, Redis. Make sure DB driver async. Dont run sungle uvicorn worker. Restart works because it clears the stuck connection. Also Python 3.13 can also make this worse
2 Replies
a month ago
This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.
Status changed to Open Railway • about 1 month ago
a month ago
Seen this before. Usually not Railway. FastAPI is likely getting stuck in a blocking call DB, Redis, Celery. When the event loop blocks, the process stays alive, no errors, no logs, healthcheck just hangs and Railway returns 499. Common fixes like add timeouts to Postgres, Redis. Make sure DB driver async. Dont run sungle uvicorn worker. Restart works because it clears the stuck connection. Also Python 3.13 can also make this worse
a month ago
thanks! it was indeed a blocking request sorta blocking the whole thing
Status changed to Open brody • about 1 month ago
Status changed to Solved brody • about 1 month ago