backend-production-8e3e: constant 12-15s TTFB on all external requests, app itself is healthy

jjojjo-bot

HOBBYOP

2 months ago

Project: Feams

Service: backend-production-8e3e (FastAPI + Socket.IO via uvicorn)

Domain: backend-production-8e3e.up.railway.app

Started: ~2026-04-25 09:00 KST

== Symptom ==

Every external HTTPS request has a constant 12-15s TTFB,

regardless of endpoint or HTTP version. Mobile app (15s axios timeout) fails on all OAuth login flows — production users cannot sign in.

== Application is healthy ==

- Logs show requests processed in milliseconds:

INFO: 100.64.0.x - "POST /api/auth/kakao/callback ..." 200 OK

- Internal Railway healthchecks (HEAD/GET /health from 100.64.0.x) succeed instantly.

- DB queries, Kakao/Google API calls, WebSocket frames all complete normally. No errors, OOM, or worker restarts.

== Slowdown is at the edge / proxy layer ==

From multiple external clients (Korea ISP, Anthropic infra), curl shows:

DNS: ~3ms | Connect: ~25-200ms | TLS: ~50-140ms

TTFB: 12.5-15.7s ← constant | Total: 12.5-15.7s

- Same on HTTP/1.1 and HTTP/2.

- A request to a unique non-existent path

(/brandon-probe-1777076633) takes 12.7s and returns 404,

but the path NEVER appears in application access logs —

request is not forwarded to origin until edge times out.

- Other services on same network (Vercel, Google, GitHub,

Cloudflare R2) respond in <1s. Only this Railway hostname is affected. Public IP resolves to 151.101.2.15 (Fastly).

== Reproduction (09:52 KST, immediately after redeploy) ==

Run 1: DNS:0.002s Conn:0.008s SSL:0.116s TTFB:12.684s [200]

Run 2: DNS:0.110s Conn:0.118s SSL:0.139s TTFB:12.608s [200]

Run 3: DNS:0.004s Conn:0.011s SSL:FAIL — curl exit 35

(TLS handshake to edge IP intermittently fails before any data reaches origin)

== What we tried ==

- Redeploy #1 (~09:32 KST): briefly restored ~500ms TTFB for ~5 min, then reverted to 12s.

- Redeploy #2 (~09:50 KST): no recovery at all. TTFB stayed at 12-15s.

Pattern is worsening.

- Application code unchanged for 2+ days.

== Request ==

Please investigate edge / Fastly POP routing for this service.

Symptom (constant 12s TTFB irrespective of endpoint, request sometimes never reaching origin, intermittent TLS failures) is consistent with an edge-side timeout/retry pattern or a stale origin pool entry, not application load.

=== Decisive evidence: Container-internal measurement ===

Just measured localhost from inside the running container

via railway ssh:

Run 1: 12.5ms HTTP 200 (first request, includes TCP setup)

Run 2: 0.9ms HTTP 200

Run 3: 0.6ms HTTP 200

Run 4: 0.5ms HTTP 200

Run 5: 0.8ms HTTP 200

/api/me: 0.7ms HTTP 401

External requests to the same endpoint, same time, same instance:

TTFB: 12.5-15.7s

The application responds in <1ms. The entire 12-15s delay is

between the external client and the application — purely in

Railway's edge / proxy / routing layer. Application code,

worker concurrency, DB, and external API calls are NOT the cause.

$10 Bounty

11 Replies

Status changed to Awaiting Railway Response Railway • 2 months ago

Status changed to Open Railway • 2 months ago

erickim713

HOBBY

2 months ago

same problem here.

recoyang

PRO

2 months ago

same problem

harrythomson1

HOBBY

2 months ago

+1 also having the same issues.

Currently on a hobby plan. Located in Vietnam.

Have deployed via Singapore and US West and still get the same data points.

Tested on a simple health endpoint and all requests are taking 12 seconds minimum.

briankoey

PRO

2 months ago

Hi guys, currently suffer massive lost on the production environment, it is SLOW networking issue TTFB on all external requests, our app is able to healthy.

Customer is angry. Railway team please immediate take a look.

briankoey

Hi guys, currently suffer massive lost on the production environment, it is SLOW networking issue **TTFB on all external requests, our app is able to healthy.** Customer is angry. **Railway team please immediate take a look.**

recoyang

PRO

2 months ago

Exactly same here, very bad user experience to my customers.

briankoey

recoyang

PRO

2 months ago

I'm even start considering switch to other platform for back-up plan.

briankoey

PRO

2 months ago

Why this issue is open for 5hours, but there is no single railway support person look into it?

briankoey

Why this issue is open for 5hours, but there is no single railway support person look into it?

erickim713

HOBBY

2 months ago

I was actually surprised when it was at 1 hour mark and there was not a single engineer / support person either replying that they are looing at it or getting it solved. I kept quiet because i was only paying hobby plan, but this is getting annoying -_-

0x5b62656e5d

MODERATOR

2 months ago

Is serverless enabled?

0x5b62656e5d

Is serverless enabled?

rhaoio

HOBBY

2 months ago

not for me

osamaa-mustafa

PRO

2 months ago

Same issue. Railway team must look into it asap.

Welcome!