Half requests delayed ~12s before reaching container — Singapore
rajdeori2019
PROOP

a month ago

Project / Service: VyaparX (vyaparx) — domain www.getvyaparx.com, Railway region asia-southeast1-eqsg3a Time window: 2026-04-25, ~10:11–10:12 UTC

Summary

About half my requests to https://www.getvyaparx.com/health take ~12s, the rest take <0.5s. I instrumented the Node /health endpoint to log when the request actually reaches the container. The data shows the 12s is spent before Node receives the request — every slow hit has an ~11.8s gap between client send and server receive, and Node always responds within 0.3s. Same instanceId on all hits, so it's not a multi-replica issue.

Evidence (5 consecutive hits, same client, same container 741506ab-2d0d-477c-a7da-69d3c9b3a9ec)

hit client_sent (UTC) server_received (UTC) pre-Node gap total

1 10:11:44.938 10:11:45.100 0.16 s 0.46 s

2 10:11:46.162 10:11:57.960 11.80 s 12.15 s

3 10:11:59.021 10:11:58.826 ~0 s 0.16 s

4 10:11:59.873 10:12:11.676 11.80 s 12.16 s

5 10:12:12.722 10:12:24.555 11.83 s 12.19 s

server_received is new Date().toISOString() at the very start of the Express handler. total is curl's time_total. The 11.8s is reproducible and consistent — feels like a TCP connection timeout + retry firing on the Fastly→origin path, not random network jitter.

Notes:

- TCP connect + TLS handshake is always ~80ms (fast).

- All slow time is in time_starttransfer.

- Your dashboard reports origin healthy (p50 1.1s, p95 1.9s) — your metrics don't see these because they only count requests that actually arrive.

- HTTP/1.0 reproduces it (so not a keep-alive issue).

- All paths reproduce (/health, /clinic/odontocare, /admin/templates).

- Single replica only — confirmed by instance ID being identical across all hits.

What I need

Please look at the Fastly edge → Railway origin path for service vyaparx in asia-southeast1. Specifically the connection between your edge proxy and the container during 10:11–10:12 UTC on 2026-04-25. The ~11.8s pattern strongly suggests a connection-timeout retry loop somewhere in your infrastructure.

Repo (in case you want to inspect): https://github.com/rajdeori2019/vyaparx

The diagnostic /health handler is at src/server.js:221.

Solved

3 Replies

Railway
BOT

a month ago

This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.

Status changed to Open Railway 27 days ago


There is currently an ongoing incident: https://status.railway.com/incident/L9HP750V


Latency issue has been fixed. Can you check now.

Attachments


Status changed to Awaiting Railway Response brody 27 days ago


a month ago

This has now been resolved.

The latency was caused by an issue with our CDN provider Fastly, specifically affecting their KV store in the Asia region. Their incident report: https://www.fastlystatus.com/incident/378503

Apologies for the disruption.


Status changed to Awaiting User Response Railway 27 days ago


Railway
BOT

20 days ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway 20 days ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...