High HTTP latency - Railway Central Station

High HTTP latency

robinvancauter

FREEOP

8 months ago

Hi

I'm creating this public thread as I believe I have found a new clue to re-open the investigation and who knows, maybe others are experiencing something similar.

The previous private thread has been closed. (https://station.railway.com/support/inconsistent-http-latency-high-variance-1947a9d8#3h03)

The problem
I'm experiencing a lot of additional HTTP request latency on my deployed axum service.
Client-side, about 40% of the requests are between 250-300ms, the other 60% are around 120ms.

Don't get me wrong, the 120ms is totally fine and comparable with Railway alternatives for routes to this region (Belgium <> Amsterdam)

The HTTP logs on Railway show a totalDuration of either +-170ms or +-15ms, coinciding with the increased latency I'm experiencing client side.

For my testing I'm using a hardwired desktop running linux.
I've also done 10min ICMP tests as well as several traceroutes to the Railway AMS region and they're all perfectly fine; consistent 20ms ping and no weird routing to be found in traceroutes.

Additionally, for comparison, the service is currently live on Fly in the same region and it never goes above 120ms.

The conclusion by the Railway team on the private thread

Based on your timestamps, our proxy metrics don't show any spikes or issues on our infrastructure side during those periods. The latency variations you're experiencing are likely due to network path differences between your location in Belgium and our AMS region, or client-side network conditions.
Network routing can vary significantly depending on which ISPs and peering arrangements are involved in the path to Railway. Different requests may take different network routes, causing the latency fluctuations you're observing.

Why I believe the issue originates from within Railway infrastructure
The additional latency consistently shows up in the Railway HTTP logs (see screenshot), which means the latency must be from within the Railway network (because Railway can't be tracking client <> proxy latency, it can only track what is inside the bounds of their infrastructure).

- The upstreamRqDuration shows 35ms (this is how long it takes for my service to handle the request)
- The totalDuration shows 174ms (this is how long the railway proxy needed to handle the request incl. upstream time)

This means there's about 140ms of additional latency being introduced by the Railway proxy (or internal network issues, hardware issues, configuration issues).

Wild guesses
Some ideas of what might be happening:
- it's not happening on all requests, which might indicate some sampling_rate is at play here. For example, if you turn on tracing for your infrastructure, you often don't want to trace every request, but rather only some of them. Tracing often times doesn't add a lot of extra overhead, but maybe in this case it's been configured to parse JSON request/response bodies.
- it might not be a proxy software issue at all, but rather oversaturated networking queues on the NIC's and these are often hard to expose in metric dashboards, but again, wild guesses

If there's any other testing or info I can provide to help the investigation further, feel free to let me know.

Last but not least, I'm a huge fan of the platform as a whole, there's a lot of good here.

While this latency issue is a deal breaker for some of my customers (it doesn't need to be the fastest, but it does need to be predictable), it is the only issue I've come across for my use cases, so if it can get resolved I will most certainly start migrating several customers over to Railway (it would make my life easier too )

Thanks again for all the hard work!

3 Replies

Railway

BOT

8 months ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!

zennoia

HOBBY

8 months ago

Experiencing a similar issue, just filed a ticket https://station.railway.com/questions/very-high-and-inconsistent-round-trip-ti-e3cb18d5 in case if there's any activity there!

gauthamses

PRO

6 months ago

+ 1 , seeing extremely high response times (> 2mins) for simple requests. Upstream response is (1ms). App is hosted in Southeast Asia (Singapore) but the edge region is europe-west4-drams3a. Not sure if it has an impact. App is behind Cloudflare proxy (free plan). I have hosted multiple services and all of them are facing the same issue.

Here's the http log of a request with high response time
```

requestId: "_nNbk-47TpyWFjjhw9P4nw"

timestamp "2025-11-11T08:48:24.918910802Z"

method "GET"

path "/api/realtime"

host "api.snezzi.com"

httpStatus 200

upstreamProto "HTTP/1.1"

downstreamProto "HTTP/2.0"

responseDetails ""

totalDuration 125190

upstreamAddress "http://[fd12:c6d6:733c:0:9000:49:79d6:f3d]:8090"

clientUa "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36"

upstreamRqDuration 1

txBytes 123

rxBytes 946

srcIp "2401:4900:883a:55aa:998e:6993:7d76:e0d"

edgeRegion "europe-west4-drams3a"

upstreamErrors ""
```

Welcome!