502's and railway edge gateway issues after horizontal scaling on Railway for FastAPI.
parth220
PROOP

a year ago

Not sure if this is affecting others as well!

I upgraded my (otherwise stable) FastAPI server to have horizontal scaling from 1 instance to 4 and I have a proliferation of intermitten 502 bad gateway errors from the railway proxy.

I'm hitting this issue with ~5-10% of request failing to execute (even with exponential retry), which is hurting my customers workflows.

The docs, https://docs.railway.com/reference/errors/application-failed-to-respond, suggest either that the IP/ports are not configured properly or that the server is under heavy strain.

  • Confirmed the ports and ip. 0.0.0.0:8080

  • I don't think its an "Application Under Heavy Load" error either, since you'd expect each instance to have LESS load after adding horizontal scaling. I am virtually error free (Maybe 1 502 error per ~1000 requests) but shoot up to ~50-100 errors per 1000 requests.

This leads me to believe it may be an issue with railway's edge routing or load balancing between my instances…

Would love any help from someone who's either great at FastAPI or railway scaling.

https://station.railway.com/questions/i-get-a-surge-of-502-errors-when-adding-671834e0

Project ID: 7c37f583-3c7b-4fb6-8fc8-9b57b0eb3606
Service ID: 1eee9966-3a1e-4f0b-9313-5737db82166d

10 Replies

a year ago

what is the error you get from the 502's? it's shown in the http logs


parth220
PROOP

a year ago

@Brody

Here's my error:

hud.server.requests.RequestError: Request failed with status 502 - JSON response: {'status': 'error', 'code': 502, 'message': 'Application failed to respond', 'request_id': 'mn4qUBnPQpalDRtY8sXEeg_1861343781'} | Status: 502 | Response Text: {"status":"error","code":502,"message":"Application failed to respond","request_id":"mn4qUBnPQpalDRtY8sXEeg_1861343781"} | Response JSON: {'status': 'error', 'code': 502, 'message': 'Application failed to respond', 'request_id': 'mn4qUBnPQpalDRtY8sXEeg_1861343781'} | Headers: {'content-length': '120', 'content-type': 'application/json', 'server': 'railway-edge', 'x-railway-edge': 'railway/us-west1', 'x-railway-fallback': 'true', 'x-railway-request-id': 'mn4qUBnPQpalDRtY8sXEeg_1861343781', 'date': 'Sat, 08 Mar 2025 02:54:24 GMT'}

a year ago

please provide the error given by the HTTP logs on railway


parth220
PROOP

a year ago

requestId:"Vv0RYWvKT8m0nnbYpEsybQ_3167001623"
timestamp:"2025-03-08T08:00:00.088690431Z"
method:"POST"
path:"/hud-gym/api/v1/execute_step/57ea66ed-190a-45e6-879f-b5abb8bcf1c7"
host:"orchestrator.hud.live"
httpStatus:502
upstreamProto:"HTTP/1.1"
downstreamProto:"HTTP/1.1"
responseDetails:"failed to forward request to upstream: body read after close"
totalDuration:4011
upstreamAddress:"http://[fd12:4680:400d:0:a000:8:396b:580f]:8000"
clientUa:"python-httpx/0.28.1"
upstreamRqDuration:4011
txBytes:120
rxBytes:420
srcIp:"136.25.59.57"
edgeRegion:"us-west1"

1348079533567377400


parth220
PROOP

a year ago

Btw, sorry @Brody I didn't realize that a discord help message opens up another help thread on help-station.
I made a request there at first: https://station.railway.com/questions/i-get-a-surge-of-502-errors-when-adding-671834e0

Happy to upgrade to enterprise and hop on a call to work through this.


a year ago

does this ever happen on the legacy edge network


parth220
PROOP

a year ago

Actually this was on the legacy edge network!

I moved to metal edge and am going to test now…


a year ago

oh then if these errors are on the legacy edge network that means these are application level errors, the legacy edge network has been GA for over 6 months


parth220
PROOP

a year ago

Interesting.

Is there a reason you can think of on the application level would have no 502's on a single instance and then many 502's when using horizontal scaling?

If the application was under heavy load, I would imagine there's LESS load with multiple replica instances rather than the other way around


a year ago

I don't have any ideas unfortunately, what does that endpoint do?


Loading...