Severe Latency After Migration to Metal Edge on Railway
devarshishimpi
HOBBYOP

8 months ago

Hey team,
I noticed a significant drop in performance after deploying a new service on Railway. Previously, I had a similar service running with networking set up on what I believe was GCP. The new one, using the same (or even simpler) code, is much slower.

The old deployment’s networking showed "Metal Edge" as an upgrade option, while the new one defaults to Metal with no way to switch or downgrade. At first, I thought GCP was causing the difference in speed, but I later learned both the deployments (old and new) are on Metal—just the proxy was still GCP for the old one.

Every millisecond matters for our customer experience, and the side-by-side latency comparison was huge. I’m trying to get to the root cause—whether it’s related to TCP proxy latency, database region, or something else.

24 Replies

devarshishimpi
HOBBYOP

8 months ago

15b4da75-0c8b-4bb0-8f04-8d4167e64752


8 months ago

I understand that the app feels slower, but concrete numbers would be helpful for actually diagnosing the issue. As this is a new app, you don't necessarily have a direct comparison to your old app.

Please send logs with RTT from a user, as well as internal response times in the app, i.e. how long the request takes to reach the user vs how long the service takes to actually process the request


8 months ago

Breaking down even further would be even more helpful. RTT for communication between services & databases would be ideal


devarshishimpi
HOBBYOP

8 months ago

Actually i can compare both the apps directly, as the new one is a subpart of the old one. So basically, it should actually perform better than the old one


devarshishimpi
HOBBYOP

8 months ago

Anyways, I'll send you the logs and timings in a few hours


devarshishimpi
HOBBYOP

8 months ago

Or if we could get on a call, it would be much better


8 months ago

  1. Is your external Mongo database located in Singapore too?

  2. If every millisecond mattered, you would host Mongo on Railway and connect to it via the private network.

  3. Your Redis database is in a separate project; it needs to be in the same project, and you need to connect to it via the private network.


8 months ago

^ Brody is able to see your project as he is a member of the team, take all this advice!


devarshishimpi
HOBBYOP

8 months ago

I'm using Mongo for another service also, so I'm just using a cluster on Atlas. And yes, redis is a good catch. I'll try that out and let you know


8 months ago

I think you should take point #2 into consideration, having Mongo in the same data center on Railway would significantly reduce latency, and you did say every millisecond matters.


devarshishimpi
HOBBYOP

8 months ago

No, actually I just checked, this service just authenticates once (which requires mongo). Other than that, everything else runs over websockets (not even requiring redis involvement). The redis task is running synchronously in the side, not even creating any trouble for websockets


devarshishimpi
HOBBYOP

8 months ago

Wait, I'll actually provide you with the other service's project id
b7229d4a-9cb5-4e53-a3cf-be4923d369f5


devarshishimpi
HOBBYOP

8 months ago

You can have a look at it. The only difference you'll find in both the services would be just this metal edge tag under the networking section

1396369758647226400
1396369758881841200


devarshishimpi
HOBBYOP

8 months ago

Can you manually migrate this from metal edge back to gcp? Maybe I can test it out and actually see if it makes a difference


devarshishimpi
HOBBYOP

8 months ago

So basically, what happened now is that i generated a new domain in my old service, which also migrated my old service to metal edge automatically. And now guess what, I'm getting the same lag as i'm getting on my new service


devarshishimpi
HOBBYOP

8 months ago

It definitely has something to do with metal edge


devarshishimpi
HOBBYOP

8 months ago

@Brody


8 months ago

How are you measuring latency? Could you also provide both of the measurements you're talking about?


devarshishimpi
HOBBYOP

8 months ago

I'll provide you the measurements in some time


8 months ago

what happened to their roles <:Thinking:1360710341239242762>


8 months ago

can they still chat here without the roles?


8 months ago

they can't because to talk here you need the support access role 😔


8 months ago

and they need the community access role to talk in #🎤|chit-chat


8 months ago

my assumption is that the user left and rejoined and never went through sign-up again


Loading...