Heavy latency lately?
kuhamaven
PROOP

a year ago

I've been running an app for over a year already in Railway, and suddenly we started experiencing random latency spikes on the app responses. Tried improving and testing through the whole code in case we bugged something out, but there are random moments in the day when response time goes from 0.5-1 second to 16-19 seconds. It's even weirder cause we though it would be a resource usage issue, but the server still has enough resources left, and whenever there is a user spike response time / performance doesn't decrease, so it isn't a concurrent users issue either.

Is there anything going on with the service?

Solved

50 Replies

kuhamaven
PROOP

a year ago

c17693fa-8f81-4f00-ad4d-8f231f687806


medim
MODERATOR

a year ago

Are you using a metal region?


kuhamaven
PROOP

a year ago

us-west1


medim
MODERATOR

a year ago

I would tell you to analyze this problem further, the team will probably ask for more data because if it were a general problem there would be more cases here


medim
MODERATOR

a year ago

Are you using something like uptime kuma to monitor this latency?


medim
MODERATOR

a year ago

also, there's a metal us-west and a gcp one


kuhamaven
PROOP

a year ago

I'm measuring the backend response time as stated earlier, that's why it caught my attention


kuhamaven
PROOP

a year ago

users started complaining about random disconnections


kuhamaven
PROOP

a year ago

and after checking the measurements, response time went from less than 1 second to almost 19 seconds per request


kuhamaven
PROOP

a year ago

Some hours ago having 500 users at the same time still kept the 1 second time


kuhamaven
PROOP

a year ago

Now there are just 60 and server is delaying up to 19 seconds per request


kuhamaven
PROOP

a year ago

No logs, no errors, no anything. Tried even the good ol' reset just in case, even with a fresh start there is heavy lag now


medim
MODERATOR

a year ago

gotcha, let's wait for a team/conductor to answer this thread


kuhamaven
PROOP

a year ago

Okay, now all out of a sudden it went from 22-25 seconds per response down to 3-7 seconds, still laggy, but a drastic improvement


kuhamaven
PROOP

a year ago

Again, no change at all in the code itself, in fact logged users increased, so it isn't userload either


adam
MODERATOR

a year ago

This sounds to me like your database and backend service are in different regions. Can you please send screenshots of both? @Kuha


kuhamaven
PROOP

a year ago

both say US WEst Oregon


adam
MODERATOR

a year ago

Please send screenshots


kuhamaven
PROOP

a year ago

1333825311938318300


kuhamaven
PROOP

a year ago

1333825354057388000


adam
MODERATOR

a year ago

Great, that rules out metal/nonmetal


adam
MODERATOR

a year ago

If you have any quantifiable data, such as a grafana dashboard, please share that


kuhamaven
PROOP

a year ago

Sadly only what I stated before


kuhamaven
PROOP

a year ago

userload and average response times


brody
EMPLOYEE

a year ago

can you please try to switch both services to the v2 runtime


kuhamaven
PROOP

a year ago

where is that?}


brody
EMPLOYEE

a year ago

within the service settings


kuhamaven
PROOP

a year ago

found it


brody
EMPLOYEE

a year ago

if this doesnt change anything we would have to recommend you setup tracing so that you can pinpoint where the "slow" is coming from


kuhamaven
PROOP

a year ago

I'll be testing with the V2 then


kuhamaven
PROOP

a year ago

never knew that was there


kuhamaven
PROOP

a year ago

what does it change?


brody
EMPLOYEE

a year ago

moves the workload from docker to podman


kuhamaven
PROOP

a year ago

So far average response time went down to 1.6 seconds, nice


kuhamaven
PROOP

a year ago

will keep updating just in case


brody
EMPLOYEE

a year ago

sounds good, I'll leave this thread open, but just know that if you see an increase in latency you will need to add tracing to your app


brody
EMPLOYEE

a year ago

Railway has no observability into what your code is or is not doing.


medim
MODERATOR

a year ago

@silence @Brody how did u know they weren't running in runtime v2? your admin superpowers?


medim
MODERATOR

a year ago

silence fail


brody
EMPLOYEE

a year ago

yes, though i didnt look at anything the user couldn't see themselves, if you see a runtime selector set to legacy then you are on legacy, if you do not see a runtime selector at all, then you are on v2, and the selector isnt there because you cannot go back to legacy


kuhamaven
PROOP

a year ago

when did the runtime change?


medim
MODERATOR

a year ago

A long time ago, lol


medim
MODERATOR

a year ago

As of 2024/06/04 (YYYY/MM/DD)


kuhamaven
PROOP

a year ago

makes sense, I have it running since 2023 almost haha


adam
MODERATOR

a year ago

It's @silent lol


adam
MODERATOR

a year ago

@Medim


medim
MODERATOR

a year ago

oh yeah


medim
MODERATOR

a year ago

mb


adam
MODERATOR

a year ago

s'all good


brody
EMPLOYEE

a year ago

!s


Status changed to Solved brody about 1 year ago


Loading...