Hi, We're currently experiencing a gradual increase in latency without any changes on our side, and wanted to check if there might be an issue affecting our host. Starting around Apr 4th, both our database latency and API response times have been steadily increasing over time. * Database query times have gradually increased * API latency (p95) has risen accordingly * No code changes or traffic spikes were introduced during this period From our monitoring: * The increase is gradual rather than a sudden spike * It affects multiple queries, not just a specific one * This suggests a broader database or host-level degradation rather than a query regression Given past incidents related to host-level issues, we're wondering: * Is our database currently on a degraded or heavily loaded host? * Were there any underlying issues around this timeframe? * Would it be possible to move this workload to a less crowded host? This is impacting production traffic, so any insight would be greatly appreciated. Thanks in advance

Gradual increase in database latency - Railway Central Station

Gradual increase in database latency

injung

PROOP

3 months ago

Hi,

We're currently experiencing a gradual increase in latency without any changes on our side, and wanted to check if there might be an issue affecting our host.

Starting around Apr 4th, both our database latency and API response times have been steadily increasing over time.

Database query times have gradually increased
API latency (p95) has risen accordingly
No code changes or traffic spikes were introduced during this period

From our monitoring:

The increase is gradual rather than a sudden spike
It affects multiple queries, not just a specific one
This suggests a broader database or host-level degradation rather than a query regression

Given past incidents related to host-level issues, we're wondering:

Is our database currently on a degraded or heavily loaded host?
Were there any underlying issues around this timeframe?
Would it be possible to move this workload to a less crowded host?

This is impacting production traffic, so any insight would be greatly appreciated.

Thanks in advance

Attachments

%E1%84%89%E...

Solved

14 Replies

Status changed to Awaiting Railway Response Railway • 3 months ago

angelo-railway

EMPLOYEE

3 months ago

Hey, thanks for the detailed report. The gradual latency increase starting Apr 4th is consistent with a host-level issue rather than a query regression. We're checking which host your database is on and whether it's under elevated load. If so, we can migrate your workload to a healthier host. Will follow up shortly.

Status changed to Awaiting User Response Railway • 3 months ago

angelo-railway

EMPLOYEE

3 months ago

Confirmed, your database is on a host that experienced a significant load spike starting April 4th. System load jumped from ~25 to 95+ and has remained elevated since, with IO wait times and memory pressure both increasing. (Although we should be good now.)

This directly correlates with the latency increase you observed. Are you in a better spot now?

injung

PROOP

3 months ago

Hi! It does seem to have improved since the peak on April 5th, but it hasn't fully recovered yet. That said, we're definitely in a better state now compared to April 5th.

We haven't had a chance to observe the impact of the changes you just made yet, but we will continue monitoring closely and will let you know if anything changes.

Really appreciate your support!

Attachments

%E1%84%89%E...

Status changed to Awaiting Railway Response Railway • 3 months ago

jake

EMPLOYEE

3 months ago

Hello. This should be resolved now. We were validating a config to speed up reads, but for specific instances it had a negative affect on writes

We've now rolled out a new config which should increase the speed of both reads AND writes (thus solving your issue)

Please let us know if that's not the case!

Status changed to Awaiting User Response Railway • 3 months ago

injung

PROOP

3 months ago

Based on our observations, it has improved, though it hasn't fully returned to previous levels yet.

Attachments

%E1%84%89%E...

Status changed to Awaiting Railway Response Railway • 3 months ago

injung

PROOP

3 months ago

Hi, is this possibly degraded again?

We're seeing a very simple query taking over 7 seconds:

SELECT "sessions".*
FROM "sessions"
WHERE "sessions"."tenant" = $1 AND "sessions"."token" = $2
LIMIT $3

Sorry for repeatedly suspecting host performance, but we haven't been able to identify any significant issues on our side in terms of database queries.

Would appreciate it if you could take another look. Thanks for your help as always!

Attachments

%E1%84%89%E...

Status changed to Solved injung • 3 months ago

injung

PROOP

3 months ago

Hey, it's getting worse. Any updates on this?

Attachments

%E1%84%89%E...

Status changed to Awaiting Railway Response Railway • 3 months ago

noahd

EMPLOYEE

2 months ago

Hey how are you collecting those metrics? I've looked around the host and a few other metrics, not really seeing any major signs here.

Definitely don't want to be a blocker to you here though.

Status changed to Awaiting User Response Railway • 2 months ago

injung

PROOP

2 months ago

I'm collecting these metrics from both Sentry and the Railway dashboard. They show similar trends, but the Railway dashboard is a bit harder to interpret, so I've attached the Sentry data instead.

As you can see, this reflects the P95 latency fluctuations we've been experiencing since Apr 17.

Attachments

%E1%84%89%E...

Status changed to Awaiting Railway Response Railway • 2 months ago

injung

PROOP

2 months ago

We're seeing the issue again.

There have still been no deploys or traffic changes on our side, but:

We experienced a period of timeouts for about 30 minutes
Overall latency has increased significantly — now roughly 3x slower than baseline
Database and API latency are both affected

This looks very similar to the previous incident and is happening again without any changes on our end.

Could you take another look at the host or underlying infrastructure?

This is now happening repeatedly and is impacting production, so we'd really appreciate urgent investigation.

Attachments

%E1%84%89%E...

Anonymous

PRO

2 months ago

Same!

ray-chen

EMPLOYEE

2 months ago

Apologies for the delay here.

We're aware of this and actively investigating. We'll update you as soon as we have more to share.

In the meantime, if you're experiencing impact, redeploying your service to a different region can help as an immediate workaround.

Status changed to Awaiting User Response Railway • about 2 months ago

injung

PROOP

2 months ago

It finally seems to be fixed. We didn't change anything on our side, but latency has returned to normal. I hope it stays this way.

Attachments

%E1%84%89%E...

Status changed to Awaiting Railway Response Railway • about 2 months ago

codydearkland

EMPLOYEE

2 months ago

Glad to hear it's back to normal. We identified and addressed the root cause on the host your database was running on. Please don't hesitate to reopen this thread if the latency returns.

Status changed to Awaiting User Response Railway • about 2 months ago

Status changed to Solved codydearkland • about 2 months ago

Welcome!