Spikes every 2 seconds

lucasrolff

FREEOP

8 months ago

I'm trying out railway.com for a simple exchange rate API (basically exposes historical ECB data).

This I'm currently running on a basic VM, and it's fairly consistent in terms of response times, around 30-35ms for the server processing time.

I've tried deploying the exact same Laravel app to railway.com, using Laravel and a Postgres backend (both running in Amsterdam, Europe).

However, it seems every 2 seconds, regardless whether I do 1 req/s, 2 req/s or 10 req/s, there's a spike of 170-190ms, and other time, it's fairly consistent.

The API calls are fairly simple, it does a select in the postgres DB and return the values as json. Nothing complex, nothing computational.

Below is the output of httpstat running in a simple loop every 0.5 seconds, doing a request to the API.

Considering it's exactly every 2 seconds, and consistent response times in the spike is what is weird to me. It's worth mentioning, the application doesn't do this on my current VPS, it's always consistent.

Sat Nov  8 09:27:16 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     15ms      |       182ms       |        1ms       ]
--
Sat Nov  8 09:27:17 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    10ms    |       1ms      |     14ms      |       58ms        |        0ms       ]
--
Sat Nov  8 09:27:17 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    10ms    |       2ms      |     14ms      |       36ms        |        0ms       ]
--
Sat Nov  8 09:27:18 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     15ms      |       34ms        |        0ms       ]
--
Sat Nov  8 09:27:18 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     14ms      |       178ms       |        0ms       ]
--
Sat Nov  8 09:27:19 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     19ms      |       45ms        |        0ms       ]
--
Sat Nov  8 09:27:20 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     14ms      |       37ms        |        1ms       ]
--
Sat Nov  8 09:27:20 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     16ms      |       41ms        |        0ms       ]
--
Sat Nov  8 09:27:21 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    12ms    |       1ms      |     15ms      |       183ms       |        0ms       ]
--
Sat Nov  8 09:27:22 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    10ms    |       2ms      |     18ms      |       44ms        |        0ms       ]
--
Sat Nov  8 09:27:22 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     14ms      |       39ms        |        0ms       ]
--
Sat Nov  8 09:27:23 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     16ms      |       48ms        |        0ms       ]
--
Sat Nov  8 09:27:23 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     15ms      |       181ms       |        0ms       ]
--
Sat Nov  8 09:27:24 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     14ms      |       42ms        |        0ms       ]
--
Sat Nov  8 09:27:25 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     15ms      |       35ms        |        1ms       ]
--
Sat Nov  8 09:27:25 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     14ms      |       34ms        |        0ms       ]
--

Solved$10 Bounty

9 Replies

Railway

BOT

8 months ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!

lucasrolff

FREEOP

8 months ago

Apparently if I use something like Laravel octane, and override the standard startup command, I don't get the spikes at all, so I'm unsure what within the default php Railpack could possibly trigger this

Whether the PHP process is terminated every X seconds, if connections are closed etc.

brody

EMPLOYEE

8 months ago

Hello,

My initial thoughts is that the spikes are due to DNS resolution time, are you perhaps not using a pooled database connection?

Best,

Brody

Status changed to Awaiting User Response Railway • 8 months ago

brody

Hello, My initial thoughts is that the spikes are due to DNS resolution time, are you perhaps not using a pooled database connection? Best, Brody

lucasrolff

FREEOP

8 months ago

Hi Brody,

I'm not using a pooled DB connection, I simply use default Laravel DB connection to pgsql.

But indeed, it seems to be related to DNS, because if I make an endpoint that resolves my postgres hostname, then it fails every 4 requests. So I guess that makes sense why the response time then is slower, since it has to retry resolving the DB hostname.

Status changed to Awaiting Railway Response Railway • 8 months ago

brody

EMPLOYEE

8 months ago

Then I would recommend moving to a pooled database connection where you would only ever resolve the DNS when adding a new connection to the pool, this way, you won't have to do a DNS lookup for every SQL query.

Status changed to Awaiting User Response Railway • 8 months ago

brody

lucasrolff

FREEOP

8 months ago

Wouldn't the ideal solution be that Railway fix their DNS infrastructure so queries doesn't fail? 🙂 That's usually what an infrastructure company would do if there's issues, and not hide the problem by only resolving things once.

Status changed to Awaiting Railway Response Railway • 8 months ago

lucasrolff

FREEOP

8 months ago

root@d815716a0364:/app# while sleep 0.5; do dig postgres.railway.internal | grep "10.182.113.181\|Query time"; done
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 136 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 4 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 4 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 136 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 132 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 136 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec

So the containers have a single resolver fd12::10) configured in resolv.conf, which likely then uses the host resolver, and considering it's exactly every 2 seconds, that seems very non-random.

brody

EMPLOYEE

8 months ago

Apologies for not mentioning sooner, but we are indeed aware of the periodic slow DNS lookups when the cache expires, and we do have plans to fix it.

My suggestion to use a pooled database would be a solution in the meantime.

Status changed to Awaiting User Response Railway • 8 months ago

brody

Apologies for not mentioning sooner, but we are indeed aware of the periodic slow DNS lookups when the cache expires, and we do have plans to fix it. My suggestion to use a pooled database would be a solution in the meantime.

lucasrolff

FREEOP

8 months ago

Thanks sir. I'll serve it through Laravel Octane which "solves" the issue in the meantime. But glad to know it's something that's at least known

Have a good one!

Status changed to Awaiting Conductor Response Railway • 8 months ago

Status changed to Solved brody • 8 months ago

Welcome!