Spikes every 2 seconds
lucasrolff
FREEOP

a month ago

I'm trying out railway.com for a simple exchange rate API (basically exposes historical ECB data).

This I'm currently running on a basic VM, and it's fairly consistent in terms of response times, around 30-35ms for the server processing time.

I've tried deploying the exact same Laravel app to railway.com, using Laravel and a Postgres backend (both running in Amsterdam, Europe).

However, it seems every 2 seconds, regardless whether I do 1 req/s, 2 req/s or 10 req/s, there's a spike of 170-190ms, and other time, it's fairly consistent.

The API calls are fairly simple, it does a select in the postgres DB and return the values as json. Nothing complex, nothing computational.

Below is the output of httpstat running in a simple loop every 0.5 seconds, doing a request to the API.

Considering it's exactly every 2 seconds, and consistent response times in the spike is what is weird to me. It's worth mentioning, the application doesn't do this on my current VPS, it's always consistent.

Sat Nov  8 09:27:16 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     15ms      |       182ms       |        1ms       ]
--
Sat Nov  8 09:27:17 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    10ms    |       1ms      |     14ms      |       58ms        |        0ms       ]
--
Sat Nov  8 09:27:17 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    10ms    |       2ms      |     14ms      |       36ms        |        0ms       ]
--
Sat Nov  8 09:27:18 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     15ms      |       34ms        |        0ms       ]
--
Sat Nov  8 09:27:18 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     14ms      |       178ms       |        0ms       ]
--
Sat Nov  8 09:27:19 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     19ms      |       45ms        |        0ms       ]
--
Sat Nov  8 09:27:20 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     14ms      |       37ms        |        1ms       ]
--
Sat Nov  8 09:27:20 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     16ms      |       41ms        |        0ms       ]
--
Sat Nov  8 09:27:21 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    12ms    |       1ms      |     15ms      |       183ms       |        0ms       ]
--
Sat Nov  8 09:27:22 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    10ms    |       2ms      |     18ms      |       44ms        |        0ms       ]
--
Sat Nov  8 09:27:22 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     14ms      |       39ms        |        0ms       ]
--
Sat Nov  8 09:27:23 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     16ms      |       48ms        |        0ms       ]
--
Sat Nov  8 09:27:23 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     15ms      |       181ms       |        0ms       ]
--
Sat Nov  8 09:27:24 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     14ms      |       42ms        |        0ms       ]
--
Sat Nov  8 09:27:25 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       1ms      |     15ms      |       35ms        |        1ms       ]
--
Sat Nov  8 09:27:25 AM UTC 2025
--
  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    11ms    |       2ms      |     14ms      |       34ms        |        0ms       ]
--
Solved$10 Bounty

9 Replies

Railway
BOT

a month ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!


lucasrolff
FREEOP

a month ago

Apparently if I use something like Laravel octane, and override the standard startup command, I don't get the spikes at all, so I'm unsure what within the default php Railpack could possibly trigger this
Whether the PHP process is terminated every X seconds, if connections are closed etc.


brody
EMPLOYEE

a month ago

Hello,

My initial thoughts is that the spikes are due to DNS resolution time, are you perhaps not using a pooled database connection?

Best,

Brody


Status changed to Awaiting User Response Railway 30 days ago


brody

Hello,My initial thoughts is that the spikes are due to DNS resolution time, are you perhaps not using a pooled database connection?Best,Brody

lucasrolff
FREEOP

a month ago

Hi Brody,

I'm not using a pooled DB connection, I simply use default Laravel DB connection to pgsql.

But indeed, it seems to be related to DNS, because if I make an endpoint that resolves my postgres hostname, then it fails every 4 requests. So I guess that makes sense why the response time then is slower, since it has to retry resolving the DB hostname.


Status changed to Awaiting Railway Response Railway 30 days ago


brody
EMPLOYEE

a month ago

Then I would recommend moving to a pooled database connection where you would only ever resolve the DNS when adding a new connection to the pool, this way, you won't have to do a DNS lookup for every SQL query.


Status changed to Awaiting User Response Railway 30 days ago


brody

Then I would recommend moving to a pooled database connection where you would only ever resolve the DNS when adding a new connection to the pool, this way, you won't have to do a DNS lookup for every SQL query.

lucasrolff
FREEOP

a month ago

Wouldn't the ideal solution be that Railway fix their DNS infrastructure so queries doesn't fail? That's usually what an infrastructure company would do if there's issues, and not hide the problem by only resolving things once.


Status changed to Awaiting Railway Response Railway 30 days ago


lucasrolff
FREEOP

a month ago

root@d815716a0364:/app# while sleep 0.5; do dig postgres.railway.internal | grep "10.182.113.181\|Query time"; done
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 136 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 4 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 4 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 136 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 132 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 136 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec
postgres.railway.internal. 10	IN	A	10.182.113.181
;; Query time: 0 msec

So the containers have a single resolver fd12::10) configured in resolv.conf, which likely then uses the host resolver, and considering it's exactly every 2 seconds, that seems very non-random.


brody
EMPLOYEE

a month ago

Apologies for not mentioning sooner, but we are indeed aware of the periodic slow DNS lookups when the cache expires, and we do have plans to fix it.

My suggestion to use a pooled database would be a solution in the meantime.


Status changed to Awaiting User Response Railway 29 days ago


brody

Apologies for not mentioning sooner, but we are indeed aware of the periodic slow DNS lookups when the cache expires, and we do have plans to fix it.My suggestion to use a pooled database would be a solution in the meantime.

lucasrolff
FREEOP

a month ago

Thanks sir. I'll serve it through Laravel Octane which "solves" the issue in the meantime. But glad to know it's something that's at least known

Have a good one!


Status changed to Awaiting Conductor Response Railway 29 days ago


Status changed to Solved brody 28 days ago


Loading...