30‑second delay on internal HTTP requests between services in staging environment
bernatgene
HOBBYOP

8 days ago

In our Railway project, the staging environment shows a consistent 30 s delay on HTTP requests proxied from a SvelteKit service to an Adminer service running in the same project.

  • The exact same Docker image and code are deployed to both production and staging. Staging environment is a clone of production; same envars too.

  • In production, Adminer requests complete instantly.

  • In staging, Adminer logs show the TCP connection accepted immediately but no data received until exactly 30 s later, after which the request proceeds normally:

[Wed Nov 26 18:14:53 2025] [::ffff:10.249.82.250]:35802 Accepted
[Wed Nov 26 18:15:27 2025] [::ffff:10.249.82.250]:35802 [302]: GET /index.php?pgsql=sc-postgis.railway.internal&username=postgres&db=postgres
[Wed Nov 26 18:15:27 2025] [::ffff:10.249.82.250]:35802 Closing
[Wed Nov 26 18:15:27 2025] [::ffff:10.249.82.250]:55424 Accepted
[Wed Nov 26 18:15:57 2025] [::ffff:10.249.82.250]:55424 [200]: GET /index.php?pgsql=sc-postgis.railway.internal&username=postgres&db=postgres&ns=public
[Wed Nov 26 18:15:57 2025] [::ffff:10.249.82.250]:55424 Closing
[Wed Nov 26 18:15:58 2025] [::ffff:10.249.82.250]:41100 Accepted
[Wed Nov 26 18:16:28 2025] [::ffff:10.249.82.250]:41100 [200]: GET /index.php?pgsql=sc-postgis.railway.internal&username=postgres&db=postgres&ns=public&script=db
[Wed Nov 26 18:16:28 2025] [::ffff:10.249.82.250]:41100 Closing
  • All other internal service calls (for example, to a FastAPI backend) work as expected in both environments.

  • Curl and Node fetch from inside the staging container reach the Adminer service instantly; the delay only occurs when the connection is made through Railway’s normal internal networking path during application runtime.

This looks like a networking or proxy issue specific to the staging subnet or ingress, but have ran out of ideas to try and debug this. Any pointers on what else to check?

0 Replies

bernatgene
HOBBYOP

8 days ago

f4a24561-151c-4a6c-91b7-e9ac75a5a9c2


brody
EMPLOYEE

8 days ago

Admittedly, I'm having trouble understanding the network flow and how you have it interconnected. Could you provide a diagram or an explanation of some sort?


bernatgene
HOBBYOP

8 days ago

Both services are in the same Railway project and communicate over the default internal network using their service names.

The SvelteKit app reverse‑proxies /admin/adminer/* requests to that internal hostname.

Basically i embed in an iframe in the sveltekit app the adminer page, and proxy all requests via a handler which redirects from the svelte server to the adminer server via 8080. Maybe it's something in there and my own fault, but what made me file the ticket is that in the production environement this flow responds instantly; in staging the TCP connection establishes immediately but the first HTTP bytes are delayed exactly 30 seconds.
All other internal requests (e.g. to a FastAPI backend over the same network which uses a very similar proxying handler) are instantaneous.


bernatgene
HOBBYOP

8 days ago

[SvelteKit container]  --->  (HTTP 8080)  --->  [Adminer container]
        |                                      |
        +--------------------------------------+
             internal Railway private network

brody
EMPLOYEE

8 days ago

Thank you for the detailed information, that is extremely helpful.

Have you tried to SSH into the SvelteKit app and curl or wget the adminer service?


bernatgene
HOBBYOP

8 days ago

yes, and it responds instantly


brody
EMPLOYEE

8 days ago

What about SSH'ing in and opening the Node REPL and doing a fetch?


bernatgene
HOBBYOP

8 days ago

# node -e "console.time('fetch'); \
  fetch('http://beneficial-smile.railway.internal:8080/index.php') \
  .then(r => r.text()) \
  .then(() => console.timeEnd('fetch'))"
fetch: 223.294ms

bernatgene
HOBBYOP

8 days ago

key difference between the fastAPI proxying and adminer proxying is taht fastAPI and svelte are in the same service/container therefore talk on localhost.., while PgAdminer is running in another service.


bernatgene
HOBBYOP

8 days ago

anyway. It's not a mega blocking issue for me since at least it works on production (ironic); but I wanted to check if there was something that rang a bell for you in this sort of thing working in one environment but not the other.


bernatgene
HOBBYOP

8 days ago

if you think of something though, please let me know


brody
EMPLOYEE

6 days ago

I'm not seeing anything on our end that could contribute to this, so could I trouble you for a reproducible example?


Loading...