Private networking returns ECONNREFUSED between two services in same project despite all documented fixes

nmvalletta77

PROOP

a month ago

Hi Railway team,

Two services in the same project + environment can't talk over private networking despite every documented config being in place. Public networking works fine; only the private hop

fails.

Services:

- Service A — Express app, listens on :::8080 (IPv6 dual-stack, confirmed at runtime via log listening on :::8080 (family=IPv6))

- Service B — Next.js sidecar that needs to fetch back to Service A

Symptom:

Service B → GET http://service-a.railway.internal:8080/... returns ECONNREFUSED every time. DNS resolves fine (no ENOTFOUND). Port 8080 is the one the server is listening on. The reverse

direction works: Service A → service-b.railway.internal:8080 via http-proxy succeeds.

What we've tried:

1. Service A binding 0.0.0.0 → ECONNREFUSED

2. Service A binding :: (dual-stack) → ECONNREFUSED

3. ipv6EgressEnabled=true on BOTH services → ECONNREFUSED

4. NODE_OPTIONS=--dns-result-order=ipv4first on the calling service → ECONNREFUSED

5. Short hostname service-a vs FQDN service-a.railway.internal → same ECONNREFUSED either way

Control that works: setting the sidecar's internal-API URL to the public edge URL (https://...) — the sidecar successfully fetches via the public route, so it's not a Node/code issue.

It's specifically the private-network hop.

Note: this project predates the Oct 2025 IPv4 private-networking rollout (legacy IPv6-only era), which might be relevant.

Ask: can you check the private-network registration for Service A? Either the VRF isn't routing correctly or it's forwarding to the wrong internal port. Happy to share project/service

IDs privately once a ticket is assigned, or open a live debug window.

Thanks

Solved$20 Bounty

Pinned Solution

nmvalletta77

PROOP

17 days ago

Closing this. Created a new project.

7 Replies

Status changed to Open Railway • 28 days ago

darseen

HOBBYTop 1% Contributor

a month ago

Does this issue happen during the build phase? Because private networking isn't available at that time. If not, I suggest you try to make a new project, just to test if the issue is related to the legacy IPv6-only.

Other things you can try:

1. Since you mentioned http-proxy connection from your Express to Next app works. Try using axios (http-proxy or any other library) instead of fetch in your Next app.

2. Remove NODE_OPTIONS=--dns-result-order=ipv4first , because your internal network might be IPv6-only.

nmvalletta77

PROOP

a month ago

Thanks for the suggestions — replying to each:

1. Build phase vs runtime. This is at request time, not build. The failing fetch happens inside a Next.js SSR route handler (generateMetadata() + ISR with revalidate: 300), and the logs

show it firing on live requests. Same path that succeeds when INTERNAL_API_URL is set to the public edge URL, so it's definitely runtime.

2. axios instead of fetch. Pretty sure this isn't a client-library issue — the same fetch() call in the same deployed image succeeds against https://app.tortus.io/... (public edge) and

fails with ECONNREFUSED against http://service-a.railway.internal:8080/... (private). The only thing changing is the hostname, so the failure is under the socket, not at the HTTP layer.

3. NODE_OPTIONS=--dns-result-order=ipv4first. Agreed that's a mismatch on IPv6-only private networking — I added it while flailing and should have removed it. It's no longer set.

ECONNREFUSED persists either way.

4. New project test. I'd rather not move a live production service as a debug step. But your instinct about legacy IPv6-only is exactly what I suspect — this project was created well

before the Oct 2025 IPv4 rollout, and my read is that Service A's private-network entry is stale (possibly compounded by the fact that Service A was renamed at one point — the old

hostname still resolves, the new one is ENOTFOUND). That's a control-plane fix on Railway's side, not something reproducible from a fresh project.

Still hoping a Railway engineer can pull the VRF / internal registration for the service and confirm. Happy to share IDs privately.

Thanks

darseen

HOBBYTop 1% Contributor

a month ago

generateMetadata runs during the build process for static routes, ISR pages do too, AFAIK. So you can't use internal URLs in anything that runs during the build process in Next.js. This can be verified in your build logs, if you have set up logging for failed requests. But since the issue also happens during runtime, it means it's not just about using the internal network during the build phase.

Perhaps I wasn't clear about the new project test. What I meant is to create 2 new test services from the same source code in a new project, to rule out the IPv6-only networking issue.

nmvalletta77

PROOP

23 days ago

One more datapoint to make this unambiguous: I just ran the control test that darseen suggested — fresh Railway project (private-net-test, project ID 17d508bd-5edd-42b8-80cc-3e5405715bbd), two new services in

it, identical pattern (one Express-style HTTP server binding :::8080, one Node sidecar fetching it via service-a.railway.internal:8080).

Fresh project result:

{

"dns": { "ok": true, "type": "AAAA",

"values": ["fd12:383c:c461:1:8000:11b:91be:28e8"], "ms": 4 },

"privateFetch": { "ok": true, "status": 200, "ms": 10 },

"publicFetch": { "ok": true, "status": 200, "ms": 27 }

}

Private-network DNS resolves AAAA in 4ms, fetch round-trip in 10ms. This is exactly what's expected and what our prod project should be doing.

So:

- Fresh project = works perfectly, no special config needed.

- Our existing project = ECONNREFUSED on legacy hostname byson-server.railway.internal, ENOTFOUND on current hostname tortus-server.railway.internal. Same project, same docs followed, same Node version, same

:::8080 binding.

The only material difference between the two projects is age (existing project predates the Oct 2025 IPv4 private-networking rollout) and the fact that our service was renamed at some point. This rules out

every generic-Railway-bug suspect and confirms the rename / legacy-state-registration theory described in the original report.

Please look at the control-plane state for service cca7ab1e-bb7b-4de1-a32a-e7ccad7133b3 in project b7c836ae-0414-4a42-96b8-7be449b555c4. Whatever migration step normally registers a service in the

post-Oct-2025 networking layer didn't run for this one (or got stale post-rename). Re-running it should resolve.

darseen

HOBBYTop 1% Contributor

23 days ago

Since the issue is related to legacy IPv6 only environments. Railway will migrate all legacy environments at a later date, as the docs say, but they didn't specify a date. Although, your apps shouldn't have a problem running in a legacy environment if configured to use :: correctly as per the docs:

Attachments

image.png

nmvalletta77

PROOP

21 days ago

@railway is this something you can investigate in my account? Certainly sounds like a bug. Would really appreciate it before we commit to a week of work to switch to a new project.

nmvalletta77

PROOP

17 days ago

Closing this. Created a new project.

Status changed to Solved mykal • 16 days ago

Welcome!