a month ago
Hi Railway team,
Two services in the same project + environment can't talk over private networking despite every documented config being in place. Public networking works fine; only the private hop
fails.
Services:
- Service A — Express app, listens on :::8080 (IPv6 dual-stack, confirmed at runtime via log listening on :::8080 (family=IPv6))
- Service B — Next.js sidecar that needs to fetch back to Service A
Symptom:
Service B → GET http://service-a.railway.internal:8080/... returns ECONNREFUSED every time. DNS resolves fine (no ENOTFOUND). Port 8080 is the one the server is listening on. The reverse
direction works: Service A → service-b.railway.internal:8080 via http-proxy succeeds.
What we've tried:
1. Service A binding 0.0.0.0 → ECONNREFUSED
2. Service A binding :: (dual-stack) → ECONNREFUSED
3. ipv6EgressEnabled=true on BOTH services → ECONNREFUSED
4. NODE_OPTIONS=--dns-result-order=ipv4first on the calling service → ECONNREFUSED
5. Short hostname service-a vs FQDN service-a.railway.internal → same ECONNREFUSED either way
Control that works: setting the sidecar's internal-API URL to the public edge URL (https://...) — the sidecar successfully fetches via the public route, so it's not a Node/code issue.
It's specifically the private-network hop.
Note: this project predates the Oct 2025 IPv4 private-networking rollout (legacy IPv6-only era), which might be relevant.
Ask: can you check the private-network registration for Service A? Either the VRF isn't routing correctly or it's forwarding to the wrong internal port. Happy to share project/service
IDs privately once a ticket is assigned, or open a live debug window.
Thanks
Pinned Solution
17 days ago
Closing this. Created a new project.
7 Replies
Status changed to Open Railway • 28 days ago
a month ago
Does this issue happen during the build phase? Because private networking isn't available at that time. If not, I suggest you try to make a new project, just to test if the issue is related to the legacy IPv6-only.
Other things you can try:
1. Since you mentioned http-proxy connection from your Express to Next app works. Try using axios (http-proxy or any other library) instead of fetch in your Next app.
2. Remove NODE_OPTIONS=--dns-result-order=ipv4first , because your internal network might be IPv6-only.
a month ago
Thanks for the suggestions — replying to each:
1. Build phase vs runtime. This is at request time, not build. The failing fetch happens inside a Next.js SSR route handler (generateMetadata() + ISR with revalidate: 300), and the logs
show it firing on live requests. Same path that succeeds when INTERNAL_API_URL is set to the public edge URL, so it's definitely runtime.
2. axios instead of fetch. Pretty sure this isn't a client-library issue — the same fetch() call in the same deployed image succeeds against https://app.tortus.io/... (public edge) and
fails with ECONNREFUSED against http://service-a.railway.internal:8080/... (private). The only thing changing is the hostname, so the failure is under the socket, not at the HTTP layer.
3. NODE_OPTIONS=--dns-result-order=ipv4first. Agreed that's a mismatch on IPv6-only private networking — I added it while flailing and should have removed it. It's no longer set.
ECONNREFUSED persists either way.
4. New project test. I'd rather not move a live production service as a debug step. But your instinct about legacy IPv6-only is exactly what I suspect — this project was created well
before the Oct 2025 IPv4 rollout, and my read is that Service A's private-network entry is stale (possibly compounded by the fact that Service A was renamed at one point — the old
hostname still resolves, the new one is ENOTFOUND). That's a control-plane fix on Railway's side, not something reproducible from a fresh project.
Still hoping a Railway engineer can pull the VRF / internal registration for the service and confirm. Happy to share IDs privately.
Thanks
a month ago
generateMetadata runs during the build process for static routes, ISR pages do too, AFAIK. So you can't use internal URLs in anything that runs during the build process in Next.js. This can be verified in your build logs, if you have set up logging for failed requests. But since the issue also happens during runtime, it means it's not just about using the internal network during the build phase.
Perhaps I wasn't clear about the new project test. What I meant is to create 2 new test services from the same source code in a new project, to rule out the IPv6-only networking issue.
23 days ago
One more datapoint to make this unambiguous: I just ran the control test that darseen suggested — fresh Railway project (private-net-test, project ID 17d508bd-5edd-42b8-80cc-3e5405715bbd), two new services in
it, identical pattern (one Express-style HTTP server binding :::8080, one Node sidecar fetching it via service-a.railway.internal:8080).
Fresh project result:
{
"dns": { "ok": true, "type": "AAAA",
"values": ["fd12:383c:c461:1:8000:11b:91be:28e8"], "ms": 4 },
"privateFetch": { "ok": true, "status": 200, "ms": 10 },
"publicFetch": { "ok": true, "status": 200, "ms": 27 }
}
Private-network DNS resolves AAAA in 4ms, fetch round-trip in 10ms. This is exactly what's expected and what our prod project should be doing.
So:
- Fresh project = works perfectly, no special config needed.
- Our existing project = ECONNREFUSED on legacy hostname byson-server.railway.internal, ENOTFOUND on current hostname tortus-server.railway.internal. Same project, same docs followed, same Node version, same
:::8080 binding.
The only material difference between the two projects is age (existing project predates the Oct 2025 IPv4 private-networking rollout) and the fact that our service was renamed at some point. This rules out
every generic-Railway-bug suspect and confirms the rename / legacy-state-registration theory described in the original report.
Please look at the control-plane state for service cca7ab1e-bb7b-4de1-a32a-e7ccad7133b3 in project b7c836ae-0414-4a42-96b8-7be449b555c4. Whatever migration step normally registers a service in the
post-Oct-2025 networking layer didn't run for this one (or got stale post-rename). Re-running it should resolve.
23 days ago
Since the issue is related to legacy IPv6 only environments. Railway will migrate all legacy environments at a later date, as the docs say, but they didn't specify a date. Although, your apps shouldn't have a problem running in a legacy environment if configured to use :: correctly as per the docs:
Attachments
21 days ago
@railway is this something you can investigate in my account? Certainly sounds like a bug. Would really appreciate it before we commit to a week of work to switch to a new project.
17 days ago
Closing this. Created a new project.
Status changed to Solved mykal • 16 days ago