17 days ago
Two services in my Toucan MLS project cannot communicate over the private network.
- toucan-mls-api (Node.js/Fastify) cannot reach valkey.railway.internal:6379
- Both services show Online in the Railway canvas
- VALKEY_URL is correctly set with the right hostname and password
- Errors began around 2026-05-31T22:26 UTC
- I have restarted both services and done a full redeploy of toucan-mls-api — errors continue
immediately after startup
- The Fastify app logs "Valkey connection error" every ~20 seconds continuously
This appears to be an internal network routing issue between the two services. The public proxy for
Valkey is still accessible, but the private network connection is broken.
Pinned Solution
12 days ago
Resolved — Destroyed the Valkey instance entirely and recreated it from scratch: new instance, new
endpoints, new private network connections. Everything is running again. Root cause appears to have
been a corrupt Docker container that restarting alone couldn't fix. If you're hitting similar private
network breakage between services on Railway, a full teardown and recreate of the Valkey instance is
what actually solved it for us.
11 Replies
17 days ago
This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.
Status changed to Open Railway • 17 days ago
17 days ago
I assume they're in the same environment? Make sure the private URL is the same as listed in Valkey's network settings.
0x5b62656e5d
I assume they're in the same environment? Make sure the private URL is the same as listed in Valkey's network settings.
17 days ago
Yes, both services are in the same project (Toucan MLS, production environment). The VALKEY_URL in
▎ toucan-mls-api is redis://default:***@valkey.railway.internal:6379 and the private hostname shown
▎ in Valkey's network settings is valkey.railway.internal. They match. The password is also confirmed
▎ correct.
17 days ago
Valkey's internal URL might be resolving to an IPv6 address. You can try to change your client's config to handle that. If you are using ioredis for example, you can add family: 6 to your client's initialization config object.
darseen
Valkey's internal URL might be resolving to an IPv6 address. You can try to change your client's config to handle that. If you are using **ioredis** for example, you can add `family: 6` to your client's initialization config object.
17 days ago
Thanks for the suggestion, darseen. I tried adding family: 6 to the iovalkey client
config and redeployed — unfortunately the errors continue at the same ~20-second
interval immediately after startup.
2026-06-01T04:02:54Z Valkey connection error
2026-06-01T04:02:55Z Valkey connection error
2026-06-01T04:03:15Z Valkey connection error
Both services show Online in the Railway canvas. The public Valkey proxy is still
reachable. This appears to be a routing failure on the private network side rather
than a DNS address family issue. Happy to share more logs if that helps diagnose
further.
17 days ago
Try going into your Toucan's console (click into the service and you'll see a console tab, but you'll need to enable priority boarding in https://railway.com/account/feature-flags) and try running ping valkey.railway.internal?
0x5b62656e5d
Try going into your Toucan's console (click into the service and you'll see a console tab, but you'll need to enable priority boarding in https://railway.com/account/feature-flags) and try running `ping valkey.railway.internal`?
17 days ago
ping isn't available in the container, but I ran getent hosts valkey.railway.internal
and got:
fd12:bfdc:9183:1:8000:d0:c637:9e5e valkey.railway.internal
So it resolves to IPv6 as suggested. I added family: 6 to the iovalkey client and
redeployed, but connection errors continue. The hostname resolves but the connection
to port 6379 isn't getting through. Looks like a routing issue on the private network
side — any idea what's broken?
16 days ago
Since running getent proves your container can resolve the IPv6 address, it's not an issue with the private network. My guess is it's either related to your API service or valkey.
To prove whether Node.js (your API) is the problem, you can bypass DNS resolution by hardcoding the IPv6 address into your connection string, it would be like this for example: redis://default:***@[fd12:bfdc:9183:1:8000:d0:c637:9e5e]:6379.
If it connects, it means the iovalkey client is failing to handle DNS resolution. Otherwise The issue is definitely caused by valkey (check valkey's deployments logs and share them here).
16 days ago
3707:C 19 May 2026 21:45:00.071 * DB saved on disk
3707:C 19 May 2026 21:45:00.072 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
1:M 19 May 2026 21:45:00.163 * Background saving terminated with success
1:M 19 May 2026 22:05:00.088 * 100 changes in 300 seconds. Saving...
1:M 19 May 2026 22:05:00.089 * Background saving started by pid 3708
3708:C 19 May 2026 22:05:00.097 * DB saved on disk
3708:C 19 May 2026 22:05:00.098 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
1:M 19 May 2026 22:05:00.189 * Background saving terminated with success
Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/1b304bdb-5a8f-4714-8ef6-dbd3c7a3266f/vol_756xswwwx5uqf4wf
"Valkey deploy logs end at May 20 — no logs available for May 21 through June 1 despite
the service showing Active."
16 days ago
Valkey is healthy, hardcoded IPv6 still fails, no logs for 11 days despite Active status, therefore the private network routing is broken on Railway's side.
16 days ago
Description:
Our Valkey service became unreachable from our Fastify service on May 31, 2026. We've done extensive
troubleshooting and worked around it by recreating Valkey, but the underlying private network issue
is unresolved and we need it investigated.
Project: Toucan MLS
What broke: valkey.railway.internal stopped routing from our Fastify service (toucan-mls-api).
Troubleshooting steps completed:
- DNS works — getent hosts valkey.railway.internal from the Fastify container resolved correctly to
fd12:bfdc:9183:1:8000:d0:c637:9e5e. DNS is not the problem.
- Hardcoded IPv6 failed — Set the connection URL to
redis://default:***@[fd12:bfdc:9183:1:8000:d0:c637:9e5e]:6379 to bypass DNS entirely. Still failed.
Rules out client-side DNS resolution as the cause.
- Public TCP proxy also failed — Switched to the public proxy URL (hopper.proxy.rlwy.net:48041).
Confirmed broken from local machine with redis-cli — returned I/O error: Connection reset by peer.
- Valkey container was healthy — Deploy logs showed clean periodic RDB saves with zero errors up to
May 19. Service showed Active throughout.
- Workaround — Deleted and recreated the Valkey service. New instance
(roundhouse.proxy.rlwy.net:20217) works via public proxy. Currently running on public proxy as a
fallback.
Current state: Online via public proxy. Private network still untested on the new instance.
The ask: Please investigate what caused the private network routing to stop working between services
in this project on May 31. We'd like to switch back to valkey.railway.internal once it's confirmed
safe, to avoid public internet routing and unnecessary bandwidth usage.
Note: the old Valkey instance (hopper.proxy.rlwy.net:48041) is now deleted, but the failure of both
its private network hostname and its public proxy simultaneously suggests a broader infrastructure
issue worth investigating.****
12 days ago
Resolved — Destroyed the Valkey instance entirely and recreated it from scratch: new instance, new
endpoints, new private network connections. Everything is running again. Root cause appears to have
been a corrupt Docker container that restarting alone couldn't fix. If you're hitting similar private
network breakage between services on Railway, a full teardown and recreate of the Valkey instance is
what actually solved it for us.
Status changed to Solved dev • 10 days ago