a month ago
Hi Railway team 👋
We’re seeing a very unusual spike in egress costs and wanted help understanding what might be happening and how to mitigate it.
Context:
Our typical monthly spend is around $900
Last month we paid $2,070.61
This month the forecast is $3,470.72
After investigating, the increase is almost entirely driven by Egress
What’s confusing us:
Most of our services are development environments
We run many microservices, but a large portion of them receive little to no traffic
Despite that, we’re seeing cases where a service generated ~1TB of egress in a single day (screenshot)
This is happening even on services that barely receive requests
The egress spike appears to be happening across many (or all) services at the same time, not isolated to a single service or workload
Questions:
What are the most common causes of unexpectedly high egress on Railway?
Is there a way to break down egress by destination ала source (internal vs external)?
Last month we didn’t catch this in time, but this month the projected cost really raised a red flag, so we’d appreciate any guidance on what to look for and how to prevent this going forward.
Thanks in advance 🙏
Project ID: e60124ce-5bb4-41cb-8517-9606dbe5b9f7
9 Replies
a month ago
To answer your questions:
Most commonly, from what I've seen, unexpectedly high egress comes from people connecting their databases our other services through the public network rather than the private network
One thing you could try is using Railway's GraphQL API (query
httpLogs) to fetch all (or at least a lot) of http logs around that time period, you can then write some code that can go through it and rank the total amount of bytes sent bysrcIp(orclientUa) from which you can then find which IPs the egress is going to and from that you can see if it's from internal use (based on whether the IP is a Railway IP / whateverclientUasays
a month ago
To answer your questions:
Most commonly, from what I've seen, unexpectedly high egress comes from people connecting their databases or other services through the public network rather than the private network
One thing you could try is using Railway's GraphQL API (query
httpLogs) to fetch all (or at least a lot) of http logs around that time period, you can then write some code that can go through it and ranksrcIp(orclientUa) by the total amount of bytes sent from which you can then find which IPs the egress is going to and from that you can see if it's from internal use (based on whether the IP is a Railway IP / whateverclientUasays)
Thanks for the pointers!
I dug a bit deeper using the Network Flow Logs tab on one of the services, and I think I found something that might be related to the unexpected egress.
Here is an example of one flow entry while the microservice is “idle” (no application logs being produced, only background monitoring running):
json { "flowId": "b34adaba-1c84-4017-baa9-574d813af943", "captureStart": "2026-02-02T17:49:38.312255295Z", "captureEnd": "2026-02-02T17:49:42.260750637Z", "flowState": "partial", "srcAddr": "10.250.36.108", "dstAddr": "100.64.0.3", "srcPort": 3000, "dstPort": 12498, "l4Protocol": "tcp", "byteCount": 2237150, "packetCount": 1656, "direction": "egress", "peerKind": "internet", "serviceId": "9e53a242-8eda-4e0b-a19f-4c7855665519", "deploymentId": "e7672b41-c501-45ba-bdda-c83ad0306522" }
From my understanding:
10.x.x.x and 100.64.x.x both look like internal / private ranges.However, the flow is still being classified as
direction: "egress" and peerKind: "internet".This type of flow is happening continuously, even when the service is not receiving external requests.
The only thing I know that runs continuously in the background like this is the /metrics endpoint used by our monitoring stack (Grafana + Prometheus + Loki). All of this is configured to run over Railway’s private network, since I know public traffic is billed as egress.
A couple of questions based on this:
Should traffic between addresses like
10.x.x.x and 100.64.x.x (inside Railway) really be classified as peerKind: "internet" and contribute to paid egress?Is there any known issue where internal/private traffic can be misclassified as internet egress in the Network Flow Logs?
In the project network graph UI, some services (for example, our “Hi-There” and “Herald” services) do not show any connection lines to other services, even though they clearly appear in the Network Flow Logs as sending traffic. Could this indicate a bug or misconfiguration on how internal connections are being detected?
Additionally, I attached a screenshot of the Railway UI showing the service network graph. In the image, multiple services that are clearly communicating with each other are not displayed as connected via the private network. I manually drew red arrows on the screenshot to indicate connections that should be visible but are currently missing in the UI.
Given that:
The flows look internal,
They are marked as egress+internet,
And several private connections don’t appear correctly in the UI,
I’m trying to confirm whether this traffic is expected to be billed, or if this might point to a misclassification or visualization issue on Railway’s side, rather than an actual public egress scenario.
Happy to provide more flow samples or details if needed. Thanks again for the help!



15 days ago
You are not billed for private network traffic, you are only billed for traffic that leaves the private network.
There is nothing that is being incorrectly classified there.
Thanks for the clarification.
That makes sense — however, what’s still unclear to me is why this traffic is leaving the private network at all, given our current setup.
Based on what I’m seeing:
The source and destination IPs in the flow logs (
10.x.x.x → 100.64.x.x) appear to be internal/private ranges.The traffic is marked as
peerKind: "internet" and direction: "egress", which suggests it is leaving the private network.This happens continuously, even when the service is idle and not handling external requests.
Multiple services show the same behavior at the same time.
Some services also do not appear connected in the private network graph UI, even though they clearly exchange traffic (as shown in the flow logs).
Given that we are not intentionally using any public endpoints for this traffic (everything is configured to use private networking), I’m trying to understand:
In what scenarios would traffic between services still exit the private network and be billed as egress?
Are there cases where monitoring, metrics scraping, or platform-level components (e.g. Prometheus/Grafana/Loki) operate outside the private network by default?
Is there a recommended way to verify that a given connection is guaranteed to stay within the private network (beyond env var configuration)?
I fully understand that private traffic itself isn’t billed — I just want to identify what is causing this traffic to leave the private network in the first place, since that seems to be driving the cost increase.
Any pointers on where to look next or what configuration detail might cause this would be really helpful.
15 days ago
I'm sorry but I won't be able to help further here, I cannot help you identify what is using the public network.
Our community will help you out here.
Thanks for the clarification.
Just to clarify, we were already using Railway’s internal/private domain (*.railway.internal) for Loki/Tempo. The only change I made was switching to the explicit RAILWAY_PRIVATE_DOMAIN variable.
Even with that change, I’m already seeing noticeable differences in the traffic patterns (the larger transfers have disappeared in the Network Flow Logs).
I’ll keep monitoring egress over the next hours/days to see how this affects overall usage long-term, and I’ll report back here with what I find.
Appreciate the help so far!
14 days ago
Hello, that traffic you see to 100.64.x.x is a side effect of how Railway handles traffic from the proxy (e.g. your domain and/or tcp proxy endpoint). On our side, we bill egress traffic to 100.64.x.x. as it is bytes flowing toward the public internet.
14 days ago
We will work on improving the network flow experience so it's easier to understand what's happening.