2 years ago
Intermittent networking issues interally calling services on railway I.e. Calling self/another service on railway using a Public domain. Intermittent from 6:50pm. From 9:02pm unable to call services. (UK times). Currently network requests are hanging.
External from railway calling these services work perfectly and as expected.
Looks like an issue with private networking and internal routing. We have intermittent connection loss on calling DB/Redis and web services using private networking.
Posted on twitter: https://x.com/cmacrae2016/status/1754637650097586587?s=20
16 Replies
2 years ago
that unfortunately, is a common problem due to the private network initialization delay, please add a 3 second sleep before your app starts
2 years ago
This is when it's running, not during start up. Can use the roundhouse.proxy.rlwy.net proxy to get it to consistency work. If I use that, you charge on egress.
2 years ago
It's not just that though, calling services using the public address that run on railway from inside railway are failing. We have been running this for a few months now, no issues. As I say, this only started happening at 6:50pm (UK Time).
2 years ago
Just tested again, api /heathz request to that service in my project. Works externally but can not call the public address from the service it's self.
2 years ago
We're looking into this as well. Curious, are these errors occurring when attempting to connect to the public database URL? -> Error: getaddrinfo ENOTFOUND postgres
Also, it looks like the service restarted a couple of hours ago, just wondering was that a manual restart?
2 years ago
Yeah @melissa manual restart to see if it was the service… Thanks for looking into this! Maybe it's just me affected?
2 years ago
Ah! I think we found the problem. Looks like you need to add the workaround for alpine images. Set this variable in your service: ENABLE_ALPINE_PRIVATE_NETWORKING=true
Docs ref: https://docs.railway.app/guides/private-networking#workaround-for-alpine-based-images
I can update that template so it comes default, too. Let me know if this helps?
2 years ago
@brody and @melissa you guys are awesome! That fix seems to have addressed the issue.
What would be amazing is if you can identify for other railway users if they are using ALPINE images and suggest to add that env var. Not sure if you do that now or not - maybe I missed it.
Can't thank you enough for the super fast response.
2 years ago
the alpine workaround variable, nor the 3 second sleep will be needed for much longer, they are actively working a permanent fix.
2 years ago
Yes!! We've talked about the notion of "suggestions" like what you're describing. Would love to see this kind of thing in the future, too.
Thanks again for raising this.
2 years ago
It looks like I have the same issue. My app could not connect to DB using the private networking but everything is good with the public networking. My project id is 3e60ecef-0377-406d-8995-dcdadd820852