Outbound connection timeouts
tuxy
PROOP

2 months ago

We've built a service which is hosted on your platform called uupsie.com.

It's a Service Status page SaaS platform - e.g. https://status.aatcorp.net/

We've noticed from the metrics whilst testing the service that every so often we see connections hanging and they are hitting the monitor 10s timeout and failing.

We thought maybe it was our checking code, however, I've since noticed errors from the workers which are reporting internal railway connection problems:

Illuminate\Broadcasting\BroadcastException: Pusher error: cURL error 28: Failed to connect to uupsie-ws.railway.internal port 8080 after 10001 ms: Timeout was reached (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for http://uupsie-ws.railway.internal:8080/apps/219368/events?auth_key=REDACTED&auth_timestamp=1776880741&auth_version=1.0&body_md5=660b7bb1d7f491919e4f57426e6351da&auth_signature=REDACTED. in /app/vendor/laravel/framework/src/Illuminate/Broadcasting/Broadcasters/PusherBroadcaster.php:171

This is an example, where one of our applications is talking to the websocket service (reverb) and is experiencing the same issue, however, this time we were able to rule out any Cloudflare or external issues and the problem is seeming to occur from within the railway network.

I'm working if the Railway network is seeing this as spam or ddos or something and as such is blocking connections intermittently. But this error was an internal connection, not external, which makes it weird.

I can confirm that the service was operational and reciving connections at that time, and the issue is not isolated to this one service.

We are seeing the same problem when talking to customer on-premise infrastructure endpoints, Digitalocean and others.

Could you please assist as we are a bit stuck now as we can't see the internals of the outbound traffic within railway.

Thanks,

Nick

$20 Bounty

2 Replies

Railway
BOT

2 months ago

This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.

Status changed to Open Railway about 2 months ago


tuxy
PROOP

25 days ago

How do I switch to a Metal plan?


Anonymous
PROTop 10% Contributor

20 days ago

It is not a separate billing plan. Railway Metal is selected per service through the region setting.

Go to the service, then Settings, then Scale, then Regions. Pick a region that has the Metal (New) tag. Railway documents that this redeploys the service, so expect a brief deploy window unless health checks and rollout settings are clean.

For your specific case, move the communicating services together, not just one service. The worker and the uupsie-ws/Reverb service should be in the same Metal region if they are talking over private networking. If a database or volume-backed service remains in a different or legacy region, you can create extra latency instead of fixing the timeout.

After switching, confirm in Settings -> Scale -> Regions that the service shows the Metal (New) tag. If latency gets worse, Railway says the manual rollback path is the same screen: select a non-Metal region again.


Welcome!

Sign in to your Railway account to join the conversation.

Loading...