Intermittent network issue when running health checks in uptime kuma
bert0rm
PROOP

a year ago

I am running Uptime Kuma (a monitoring service) in the US West (Oregon, USA) region using the Railway Uptime Kuma template by Brody Over. I am seeing intermittent network failures to reach my website. I am seeing a 30 day uptime report of 98.67% with checks running every 60s.

This is the issue reported by Uptime Kuma:

2024-06-18T11:36:29-07:00 [MONITOR] WARN: Monitor #4 'Google': Failing: connect EHOSTUNREACH 74.125.142.103:443 | Interval: 60 seconds | Type: http | Down Count: 0 | Resend Interval: 0

2024-06-18T11:44:38-07:00 [MONITOR] WARN: Monitor #3 'Auth@Railway': Failing: connect EHOSTUNREACH 35.212.174.161:443 | Interval: 60 seconds | Type: http | Down Count: 0 | Resend Interval: 0

I setup a monitor for Google (www.google.com) and also to a http service deployed on Railway, to debug whether this was an issue on my servers or railway, and I am seeing the same issue for both. I also deployed a different instance of Uptime Kuma on a different deployment provider in the US West (Seattle, USA) region with the same configuration for Uptime Kuma and do not see this issue. (100% uptime for both my servers, the railway http service and Google even during the network issues with Railway).

Is there anything I can do to fix these network issues?

10 Replies

brody
EMPLOYEE

a year ago

This is a known issue with the legacy runtime and it is fixed with the V2 runtime, unfortunately the V2 runtime does not yet support volumes, so I would stick the the service that is not experiencing this issue.


bert0rm
PROOP

a year ago

Thanks for the response Brody.

Yeah I am using the V2 runtime and still see the issue. Do you know where I can track progress on this known issue you mentioned?


brody
EMPLOYEE

a year ago

As mentioned, despite the V2 runtime being selectable for a service with a volume, the service with a volume will not be using the V2 runtime, aka you are not using the V2 runtime on your Uptime Kuma service.

The issue is not tracked and won't be fixed since the legacy runtime will be deprecated, for Railway it does not make sense to sink time into fixing a legacy system.

Set the retry attempts in Uptime Kuma to a higher value.


bert0rm
PROOP

a year ago

Thanks for the clarification. I'll try making some updates to the Uptime Kuma configs.

Do you know when can we expect the v2 runtime to support volumes?


brody
EMPLOYEE

a year ago

If all goes well, new few months, but take that with a grain of salt.


brody
EMPLOYEE

a year ago

The team prioritised volumes on the v2 runtime before moving to bare metal and now most of Railway's hosts now support volumes on the v2 runtime, set your service to use the v2 runtime and redeploy, you will know you're on the v2 runtime if you see container event logs.


bert0rm
PROOP

a year ago

Is it the `Starting Container` log?


brody
EMPLOYEE

a year ago

Yep those too, that would indicate that you are indeed on the V2 runtime!


bert0rm
PROOP

a year ago

Sweet, thanks for the help!


brody
EMPLOYEE

a year ago

Now that you are on the V2 runtime, please let me know if you see that error again.


Loading...