15 days ago
monitors went down showing "socket hang up" just now
91 Replies
15 days ago
might've been resolved on its own?
15 days ago
back green but still...
15 days ago
connection err closed something something when I tried to fetch my service's healthcheck endpoint
15 days ago
3 services were/are effected so far, no new deployments
15 days ago
will look around
15 days ago
I just had it happen on railway.com as well
15 days ago
I'm in eu ams
affected services whose healthchecks failed externally
15 days ago
Seeing latency spikes
15 days ago
both screenshots are of services linked above
Attachments
15 days ago
not seeing the socket hang up anymore but still seeing slight latency spikes
15 days ago
socket hang up again
15 days ago
may I ask which ip/isp you're getting these from? can dm too
15 days ago
502 bad gateways, connection reset by peer
15 days ago
69.46.46.14:443: i/o timeout
15 days ago
constantly, like right now?
15 days ago
upstream error
Attachments
15 days ago
uhh not super constantly but quite consistent
15 days ago
is there any response body?
15 days ago
upstream error
15 days ago
everything seems to be stable the last 30m, but might just be luck
15 days ago
let me know if you see one in the next 5min or so
15 days ago
instability is back...
15 days ago
unexpected EOF
15 days ago
read tcp 172.18.0.8:33922->66.33.22.216:443: read: connection reset by peer
15 days ago
ok, I know why you see this now
15 days ago
wait what
15 days ago
Oh I see, the origin/railway proxies (on the 66.33.22.0/24 prefix) were just deployed
15 days ago
How aggressively are you monitoring those? Do you keep the connection open?
15 days ago
I don't think so, minutely checks
15 days ago
I'm trying to access the failing URLs from my browser, I get hit with a conn refused (or alike, browser autorefreshes) and afterwards it seems to work fine
15 days ago
could you send me the failing URL in here/over DMs? (does it resolve to an address within 66.33.22.0/24)?
15 days ago
I'll send you the ones my monitors are annoying me about, do want to note that it's intermittent so far
15 days ago
seeing similar intermittent down behavior
15 days ago
happened again around 40 minutes ago
14 days ago
seeing timeouts, question mark
14 days ago
do you know if its the 66.33.22 IPs or 69.46.46 IPs you're seeing timeouts on?
14 days ago
69.46.46.58
14 days ago
resolved ""itself"" as of now but
Attachments
14 days ago
ok, that would do it. bgp should reconverge
Attachments
14 days ago
Let me know if you see anything else.
14 days ago
Was this from a monitor out of interest? Is the monitor hitting the "ams1" POP? https:///.railway/cdn-trace
Yeah, it was a monitor hitting my own API. The entire app was running into a Cloudflare timeout error, but the Railway metrics were still showing everything as healthy and operating normally so idk what was that
14 days ago
https://discord.com/channels/713503345364697088/1511663784127762483/1511663784127762483 seeing some latency spikes
14 days ago
69.46.46.58
Attachments
14 days ago
Could you run a traceroute/mtr to that IP from the monitor's network/server if possible? 🙏
14 days ago
not consistent latency... mtr just shows ams eq6 right now, but ill keep retrying
14 days ago
some cloudflare proxied requests are timing out completely
14 days ago
I'd like to see the hops you take before eq6
14 days ago
2.|-- sre02.gs.core.blackgate.nl 0.0% 10 4.6 4.7 4.3 6.0 0.5
3.|-- 100.65.1.14 0.0% 10 4.3 5.0 3.8 11.2 2.2
4.|-- 100.65.0.161 0.0% 10 8.6 5.6 3.9 9.8 2.0
5.|-- 81.20.68.161 0.0% 10 4.6 5.8 4.5 10.0 1.8
6.|-- ae-7.r23.amstnl07.nl.bb.gin.ntt.net 0.0% 10 84.0 85.5 83.4 90.8 2.5
7.|-- ae-18.a01.amstnl07.nl.bb.gin.ntt.net 0.0% 10 8.6 7.4 4.7 14.1 3.1
8.|-- 81.20.68.138 0.0% 10 3.4 4.3 3.3 10.5 2.2
9.|-- vl221.ams-eq6-dist-2.cdn77.com 0.0% 10 4.5 5.1 4.5 6.5 0.6
10.|-- 69.46.46.70 0.0% 10 4.9 5.0 4.3 7.6 0.914 days ago
https://utilities-us-east.up.railway.app/raw
What is the x-railway-edge header you see here?
14 days ago
railway/europe-west4-drams3a
14 days ago
What about here? https://utilities-us-east-cf-proxied.railway.com/raw
14 days ago
same edge val, cf-ray ending in -ams
14 days ago
Ty
14 days ago
timing out for me
14 days ago
Attachments
14 days ago
and... not anymore
14 days ago
No timeouts on https://utilities-us-east.up.railway.app/raw right - just via Cloudflare?
14 days ago
timeouts only observed via cloudflare so far
14 days ago
latency spikes on non-cloudflare still a thing but my monitors havent tripped on them
14 days ago
cf-ray became lhr, then timed out the following refresh
(and once again I am able to fetch it fine..)
14 days ago
I'm assuming you disabled the ams pop
Attachments
14 days ago
Nope it's still up, do you see that on the non-CF domain as well?
14 days ago
I do not
Attachments
14 days ago
I have a suspicion on what it could be. I'm going to disable something and we can see if that resolves it.
swag42dev
 railway/europe-west4-drams3a
14 days ago
Is your domain also proxied by Cloudflare?
phin
Is your domain also proxied by Cloudflare?
14 days ago
no
14 days ago
I've disabled something in AMS - let me know if you see improvement over the next half hour or so
phin
Is your domain also proxied by Cloudflare?
14 days ago
Domain management is on CF, but this particular domain is not proxied.
14 days ago
still seeing these (or should I just wait a bit more)
14 days ago
What latency is that tracking? ICMP or HTTP?
14 days ago
HTTP
phin
May I know the domain or service this is happening to?
14 days ago
project/b1f6fd55-ba7c-423a-8784-21ae43029bab/service/88da5b4b-99c2-4e7d-8946-b51c75496f9f
14 days ago
still going cf ams -> hikari lhr
14 days ago
non-proxied latency spikes still occurring
14 days ago
Is there a specific domain this is occuring on or is it all of your domains?
14 days ago
I'll dm
14 days ago
This is starting to affect other deployments.
The problem is becoming widespread.
Attachments
14 days ago
Hey.
Same here.
Using a VPN, we tested a few other edges.
In a nutshell, railway/europe-west4-drams3a is definitely troubles for us. us-west2 works fine as far as we can tell.
14 days ago
Monitors tripped over *.up.railway.app timeout
14 days ago
so it's no longer necessarily CF specific
14 days ago
I'm also seeing latency spikes from Singapore
(Singapore Railway service -> EU-W Railway service over pubnet)
14 days ago
I'm guessing something was changed about 18 minutes ago? Latency seems better
swag42dev
 This is starting to affect other deployments. The problem is becoming widespread.
14 days ago
I’m having the same issue on my Railway-hosted Node/NestJS backend, but Railway support has not responded yet
13 days ago
There's still few latency spikes going on, but not often
13 days ago
Interesting. All of those domains have pointed to the old edge/66.33.22.0/24 for about 8 hours.