6 months ago
So I posted about this in <#1067670962276945961> wondering if there was any unreported incident related to routing. I did some more tests and found it was railway auto selecting the beta metal edge network for my newly deployed service which I didn't notice.
I am from Nepal and and i am developing a service for users in nepal so my closest railway region is singapore which is 70-100 ms away depending on the time of the day.
With the new edge network i was constantly getting my request routed through europe as reported by "x-railway-edge" response header. I couldn't figure it out why it was happening so i temporarily used a cloudflared tunnel to solve it.
After finding out the root cause I switched back and forth between the gcp & edge flusing dns along the way and i was able to concretely reproduce this behavior.
Let me know if you guys need any more information.
0 Replies
6 months ago
can you check again? we updated some routing information
6 months ago
hey, could you please provide a traceroute or mtr to 66.33.22.1
tracert 66.33.22.1
Tracing route to 66.33.22.1 [66.33.22.1]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms dsldevice.lan [192.168.1.254]
2 3 ms 4 ms 3 ms 27.34.24.1
3 4 ms 3 ms 3 ms be-82-8.45.gwc-ndc-core-01.wlink.com.np [202.79.45.8]
4 2 ms 5 ms 4 ms ae-20-136.41.gwj-htda-core-01.wlink.com.np [202.79.41.136]
5 5 ms 4 ms 8 ms ae-21-139.41.gwj-btwl-core-01.wlink.com.np [202.79.41.139]
6 4 ms 5 ms 6 ms ae52-ipt-bhwa-01.wlink.com.np [72.9.128.67]
7 24 ms 37 ms 6 ms 125.17.58.157
8 130 ms 133 ms 132 ms 116.119.61.232
9 154 ms 158 ms 153 ms mei-b5-link.ip.twelve99.net [62.115.42.118]
10 158 ms 158 ms 159 ms prs-bb1-link.ip.twelve99.net [62.115.124.54]
11 164 ms 160 ms 164 ms adm-bb1-link.ip.twelve99.net [62.115.134.96]
12 161 ms 158 ms 158 ms adm-b12-link.ip.twelve99.net [62.115.137.189]
13 161 ms 161 ms 161 ms railwaycorp-ic-390073.ip.twelve99-cust.net [62.115.196.223]
14 157 ms 156 ms 157 ms 66.33.22.1 [66.33.22.1]6 months ago
Is this your isp?

6 months ago
sec
this is the largest isp of nepal, however that's just 1. there are 3 more that's just as larger
6 months ago
ok good to know thanks
6 months ago
@smol hey, could you try again please?
still the same.
This is the service i a testing against: https://site-production-87d6.up.railway.app
traceroute is virtually the same
tracert 66.33.22.2
Tracing route to 66.33.22.2 over a maximum of 30 hops
1 <1 ms <1 ms <1 ms dsldevice.lan [192.168.1.254]
2 6 ms 4 ms 6 ms 27.34.24.1
3 3 ms 3 ms 3 ms be-82-8.45.gwc-ndc-core-01.wlink.com.np [202.79.45.8]
4 3 ms 4 ms 7 ms ae-20-136.41.gwj-htda-core-01.wlink.com.np [202.79.41.136]
5 5 ms 5 ms 7 ms ae-21-139.41.gwj-btwl-core-01.wlink.com.np [202.79.41.139]
6 6 ms 5 ms 4 ms ae52-ipt-bhwa-01.wlink.com.np [72.9.128.67]
7 11 ms 6 ms 13 ms 125.17.58.157
8 139 ms 140 ms 139 ms 116.119.61.204
9 158 ms 160 ms 156 ms mei-b5-link.ip.twelve99.net [62.115.42.118]
10 167 ms 164 ms 166 ms prs-bb1-link.ip.twelve99.net [62.115.124.54]
11 162 ms 162 ms 162 ms adm-bb1-link.ip.twelve99.net [62.115.134.96]
12 168 ms 165 ms 165 ms adm-b12-link.ip.twelve99.net [62.115.137.189]
13 164 ms 164 ms 163 ms railwaycorp-ic-390073.ip.twelve99-cust.net [62.115.196.223]
14 177 ms 176 ms 179 ms 66.33.22.26 months ago
Kk
6 months ago
I’ll have to call an ISP to resolve this so I’ll get back to you later
this is the edge network, going staraight to europe. i have already attached the traceroute for this
ping site-production-87d6.up.railway.app
Pinging edge.railway.app [66.33.22.2] with 32 bytes of data:
Reply from 66.33.22.2: bytes=32 time=175ms TTL=51
Reply from 66.33.22.2: bytes=32 time=179ms TTL=51
Reply from 66.33.22.2: bytes=32 time=177ms TTL=51
Reply from 66.33.22.2: bytes=32 time=178ms TTL=51
Ping statistics for 66.33.22.2:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 175ms, Maximum = 179ms, Average = 177msthis is the gcp. going to singapore
ping site-production-4270.up.railway.app
Pinging trestle.proxy.rlwy.net [35.213.168.149] with 32 bytes of data:
Reply from 35.213.168.149: bytes=32 time=94ms TTL=104
Reply from 35.213.168.149: bytes=32 time=93ms TTL=104
Reply from 35.213.168.149: bytes=32 time=93ms TTL=104
Reply from 35.213.168.149: bytes=32 time=95ms TTL=104
Ping statistics for 35.213.168.149:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 93ms, Maximum = 95ms, Average = 93msHere's the traceroute
tracert 35.213.168.149
Tracing route to 149.168.213.35.bc.googleusercontent.com [35.213.168.149]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms dsldevice.lan [192.168.1.254]
2 6 ms 4 ms 6 ms 27.34.24.1
3 3 ms 2 ms 4 ms be-82-8.45.gwc-ndc-core-01.wlink.com.np [202.79.45.8]
4 3 ms 4 ms 2 ms ae-20-136.41.gwj-htda-core-01.wlink.com.np [202.79.41.136]
5 8 ms 4 ms 4 ms ae-21-139.41.gwj-btwl-core-01.wlink.com.np [202.79.41.139]
6 5 ms 5 ms 8 ms ae52-ipt-bhwa-01.wlink.com.np [72.9.128.67]
7 5 ms 5 ms 6 ms 125.17.58.157
8 81 ms 87 ms 93 ms 116.119.42.23
9 95 ms 93 ms 94 ms 149.168.213.35.bc.googleusercontent.com [35.213.168.149]
Trace complete.Hey @Phineas , so i am not an expert at this but i am guessing this is all a big DNS issue by my isp or the next provider after my isp right ?
One intresting observation that i made was even when using a custom domain with cloudflare (full orange cloud) i was still getting my request routed through europe. Now cf has a pop in the capital and has direct peering with my ISP so all the traffic should be getting handed over to cf from my isp right ?
I also tested this with warp (their vpn) and with colocation center in Nepal. Results were the same
Now i didn't want my users to connect directly to railway, i want to use cf in between.
This is what my traceroute looks like for an application hosted on railway with edge network using custom domain with cloudflare proxy
traceroute cf.700104.xyz
traceroute to cf.700104.xyz (172.67.206.87), 30 hops max, 60 byte packets
1 DESKTOP-HK6N184 (172.30.80.1) 0.210 ms 0.201 ms 0.193 ms
2 dsldevice.lan (192.168.1.254) 0.674 ms 1.480 ms 1.699 ms
3 27.34.24.1 (27.34.24.1) 5.901 ms 3.977 ms 6.071 ms
4 be-82-8.45.gwc-ndc-core-01.wlink.com.np (202.79.45.8) 4.134 ms 5.726 ms 5.923 ms
5 * * *
6 103.211.151.11 (103.211.151.11) 6.143 ms 3.877 ms 3.637 ms
7 172.67.206.87 (172.67.206.87) 6.502 ms 3.096 ms 6.186 mswith ipv6
tracert cf.700104.xyz
Tracing route to cf.700104.xyz [2606:4700:3034::ac43:ce57]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 2400-1a00-4b40.ip6.wlink.com.np [2400:1a00:4b40:ca94::1]
2 14 ms 6 ms 28 ms 2400-1a00-4b04.ip6.wlink.com.np [2400:1a00:4b04:0:c8bd:e0e0:2931:d308]
3 * * 3 ms 2400:1a00:0:45::8
4 * * * Request timed out.
5 2 ms 5 ms 3 ms 2400:1a00:4:1151::6
6 2 ms 8 ms 3 ms 2606:4700:3034::ac43:ce57
Trace complete.should have mentioned this was done in a vm so the extra pop "DESKTOP-HK6N184"
@Phineas there is another routing issue i have discovered. I have 2 services hosted in singapore, both using cloudflare proxy.
The service in question is a nextjs app and hono api. When nextjs makes a request to api using public cf url the request gets routed through us-west1
I assumed it would first hit the cf server in singapore then get routed to railway in singapore. I know i can just use the internal url and i am working on it.
I was experiencing unusually high latency so i checked the http logs from api and the requests were being served from us-east edge region.
Project ID: 92fbff68-d68f-467c-9438-705fc248eb2b
I decided to ssh into my nextjs container and do a quick curl to my api url to check.
Could not post the full output of curl so attached it to a file.
6 months ago
You don't have the metal edge enabled, so this is routed by Google's geodns stuff
6 months ago
@smol
6 months ago
we've made some adjustments recently, so you can try again
6 months ago
if you do have routing issues, you can ping me and I can debug them
6 months ago
but I cannot control the legacy edge network because it's google's network
Hi, it's me again. I have been using the GCP network without issues on routing but just today I switched my applications to Metal edge to test how things are and nothing seem to have changed. The requests are still getting routed through europe instead of directly hitting singapore. Also the ISPs really don't matter anymore as all my application requests are being proxied through cloudflare so it's there you need to apply the fix as to why requests originating on "96c960dee9d1cf60-KTM" region is being routed through "railway/europe-west4-drams3a" when singapore is an option.
This behaviour is consistent across all applications and requests.
Let me know if you need more information on this. Also it seems i cannot revert back to GCP after opting into Metal edge.
4 months ago
Are you able to disable the Cloudflare proxy?
4 months ago
For smaller Cloudflare PoPs (e.g. KTM), it is very hard for us to influence route selection because they do not have many peers and most outbound traffic goes over various transit providers
I need some level of proxy caching and config which CF is able to provide since railway doesn't have them.
I could deploy a similar config on railway itself but that defeats the purpose of edge caching.
does railway have any plans to sunset GCP network and does anycast really need to be configured and optimized on a pop by pop basis ?
I mean i could try to contact cf regarding this issue but it would be better coming for you guys. Also the KTM pop isn't very reliable (personal experience, i live here) so the next closest pop where request gets routed through is Delhi(DEL).
I have experienced this routing issue only with railway, i used to be on fly and they were fine.
4 months ago
I've done some digging with Cloudflare's traceroute/diagnostics API and figured out that Airtel (a KTM regional upstream transit provider) was responsible for a lot of Cloudflare's bad routes to us via Europe. I've managed to send Airtel preferred routes for our anycast network over an IX in Singapore, and now most traces from Cloudflare KTM land correctly at Singapore (there is one that still seems to be suboptimal). Please let me know if this is fixed from your side.
Sorry i forgot to mention that Airtel is indeed one of the major transit provider for a lot of ISPs in Nepal and not just cloudflare.
Thanks i will check it out.
It seems the routing issues has been solved. I will report back with railway-request-id if i do find some requests getting routed wrongly. Airtel may not be the only transit provider cf is using. But so far so good.
Thanks phineas.
does railway have plans to offer dedicated IPs ?
I was recommending railway to one of my colleague and they said railway doesn't have dedicated IP so can't lockdown stuff.
When i was on fly, they had 2 kinds of dedicated ips, region specific and anycast based and if one had routing issues with anycast they could switch to the region one.
would be nice if railway had that on a project level so all services on that project share the same IP and we could use A record instead of CNAME for DNS.
4 months ago
We would like to in the future, however it is not yet on our roadmap so I can't guarantee when they will be available.