Bad Anycast routing with Metal edge network in sea region
smolpaw
HOBBYOP

6 months ago

So I posted about this in <#1067670962276945961> wondering if there was any unreported incident related to routing. I did some more tests and found it was railway auto selecting the beta metal edge network for my newly deployed service which I didn't notice.

I am from Nepal and and i am developing a service for users in nepal so my closest railway region is singapore which is 70-100 ms away depending on the time of the day.

With the new edge network i was constantly getting my request routed through europe as reported by "x-railway-edge" response header. I couldn't figure it out why it was happening so i temporarily used a cloudflared tunnel to solve it.

After finding out the root cause I switched back and forth between the gcp & edge flusing dns along the way and i was able to concretely reproduce this behavior.

Let me know if you guys need any more information.

0 Replies

smolpaw
HOBBYOP

6 months ago

N/A


brody
EMPLOYEE

6 months ago

can you check again? we updated some routing information


smolpaw
HOBBYOP

6 months ago

it's still the same for me


phin
EMPLOYEE

6 months ago

hey, could you please provide a traceroute or mtr to 66.33.22.1


smolpaw
HOBBYOP

6 months ago

sure


smolpaw
HOBBYOP

6 months ago

tracert 66.33.22.1

Tracing route to 66.33.22.1 [66.33.22.1]
over a maximum of 30 hops:

  1    &lt;1 ms    &lt;1 ms    &lt;1 ms  dsldevice.lan [192.168.1.254]
  2     3 ms     4 ms     3 ms  27.34.24.1
  3     4 ms     3 ms     3 ms  be-82-8.45.gwc-ndc-core-01.wlink.com.np [202.79.45.8]
  4     2 ms     5 ms     4 ms  ae-20-136.41.gwj-htda-core-01.wlink.com.np [202.79.41.136]
  5     5 ms     4 ms     8 ms  ae-21-139.41.gwj-btwl-core-01.wlink.com.np [202.79.41.139]
  6     4 ms     5 ms     6 ms  ae52-ipt-bhwa-01.wlink.com.np [72.9.128.67]
  7    24 ms    37 ms     6 ms  125.17.58.157
  8   130 ms   133 ms   132 ms  116.119.61.232
  9   154 ms   158 ms   153 ms  mei-b5-link.ip.twelve99.net [62.115.42.118]
 10   158 ms   158 ms   159 ms  prs-bb1-link.ip.twelve99.net [62.115.124.54]
 11   164 ms   160 ms   164 ms  adm-bb1-link.ip.twelve99.net [62.115.134.96]
 12   161 ms   158 ms   158 ms  adm-b12-link.ip.twelve99.net [62.115.137.189]
 13   161 ms   161 ms   161 ms  railwaycorp-ic-390073.ip.twelve99-cust.net [62.115.196.223]
 14   157 ms   156 ms   157 ms  66.33.22.1 [66.33.22.1]

phin
EMPLOYEE

6 months ago

Is this your isp?

1375727189776928800


smolpaw
HOBBYOP

6 months ago

yep


phin
EMPLOYEE

6 months ago

sec


smolpaw
HOBBYOP

6 months ago

this is the largest isp of nepal, however that's just 1. there are 3 more that's just as larger


phin
EMPLOYEE

6 months ago

ok good to know thanks


phin
EMPLOYEE

6 months ago

@smol hey, could you try again please?


smolpaw
HOBBYOP

6 months ago

still the same.
This is the service i a testing against: https://site-production-87d6.up.railway.app

traceroute is virtually the same

tracert 66.33.22.2

Tracing route to 66.33.22.2 over a maximum of 30 hops

  1    &lt;1 ms    &lt;1 ms    &lt;1 ms  dsldevice.lan [192.168.1.254]
  2     6 ms     4 ms     6 ms  27.34.24.1
  3     3 ms     3 ms     3 ms  be-82-8.45.gwc-ndc-core-01.wlink.com.np [202.79.45.8]
  4     3 ms     4 ms     7 ms  ae-20-136.41.gwj-htda-core-01.wlink.com.np [202.79.41.136]
  5     5 ms     5 ms     7 ms  ae-21-139.41.gwj-btwl-core-01.wlink.com.np [202.79.41.139]
  6     6 ms     5 ms     4 ms  ae52-ipt-bhwa-01.wlink.com.np [72.9.128.67]
  7    11 ms     6 ms    13 ms  125.17.58.157
  8   139 ms   140 ms   139 ms  116.119.61.204
  9   158 ms   160 ms   156 ms  mei-b5-link.ip.twelve99.net [62.115.42.118]
 10   167 ms   164 ms   166 ms  prs-bb1-link.ip.twelve99.net [62.115.124.54]
 11   162 ms   162 ms   162 ms  adm-bb1-link.ip.twelve99.net [62.115.134.96]
 12   168 ms   165 ms   165 ms  adm-b12-link.ip.twelve99.net [62.115.137.189]
 13   164 ms   164 ms   163 ms  railwaycorp-ic-390073.ip.twelve99-cust.net [62.115.196.223]
 14   177 ms   176 ms   179 ms  66.33.22.2

phin
EMPLOYEE

6 months ago

Kk


phin
EMPLOYEE

6 months ago

I’ll have to call an ISP to resolve this so I’ll get back to you later


smolpaw
HOBBYOP

6 months ago

i will attach some more info in a minute


smolpaw
HOBBYOP

6 months ago

this is the edge network, going staraight to europe. i have already attached the traceroute for this

ping site-production-87d6.up.railway.app

Pinging edge.railway.app [66.33.22.2] with 32 bytes of data:
Reply from 66.33.22.2: bytes=32 time=175ms TTL=51
Reply from 66.33.22.2: bytes=32 time=179ms TTL=51
Reply from 66.33.22.2: bytes=32 time=177ms TTL=51
Reply from 66.33.22.2: bytes=32 time=178ms TTL=51

Ping statistics for 66.33.22.2:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 175ms, Maximum = 179ms, Average = 177ms

smolpaw
HOBBYOP

6 months ago

this is the gcp. going to singapore

ping site-production-4270.up.railway.app

Pinging trestle.proxy.rlwy.net [35.213.168.149] with 32 bytes of data:
Reply from 35.213.168.149: bytes=32 time=94ms TTL=104
Reply from 35.213.168.149: bytes=32 time=93ms TTL=104
Reply from 35.213.168.149: bytes=32 time=93ms TTL=104
Reply from 35.213.168.149: bytes=32 time=95ms TTL=104

Ping statistics for 35.213.168.149:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 93ms, Maximum = 95ms, Average = 93ms

Here's the traceroute

tracert 35.213.168.149

Tracing route to 149.168.213.35.bc.googleusercontent.com [35.213.168.149]
over a maximum of 30 hops:

  1    &lt;1 ms    &lt;1 ms    &lt;1 ms  dsldevice.lan [192.168.1.254]
  2     6 ms     4 ms     6 ms  27.34.24.1
  3     3 ms     2 ms     4 ms  be-82-8.45.gwc-ndc-core-01.wlink.com.np [202.79.45.8]
  4     3 ms     4 ms     2 ms  ae-20-136.41.gwj-htda-core-01.wlink.com.np [202.79.41.136]
  5     8 ms     4 ms     4 ms  ae-21-139.41.gwj-btwl-core-01.wlink.com.np [202.79.41.139]
  6     5 ms     5 ms     8 ms  ae52-ipt-bhwa-01.wlink.com.np [72.9.128.67]
  7     5 ms     5 ms     6 ms  125.17.58.157
  8    81 ms    87 ms    93 ms  116.119.42.23
  9    95 ms    93 ms    94 ms  149.168.213.35.bc.googleusercontent.com [35.213.168.149]

Trace complete.

smolpaw
HOBBYOP

6 months ago

Hey @Phineas , so i am not an expert at this but i am guessing this is all a big DNS issue by my isp or the next provider after my isp right ?

One intresting observation that i made was even when using a custom domain with cloudflare (full orange cloud) i was still getting my request routed through europe. Now cf has a pop in the capital and has direct peering with my ISP so all the traffic should be getting handed over to cf from my isp right ?

I also tested this with warp (their vpn) and with colocation center in Nepal. Results were the same

Now i didn't want my users to connect directly to railway, i want to use cf in between.

This is what my traceroute looks like for an application hosted on railway with edge network using custom domain with cloudflare proxy

traceroute cf.700104.xyz
traceroute to cf.700104.xyz (172.67.206.87), 30 hops max, 60 byte packets
 1  DESKTOP-HK6N184 (172.30.80.1)  0.210 ms  0.201 ms  0.193 ms
 2  dsldevice.lan (192.168.1.254)  0.674 ms  1.480 ms  1.699 ms
 3  27.34.24.1 (27.34.24.1)  5.901 ms  3.977 ms  6.071 ms
 4  be-82-8.45.gwc-ndc-core-01.wlink.com.np (202.79.45.8)  4.134 ms  5.726 ms  5.923 ms
 5  * * *
 6  103.211.151.11 (103.211.151.11)  6.143 ms  3.877 ms  3.637 ms
 7  172.67.206.87 (172.67.206.87)  6.502 ms  3.096 ms  6.186 ms

smolpaw
HOBBYOP

6 months ago

with ipv6

tracert cf.700104.xyz

Tracing route to cf.700104.xyz [2606:4700:3034::ac43:ce57]
over a maximum of 30 hops:

  1    &lt;1 ms    &lt;1 ms    &lt;1 ms  2400-1a00-4b40.ip6.wlink.com.np [2400:1a00:4b40:ca94::1]
  2    14 ms     6 ms    28 ms  2400-1a00-4b04.ip6.wlink.com.np [2400:1a00:4b04:0:c8bd:e0e0:2931:d308]
  3     *        *        3 ms  2400:1a00:0:45::8
  4     *        *        *     Request timed out.
  5     2 ms     5 ms     3 ms  2400:1a00:4:1151::6
  6     2 ms     8 ms     3 ms  2606:4700:3034::ac43:ce57

Trace complete.

smolpaw
HOBBYOP

6 months ago

should have mentioned this was done in a vm so the extra pop "DESKTOP-HK6N184"


smolpaw
HOBBYOP

6 months ago

@Phineas there is another routing issue i have discovered. I have 2 services hosted in singapore, both using cloudflare proxy.
The service in question is a nextjs app and hono api. When nextjs makes a request to api using public cf url the request gets routed through us-west1
I assumed it would first hit the cf server in singapore then get routed to railway in singapore. I know i can just use the internal url and i am working on it.

I was experiencing unusually high latency so i checked the http logs from api and the requests were being served from us-east edge region.
Project ID: 92fbff68-d68f-467c-9438-705fc248eb2b


smolpaw
HOBBYOP

6 months ago

I decided to ssh into my nextjs container and do a quick curl to my api url to check.

Could not post the full output of curl so attached it to a file.


smolpaw
HOBBYOP

6 months ago

cf-ray: 94865a9e8f292284-SJC

why is it going to San Jose, CA ?


phin
EMPLOYEE

6 months ago

You don't have the metal edge enabled, so this is routed by Google's geodns stuff


phin
EMPLOYEE

6 months ago

@smol


smolpaw
HOBBYOP

6 months ago

because the metal edge causes routing issues with end users


phin
EMPLOYEE

6 months ago

we've made some adjustments recently, so you can try again


phin
EMPLOYEE

6 months ago

if you do have routing issues, you can ping me and I can debug them


phin
EMPLOYEE

6 months ago

but I cannot control the legacy edge network because it's google's network


smolpaw
HOBBYOP

4 months ago

Hi, it's me again. I have been using the GCP network without issues on routing but just today I switched my applications to Metal edge to test how things are and nothing seem to have changed. The requests are still getting routed through europe instead of directly hitting singapore. Also the ISPs really don't matter anymore as all my application requests are being proxied through cloudflare so it's there you need to apply the fix as to why requests originating on "96c960dee9d1cf60-KTM" region is being routed through "railway/europe-west4-drams3a" when singapore is an option.

This behaviour is consistent across all applications and requests.
Let me know if you need more information on this. Also it seems i cannot revert back to GCP after opting into Metal edge.


phin
EMPLOYEE

4 months ago

Are you able to disable the Cloudflare proxy?


phin
EMPLOYEE

4 months ago

For smaller Cloudflare PoPs (e.g. KTM), it is very hard for us to influence route selection because they do not have many peers and most outbound traffic goes over various transit providers


smolpaw
HOBBYOP

4 months ago

I need some level of proxy caching and config which CF is able to provide since railway doesn't have them.
I could deploy a similar config on railway itself but that defeats the purpose of edge caching.
does railway have any plans to sunset GCP network and does anycast really need to be configured and optimized on a pop by pop basis ?

I mean i could try to contact cf regarding this issue but it would be better coming for you guys. Also the KTM pop isn't very reliable (personal experience, i live here) so the next closest pop where request gets routed through is Delhi(DEL).

I have experienced this routing issue only with railway, i used to be on fly and they were fine.


phin
EMPLOYEE

4 months ago

I've done some digging with Cloudflare's traceroute/diagnostics API and figured out that Airtel (a KTM regional upstream transit provider) was responsible for a lot of Cloudflare's bad routes to us via Europe. I've managed to send Airtel preferred routes for our anycast network over an IX in Singapore, and now most traces from Cloudflare KTM land correctly at Singapore (there is one that still seems to be suboptimal). Please let me know if this is fixed from your side.


smolpaw
HOBBYOP

4 months ago

Sorry i forgot to mention that Airtel is indeed one of the major transit provider for a lot of ISPs in Nepal and not just cloudflare.
Thanks i will check it out.


smolpaw
HOBBYOP

4 months ago

It seems the routing issues has been solved. I will report back with railway-request-id if i do find some requests getting routed wrongly. Airtel may not be the only transit provider cf is using. But so far so good.
Thanks phineas.


smolpaw
HOBBYOP

4 months ago

does railway have plans to offer dedicated IPs ?
I was recommending railway to one of my colleague and they said railway doesn't have dedicated IP so can't lockdown stuff.

When i was on fly, they had 2 kinds of dedicated ips, region specific and anycast based and if one had routing issues with anycast they could switch to the region one.

would be nice if railway had that on a project level so all services on that project share the same IP and we could use A record instead of CNAME for DNS.


phin
EMPLOYEE

4 months ago

We would like to in the future, however it is not yet on our roadmap so I can't guarantee when they will be available.


Loading...