Multi-region service outage

a year ago

My APIs are in node.js everything is working like a charm, but today we have outage in out betterstack(shorterloopstatus.com) its for some regions. One thing I changed is added replicas- amsterdam, NL (metal) and Virgenia(Metal).

New York, USAError opening https://api.shorterloop.com: curl: (28) Operation timed out after 30001 milliseconds with 0 bytes received

San Francisco, USAError opening https://api.shorterloop.com: curl: (28) Operation timed out after 30001 milliseconds with 0 bytes received

Singapore Error opening https://api.shorterloop.com: curl: (28) Operation timed out after 30001 milliseconds with 0 bytes received

Sydney, AustraliaError opening https://api.shorterloop.com: curl: (28) Operation timed out after 30001 milliseconds with 0 bytes received

Tokyo, JapanError opening https://api.shorterloop.com: curl: (28) Operation timed out after 30001 milliseconds with 0 bytes received

Dallas, USAError opening https://api.shorterloop.com: curl: (28) Operation timed out after 30001 milliseconds with 0 bytes received

Works:-

Amsterdam, Netherlands18 ms21 ms0 ms106 ms107 ms112 ms112 ms147 B1.3 KB/s66.33.22.2

Frankfurt, Germany15 ms23 ms0 ms106 ms106 ms130 ms130 ms147 B1.1 KB/s66.33.22.3

London, United Kingdom20 ms29 ms0 ms127 ms128 ms143 ms143 ms147 B1022 B/s66.33.22.4

Please help to resolve this issue!!

Solved

10 Replies

a year ago

Removed : US East(Virginia, USA) Metal

Fixed all the issue, Attached pdf for your references.

Its for sure: The new Railway replicas (metal) aren't properly handling global traffic or are not registered in the load balancer/edge correctly.

It serves traffic in their own region. The other regions may route to a dead/non-listening replica, resulting in curl: (28) Operation timed ou.

I kept Amsterdam only, until everything works again, and removed Virginia and any other metal replicas. Everything worked!

Can this be fixed, Railway?

Attachments


Hey there,

Clarifying question, did the outage occur when you deployed the instances or you are noticing that all global traffic is failing to serve.

Is there a staging environment we can test against for a reproducible example?


Status changed to Awaiting User Response Railway 12 months ago


angelo-railway

Hey there, Clarifying question, did the outage occur when you deployed the instances or you are noticing that all global traffic is failing to serve. Is there a staging environment we can test against for a reproducible example?

a year ago

It happened after adding replica and deployin!


Status changed to Awaiting Railway Response Railway 12 months ago


Railway
BOT

a year ago

Hello!

We've escalated your issue to our engineering team.

We aim to provide an update within 1 business day.

Please reply to this thread if you have any questions!

Status changed to Awaiting User Response Railway 12 months ago


a year ago

Is this consistently reproducible with the US East region for you? And does it happen immediately on deploy

(I've escalated to our network expert)


20k-ultra
EMPLOYEE

a year ago

Hello, I tried reproducing the error by configuring an application with EU West and US East but did not see any errors routing to my application.

Can you try again ? You can try to reproduce in a development environment if you are concerned about having any impact in your production environment.

In the past 24 hours our monitors have not detected any issues and this is the first report of such an error.

Let us know if the issue persists.


20k-ultra
EMPLOYEE

a year ago

You can also perform your curl request with -vvv to include more information about which route is used.


20k-ultra
EMPLOYEE

a year ago

I see some HTTP requests on the us-east4 application that has been removed.

https://railway.com/project/2394abc5-3936-4f8d-b3a3-920ee7f91835/service/3bd9b3e0-0ee6-448b-8625-15d421cf2e7a?environmentId=715f26ad-9a53-4177-90b6-808425724c64&id=179348be-9c76-4507-8378-c64b0a342024#http

I am investigating but it looks like requests were sent to this application but they took 5 minutes to complete and your curl request gives up after 30 seconds.


20k-ultra
EMPLOYEE

a year ago

When someone makes a request to https://api.shorterloop.com/ does that service make another HTTP request to something else ? does it make a request to another service on railway ?


a year ago

Yes, mysql database!


Status changed to Awaiting Railway Response Railway 11 months ago


20k-ultra

You can also perform your curl request with `-vvv` to include more information about which route is used.

a year ago

Could you get the data Mig has requested here?


Status changed to Awaiting User Response Railway 11 months ago


Status changed to Solved parmstar 10 months ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...