Outage?
nickm
PROOP

2 years ago

Are you experiencing any issues? We're in Singapore. Just checking the public channel since private support aren't responding. Builds aren't working and 5 different environments are down
3c08e827-8d73-4a37-bbe9-9af9757bd354

37 Replies

raleng
PRO

2 years ago

We have a service down as well in Singapore.



nickm
PROOP

2 years ago

Sad state of affairs on our production infrastructure

1236166901701283800


devcsrj
PRO

2 years ago

Same here - nothing is getting deployed at the moment


nickm
PROOP

2 years ago

Ping


2 years ago

please check <#846875565357006878> for updates


nickm
PROOP

2 years ago

Thanks Brody! I will now that there's one there


nickm
PROOP

2 years ago

no available stackers found within resource limits on an attempted redeploy



2 years ago

Hi Nick please standby we are investigating, incident has been called


nickm
PROOP

2 years ago

Thanks david, adding some context where I have it in case it helps debugging


nickm
PROOP

2 years ago

We came back online ~30 mins ago. Now we're back offline as of ~4 mins ago


2 years ago

Our apps and services are still down as well, tried migrating to US region, no luck.


khoavn02
PRO

2 years ago

Pls help, I can't connect to postgres db any more


nickm
PROOP

2 years ago

Still down for us too.


khoavn02
PRO

2 years ago

Do you have backup, I'm thinking of migrate database to other provider


nickm
PROOP

2 years ago

Don't be too hasty – this should be resolved soon (given how long it took last time) though I'm not aware of your requirements. At a certain point that'd have to be an option but for us we won't as yet.


2 years ago

Starting to see our services up now…


nickm
PROOP

2 years ago

Thanks partbot, trying to redeploy but no luck as yet. I'll also check in when we're up


2 years ago

Update: Partial recovery, 50% of capacity restored. Actively working on the rest. Thanks for your patience, on-call team working as swiftly as possible to restore service.


2 years ago

thanks david


jtechbit
PRO

2 years ago

ETA on full capacity restoration?


nickm
PROOP

2 years ago

Thanks David and team


jtechbit
PRO

2 years ago

Time for another update? Just a reminder that people have production infrastructure that is affected.


rendercoder
PRO

2 years ago

I just deployed services in the Singapore region and encountered a similar issue. Unable to deploy service successfully

1236247092826210300


nickm
PROOP

2 years ago

Still down, i'm trying regularly to re-deploy to no avail


jtechbit
PRO

2 years ago

The level of communication from Railway on this incident is totally unacceptable. I hope processes can be improved as a result of the post-mortem. Even just a “we are continuing to work on it” would give some confidence an on-call team is actually working on this…


nickm
PROOP

2 years ago

My production systems have been down 4 hours in this downtime, and in total 6 hours 15 mins today. So far


2 years ago

Update: The core issue has been identified and a resolution is in progress to restore service. The on-call team is working to roll it out.


nickm
PROOP

2 years ago

I'm online now. Redeploying worked


devcsrj
PRO

2 years ago

4 out of my 5 services redeployed properly. One more still haven't recovered. Might take a while more for the fix to be rolled out


nickm
PROOP

2 years ago

Almost 11pm here, going to be a nervous night's sleep given the day of issues.

Thanks for getting it resolved team. Echoing jtechbit – not enough comms given the severity


2 years ago

Thanks for the feedback, acknowledged. That's on me personally for not communicating more. We've had the full on-call team on this (with several additional engineers joining) for as many hours as service has been down.


2 years ago

Full service restoration in sight.


2 years ago

Fix implemented. Resolved.


jtechbit
PRO

2 years ago

Thank you for the update David! My services are now responding normally.


2 years ago

We've published a full incident retro here: https://blog.railway.app/p/2024-05-04-incident-report


Loading...