Production Redis instance down while re-deployment is stuck
fredgig
PROOP

10 days ago

After I renamed a Redis service private networking address.

project/204594ed-2cbb-4f9e-924a-88cf5d199e27/service/64564b48-5b07-4326-b701-392e6a09ec5b?environmentId=767ba06f-5057-4e54-9c14-7104bd5ae0e2

Solved

31 Replies

fredgig
PROOP

10 days ago

Our main production operations have been down for 30 minutes now because of this.


fredgig
PROOP

10 days ago

The Redis service keeps randomly going down.


fredgig
PROOP

10 days ago


fredgig
PROOP

10 days ago

Our production Redis instance is still down after numerous deployment attempts.


10 days ago

Do you have any logs that you can share?


10 days ago

Is the deployment itself crashing?


fredgig
PROOP

10 days ago

No build/deploy logs being shown unfortunately. Sometimes the deployment itself fails, sometimes it passes but randomly goes offline after a couple minutes. At this time there is an "active" deployment (no logs at all) but the instance isn't reachable. Not sure if it's a state or UI bug.


10 days ago

Have you tried renaming back to the original private network address?


fredgig
PROOP

10 days ago

Good idea I'll give it a try after this next deployment fails


10 days ago

I'll see if I can repro this issue


fredgig
PROOP

10 days ago

Still no luck after changing back to the original address


10 days ago

I'm seeing a simlar issue (related?) - redeployed a service and now it cannot talk to my Redis instances.


10 days ago

PHP Fatal error:  Uncaught RedisException: Connection timed out

10 days ago

FWIW this Redis instance itself was not redeployed.


10 days ago

Trying to SSH into the Redis instance shows:

📦 Your service's container is not running (status: exited)
🔧 Deploy or restart your service, then try again.

10 days ago

@fred do you get that too?


fredgig
PROOP

10 days ago

I can't do that because the service is pending deletion right now, but I imagine I would get the same result as you. The container was definitely not running


10 days ago

Redeploy has fixed for me. FWIW, from my service, my-redis was not resolving. my-redis.railway.internal was, however - may play a factor.


fredgig
PROOP

10 days ago

Staff, please just delete the service that was linked in this thread. It's been pending deletion for half an hour now and blocks any other changes from going out. The environment is essentially unusable right now.


fredgig
PROOP

10 days ago

@Brody @Noah sorry for the ping but something must be done here


10 days ago

Try on railway Station, the team is there more active


10 days ago

Working on a fix for this now


10 days ago

This bug is so very rare I havent seen it in ~4 months


fredgig
PROOP

10 days ago

god bless you noah


10 days ago

The Redis2 instance you had pending deletion has now completed cc @fred


10 days ago

Sorry you got nailed by that, not ideal at all


10 days ago

The nature of the bug causing that deletion issue is so very rare its hard to consistently reproduce. However we did allegedly fix it. Have some things coming down the pipe that should nuke it for good


10 days ago

If I'm able to help otherwise please let me know!


fredgig
PROOP

10 days ago

that should do the trick thanks for the post mortem


10 days ago

Absolutely!


10 days ago

(no markdown available for this content)


Status changed to Solved medim 10 days ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...