Production Redis instance down while re-deployment is stuck
fredgig
PROOP

a month ago

After I renamed a Redis service private networking address.

project/204594ed-2cbb-4f9e-924a-88cf5d199e27/service/64564b48-5b07-4326-b701-392e6a09ec5b?environmentId=767ba06f-5057-4e54-9c14-7104bd5ae0e2

Solved

31 Replies

fredgig
PROOP

a month ago

Our main production operations have been down for 30 minutes now because of this.


fredgig
PROOP

a month ago

The Redis service keeps randomly going down.


fredgig
PROOP

a month ago


fredgig
PROOP

a month ago

Our production Redis instance is still down after numerous deployment attempts.


a month ago

Do you have any logs that you can share?


a month ago

Is the deployment itself crashing?


fredgig
PROOP

a month ago

No build/deploy logs being shown unfortunately. Sometimes the deployment itself fails, sometimes it passes but randomly goes offline after a couple minutes. At this time there is an "active" deployment (no logs at all) but the instance isn't reachable. Not sure if it's a state or UI bug.


a month ago

Have you tried renaming back to the original private network address?


fredgig
PROOP

a month ago

Good idea I'll give it a try after this next deployment fails


a month ago

I'll see if I can repro this issue


fredgig
PROOP

a month ago

Still no luck after changing back to the original address


a month ago

I'm seeing a simlar issue (related?) - redeployed a service and now it cannot talk to my Redis instances.


a month ago

PHP Fatal error:  Uncaught RedisException: Connection timed out

a month ago

FWIW this Redis instance itself was not redeployed.


a month ago

Trying to SSH into the Redis instance shows:

📦 Your service's container is not running (status: exited)
🔧 Deploy or restart your service, then try again.

a month ago

@fred do you get that too?


fredgig
PROOP

a month ago

I can't do that because the service is pending deletion right now, but I imagine I would get the same result as you. The container was definitely not running


a month ago

Redeploy has fixed for me. FWIW, from my service, my-redis was not resolving. my-redis.railway.internal was, however - may play a factor.


fredgig
PROOP

a month ago

Staff, please just delete the service that was linked in this thread. It's been pending deletion for half an hour now and blocks any other changes from going out. The environment is essentially unusable right now.


fredgig
PROOP

a month ago

@Brody @Noah sorry for the ping but something must be done here


a month ago

Try on railway Station, the team is there more active


a month ago

Working on a fix for this now


a month ago

This bug is so very rare I havent seen it in ~4 months


fredgig
PROOP

a month ago

god bless you noah


a month ago

The Redis2 instance you had pending deletion has now completed cc @fred


a month ago

Sorry you got nailed by that, not ideal at all


a month ago

The nature of the bug causing that deletion issue is so very rare its hard to consistently reproduce. However we did allegedly fix it. Have some things coming down the pipe that should nuke it for good


a month ago

If I'm able to help otherwise please let me know!


fredgig
PROOP

a month ago

that should do the trick thanks for the post mortem


a month ago

Absolutely!


a month ago

(no markdown available for this content)


Status changed to Solved medim 28 days ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...