2 months ago
Hello,
I have a rails app on US West with 2 replicas and a Redis cache store also on US West. The Redis cache is on us-west2. The rails app will sometimes deploy both replicas to us-west2 (which is great), however, sometimes it will deploy one replica to us-west2 and the other to us-west1.
The problem is that reading cache entries from redis between us-west1 and us-west2 has network latency of 20ms, which is extremely slow for reading cache from memory (normally less than 1ms). I have endpoints that read 20+ cache keys so at 20ms each, my frontend clients are waiting a much longer time depending on which backend replica server they hit.
I tried to force us-west2 by adding a railway.json file:
{
"$schema": "https://railway.com/schema.json",
"deploy": {
"region": "us-west2"
}
}
The next deploy was both on us-west2 so I thought it worked, but a new deploy this morning put me back on us-west1 and us-west2. I tried to re-deploy to get both replicas back on us-west2 but it didn't work. I'll keep retrying deploys to see if I can get it to "stick" to us-west2 before our US users wake up, but please advise how we can get both app servers on the same host as our cache. Thank you!
13 Replies
2 months ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
Status changed to Awaiting User Response Railway • 2 months ago
noahd
Howdy!Would love to clarify, are you using internal networking on it?
2 months ago
Yes, we are using internal networking from web server to redis server.
Status changed to Awaiting Railway Response Railway • 2 months ago
2 months ago
Hello,
The current behavior is expected; we distribute workloads to either zone depending on availability at any given time, and we don't currently offer the ability to pin workloads to a specific zone.
Best,
Brody
Status changed to Awaiting User Response Railway • about 2 months ago
brody
Hello,The current behavior is expected; we distribute workloads to either zone depending on availability at any given time, and we don't currently offer the ability to pin workloads to a specific zone.Best,Brody
2 months ago
It's not acceptable for my cache reads to take 40x longer based on random zone distribution. Is this something that can be addressed soon or do I need to keep redeploying several times until replicas are on the same zone for the foreseeable future? Do all other regions have the same randomness?
Status changed to Awaiting Railway Response Railway • about 2 months ago
2 months ago
All regions have availability zones, but this split is most common in us-west, given the distance between the two zones.
Status changed to Awaiting User Response Railway • about 2 months ago
brody
All regions have availability zones, but this split is most common in us-west, given the distance between the two zones.
2 months ago
Can you answer the first question as well?
Status changed to Awaiting Railway Response Railway • about 2 months ago
Status changed to Awaiting User Response Railway • about 2 months ago
sam-a
We have no defined timeline for changing this behavior.
2 months ago
To be explicit -- the best (and only) way to ensure that replicas are on the same zone as the cache server is to manually check post-deployment and redeploy repeatedly until they are are on the same zone? This seems extremely inefficient so I want to triple-check that this is the official recommendation to accomplish this goal and have 40x faster cache reads.
Status changed to Awaiting Railway Response Railway • about 2 months ago
2 months ago
Update: we just lost a whole day of development because apparently this issue also occurs when there is only a single instance and not just with replicas. This issue caused our staging environment to get thousands of timeouts for hours until we realized it's the same issue.
It's hard to debug as the Deployments tab shows the web app and redis server are both on "us-west2" but when you look at the Metrics tab, and then click on Replicas (vs Sum), you can see the real zone where redis is on us-west2 and the web app is on us-west1.
Since staging auto-deploys, we need a way to programmatically fetch the real zone so we can compare and initiate a redeploy. Please advise on how to do this.
2 months ago
Hi,
Sorry for your struggles. At this point we do not expose this information programatically.
However, if you are willing to run a single replica service we could pin it to a the Metal us-west2 region.
Let us know if you'd like to do that.
Kind regards,
Sam
Status changed to Awaiting User Response Railway • about 2 months ago
sam-a
Hi,Sorry for your struggles. At this point we do not expose this information programatically.However, if you are willing to run a single replica service we could pin it to a the Metal us-west2 region.Let us know if you'd like to do that.Kind regards,Sam
2 months ago
Appreciate the offer -- we would definitely like to do this for our staging environment with auto-deploys to ensure consistent behavior. I think this thread is public so what's the best way to share that information with you? You could probably find it pretty easily too -- Staging Env + Rails app name that ends in "-backend".
For production, we ran into performance issues when only running a single replica so I'd prefer to keep 2 instances there. If you have an internal feature request process, please add one for me to be able to configure this via railway.json.
Status changed to Awaiting Railway Response Railway • about 2 months ago
2 months ago
Thanks for your note. I think I see the environment you are talking about and will follow up.
We are working on resolving the underlying issue as well.
Status changed to Awaiting User Response Railway • about 2 months ago
2 months ago
This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!
Status changed to Solved Railway • about 2 months ago
