6 months ago
Since deploying to Railway metal last week, I've had many of these errors on Redis.
Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. Just noticed now as the service is really slow. Any ideas?
18 Replies
6 months ago
Slowness resolved on its own, but still added these variables: REDIS_APPENDONLY=no, REDIS_SAVE=
Still curious if it's a known thing with the metal transition 
6 months ago
Hey there Cerefe,
We've been playing whack a mole and scheduling workloads to make sure no one is hit by a noisy neighbor. So you likely been addressed with that action, with that said, those options would help with getting Redis to be a bit less disk heavy.
We're tracking this issue and are working with the Infra team to fix this issue.
Status changed to Awaiting User Response Railway • 6 months ago
6 months ago
Hi - any updates? This is causing an extreme slow down on my app and users have been reaching out to ask if the app is down it is loading so slowly.
Status changed to Awaiting Railway Response Railway • 6 months ago
6 months ago
Noted- we can move your instances back to GCP if that’s okay with you so that your business is not affected in the short term. Is that doable?
Status changed to Awaiting User Response Railway • 6 months ago
6 months ago
What would be the long term fix there? I thought all regions needed to be moved to metal by today.
Status changed to Awaiting Railway Response Railway • 6 months ago
6 months ago
There is a core fix planned, we plan to ship a fix to the core FSwait on the fleet, but this is affecting a few workloads on the machines. However, the timeline is within a week and we are looking to delay the final call for certain customers impacted like you.
We would move you back as soon as we confirmed that we have a core fix out for Metal.
Status changed to Awaiting User Response Railway • 6 months ago
6 months ago
I have migrated back to a nonmetal region - it is performing better already. Appreciate your fast responses, and would also appreciate being notified before any automatic migrations happen for this service.
Status changed to Awaiting Railway Response Railway • 6 months ago
Status changed to Awaiting User Response Railway • 6 months ago
6 months ago
We had a slight slowness reprieve but even off metal (for that one Redis service) still seeing much slower loading times. Are they other issues with metal too that might improve after the week?
Status changed to Awaiting Railway Response Railway • 6 months ago
6 months ago
As an update: my app is completely unreachable now.
6 months ago
I've seen some success with a config I'm attempting to rollout for metal here. Let me try it on the cloud machine you're on for now.
Status changed to Awaiting User Response Railway • 6 months ago
6 months ago
Hi Jake. Sounds great. Would be interested in any details or updates you can provide! Service is performing better now (still not as fast as before the switch to metal, but loading at all is much better than yesterday.)
Status changed to Awaiting Railway Response Railway • 6 months ago
5 months ago
Awesome to hear, we improved our cluster configuration to help balance the load across the fleet better
Status changed to Awaiting User Response Railway • 6 months ago
5 months ago
I encountered a similar-looking issue. It seems this Redis IO problem exists on METAL. Please refer to my case and provide a solution. Thank you.
https://station.railway.com/questions/unstable-metal-disk-io-happed-2-times-i-1a7ec197
Status changed to Awaiting Railway Response Railway • 6 months ago
Status changed to Solved itsrems • 6 months ago
5 months ago
Hm, tried to leave a reply earlier but maybe you can't reply after a thread has been marked as solved? I think this was marked as solved too early though! Would still like an update on the automigration back to metal and if all the issues have been resolved.
Status changed to Awaiting Railway Response Railway • 5 months ago
5 months ago
Heya, sorry about that. As Chandrika mentioned, we've rolled these changes out across the fleet. sounds like you're all set w/performance ?
As for moving you back to metal, I can do that with your approval in about ~12h so our platform team is there in case anything goes wrong.
Status changed to Awaiting User Response Railway • 5 months ago
5 months ago
works for me!
Status changed to Awaiting Railway Response Railway • 5 months ago
5 months ago
migration ran successfully - your redis now lives in the same region as your other services.
Let me know if you need anything else!
Status changed to Awaiting User Response Railway • 5 months ago
Status changed to Solved cerefre • 5 months ago
