Sudden Slow responses in Node + Redis setup

rishimohan
PRO

2 months ago

I have a Node app that's linked to Redis for cache

  • Store image urls in redis on first request in Node app

  • Retrieve URL from redis on subsequent request and convert it to buffer

This specific part used to take <1.5 seconds 4 days back, recently it has been taking no less than 4 seconds

I also saw a auto-deployment on my node app that said "the app is now switched to Metal", not sure if it's related or anything that was changed recently

I've spent last 36 hours debugging this in my app and came to conclusion it's not the app. It would be nice if someone can help understanding this

Awaiting User Response

8 Replies

chandrika
EMPLOYEE

2 months ago

Hi there, you’ve got multiple replicas on the Node service across different regions so wondering if the tests you’re checking are between regions?


Status changed to Awaiting User Response railway[bot] about 2 months ago


chandrika
EMPLOYEE

2 months ago

Could you please share more about how you're measuring the latency? Is it 4s+ when connecting EU West <> EU West?


rishimohan
PRO

2 months ago

It's Node service's replica region <> Redis in EU West. I had the same exact setup(regions, replicas etc.) last week too and the part I mentioned was consistently <1.5 seconds

Let me know if I can help you debug this


Status changed to Awaiting Railway Response railway[bot] about 2 months ago


echohack
EMPLOYEE

2 months ago

Hey Rishimohan,

Can you share what you're using to measure this latency? What diagnostics are you using? We would expect higher latency from US-West and Singapore replicas than compared to your EU-West replicas due to your Redis instance being located in EU-West.


Status changed to Awaiting User Response railway[bot] about 2 months ago


rishimohan
PRO

2 months ago

I've run multiple tests using performance.now() before and after the function

This below function converts image URL coming from Redis cache to buffer. Now this function alone takes upwards 1,500ms which used to be about 500-600 ms. Redis latency seems to be better now in my recent tests

export async function convertImageToBinary({ imageUrl, timeout = 10000 }) {

try {

// Use AbortController for timeout support

const controller = new AbortController();

const timeoutId = setTimeout(() => controller.abort(), timeout);

// Add cache-control headers to potentially improve performance

const response = await fetch(imageUrl, {

signal: controller.signal,

headers: {

Accept: "image/*",

"Cache-Control": "max-age=3600",

},

// Add keepalive for better connection management

keepalive: true,

// Add compression support

compress: true,

});

clearTimeout(timeoutId);

if (!response.ok) {

throw new Error(`HTTP error: ${response.status}`);

}

// Check content type to ensure we're handling an image

const contentType = response.headers.get("content-type");

if (!contentType?.startsWith("image/")) {

throw new Error("Not an image response");

}

// Use streaming for large files

if (response.body) {

const reader = response.body.getReader();

const chunks = [];

while (true) {

const { done, value } = await reader.read();

if (done) break;

chunks.push(value);

}

return Buffer.concat(chunks);

}

// Fallback to arrayBuffer for smaller files

const arrayBuffer = await response.arrayBuffer();

return Buffer.from(arrayBuffer);

} catch (error) {

console.error(

"Error converting image to binary:",

error.name === "AbortError" ? "Request timed out" : error.message

);

return null;

}

}


Status changed to Awaiting Railway Response railway[bot] about 2 months ago


echohack
EMPLOYEE

2 months ago

Unfortunately performance.now() isn't really measuring network latency and that could include the image processing time. Have you tried running a traceroute , ping or iperf from your node.js service to your redis service?

You can ssh into your containers using railway ssh so you can run commands directly from the effected container.

I'd need to see a before and after using one of these tools to verify that network latency is effected.


Status changed to Awaiting User Response railway[bot] about 2 months ago


rishimohan
PRO

2 months ago

Thanks for the response! So I just tried again and it seems to be superfast again, just like it was. And I didn't change anything in the code

What was taking 4 to 6 seconds yesterday is now taking <1s now

Just curious did you guys change anything in last 24 hours? Or something changed in my account? Just want to make sure my customers get the quickest response times so please help me with it, thanks!


Status changed to Awaiting Railway Response railway[bot] about 2 months ago


echohack
EMPLOYEE

2 months ago

I'm not seeing any changes on your end, but again it's difficult to say if performance.now() is a good metric for tracking latency as that can be related to processing time and not the network. Glad to hear everything seems back to normal for you.


Status changed to Awaiting User Response railway[bot] about 2 months ago