Traffic spike errors http 502
epether
PROOP

a month ago

Hello

We had a traffic spike January 8th around 20h UTC.

Our service node api we saw these errors in the logs:

"upstreamErrors": "[{\"deploymentInstanceID\":\"f55af9d8-2152-4036-aac3-bf051f52f287\",\"duration\":13121,\"error\":\"an unknown error occurred\"},{\"deploymentInstanceID\":\"6acb3dab-a3cc-4818-8bde-8fd77bdc9222\",\"duration\":5000,\"error\":\"connection dial timeout\"},{\"deploymentInstanceID\":\"d257e79c-a2c4-485d-b114-1ea1a1d50dbd\",\"duration\":5000,\"error\":\"connection dial timeout\"}]"

Could you help us understand what went wrong? We dont see any spike in CPU/RAM but the service returned some http 502 to the clients.

Thanks

$30 Bounty

3 Replies

Railway
BOT

a month ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!


brody
EMPLOYEE

a month ago

This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.

Status changed to Open brody 28 days ago


fra
HOBBYTop 10% Contributor

a month ago

Do you use third party services? It seems like your service is doing a request that timeout...


darseen
HOBBYTop 1% Contributor

a month ago

If you are using node.js, a single synchronous operation or a blocked Event Loop can cause this. However, that usually spikes the cpu. If cpu was flat, it's more likely the event loop was waiting on a promise that never resolved or a slow external API call. And since cpu was low, your application was likely waiting.

Under normal load, a server accepts a connection in milliseconds. A 5 second delay means the server was running but completely unresponsive to new network traffic.

You should check your database metrics to see if it's causing the bottleneck. Check if active connections hit the limit as well. Or maybe you're calling an external API that's causing this.

I also recommend that you implement timeouts for database queries and external API calls to ensure you aren't awaiting a slow promise without a timeout.


Loading...