Server keeps randomly spiking to 6+gb memory and crashing
cryptothud
HOBBYOP

5 months ago

I’ve had server up and running for over a year with no issues. The last deployment was up for over a day with no issues, then randomly spiked from a consistent ~500mb (normal and average, never spikes above ~800mb) memory to 6+gb and crashed the server. It’s been doing this ever 10-30 minutes now and I can’t figure it out for the life of me.

I’ve tried rolling back the deployment to code I used weeks ago where there were absolutely 0 issues, still happens.

I tried updating node.js from 18 to 21, still happens.

I tried logging the heaps & rss every 5seconds, even every 100ms for spikes and there’s nothing. The server will just stop logging anything for over a minute and crash with the error, no spikes detected.

Please, any help would be great 🙏

<--- Last few GCs --->
[23:0x826fe20] 109236 ms: Mark-Compact 4085.6 (4143.2) -> 4084.5 (4143.5) MB, 2509.32 / 0.00 ms (average mu = 0.164, current mu = 0.000) allocation failure; scavenge might not succeed
[23:0x826fe20] 114858 ms: Mark-Compact 4085.7 (4143.5) -> 4084.6 (4143.5) MB, 5621.82 / 0.00 ms (average mu = 0.060, current mu = 0.000) allocation failure; scavenge might not succeed
<--- JS stacktrace --->
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

$20 Bounty

2 Replies

cryptothud
HOBBYOP

5 months ago

last 24hrs (abnormal) vs 30 days (normal)

rate limits in place (120/min with slowdown) no more than 3mb data allowed per request


case
PRO

5 months ago

@cryptothud have you looked into what may have changed in your codebase, during this period of stability → instability? E.g. do you have any dependencies auto-updating, or has the hosting environment changed (e.g. container versions, Linux distro versions, etc)? Usually things like this don't spontaneously change so drastically without some sort of correlated change in the codebase or hosting environment / config.

It might also help to make a timeline (with date and timestamps) showing the last known "good state" and the date-times when things started acting weird.


Loading...