VCPU peaks unexplained
jmunozz
PROOP

4 months ago

I faces some unexplained peaks of vCPU consumption in my API Service.
I uses NestJS with soft microservices so it can escalates quickly, but here the figures seems way to high to be explained by a micro-service shitstorm.
Furthermore, when I dig to the logs, there is no evidence of such an activity (only a dozen of logs on the period).
How this number is calculated and do you have any method to dig deeper into root causes of vCPU consumption ?
Thanks

Attachments

Solved$20 Bounty

Pinned Solution

4 months ago

you might be running into an issue with nestJS apps is creating large garbage collections (thats if your service is handling JSON payloads or uses redis). you can add:

NODE_OPTIONS="--trace_gc" or use clinic doctor -- node dist/main.js to be able to debug the high VCPU

you can also redeploy and enabled more detailed logs, so something like:

import { Logger } from '@nestjs/common';

async function bootstrap() {

const app = await NestFactory.create(AppModule, { logger: ['error', 'warn', 'log', 'debug'] });

await app.listen(process.env.PORT || 3000, '0.0.0.0');

Logger.debug('App started with full logging');

}

or CPU level diagnostics:

import { performance, PerformanceObserver } from 'node:perf_hooks';

const obs = new PerformanceObserver((items) => {

console.log(items.getEntries());

});

obs.observe({ entryTypes: ['function'] });

in short, it can also be a thread pool issue when you are performing async work

3 Replies

4 months ago

you might be running into an issue with nestJS apps is creating large garbage collections (thats if your service is handling JSON payloads or uses redis). you can add:

NODE_OPTIONS="--trace_gc" or use clinic doctor -- node dist/main.js to be able to debug the high VCPU

you can also redeploy and enabled more detailed logs, so something like:

import { Logger } from '@nestjs/common';

async function bootstrap() {

const app = await NestFactory.create(AppModule, { logger: ['error', 'warn', 'log', 'debug'] });

await app.listen(process.env.PORT || 3000, '0.0.0.0');

Logger.debug('App started with full logging');

}

or CPU level diagnostics:

import { performance, PerformanceObserver } from 'node:perf_hooks';

const obs = new PerformanceObserver((items) => {

console.log(items.getEntries());

});

obs.observe({ entryTypes: ['function'] });

in short, it can also be a thread pool issue when you are performing async work


jmunozz
PROOP

4 months ago

Thanks, i will run those diagnosis tools.
I use microservices with a separated instance of rabbitmq as a message broker https://docs.nestjs.com/microservices/rabbitmq.


jmunozz

Thanks, i will run those diagnosis tools. I use microservices with a separated instance of rabbitmq as a message broker https://docs.nestjs.com/microservices/rabbitmq.

4 months ago

it does explain why - rabbitMQ typically wouldnt have logs but can have high vCPU. look for duplicate consumer tags in rabbitmq, or deploy timestamps vs vCPU spikes. adding the logging above will help pinpoint the exact issue though


Status changed to Solved ray-chen 4 months ago


Loading...