I seem to be running out of memory and is causing issues (mainly in my API calls)

seanmosborn

PROOP

4 days ago

I seem to be running out of memory and is causing issues (mainly in my API calls) my api health use to have like 2GB and now it shows 512. in a session it should go above the 512 and I think this is my issues, this was running fine when the cgroupMemoryLimit showed above 512. I have the memory in my setting at max (I think 24GB) how can I increase this?

{"uptime":1894,"memory":{"rss":"350MB","heapUsed":"190MB","heapTotal":"203MB","external":"5MB","arrayBuffers":"1MB"},"env":{"ANTHROPIC_API_KEY":true,"BUILT_IN_FORGE_API_KEY":true,"BUILT_IN_FORGE_API_URL":true,"DATABASE_URL":true,"NODE_ENV":"production"},"nodeVersion":"v22.22.2","platform":"linux","arch":"x64","outboundHttps":{"status":200,"ok":true},"cgroupMemoryLimit":"512MB"}

$20 Bounty

7 Replies

Railway

BOT

4 days ago

Your Pro plan supports up to 24 GB of memory per service, but your cgroupMemoryLimit showing 512 MB indicates you have a Replica Limit set on this service that caps it. Go to your service's Settings > Deploy > Replica Limits and either increase or remove the memory limit there - that is separate from the plan-level maximum and directly controls the cgroup constraint your app sees.

Status changed to Awaiting User Response Railway • 4 days ago

seanmosborn

PROOP

4 days ago

I do not see anything under the setting - Deploy that show replica limits

Status changed to Awaiting Railway Response Railway • 4 days ago

seanmosborn

PROOP

4 days ago

replica limits shows to be at 24GB and that is what it use to say in my api/health

seanmosborn

PROOP

4 days ago

Screenshot 2026-05-27 at 10.36.33 AM.png

Attachments

Screenshot%...

Railway

BOT

4 days ago

The setting is in your service's Settings tab, under the "Deploy" section, labeled "Replica Limits" - you can see the docs reference here. If you don't see a value set there, it means no manual cap is in place, and the 512 MB cgroupMemoryLimit you're seeing is from Railway's vertical autoscaler, which dynamically adjusts the memory allocation based on your service's current usage and scales up automatically as demand increases.

Status changed to Awaiting User Response Railway • 4 days ago

Railway

The setting is in your service's Settings tab, under the "Deploy" section, labeled "Replica Limits" - you can see the [docs reference here](https://docs.railway.com/pricing/cost-control#replica-limits). If you don't see a value set there, it means no manual cap is in place, and the 512 MB cgroupMemoryLimit you're seeing is from Railway's vertical autoscaler, which dynamically adjusts the memory allocation based on your service's current usage and scales up automatically as demand increases.

seanmosborn

PROOP

4 days ago

mine is set to the full 24GB

Status changed to Awaiting Railway Response Railway • 4 days ago

Status changed to Open Railway • 4 days ago

dev-charles254

PROTop 5% Contributor

4 days ago

A few things stand out here:

If Replica Limits is already set to 24GB, then the important part is that the runtime itself is still reporting:

cgroupMemoryLimit: 512MB

That means the container currently booted with a 512MB cgroup allocation, not just that Node is using 512MB.

A couple questions that would help narrow this down:

Did this start after a redeploy/runtime update?
Is this a Web Service, Background Worker, or something else?
Are you seeing container restarts/OOM kills in the deploy logs?
Does memory usage spike during API calls or is it stable?
Are you using Node cluster/workers, streaming responses, or large payload buffering?

Also, Railway support mentioned vertical autoscaling, but in practice autoscaling usually reacts AFTER pressure increases. If the process gets killed or throttled before scale-up happens, API calls can start failing while the container still reports 512MB.

I would test these specifically:

Check actual cgroup limit inside the container:

cat /sys/fs/cgroup/memory.max

Check whether Node itself is capped:

node -e "console.log(require('v8').getHeapStatistics())"

Force Node heap higher temporarily:

NODE_OPTIONS=--max-old-space-size=4096

Watch deploy/runtime logs for:

OOMKilled
SIGKILL
container restart loops
memory pressure warnings

Another possibility is that Railway changed the initial allocation behavior recently and services now start at 512MB baseline unless sustained pressure triggers upscale. If that’s the case, API bursts could fail before autoscaling reacts.

The important clue is that your app was previously reporting ~2GB in health checks and now consistently reports 512MB despite unchanged Replica Limits. That suggests either:

a platform/runtime allocation behavior change,
autoscaler regression,
or the service is no longer being classified the same way internally.

Welcome!