Possible memory leak

bluefield-creator

PROOP

2 months ago

We recently deployed a headless 3D rendering software to Railway using DOCKER. Every time we deploy it, it builds the latest renderer code and runs its service, which is based in TCP.

Problem: For some reason, memory consumption only goes up. At first we believed it was a memory leak from our code, so we inspected it thoroughly, and we also exercised plenty of local testing, however, we could not replicate this behavior.

Because our renderer service kept reaching maximum memory usage limits (2GB), we decided to set up a cron job to restart the service automatically every 15 minutes - in theory, this should clear all the memory usage, if the fault was at our code level.

However, after making these changes, we noticed memory still kept climbing and never got cleared, as if the software never got restarted.

Therefore we are prone to believe this may be an issue with either docker, or railway.

image attached of 6 hours of usage - at 15 mins cronjob for restarts, the memory usage should have dropped every 15 minutes back to 185 MB (Standard usage of our renderer on startup), but that's not the case.

We have been revising our code and deployment options, we simply cannot find out what is wrong, and technical assistance would be most appreciated.

Attachments

image.png

$20 Bounty

Pinned Solution

suryalim11

HOBBY

2 months ago

To answer your question about Railway-level restarts: Railway does not have a built-in scheduled restart, but the simplest way is to make your container exit cleanly — Railway will auto-restart it.

The cleanest approach is to wrap your renderer with a timeout in your Dockerfile CMD:

CMD timeout 900 your-renderer-binary; exit 0

This exits the container after 15 minutes (900 seconds), and Railway's restart policy brings it back up fresh. No cron job needed.

Alternatively, from inside your container, you can send SIGTERM to PID 1 after a delay:

sleep 900 && kill -SIGTERM 1

However — if memory is NOT clearing even after restarts, the issue is likely not your code. Here is what to check:

1. Confirm restarts are actually happening at container level

Check your Railway deployment logs for "Starting Container" entries. If you see them every 15 mins, restarts are working. If not, your cron job may be killing a child process but the main container (PID 1) is still running.

2. Check for Railway persistent volumes

If your service has a Railway Volume attached, files written there persist across restarts. If your renderer writes memory-mapped files or cache to a mounted volume path, they won't be cleared on restart.

3. Docker layer caching is not the cause

Docker image layers are read-only and don't accumulate memory at runtime — so that is not the issue.

4. Check if your renderer is using GPU/shared memory

Headless 3D rendering often uses OpenGL or Vulkan which can allocate GPU-shared memory outside the normal RSS. This would show as growing memory in Railway's metrics even though your process itself is "clean". If this is the case, Railway's memory metrics may be including GPU allocations that are not freed between renders.

Can you confirm whether Railway logs show "Starting Container" every 15 minutes, and whether your renderer uses GPU/OpenGL?

6 Replies

Status changed to Open Railway • 2 months ago

bluefield-creator

PROOP

2 months ago

We attached our dockerfile:

https://raw.githubusercontent.com/visv4/renderer/refs/heads/main/Dockerfile?token=GHSAT0AAAAAAD324GMLB777ZL3XEGZEDRCM2PW7S5A

And our startup script:

https://raw.githubusercontent.com/visv4/renderer/refs/heads/main/railway.json?token=GHSAT0AAAAAAD324GMLKBX25U2OO26FN23C2PW7TFQ

bluefield-creator

PROOP

2 months ago

Thank you very much for the response, we will try this now. Any additional information on achieving railway level resetarts? Is it through a script inside the container? Or some service?

Cheers.

Status changed to Open brody • 2 months ago

suryalim11

HOBBY

2 months ago

To answer your question about Railway-level restarts: Railway does not have a built-in scheduled restart, but the simplest way is to make your container exit cleanly — Railway will auto-restart it.

The cleanest approach is to wrap your renderer with a timeout in your Dockerfile CMD:

CMD timeout 900 your-renderer-binary; exit 0

This exits the container after 15 minutes (900 seconds), and Railway's restart policy brings it back up fresh. No cron job needed.

Alternatively, from inside your container, you can send SIGTERM to PID 1 after a delay:

sleep 900 && kill -SIGTERM 1

However — if memory is NOT clearing even after restarts, the issue is likely not your code. Here is what to check:

1. Confirm restarts are actually happening at container level

2. Check for Railway persistent volumes

3. Docker layer caching is not the cause

Docker image layers are read-only and don't accumulate memory at runtime — so that is not the issue.

4. Check if your renderer is using GPU/shared memory

Can you confirm whether Railway logs show "Starting Container" every 15 minutes, and whether your renderer uses GPU/OpenGL?

Status changed to Solved mykal • about 2 months ago

mykal

EMPLOYEE

a month ago

You can trigger a Railway-level restart (fresh container) via the Public API using the deploymentRestart mutation, which restarts without rebuilding. You could call this from an external scheduler or a separate lightweight cron service. Alternatively, if your renderer can be structured as a short-lived task that exits after processing, you could use Railway's built-in cron jobs to spin up a fresh container on a schedule.

Status changed to Awaiting User Response Railway • about 2 months ago

asepsaputra

HOBBY

a month ago

If the restart is truly happening, memory usage should drop. If it doesn’t drop, it most likely means the container is not actually being restarted, or there are child processes/cache that remain alive.

Status changed to Awaiting Railway Response Railway • about 2 months ago

asepsaputra

HOBBY

a month ago

A few things to check:

Does the deploy/runtime log show a full service restart every 15 minutes?

Does the process PID change after each scheduled restart?

Are there child renderer processes left running after the main process restarts?

Is the cron job running inside the same container? If yes, it may not be restarting the container itself.

Is the memory coming from the main renderer process, child processes, /dev/shm, filesystem cache, or browser/headless renderer cache?

I’d suggest SSHing into the service and checking memory per process:

ps aux --sort=-rss | head -20

free -m

cat /sys/fs/cgroup/memory.current 2>/dev/null || true

Also log the main process PID on startup. If the PID does not change every 15 minutes, then the service is not actually being restarted.

If the PID changes but memory still keeps increasing, check for orphaned child processes or renderer/browser processes that survive the restart.

Status changed to Open chandrika • about 1 month ago

Welcome!