23 days ago
Hi, my service "Ingest" is reporting high memory usage.
The code that is running in the service was deployed 3 months ago, and it was stable until now. In the last few days the memory consumption of the service is spiking toward the limit.
I tried restarting the deployment, and it seems to reset the memory usage at normal levels, but after a day it suddenly spiked again.
Do you have any suggestion on how we can identify the cause?
4 Replies
Status changed to Awaiting Railway Response Railway • 23 days ago
22 days ago
This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.
Status changed to Open Railway • 22 days ago
21 days ago
The pattern described strongly suggests either a memory leak or unbounded growth (cache, queue, or retained objects). Is the growth linear, or more like a sudden spike? If this is a spike, can you be sure that this does not organically correlate to traffic?
21 days ago
It was a sudden spike in a code that was there since 3 months without any issue. That's why I think has nothing to do with traffic patterns.
ilvalerione
It was a sudden spike in a code that was there since 3 months without any issue. That's why I think has nothing to do with traffic patterns.
21 days ago
Without code or metric graphs it’s hard to diagnose precisely. Since the behavior is sudden spikes (not gradual growth), this usually points to workload-related memory pressure rather than a classic leak.
A good first step is to add basic memory logging:
- Track
process.memoryUsage()(especiallyrssandheapUsed) on an interval - Correlate spikes with request rate, payload size, or batch size
To pinpoint the cause, use profiling tools:
heapdumpCapture heap snapshots before and during a spike, then compare them to see what objects grow.clinic heapprofilerHelps identify where memory is being allocated and retained over time.clinic doctorProvides a broader analysis (CPU, event loop, memory) and can highlight bottlenecks causing buildup.
Also worth checking alongside profiling:
- Whether large payloads or batches are being buffered in memory
- If concurrency or queue size temporarily spikes
- If slow downstream services (DB/API) are causing in-memory buildup
These steps should help narrow it down to either input size, concurrency, or a specific allocation hotspot.
21 days ago
That's why I created a private ticket with Railway support, but they decided to open it to the community. I don't underastand how someone without access to the system can be helpful here.