2 months ago
Hi everyone,
I'm running a website audit service on Railway that uses Playwright/Chromium to perform automated cookie consent audits on websites.
**My Setup:**
- Service: Node.js 20 backend
- Playwright with Chromium
- Browser Pool: 4 browsers × 10 contexts = 40 parallel browser contexts
- Resources: 32 vCPU, 32GB RAM (plenty available)
- Environment: Railway staging
SHM_SIZE_BYTES=1073741824 (1GB, also tried 4GB) via Railway specific env-variable
My Browser headers are:
const args: string[] = [
'--disable-http2',
'--disable-gpu',
'--allow-insecure-localhost',
];
**The Problem:**
When I try to start 40 browser contexts simultaneously, the browser processes crash immediately with errors like:
- `Target crashed`
- `Browser unexpectedly closed`
- `page.evaluate: Target crashed`
Peak Usage was: 6GB Ram and 7.9vCPU
This happens **even though CPU and RAM are barely utilized**
**What Works:**
Interestingly, a smaller configuration runs perfectly fine:
- 2 browsers × 12 contexts = 24 parallel browser contexts
- 24 contexts total
- No crashes, stable performance
Peak-Usage: 13vCPU and 6GB RAM
**What I've Already Checked:**
- CPU/RAM are not the bottleneck
- Increased file descriptors (ulimit -n 65536)
- Increased process limits (ulimit -u 4096)
- No "EMFILE: too many open files" errors
- No memory allocation failures
- Removing the '--disable-http2' browser header
- Adding the '--disable-dev-shm-usage' browser header
**My Question:**
Are there any **non-obvious or undocumented limits on Railway** that could cause this? For example:
- Kernel-level process spawning limits?
- Container isolation limits?
- Network connection limits?
- Other resource constraints not visible via `top` or `free`?
The jump from 24 contexts (stable) to 40 contexts (crashes) suggests there's a hard limit somewhere that I'm not seeing.
Any insights would be greatly appreciated!
Thanks!3 Replies
2 months ago
We do not impose any documented kernel-level PID, thread, or process-spawning limits beyond the standard CPU and memory constraints you configure via replica limits. The configurable platform-side variable relevant to your use case is RAILWAY_SHM_SIZE_BYTES (which you're already using), and there are no hidden connection or file descriptor caps on our end. The "Target crashed" behavior you're seeing at 40 contexts is most likely driven by Chromium's own per-process resource consumption (each browser context spawns multiple subprocesses and threads) hitting a container-level limit that isn't surfaced by top or free, such as the PID cgroup limit. You can verify this inside your container by checking the value in /sys/fs/cgroup/pids.max during runtime to see if the container's PID ceiling is being reached.
Status changed to Awaiting User Response Railway • about 2 months ago
Railway
We do not impose any documented kernel-level PID, thread, or process-spawning limits beyond the standard CPU and memory constraints you configure via replica limits. The configurable platform-side variable relevant to your use case is `RAILWAY_SHM_SIZE_BYTES` (which you're already using), and there are no hidden connection or file descriptor caps on our end. The "Target crashed" behavior you're seeing at 40 contexts is most likely driven by Chromium's own per-process resource consumption (each browser context spawns multiple subprocesses and threads) hitting a container-level limit that isn't surfaced by top or free, such as the PID cgroup limit. You can verify this inside your container by checking the value in `/sys/fs/cgroup/pids.max` during runtime to see if the container's PID ceiling is being reached.
2 months ago
After debugging i found out that i hit a cgroup limit of 1000 pids which is host-level adjusted from your site. Even if i set this in my Dockerfile,i cant go beyond your limit:
# Increase system limits for high-concurrency Playwright browser contexts # File descriptor limit: 1024 -> 65536 (needed for 40+ browser contexts) # Process limit: 1024 -> 4096 (needed for multiple browser processes) USER root RUN echo "appuser soft nofile 65536" >> /etc/security/limits.conf && \ echo "appuser hard nofile 65536" >> /etc/security/limits.conf && \ echo "appuser soft nproc 4096" >> /etc/security/limits.conf && \ echo "appuser hard nproc 4096" >> /etc/security/limits.conf USER appuser
Status changed to Awaiting Railway Response Railway • about 2 months ago
Status changed to Awaiting User Response Railway • about 2 months ago
Status changed to Closed brody • about 2 months ago