Process pool is not usable anymore
carrascocesar
HOBBYOP

4 months ago

I have a FastAPI app that runs request is parallel. The app had been working fine but then suddenly started getting this error:

INFO - POST request to /run_app from Address(host='100.64.0.7', port=58342): Successfully received.

ERROR - A child process terminated abruptly, the process pool is not usable anymore

Traceback (most recent call last):

File "/app/src/api/main.py", line 350, in upload_problem

future = loop.run_in_executor(executor, process_request, simulation_request)

File "uvloop/loop.pyx", line 2747, in uvloop.loop.Loop.run_in_executor

File "/mise/installs/python/3.13.9/lib/python3.13/concurrent/futures/process.py", line 791, in submit

raise BrokenProcessPool(self._broken)

concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore

I did a rebuilt and that of course fixed the problem and requested started processing again.

While shutting down the logs showed the following:

/mise/installs/python/3.13.9/lib/python3.13/multiprocessing/resource_tracker.py:324: UserWarning: resource_tracker: There appear to be 10 leaked semaphore objects to clean up at shutdown: {'/loky-130-tbtbnu5p', '/loky-247-0n67tgdf', '/loky-6-asizrpi6', '/loky-120-td_xhej4', '/loky-154-4pn23wdt', '/loky-6-sq6kxylg', '/loky-6-yhbufqtz', '/loky-211-lma7dd7l', '/loky-6-2sq1syj7', '/loky-6-j4h2014r'}

I do not know if this was the source of the problem.

Do you have any suggestions?

Thanks for your help.

Solved$10 Bounty

Pinned Solution

carrascocesar
HOBBYOP

4 months ago

I think I found the issue. I was using a custom start command with --workers 17 when I only have 8 VCPU. In my FastAPI I was calling app.state.executor = ProcessPoolExecutor() without specifying max_workers. This, I found, out was spawning a large number of child processes. I am now using: gunicorn -k uvicorn.workers.UvicornWorker --workers 2 main:app -b 0.0.0.0:8080 --max-requests 10 --max-requests-jitter 3 --pid /tmp/gunicorn.pid as the start command. I am attaching the new deployment logs. Would this explain my original problem?

Attachments

4 Replies

Railway
BOT

4 months ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!


4 months ago

Can you share the entire deployment logs?


Status changed to Awaiting User Response Railway 4 months ago


carrascocesar
HOBBYOP

4 months ago

I think I found the issue. I was using a custom start command with --workers 17 when I only have 8 VCPU. In my FastAPI I was calling app.state.executor = ProcessPoolExecutor() without specifying max_workers. This, I found, out was spawning a large number of child processes. I am now using: gunicorn -k uvicorn.workers.UvicornWorker --workers 2 main:app -b 0.0.0.0:8080 --max-requests 10 --max-requests-jitter 3 --pid /tmp/gunicorn.pid as the start command. I am attaching the new deployment logs. Would this explain my original problem?

Attachments


Status changed to Awaiting Conductor Response Railway 4 months ago


carrascocesar

I think I found the issue. I was using a custom start command with --workers 17 when I only have 8 VCPU. In my FastAPI I was calling app.state.executor = ProcessPoolExecutor() without specifying max_workers. This, I found, out was spawning a large number of child processes. I am now using: gunicorn -k uvicorn.workers.UvicornWorker --workers 2 main:app -b 0.0.0.0:8080 --max-requests 10 --max-requests-jitter 3 --pid /tmp/gunicorn.pid as the start command. I am attaching the new deployment logs. Would this explain my original problem?

4 months ago

That would explain the issue, if the issue appears again feel free to create another thread.


Status changed to Awaiting User Response Railway 4 months ago


Status changed to Solved passos 4 months ago


Status changed to Solved passos 4 months ago


Loading...