Process pool is not usable anymore
carrascocesar
HOBBYOP

19 days ago

I have a FastAPI app that runs request is parallel. The app had been working fine but then suddenly started getting this error:

INFO - POST request to /run_app from Address(host='100.64.0.7', port=58342): Successfully received.

ERROR - A child process terminated abruptly, the process pool is not usable anymore

Traceback (most recent call last):

File "/app/src/api/main.py", line 350, in upload_problem

future = loop.run_in_executor(executor, process_request, simulation_request)

File "uvloop/loop.pyx", line 2747, in uvloop.loop.Loop.run_in_executor

File "/mise/installs/python/3.13.9/lib/python3.13/concurrent/futures/process.py", line 791, in submit

raise BrokenProcessPool(self._broken)

concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore

I did a rebuilt and that of course fixed the problem and requested started processing again.

While shutting down the logs showed the following:

/mise/installs/python/3.13.9/lib/python3.13/multiprocessing/resource_tracker.py:324: UserWarning: resource_tracker: There appear to be 10 leaked semaphore objects to clean up at shutdown: {'/loky-130-tbtbnu5p', '/loky-247-0n67tgdf', '/loky-6-asizrpi6', '/loky-120-td_xhej4', '/loky-154-4pn23wdt', '/loky-6-sq6kxylg', '/loky-6-yhbufqtz', '/loky-211-lma7dd7l', '/loky-6-2sq1syj7', '/loky-6-j4h2014r'}

I do not know if this was the source of the problem.

Do you have any suggestions?

Thanks for your help.

Solved$10 Bounty

4 Replies

Railway
BOT

19 days ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!


Can you share the entire deployment logs?


Status changed to Awaiting User Response Railway 17 days ago


carrascocesar
HOBBYOP

17 days ago

I think I found the issue. I was using a custom start command with --workers 17 when I only have 8 VCPU. In my FastAPI I was calling app.state.executor = ProcessPoolExecutor() without specifying max_workers. This, I found, out was spawning a large number of child processes. I am now using: gunicorn -k uvicorn.workers.UvicornWorker --workers 2 main:app -b 0.0.0.0:8080 --max-requests 10 --max-requests-jitter 3 --pid /tmp/gunicorn.pid as the start command. I am attaching the new deployment logs. Would this explain my original problem?

Attachments


Status changed to Awaiting Conductor Response Railway 17 days ago


carrascocesar

I think I found the issue. I was using a custom start command with --workers 17 when I only have 8 VCPU. In my FastAPI I was calling app.state.executor = ProcessPoolExecutor() without specifying max_workers. This, I found, out was spawning a large number of child processes. I am now using: gunicorn -k uvicorn.workers.UvicornWorker --workers 2 main:app -b 0.0.0.0:8080 --max-requests 10 --max-requests-jitter 3 --pid /tmp/gunicorn.pid as the start command. I am attaching the new deployment logs. Would this explain my original problem?

passos
MODERATOR

16 days ago

That would explain the issue, if the issue appears again feel free to create another thread.


Status changed to Awaiting User Response Railway 16 days ago


Status changed to Solved passos 16 days ago


Status changed to Solved passos 16 days ago


Loading...