19 days ago
I have a FastAPI app that runs request is parallel. The app had been working fine but then suddenly started getting this error:
INFO - POST request to /run_app from Address(host='100.64.0.7', port=58342): Successfully received.
ERROR - A child process terminated abruptly, the process pool is not usable anymore
Traceback (most recent call last):
File "/app/src/api/main.py", line 350, in upload_problem
future = loop.run_in_executor(executor, process_request, simulation_request)
File "uvloop/loop.pyx", line 2747, in uvloop.loop.Loop.run_in_executor
File "/mise/installs/python/3.13.9/lib/python3.13/concurrent/futures/process.py", line 791, in submit
raise BrokenProcessPool(self._broken)
concurrent.futures.process.BrokenProcessPool: A child process terminated abruptly, the process pool is not usable anymore
I did a rebuilt and that of course fixed the problem and requested started processing again.
While shutting down the logs showed the following:
/mise/installs/python/3.13.9/lib/python3.13/multiprocessing/resource_tracker.py:324: UserWarning: resource_tracker: There appear to be 10 leaked semaphore objects to clean up at shutdown: {'/loky-130-tbtbnu5p', '/loky-247-0n67tgdf', '/loky-6-asizrpi6', '/loky-120-td_xhej4', '/loky-154-4pn23wdt', '/loky-6-sq6kxylg', '/loky-6-yhbufqtz', '/loky-211-lma7dd7l', '/loky-6-2sq1syj7', '/loky-6-j4h2014r'}
I do not know if this was the source of the problem.
Do you have any suggestions?
Thanks for your help.
4 Replies
19 days ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
Status changed to Awaiting User Response Railway • 17 days ago
17 days ago
I think I found the issue. I was using a custom start command with --workers 17 when I only have 8 VCPU. In my FastAPI I was calling app.state.executor = ProcessPoolExecutor() without specifying max_workers. This, I found, out was spawning a large number of child processes. I am now using: gunicorn -k uvicorn.workers.UvicornWorker --workers 2 main:app -b 0.0.0.0:8080 --max-requests 10 --max-requests-jitter 3 --pid /tmp/gunicorn.pid as the start command. I am attaching the new deployment logs. Would this explain my original problem?
Attachments
Status changed to Awaiting Conductor Response Railway • 17 days ago
carrascocesar
I think I found the issue. I was using a custom start command with --workers 17 when I only have 8 VCPU. In my FastAPI I was calling app.state.executor = ProcessPoolExecutor() without specifying max_workers. This, I found, out was spawning a large number of child processes. I am now using: gunicorn -k uvicorn.workers.UvicornWorker --workers 2 main:app -b 0.0.0.0:8080 --max-requests 10 --max-requests-jitter 3 --pid /tmp/gunicorn.pid as the start command. I am attaching the new deployment logs. Would this explain my original problem?
16 days ago
That would explain the issue, if the issue appears again feel free to create another thread.
Status changed to Awaiting User Response Railway • 16 days ago
Status changed to Solved passos • 16 days ago
Status changed to Solved passos • 16 days ago