python ProcessPoolExecutor not working
vihardesu
PROOP

2 years ago

Hi, I'm trying to leverage my instance's entire vCPU (8 vCPUs) to perform OCR on PDFs. My code works locally, but when I try to use the CPUs on my fastapi python railway instance, it never goes past 1 vCPU and it claims CPU usage is 0%. How can I get multi processing to work?

  • I'm using python, fastapi, pytesseract, psutil and ProcessPoolExecutor
        with concurrent.futures.ProcessPoolExecutor(max_workers=max_workers) as executor:  
            futures = {executor.submit(process_page_wrapper, page_args): idx for idx, page_args in enumerate(page_args)}  
            for future in concurrent.futures.as_completed(futures):  
                idx = futures[future]  
                try:  
                    processed_page = future.result()  
                    result[idx]= processed_page  
                    logging.info(f"processed_page {idx} successfully")  
                except Exception as e:  
                    logging.error(f"Failed to process processed_page {idx}: {e}")  
Solved

2 Replies

vihardesu
PROOP

2 years ago

Update: I set the max_workers to 8 explicitly and that seemed to help. However, the processing time per job is significantly slower on this instance than my local machine (3-5x slower per page). A 3-second process is taking 15 seconds on this instance. Why would this be happening? I need to be able to process something like a 600-page pdf in under 5 minutes.


2 years ago

Hello,

Perhaps you could try our metal regions, they have faster CPUs.


Status changed to Awaiting User Response Railway over 1 year ago


Railway
BOT

9 months ago

This thread has been marked as solved automatically due to a lack of recent activity. Please re-open this thread or create a new one if you require further assistance. Thank you!

Status changed to Solved Railway 10 months ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...