6 months ago
Hi,
I’m deploying an application that integrates three LLM APIs. Smaller models (Mistral, Gemini-1.5-flash) run fine, but when I run all three together the deployment fails at runtime -- especially with the larger paid model (Gemini-2.5-pro). Locally it works, but on Railway the backend never returns to the frontend.
It looks like a request/idle timeout is reached before the model finishes. Where can I configure longer request timeouts on Railway (ideally 2–3 minutes)? I adjusted the code to shorten work, but that reduces output quality. Guidance on the correct way to extend execution time would be appreciated.
Thank you.
5 Replies
6 months ago
Our request timeout is 15 minutes.
Thanks for clarifying. Since the request window is 15 minutes, I suspect the issue may be in my implementation (possibly related to Flask’s request handling or how I manage background jobs). Could you confirm if there are any common pitfalls when running long-running Flask apps on Railway?
6 months ago
There are none on the platform side of things, given we are not serverless, so there's no max execution time or anything like that.
Unfortunately, I don't know what your issue is. I just wanted to clear up the misconceptions that were brought up, but since it is not a platform issue, I will bounty this thread in hopes that the community can help you debug.
6 months ago
I think there was some issue from the Google's end today, api calls were mostly returning 503 errors (from my personal experience), maybe that was the problem? try testing it now
6 months ago
Did you consider changing the application architecture in a way that instead of waiting for the backend to respond to the front-end on a long running task you schedule a job into some queue (as you mentioned Flash you may find Rq https://python-rq.org/ quite simple and easy to use for your use-case). The job then can be picked by a worker from the queue, and the backed can let your front-end know the job id. The front-end then may poll the job for its status periodically and fetch the job result as soon as the job completes