8 months ago
Hello 👋
Can I host an LLM like ollama but able to respond to multiple requests on railway?
I'm talking about deploying an LLM with an API capable of responding quickly to a medium number of requests.
0 Replies
8 months ago
I'm no expert in LLMs but theres some templates like ollama and others in the marketplace so it's totally possible, now performance wise Railway doesn't offer GPUs so you would need a serverless GPU platform for it
8 months ago
also if it's a CPU/RAM intensive process you may also consider upgrading to the Pro Plan
Fine, thank's for you answer. I actually found those templates but i'm asking for a multi request, and ollama does not seem to be able to answer several requests simultaneously.
8 months ago
Yeah, that's isn't related to Railway itself