Template Request: Llama
sarahkb125
EMPLOYEEOP

10 months ago

Llama is a popular open-source reasoning model, as are Deepseek, Qwen, and GPT OSS. The distilled versions make it feasible to self-host these models on Railway.

Railway is an optimal place to deploy high-scale models and access them via API, but needs a template to make it easier to get started.

This bounty will be paid out when:

  • A high-quality Llama 3.2 model template is presented
  • All potential template feedback has been incorporated
  • All requirements are met and tested
  • Follows all template best practices where applicable

Template requirements:

  • vLLM inference server with specific model
  • Request batching / caching through Redis
  • Multiple model sizes are supported
  • API layer - FastAPI with OpenAI compatible endpoints
  • Volume-backed storage for all databases using correct mount paths
  • Service dependencies should be correctly configured using proper startup order and health checks
  • Environment variables correctly configured for Railway domains using private networking where applicable

A few resources to get you started:

Solved$150 Bounty

Pinned Solution

10 months ago

Here is the completed llama template as requested!

https://railway.com/deploy/llama-32-1b

3 Replies

Railway
BOT

10 months ago

Hey there! We've found the following might help you get unblocked faster:

If you find the answer from one of these, please let us know by solving the thread!


10 months ago

Doing this!


10 months ago

Here is the completed llama template as requested!

https://railway.com/deploy/llama-32-1b


Status changed to Solved sarahkb125 10 months ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...