10 months ago
GPT OSS is a popular open-source reasoning model, as are Deepseek, Qwen, and Llama. The distilled versions make it feasible to self-host these models on Railway.
Railway is an optimal place to deploy high-scale models and access them via API, but needs a template to make it easier to get started.
This bounty will be paid out when:
- A high-quality GPT OSS model template is presented
- All potential template feedback has been incorporated
- All requirements are met and tested
- Follows all template best practices where applicable
Template requirements:
- vLLM inference server with specific model
- Request batching / caching through Redis
- Multiple model sizes are supported
- API layer - FastAPI with OpenAI compatible endpoints
- Volume-backed storage for all databases using correct mount paths
- Service dependencies should be correctly configured using proper startup order and health checks
- Environment variables correctly configured for Railway domains using private networking where applicable
A few resources to get you started:
- Template Best Practices
- Learn more about Railway Templates
- Railway Documentation
- GPT OSS information
- vLLM documentation
Pinned Solution
10 months ago
Here is the completed GPT-OSS template with authentication as requested.
5 Replies
10 months ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
10 months ago
I've tested this on another hobby account, just needs testing on PRO tier.
https://railway.com/deploy/gpt-vllm
(tip - try using "gpt2" if you're not sure where to start)
10 months ago
Here is the completed GPT-OSS template with authentication as requested.
Status changed to Solved sarahkb125 • 10 months ago