4 months ago
The Problem
I'm currently spending ~$35/month on Railway with extremely minimal usage - just 2 internal team members periodically testing the system. I'm concerned this won't scale sustainably when we get real users, and I need help diagnosing what's driving these costs.
What We're Running
Vizionari.ai - An AI-powered data analysis platform where users upload files (CSV/Excel) and our AI agents process them asynchronously to provide insights and analysis.
Railway Services:
vizionari-api - FastAPI application handling requests, file uploads, and WebSocket connections
celery-worker - Background task processor for AI workflows (this shows as highest cost in observability memory usage)
PostgreSQL (Railway plugin) - Main database
Redis (Railway plugin) - Used for both task queue and file caching
RabbitMQ (CloudAMQP, external) - WebSocket message broker
How It Works (High-Level)
User uploads file → API caches it → Celery worker processes with AI →
Updates sent via RabbitMQ → Results stored in PostgreSQL
Processing tasks typically take 3-40 seconds
Celery worker runs with 8 concurrent workers by default
AI processing uses OpenAI API (external cost, not Railway)
My Main Questions
Why is celery-worker so expensive?
Only ~10-20 workflow executions per week
Is it consuming full resources 24/7 even when idle?
Should I be using autoscaling or a different strategy?
Resource allocation visibility:
What CPU/RAM are being allocated to each service?
How much of my cost is idle vs. active usage?
Optimization opportunities:
Can celery-worker scale down when there are no tasks?
Should I separate Redis usage (task queue vs. file caching)?
Is my worker concurrency setting (8) too high for actual usage?
Scalability concerns:
If I'm at $35 with 2 users, how do I ensure costs scale with actual usage rather than just reserved resources?
Attachments
13 Replies
4 months ago
Hey there! We've found the following might help you get unblocked faster:
If you find the answer from one of these, please let us know by solving the thread!
4 months ago
Hello,
The majority of costs are coming from memory, which is used by your Celery worker.
If you would like to reduce costs, you may want to look into a more memory-efficient solution to handle what you are using Celery for.
Best,
Brody
Status changed to Awaiting User Response Railway • 4 months ago
brody
Hello,The majority of costs are coming from memory, which is used by your Celery worker.If you would like to reduce costs, you may want to look into a more memory-efficient solution to handle what you are using Celery for.Best,Brody
4 months ago
Have you got any refernce for a more memory efficient solution?
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
I don't, but I'm sure the community can give you some advise, would you like me to open this up to the community?
Status changed to Awaiting User Response Railway • 4 months ago
Status changed to Awaiting Railway Response Railway • 4 months ago
4 months ago
This thread has been marked as public for community involvement, as it does not contain any sensitive or personal information. Any further activity in this thread will be visible to everyone.
Status changed to Open brody • 4 months ago
4 months ago
Have you viewed this?
https://docs.celeryq.dev/en/latest/userguide/optimizing.html
They seem to have some tips on optimization. Unfortunately, if that doesn't work, that's likely just the nature of Celery. I wouldn't expect the jump from 10-100 users to be as large as 0-10.
4 months ago
I can't help on the memory optimization, but can you trying limiting the memory and see if your app still work with less memory? is not a real solution but it might give you more time to find the real solution....
fra
I can't help on the memory optimization, but can you trying limiting the memory and see if your app still work with less memory? is not a real solution but it might give you more time to find the real solution....
4 months ago
Current costs arent really the problem because we're still on dev stage, Im afraid of the costs of scaling our product when we finally deliver to our client that has ~400 interested users.
samgordon
Have you viewed this?https://docs.celeryq.dev/en/latest/userguide/optimizing.htmlThey seem to have some tips on optimization. Unfortunately, if that doesn't work, that's likely just the nature of Celery. I wouldn't expect the jump from 10-100 users to be as large as 0-10.
4 months ago
I've checked this out and came to some conclusions, I need to change worker prefetch multiplier to 1 to avoid task hoarding. But the most important that I dont know if that's even possible to do on railway is check if it supports autoscaling based on Queue depth in Redis, CPU/memory metrics. Is that possible? Or if it doesn't support the "scale to zero", use a cron job to check queue depth and start/stop the worker service or move to a platform that supports serverless workers (AWS Lambda, Google Cloud Run).
rubenszinho
I've checked this out and came to some conclusions, I need to change worker prefetch multiplier to 1 to avoid task hoarding. But the most important that I dont know if that's even possible to do on railway is check if it supports autoscaling based on Queue depth in Redis, CPU/memory metrics. Is that possible? Or if it doesn't support the "scale to zero", use a cron job to check queue depth and start/stop the worker service or move to a platform that supports serverless workers (AWS Lambda, Google Cloud Run).
4 months ago
Railway supports serverless, there is a toggle at the bottom of the service config. Other than that, auto scaling is pretty limited. You can create replicas, but I'm pretty sure they all run.
samgordon
Railway supports serverless, there is a toggle at the bottom of the service config. Other than that, auto scaling is pretty limited. You can create replicas, but I'm pretty sure they all run.
4 months ago
I've checked serverless docs, and as the connection between celery and database is always on, It wont work for what I need, and it will be always in idle
4 months ago
how many workers do you have? I'm not expert on celery, I did a quick research (I'm not gonna lie, all with ai), the suggestions they gave is to have less workers and try with --pool=threads ...if you are using prefork you will use a lot of memory...but it really depends on the kind of job you need....
you can ignore this if it doesn't make sense, I'm just throwing more ideas in the bucket
fra
how many workers do you have? I'm not expert on celery, I did a quick research (I'm not gonna lie, all with ai), the suggestions they gave is to have less workers and try with --pool=threads ...if you are using prefork you will use a lot of memory...but it really depends on the kind of job you need....you can ignore this if it doesn't make sense, I'm just throwing more ideas in the bucket
4 months ago
Hmm, Im already running 2 workers (down from 8), and I'm thinking about switching to cron mode (5 min schedule) with webhook triggers for on-demand execution, so threads would only add very low extra savings but I'll consider it if I need more optimization later. The downside of the webhook+cron approach is that this might add 30s delay to my calls to worker at least.
