Testing out Ollama and its really slow

tansandoteth
PRO

a month ago

I was just testing out LLMs on Railway to see how they fair. Currently just deployed the ollama template and tried testing with hobby and pro, but but seems pretty slow in a basic response. Testing with llama3:latest and noticed memory is only hitting 10gb.

Here's the command i am running:

curl http://ollama.railway.internal:11434/api/chat -H "Content-Type: application/json" -d '{"model":"llama3:latest","stream":false,"messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Why is the sky blue?"}]}'

Is this an LLM issue? or something to do with vram bandwidth? Any thoughts appreciated.

Solved

0 Replies

tansandoteth
PRO

a month ago

c6b347a2-4303-459f-8d48-60e445d0b480


tansandoteth
PRO

a month ago

I'm realizing this might be a CPU vs GPU issue


a month ago

Yep, that’s very likely it


a month ago

!s


Status changed to Solved adam 27 days ago


Testing out Ollama and its really slow - Railway Help Station