Loading...

Testing out Ollama and its really slow

tansandoteth

PRO

a month ago

I was just testing out LLMs on Railway to see how they fair. Currently just deployed the ollama template and tried testing with hobby and pro, but but seems pretty slow in a basic response. Testing with llama3:latest and noticed memory is only hitting 10gb.

Here's the command i am running:

curl http://ollama.railway.internal:11434/api/chat -H "Content-Type: application/json" -d '{"model":"llama3:latest","stream":false,"messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Why is the sky blue?"}]}'

Is this an LLM issue? or something to do with vram bandwidth? Any thoughts appreciated.

Solved

0 Replies

tansandoteth

PRO

a month ago

c6b347a2-4303-459f-8d48-60e445d0b480

tansandoteth

PRO

a month ago

I'm realizing this might be a CPU vs GPU issue

adam

MODERATOR

a month ago

Yep, that’s very likely it

adam

MODERATOR

a month ago

Status changed to Solved adam • 27 days ago