3 months ago
How to add models after the build and deployment of Ollama service? Also, the UI asks me to upload folder, is there an easy way?
I’m trying to add multiple models to my Ollama setup I have. It keeps on giving me Invalid model in the logs and crashes whenever I add more than one model in the variable. For example.
This: OLLAMA_DEFAULT_MODELS="gemma3:4b”, Works.
But this: OLLAMA_DEFAULT_MODELS="llama3.2:3b,gemma3:4b”, Fails.
So please assist me cause I’m building different services temporary, but it’s annoying obviously.
3 Replies
3 months ago
Heyy man, thank you for the reply, and for the amazing energy lol.
I tried doing the space instead of comma and still got this error:
time=2025-12-17T17:13:26.398Z level=INFO source=server.go:429 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 44797"
time=2025-12-17T17:13:26.388Z level=INFO source=routes.go:1554 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://[::]:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://llama.railway.internal:*http://localhosthttps://localhosthttp://localhost:*https://localhost:*http://127.0.0.1https://127.0.0.1http://127.0.0.1:*https://127.0.0.1:*http://0.0.0.0https://0.0.0.0http://0.0.0.0:*https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-12-17T17:13:26.394Z level=INFO source=images.go:522 msg="total blobs: 5"
time=2025-12-17T17:13:26.394Z level=INFO source=images.go:529 msg="total unused blobs removed: 0"
time=2025-12-17T17:13:26.395Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2025-12-17T17:13:26.395Z level=INFO source=routes.go:1607 msg="Listening on [::]:11434 (version 0.13.4)"
time=2025-12-17T17:13:26.395Z level=INFO source=runner.go:106 msg="experimental Vulkan support disabled. To enable, set OLLAMA_VULKAN=1"
time=2025-12-17T17:13:26.442Z level=INFO source=server.go:429 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 44297"
time=2025-12-17T17:13:26.472Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="29.8 GiB" available="29.8 GiB"
time=2025-12-17T17:13:26.472Z level=INFO source=routes.go:1648 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"
Error: accepts 1 arg(s), received 2
And about the folder issue, I don't even wanna upload a folder, I was asking if there was a way I can add another model from the UI. or even if you know where to get the "folder" from, because either way, I can't add any.
As you can see I can only import Presets. which I have no idea what is that, excuse my ignorance man lol.
Finally, thanks for the heads up on gemma2, if you have tips on even better models for my current use case or in general, that would be much appreciated.
(My use case in general is that I'm trying to build a RAG + vector search that I will need AI to help with two prompts, one that will help with the vector words to find the stuff user needs, and another to analyze results, which will be about event sessions.)
Thank you for your time.
Attachments
Status changed to Open brody • 3 months ago
3 months ago
You helped with all I needed man thanks a lot.
You got any tips regarding the fastest, lowest usage, and best performing modal?
That's my discord if you connect USRID:328357119193513984
3 months ago
Oh also, me trying Ollama on railway, I don't see neither fast nor cheap at all, it's Kinda of equal and much slower, what do you think is the best alternative? n8n workarounds? or what?