Eval bug: Server returns 500 error on /api/generate and /api/chat requests #12176

blues-alex · 2025-03-04T12:25:24Z

Name and Version

Environment:

OS: Manjaro linux
Ollama Version: 0.5.12
Installed Models: [List installed models, e.g., qwen2.5-coder:7b, MFDoom/deepseek-r1-tool-calling:32b, deepseek-r1:32b , qwen2.5-coder:32b]

Operating systems

Linux

GGML backends

CUDA

Hardware

CPU/GPU:
- 12th Gen Intel(R) Core(TM) i5-12450H
- nVidia GA107M [GeForce RTX 3050 Mobile] (4096MiB)
  - SOFTWARE:
    - NVIDIA-SMI 570.124.04, Driver Version: 570.124.04, CUDA Version: 12.8

Models

DeepSeek-R1-Distill-Qwen-14B-Q8_0.gguf

Problem description & steps to reproduce

Steps to Reproduce:

Start the Ollama server using ollama serve.
Send a request to either /api/generate or /api/chat endpoint.
Observe the server response.

Expected Behavior:
The server should return a successful response with HTTP status code 200.

Actual Behavior:
The server returns an error response with HTTP status code 500.

Example Request:

curl -X POST http://localhost:11434/api/generate \
-H "Content-Type: application/json" \
-d '{"model":"deepseek-r1_Q8_0:14","prompt":"Hi. How are You?"}'

First Bad Commit

No response

Relevant log output

2025/03/04 16:02:35 routes.go:1205: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:10m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:$HOME/Data/Models/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:true OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:true OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-03-04T16:02:35.483+04:00 level=INFO source=routes.go:1256 msg="Listening on 127.0.0.1:11434 (version 0.5.12)"
time=2025-03-04T16:02:35.483+04:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler"
time=2025-03-04T16:02:35.483+04:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-04T16:02:35.483+04:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-03-04T16:02:35.483+04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=libcuda.so*
time=2025-03-04T16:02:35.483+04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[/usr/lib/ollama/libcuda.so* $HOME/libcuda.so* /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]"
time=2025-03-04T16:02:35.502+04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[/usr/lib/libcuda.so.570.124.04 /usr/lib32/libcuda.so.570.124.04 /usr/lib64/libcuda.so.570.124.04]"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:35.611+04:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=1 library=/usr/lib/libcuda.so.570.124.04
[GPU-34fcc318-430c-5f94-4d32-922cfac59ff1] CUDA totalMem 3779 mb
[GPU-34fcc318-430c-5f94-4d32-922cfac59ff1] CUDA freeMem 3601 mb
[GPU-34fcc318-430c-5f94-4d32-922cfac59ff1] Compute Capability 8.6
time=2025-03-04T16:02:35.759+04:00 level=DEBUG source=amd_linux.go:419 msg="amdgpu driver not detected /sys/module/amdgpu"
releasing cuda driver library
time=2025-03-04T16:02:35.759+04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3050 Laptop GPU" total="3.7 GiB" available="3.5 GiB"
time=2025-03-04T16:02:44.071+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.2 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.1 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:44.240+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:44.240+04:00 level=DEBUG source=sched.go:182 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=3 gpu_count=1
time=2025-03-04T16:02:44.263+04:00 level=DEBUG source=sched.go:225 msg="loading first model" model=$HOME/Data/Models/.ollama/models/blobs/sha256-234bf54d16c772dc566325ffeadcb78b5a1d29b60e340da95d303da37ce33c13
time=2025-03-04T16:02:44.263+04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[3.5 GiB]"
time=2025-03-04T16:02:44.263+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.1 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.1 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:44.404+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:44.404+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.key_length default=128
time=2025-03-04T16:02:44.404+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.value_length default=128
time=2025-03-04T16:02:44.404+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.key_length default=128
time=2025-03-04T16:02:44.404+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.value_length default=128
time=2025-03-04T16:02:44.404+04:00 level=DEBUG source=memory.go:185 msg="gpu has too little memory to allocate any layers" id=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3050 Laptop GPU" total="3.7 GiB" available="3.5 GiB" minimum_memory=479199232 layer_size="407.0 MiB" gpu_zer_overhead="0 B" partial_offload="3.2 GiB" full_offload="2.6 GiB"
time=2025-03-04T16:02:44.405+04:00 level=DEBUG source=memory.go:329 msg="insufficient VRAM to load any model layers"
time=2025-03-04T16:02:44.405+04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[3.5 GiB]"
time=2025-03-04T16:02:44.405+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.1 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.1 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:44.523+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:44.523+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.key_length default=128
time=2025-03-04T16:02:44.523+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.value_length default=128
time=2025-03-04T16:02:44.523+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.key_length default=128
time=2025-03-04T16:02:44.524+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.value_length default=128
time=2025-03-04T16:02:44.524+04:00 level=DEBUG source=memory.go:185 msg="gpu has too little memory to allocate any layers" id=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3050 Laptop GPU" total="3.7 GiB" available="3.5 GiB" minimum_memory=479199232 layer_size="407.0 MiB" gpu_zer_overhead="0 B" partial_offload="3.2 GiB" full_offload="2.6 GiB"
time=2025-03-04T16:02:44.524+04:00 level=DEBUG source=memory.go:329 msg="insufficient VRAM to load any model layers"
time=2025-03-04T16:02:44.524+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.1 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:44.640+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:44.640+04:00 level=INFO source=server.go:97 msg="system memory" total="62.5 GiB" free="52.0 GiB" free_swap="24.0 GiB"
time=2025-03-04T16:02:44.640+04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[3.5 GiB]"
time=2025-03-04T16:02:44.640+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.1 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:44.754+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:44.754+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.key_length default=128
time=2025-03-04T16:02:44.754+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.value_length default=128
time=2025-03-04T16:02:44.754+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.key_length default=128
time=2025-03-04T16:02:44.754+04:00 level=WARN source=ggml.go:132 msg="key not found" key=qwen2.attention.value_length default=128
time=2025-03-04T16:02:44.754+04:00 level=DEBUG source=memory.go:185 msg="gpu has too little memory to allocate any layers" id=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3050 Laptop GPU" total="3.7 GiB" available="3.5 GiB" minimum_memory=479199232 layer_size="407.0 MiB" gpu_zer_overhead="0 B" partial_offload="3.2 GiB" full_offload="2.6 GiB"
time=2025-03-04T16:02:44.754+04:00 level=DEBUG source=memory.go:329 msg="insufficient VRAM to load any model layers"
time=2025-03-04T16:02:44.754+04:00 level=INFO source=server.go:130 msg=offload library=cuda layers.requested=-1 layers.model=49 layers.offload=0 layers.split="" memory.available="[3.5 GiB]" memory.gpu_overhead="0 B" memory.required.full="19.8 GiB" memory.required.partial="0 B" memory.required.kv="6.0 GiB" memory.required.allocations="[0 B]" memory.weights.total="19.1 GiB" memory.weights.repeating="18.3 GiB" memory.weights.nonrepeating="788.9 MiB" memory.graph.full="2.6 GiB" memory.graph.partial="3.2 GiB"
time=2025-03-04T16:02:44.754+04:00 level=WARN source=server.go:170 msg="flash attention enabled but not supported by gpu"
time=2025-03-04T16:02:44.754+04:00 level=DEBUG source=server.go:259 msg="compatible gpu libraries" compatible=[]
time=2025-03-04T16:02:44.755+04:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu"
time=2025-03-04T16:02:44.755+04:00 level=INFO source=server.go:380 msg="starting llama server" cmd="/usr/bin/ollama runner --ollama-engine --model $HOME/Data/Models/.ollama/models/blobs/sha256-234bf54d16c772dc566325ffeadcb78b5a1d29b60e340da95d303da37ce33c13 --ctx-size 32768 --batch-size 512 --verbose --threads 4 --no-mmap --parallel 1 --port 42975"
time=2025-03-04T16:02:44.755+04:00 level=DEBUG source=server.go:398 msg=subprocess environment="[CUDA_PATH=/opt/cuda PATH=$HOME/.nvm/versions/node/v23.7.0/bin:$HOME/go/bin:$HOME/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/var/lib/snapd/snap/bin LD_LIBRARY_PATH=/usr/lib/ollama]"
time=2025-03-04T16:02:44.755+04:00 level=INFO source=sched.go:450 msg="loaded runners" count=1
time=2025-03-04T16:02:44.755+04:00 level=INFO source=server.go:557 msg="waiting for llama runner to start responding"
time=2025-03-04T16:02:44.756+04:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error"
time=2025-03-04T16:02:44.765+04:00 level=INFO source=runner.go:887 msg="starting ollama engine"
time=2025-03-04T16:02:44.766+04:00 level=INFO source=runner.go:943 msg="Server listening on 127.0.0.1:42975"
time=2025-03-04T16:02:44.790+04:00 level=WARN source=ggml.go:132 msg="key not found" key=general.description default=""
time=2025-03-04T16:02:44.790+04:00 level=INFO source=ggml.go:95 msg="" architecture=qwen2 file_type=Q8_0 name="DeepSeek R1 Distill Qwen 14B" description="" num_tensors=579 num_key_values=31
time=2025-03-04T16:02:44.790+04:00 level=DEBUG source=ggml.go:89 msg="ggml backend load all from path" path=/usr/lib/ollama
load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so
time=2025-03-04T16:02:44.792+04:00 level=INFO source=ggml.go:110 msg=cpu device.name=CPU device.description="12th Gen Intel(R) Core(TM) i5-12450H" device.kind=cpu device.free="0 B" device.total="0 B"
time=2025-03-04T16:02:44.792+04:00 level=INFO source=ggml.go:110 msg=cpu device.name=CPU device.description="12th Gen Intel(R) Core(TM) i5-12450H" device.kind=cpu device.free="0 B" device.total="0 B"
time=2025-03-04T16:02:45.106+04:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model"
panic: unsupported model architecture "qwen2"

goroutine 8 [running]:
github.com/ollama/ollama/runner/ollamarunner.(*Server).loadModel(0xc000710700, {0x7ffc860fbd04?, 0x0?}, {0x4, 0x0, 0x0, {0x0, 0x0, 0x0}}, {0x0, ...}, ...)
	/build/ollama/src/ollama/runner/ollamarunner/runner.go:815 +0x379
created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1
	/build/ollama/src/ollama/runner/ollamarunner/runner.go:918 +0x9a5
time=2025-03-04T16:02:47.386+04:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server not responding"
time=2025-03-04T16:02:47.637+04:00 level=ERROR source=sched.go:456 msg="error loading llama server" error="llama runner process has terminated: exit status 2"
time=2025-03-04T16:02:47.637+04:00 level=DEBUG source=sched.go:459 msg="triggering expiration for failed load" model=$HOME/Data/Models/.ollama/models/blobs/sha256-234bf54d16c772dc566325ffeadcb78b5a1d29b60e340da95d303da37ce33c13
time=2025-03-04T16:02:47.638+04:00 level=DEBUG source=sched.go:361 msg="runner expired event received" modelPath=$HOME/Data/Models/.ollama/models/blobs/sha256-234bf54d16c772dc566325ffeadcb78b5a1d29b60e340da95d303da37ce33c13
time=2025-03-04T16:02:47.638+04:00 level=DEBUG source=sched.go:376 msg="got lock to unload" modelPath=$HOME/Data/Models/.ollama/models/blobs/sha256-234bf54d16c772dc566325ffeadcb78b5a1d29b60e340da95d303da37ce33c13
[GIN] 2025/03/04 - 16:02:47 | 500 |  3.582745881s |       127.0.0.1 | POST     "/api/generate"
time=2025-03-04T16:02:47.638+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.1 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.1 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:47.781+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:47.781+04:00 level=DEBUG source=server.go:1081 msg="stopping llama server"
time=2025-03-04T16:02:47.781+04:00 level=DEBUG source=sched.go:381 msg="runner released" modelPath=$HOME/Data/Models/.ollama/models/blobs/sha256-234bf54d16c772dc566325ffeadcb78b5a1d29b60e340da95d303da37ce33c13
time=2025-03-04T16:02:48.032+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.1 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:48.186+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:48.283+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="51.9 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:48.443+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:48.532+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="51.9 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="51.9 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:48.668+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:48.783+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="51.9 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="51.9 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:48.919+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:49.033+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="51.9 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="51.9 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:49.172+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:49.282+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="51.9 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="51.8 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:49.421+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:49.532+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="51.8 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="51.8 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:49.709+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:49.782+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="51.8 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:49.922+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:50.033+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:50.165+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:50.283+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="51.9 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:50.422+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:50.532+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="51.9 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:50.669+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:50.782+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.1 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:50.917+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:51.032+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.1 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:51.172+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:51.282+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:51.417+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:51.533+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:51.670+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:51.783+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.1 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:51.919+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:52.032+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.1 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.1 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:52.168+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:52.283+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.1 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:52.435+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:52.532+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="51.9 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:52.664+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:52.782+04:00 level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.144594331 model=$HOME/Data/Models/.ollama/models/blobs/sha256-234bf54d16c772dc566325ffeadcb78b5a1d29b60e340da95d303da37ce33c13
time=2025-03-04T16:02:52.782+04:00 level=DEBUG source=sched.go:385 msg="sending an unloaded event" modelPath=$HOME/Data/Models/.ollama/models/blobs/sha256-234bf54d16c772dc566325ffeadcb78b5a1d29b60e340da95d303da37ce33c13
time=2025-03-04T16:02:52.782+04:00 level=DEBUG source=sched.go:309 msg="ignoring unload event with no pending requests"
time=2025-03-04T16:02:52.782+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="51.9 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="52.0 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1
time=2025-03-04T16:02:52.917+04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-34fcc318-430c-5f94-4d32-922cfac59ff1 name="NVIDIA GeForce RTX 3050 Laptop GPU" overhead="0 B" before.total="3.7 GiB" before.free="3.5 GiB" now.total="3.7 GiB" now.free="3.5 GiB" now.used="177.4 MiB"
releasing cuda driver library
time=2025-03-04T16:02:53.032+04:00 level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.394038201 model=$HOME/Data/Models/.ollama/models/blobs/sha256-234bf54d16c772dc566325ffeadcb78b5a1d29b60e340da95d303da37ce33c13
time=2025-03-04T16:02:53.032+04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="62.5 GiB" before.free="52.0 GiB" before.free_swap="24.0 GiB" now.total="62.5 GiB" now.free="51.9 GiB" now.free_swap="24.0 GiB"
initializing /usr/lib/libcuda.so.570.124.04
dlsym: cuInit - 0x7f047fd0de60
dlsym: cuDriverGetVersion - 0x7f047fd0de80
dlsym: cuDeviceGetCount - 0x7f047fd0dec0
dlsym: cuDeviceGet - 0x7f047fd0dea0
dlsym: cuDeviceGetAttribute - 0x7f047fd0dfa0
dlsym: cuDeviceGetUuid - 0x7f047fd0df00
dlsym: cuDeviceGetName - 0x7f047fd0dee0
dlsym: cuCtxCreate_v3 - 0x7f047fd0e180
dlsym: cuMemGetInfo_v2 - 0x7f047fd0e900
dlsym: cuCtxDestroy - 0x7f047fd6ca80
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 1

The text was updated successfully, but these errors were encountered:

fairydreaming · 2025-03-05T09:44:28Z

This is llama.cpp project, you seem to have problems with ollama, so post your issues here: https://github.com/ollama/ollama

blues-alex added the bug-unconfirmed label Mar 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: Server returns 500 error on /api/generate and /api/chat requests #12176

Eval bug: Server returns 500 error on /api/generate and /api/chat requests #12176

blues-alex commented Mar 4, 2025

fairydreaming commented Mar 5, 2025

Eval bug: Server returns 500 error on /api/generate and /api/chat requests #12176

Eval bug: Server returns 500 error on /api/generate and /api/chat requests #12176

Comments

blues-alex commented Mar 4, 2025

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

fairydreaming commented Mar 5, 2025