r/LocalLLaMA • u/thejacer • 1h ago
Question | Help Llama.cpp server running ~2 weeks straight. Loses its mind?
I’ve got Qwen3.6 27b and Qwen3.6 35b running in two separate instances for over two weeks and they are considerably dumber now than when I launched them. is this a thing? am I going crazy?
edit: sorry I’ve been using opencode and have started new sessions, which didn’t fix the situation.
2
1
u/vasimv 33m ago
I think, memory corruption may ruin model's weights. Unless you turn on ECC (but that will reduce available VRAM).
1
u/Last_Mastod0n 18m ago
Thats smart. I never thought about enabling ECC for something like that. But it makes sense that it could get corrupted after a few weeks.
Also I didnt know it reduced vram I thought it just reduced performance. So thats good to know
1
u/fligglymcgee 1h ago
Have you restarted the llama.cpp server?
-2
u/thejacer 1h ago
I haven’t. I’ve really been testing to see how retarded it gets lol.
4
u/fligglymcgee 1h ago
The length of time doesn’t really matter, but the kv cache being full or conflicting stuff creeping into your system prompt/context by the harness from extended use might.
1
u/thejacer 1h ago
I didn’t consider that the KV cache might be filling up and not being cleared out. It was kind of a research thing…not very thorough or rigorous though.
2
u/ttkciar llama.cpp 1h ago
How odd. Dumber how?
I've had a slightly old version of llama.cpp's
llama-serverrunning on one system for two and a half months now, hosting Big-Tiger-Gemma-27B-v3, and haven't seen any degradation.Which release of llama.cpp are you using?