Sem Karaman's picture

Sem Karaman

SemKara

·

AI & ML interests

None yet

Recent Activity

repliedto danielhanchen's post about 11 hours ago

Qwen3.6-35B-A3B can now be run locally! 💜 The model is the strongest mid-sized LLM on nearly all benchmarks. Run on 23GB RAM via Unsloth Dynamic GGUFs. GGUFs to run: https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF Guide: https://unsloth.ai/docs/models/qwen3.6

repliedto danielhanchen's post about 14 hours ago

Qwen3.6-35B-A3B can now be run locally! 💜 The model is the strongest mid-sized LLM on nearly all benchmarks. Run on 23GB RAM via Unsloth Dynamic GGUFs. GGUFs to run: https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF Guide: https://unsloth.ai/docs/models/qwen3.6

repliedto danielhanchen's post about 14 hours ago

Qwen3.6-35B-A3B can now be run locally! 💜 The model is the strongest mid-sized LLM on nearly all benchmarks. Run on 23GB RAM via Unsloth Dynamic GGUFs. GGUFs to run: https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF Guide: https://unsloth.ai/docs/models/qwen3.6

View all activity

Organizations

None yet

replied to danielhanchen's post about 11 hours ago

me too, if I could fit it on my GPU. but I highly doubt it's the quantization. 3.5 35B and all sorts of merges of it works perfectly fine at pretty much any quant

i see, in that case if you already tried sampling (repeat penalty 1.05-1.1) and quant seems fine then only other possible fix that worked sometimes for me is a direct system prompt to 'not overthink' and 'allow knock on errors after first solution'

also i think it's a given that you should be running 0.2 temp for 'precise' responses

replied to danielhanchen's post about 14 hours ago

i would try Q8 as last resort

replied to danielhanchen's post about 14 hours ago

I see, this is usually because you are using Q4 quants locally? Try to go up to Q5 or Q8, reduce GPU layers and context length to fit into you VRAM.

New activity in Jackrong/Qwopus3.5-27B-v3.5-GGUF about 14 hours ago

So far my top choice

#1 opened about 14 hours ago by

liked a model about 14 hours ago

Jackrong/Qwopus3.5-27B-v3.5-GGUF

Image-Text-to-Text • 27B • Updated 3 days ago • 6.27k • 19

New activity in unsloth/Qwen3.6-35B-A3B-GGUF 1 day ago

Sampling?

#13 opened 1 day ago by

replied to danielhanchen's post 1 day ago

oh 3.6 35B is a literal never ending reasoning loop for me. like 3 out 6 times need to kill the server type of deal

post your sampling setup

New activity in BugTraceAI/BugTraceAI-Apex-G4-26B-Q4 1 day ago

Dropping Tools

#1 opened 1 day ago by

New activity in Ex0bit/Gemma4-26B-A4B-PRISM-PRO-DQ-GGUF 6 days ago

Why "Mythos"?

#1 opened 6 days ago by

New activity in douyamv/Gemma-4-31B-JANG_4M-CRACK-GGUF 10 days ago

vision capabilites

#4 opened 10 days ago by

New activity in Jackrong/Qwopus3.5-27B-v3-GGUF 10 days ago

[Help] Why is Qwopus3.5-27B-v3 outputting its internal thinking and duplicating text?

#11 opened 11 days ago by

Failed to generate a valid tool call.

#10 opened 13 days ago by

New activity in Jackrong/Qwopus3.5-27B-v3-GGUF 12 days ago

Doesnt work in lm studio

#8 opened 16 days ago by

liked a model 12 days ago

Jackrong/Qwopus3.5-27B-v3-GGUF

Image-Text-to-Text • 27B • Updated 3 days ago • 166k • 341