Qwopus3.5-27B-v3-TQ3_4S

TQ3_4S is a 3.5-bit Walsh-Hadamard-transform weight format with four per-8 scales per 32-weight block.

This release is a TQ3_4S GGUF quantization of Jackrong/Qwopus3.5-27B-v3, which is itself derived from the Qwen3.5-27B family.

Quantization Source

HF source checkout:
- Jackrong/Qwopus3.5-27B-v3
upstream family:
- Qwen/Qwen3.5-27B
F16 GGUF used as the quantization source:
- Qwopus3.5-27B-v3-f16.gguf

Quantized with:

./build/bin/llama-quantize \
  /path/to/Qwopus3.5-27B-v3-f16.gguf \
  /path/to/Qwopus3.5-27B-v3-TQ3_4S.gguf \
  TQ3_4S \
  8

Quality

Full-pass wiki.test.raw, c=2048:

Final PPL = 6.3433 +/- 0.03999
Median chunk PPL = 6.1953

Runtime Validation

Validated on clean public llama.cpp-tq3 main:

runtime commit: 62eb27dce
runtime requirement:
- turbo-tan/llama.cpp-tq3
strict chat smoke:
- prompt: Write ONLY the word ok.
- response: ok
multimodal projector:
- mmproj.gguf

Validated server profile:

./build/bin/llama-server \
  -m /path/to/Qwopus3.5-27B-v3-TQ3_4S.gguf \
  -mm /path/to/mmproj.gguf \
  -a qwopus35-27b-v3-tq3_4s \
  --host 127.0.0.1 --port 8080 \
  -ngl 99 -c 8192 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --no-warmup --jinja \
  --reasoning off --reasoning-budget 0 --reasoning-format deepseek \
  --cache-ram 0 --no-mmproj-offload

Recommended Chat Settings

For cleaner short-answer behavior on this reasoning-distilled model, the best local setting I found was:

--reasoning on --reasoning-budget 0 --temp 0.6 --top-k 20 --min-p 0 --repeat-penalty 1.0

This helps suppress visible thinking-tag spill better than --reasoning off on simple prompts.

Vision / Image Input

The repo includes mmproj.gguf for multimodal use.

If your frontend says image input is unsupported, it is usually talking to an older server process that was started without --mmproj.

Notes

This is a weight quantization release for the Qwopus v3 model line.
Running this GGUF requires the TQ3_4S runtime in:
- turbo-tan/llama.cpp-tq3

Credits

Downloads last month: 2,511

GGUF

Model size

27B params

Architecture

qwen35

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for YTan2000/Qwopus3.5-27B-v3-TQ3_4S

Base model

Qwen/Qwen3.5-27B

Finetuned

unsloth/Qwen3.5-27B

Adapter

Jackrong/Qwopus3.5-27B-v3

Quantized

(25)

this model