Instructions to use darkc0de/XORTRON.CriminalComputing.LARGE.2026.3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use darkc0de/XORTRON.CriminalComputing.LARGE.2026.3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="darkc0de/XORTRON.CriminalComputing.LARGE.2026.3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("darkc0de/XORTRON.CriminalComputing.LARGE.2026.3")
model = AutoModelForCausalLM.from_pretrained("darkc0de/XORTRON.CriminalComputing.LARGE.2026.3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use darkc0de/XORTRON.CriminalComputing.LARGE.2026.3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "darkc0de/XORTRON.CriminalComputing.LARGE.2026.3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "darkc0de/XORTRON.CriminalComputing.LARGE.2026.3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/darkc0de/XORTRON.CriminalComputing.LARGE.2026.3

SGLang

How to use darkc0de/XORTRON.CriminalComputing.LARGE.2026.3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "darkc0de/XORTRON.CriminalComputing.LARGE.2026.3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "darkc0de/XORTRON.CriminalComputing.LARGE.2026.3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "darkc0de/XORTRON.CriminalComputing.LARGE.2026.3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "darkc0de/XORTRON.CriminalComputing.LARGE.2026.3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use darkc0de/XORTRON.CriminalComputing.LARGE.2026.3 with Docker Model Runner:
```
docker model run hf.co/darkc0de/XORTRON.CriminalComputing.LARGE.2026.3
```

Abnormally poor spelling?

by nawoalanor - opened 24 days ago

Discussion

nawoalanor

24 days ago

I experimented with Q4_K_M quantization from here: mradermacher/XORTRON.CriminalComputing.LARGE.2026.3-i1-GGUF

I used this startup command: llama-server.exe -m "A:\AI\Llama\Models\etc\XORTRON.CriminalComputing.LARGE.2026.3.i1-Q4_K_M.gguf" -ctk q8_0 -ctv q8_0 -ngl 99 -ub 2048 --parallel 1 --alias llama --no-mmap --ctx-size 131072 --port 5001 --flash-attn on --host 0.0.0.0

Based on the UGI benchmark result I was expecting very good (or at least interesting) results but, without exaggerating, I got the worst output quality of any model I've ever tried. Worse than a 4B or 2B running on my phone. The most obvious problem was terrible spelling. Maybe around 1 in 15 words would have an obvious spelling error. The word "amber" became "ambor", etc.

Any idea what I might've done wrong to get such bad results? It's hard to believe it's a problem with the model itself. My parameters were very "normal", the same as I'd use with any other model.

darkc0de

Owner 23 days ago

Caused by the --cache-type-k q8_0 and --cache-type-v q8_0 flags??? I'm guessing...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment