Verus
Collection
2 items • Updated
How to use 8F-ai/Verus-4B with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("image-text-to-text", model="8F-ai/Verus-4B")
messages = [
{
"role": "user",
"content": [
{"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
{"type": "text", "text": "What animal is on the candy?"}
]
},
]
pipe(text=messages) # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("8F-ai/Verus-4B", dtype="auto")How to use 8F-ai/Verus-4B with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "8F-ai/Verus-4B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "8F-ai/Verus-4B",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this image in one sentence."
},
{
"type": "image_url",
"image_url": {
"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
}
}
]
}
]
}'docker model run hf.co/8F-ai/Verus-4B
How to use 8F-ai/Verus-4B with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "8F-ai/Verus-4B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "8F-ai/Verus-4B",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this image in one sentence."
},
{
"type": "image_url",
"image_url": {
"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
}
}
]
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "8F-ai/Verus-4B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "8F-ai/Verus-4B",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this image in one sentence."
},
{
"type": "image_url",
"image_url": {
"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
}
}
]
}
]
}'How to use 8F-ai/Verus-4B with Docker Model Runner:
docker model run hf.co/8F-ai/Verus-4B
This repository contains model weights and configuration files for Verus-4B in the Hugging Face Transformers format.
Compatible with Hugging Face Transformers, vLLM, SGLang, and other major inference frameworks.
Primary intended use cases are code generation, code review, debugging, and general coding assistance.
| Property | Value |
|---|---|
| Parameters | ~4B |
| Context Length | 262,144 tokens |
| Architecture | Qwen3.5 |
| Chat Format | ChatML (<|im_start|> / <|im_end|>) |
| Dtype | bfloat16 |
| License | Apache 2.0 |
pip install "transformers>=4.52.0" accelerate torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
MODEL_ID = "8F-ai/Verus-4B"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model.eval()
messages = [
{
"role": "system",
"content": "You are Verus, a coding assistant made by 8F-ai. You help with coding tasks and keep responses focused and clean."
},
{
"role": "user",
"content": "Write a Python async context manager that manages a PostgreSQL connection pool using asyncpg."
}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.inference_mode():
generated_ids = model.generate(**inputs, max_new_tokens=2048, temperature=0.1, top_p=0.95)
output = tokenizer.decode(generated_ids[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(output)
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
MODEL_ID = "8F-ai/Verus-4B"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model.eval()
messages = [
{
"role": "user",
"content": [
{"type": "image", "image": "path/to/screenshot.png"},
{"type": "text", "text": "Convert this UI screenshot into a React component using Tailwind CSS."}
]
}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.inference_mode():
generated_ids = model.generate(**inputs, max_new_tokens=2048, temperature=0.1, top_p=0.95)
output = tokenizer.decode(generated_ids[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(output)
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
)
tokenizer = AutoTokenizer.from_pretrained("8F-ai/Verus-4B")
model = AutoModelForCausalLM.from_pretrained(
"8F-ai/Verus-4B",
quantization_config=quantization_config,
device_map="auto",
)
| Use Case | Example |
|---|---|
| Code Generation | Write functions, classes, scripts in any language |
| Debugging | Identify and fix bugs from error messages or code |
| Code Review | Suggest improvements, catch issues, explain code |
| UI to Code | Convert screenshots or diagrams into working code |
| Long Context Codebase | Reason over entire repos up to ~200K tokens |
| General Q&A | Answer programming questions clearly and concisely |
@misc{verus4b2026,
title = {Verus-4B: A Coding-Focused Multimodal Language Model with 262K Context},
author = {8F-ai},
year = {2026},
howpublished = {\url{https://huggingface.co/8F-ai/Verus-4B}},
note = {Apache 2.0 License}
}
Verus-4B is released under the Apache License 2.0. See LICENSE for full terms.
Derived from Qwen/Qwen3.5-4B (Apache 2.0).