LLaMAX3-8B LoRA for Syllogistic Reasoning (Few-shot Training)
LoRA fine-tuned adapter for LLaMAX3-8B trained with a few-shot prompt format for syllogistic validity classification. Designed for SemEval 2026 Task 11 (Logical Reasoning with Content Effect).
Model Description
This adapter is fine-tuned on LLaMAX3-8B using a consistent few-shot training prompt containing two in-context examples (one valid and one invalid) followed by the target syllogism. It achieves strong performance and a low content effect when used with the same few-shot pattern at inference time.
Model Details
- Base model: LLaMAX/LLaMAX3-8B
- Task: Binary classification of syllogistic validity (valid vs invalid)
- Training method: Few-shot training prompt (2 examples)
- Inference method: Few-shot prompting (same 2-shot template)
- Dataset: 648 English syllogisms (90/10 split for train/dev)
- Hardware: A100 80GB
- Training time: ~13 minutes
LoRA Configuration
{ "r": 64, "lora_alpha": 128, "target_modules": [ "q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj" ], "lora_dropout": 0.05, "bias": "none", "task_type": "CAUSAL_LM" }
Trainable parameters: 167M of 8.1B total (~2.06%)
Performance (Validation, 240 items)
| Metric | Value |
|---|---|
| Accuracy | 86.3% |
| Content Effect (total) | 0.056 |
| Ranking Score | 15.45 |
| Intra-Plausibility CE | 0.0757 |
| Cross-Plausibility CE | 0.0360 |
Comparison
| Model | Accuracy | Content Effect | Ranking Score |
|---|---|---|---|
| Zero-shot trained + few-shot inference | 87.5% | 0.058 | 15.03 |
| Few-shot trained + few-shot inference (this) | 86.3% | 0.056 | 15.45 |
Installation
pip install transformers peft torch
Usage
import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel
Load base model and tokenizer base_id = "LLaMAX/LLaMAX3-8B" base_model = AutoModelForCausalLM.from_pretrained( base_id, torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(base_id) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token
Load LoRA adapter model = PeftModel.from_pretrained( base_model, "maytemuma/llamax3-8b-lora-fewshot" ) model.eval()
Inference (few-shot prompt)
def classify_syllogism(syllogism: str, model, tokenizer) -> bool: """ Returns True for 'valid', False for 'invalid'. """ prompt = f"""Task: Determine if logical arguments are valid or invalid.
Example 1: Syllogism: All dogs are animals. All animals are living things. Therefore, all dogs are living things. Answer: valid
Example 2: Syllogism: No cats are dogs. Some animals are dogs. Therefore, some animals are cats. Answer: invalid
Example 3: Syllogism: {syllogism} Answer:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=5, temperature=0.7, do_sample=True, top_p=0.9, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode( outputs[inputs["input_ids"].shape:],[1] skip_special_tokens=True ).strip().lower()
if response.startswith("valid") and "invalid" not in response[:10]: return True if "invalid" in response[:15]: return False return False
Training Details
Few-shot Training Prompt
Task: Determine if logical arguments are valid or invalid.
Example 1: Syllogism: All dogs are animals. All animals are living things. Therefore, all dogs are living things. Answer: valid
Example 2: Syllogism: No cats are dogs. Some animals are dogs. Therefore, some animals are cats. Answer: invalid
Example 3: Syllogism: {syllogism} Answer: {label}
Hyperparameters
{ "learning_rate": 2e-4, "num_epochs": 10, "batch_size": 4, "gradient_accumulation_steps": 4, "warmup_ratio": 0.1, "weight_decay": 0.01, "max_seq_length": 768, "optimizer": "adamw_torch", "lr_scheduler": "cosine", "bf16": true, "early_stopping_patience": 3 }
Note: max_seq_length was set to 768 to accommodate longer few-shot prompts safely.
Limitations
- Requires the few-shot prompt format at inference for best results.
- English-only training data; performance may drop on other languages.
- Binary validity classification; does not provide explanations.
Intended Use
- Research on logical reasoning and content effects.
- SemEval 2026 Task 11 benchmarks and submissions.
- Educational demonstrations of LoRA for reasoning tasks.
Citation
@misc{llamax3-syllogistic-fewshot-2025, author = {Maria Teresa Muñoz Martin}, title = {LLaMAX3-8B LoRA for Syllogistic Reasoning (Few-shot Training)}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/maytemuma/llamax3-8b-lora-fewshot}}, note = {SemEval 2026 Task 11: Logical Reasoning with Content Effect} }
License
Apache-2.0. This adapter inherits the license terms compatible with its base model.
- Downloads last month
- 23
Model tree for maytemuma/llamax3-8b-lora-fewshot
Base model
LLaMAX/LLaMAX3-8B