LLaMAX3-8B LoRA for Syllogistic Reasoning (Few-shot Training)

LoRA fine-tuned adapter for LLaMAX3-8B trained with a few-shot prompt format for syllogistic validity classification. Designed for SemEval 2026 Task 11 (Logical Reasoning with Content Effect).

Model Description

This adapter is fine-tuned on LLaMAX3-8B using a consistent few-shot training prompt containing two in-context examples (one valid and one invalid) followed by the target syllogism. It achieves strong performance and a low content effect when used with the same few-shot pattern at inference time.

Model Details

  • Base model: LLaMAX/LLaMAX3-8B
  • Task: Binary classification of syllogistic validity (valid vs invalid)
  • Training method: Few-shot training prompt (2 examples)
  • Inference method: Few-shot prompting (same 2-shot template)
  • Dataset: 648 English syllogisms (90/10 split for train/dev)
  • Hardware: A100 80GB
  • Training time: ~13 minutes

LoRA Configuration

{ "r": 64, "lora_alpha": 128, "target_modules": [ "q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj" ], "lora_dropout": 0.05, "bias": "none", "task_type": "CAUSAL_LM" }

Trainable parameters: 167M of 8.1B total (~2.06%)

Performance (Validation, 240 items)

Metric Value
Accuracy 86.3%
Content Effect (total) 0.056
Ranking Score 15.45
Intra-Plausibility CE 0.0757
Cross-Plausibility CE 0.0360

Comparison

Model Accuracy Content Effect Ranking Score
Zero-shot trained + few-shot inference 87.5% 0.058 15.03
Few-shot trained + few-shot inference (this) 86.3% 0.056 15.45

Installation

pip install transformers peft torch

Usage

import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel

Load base model and tokenizer base_id = "LLaMAX/LLaMAX3-8B" base_model = AutoModelForCausalLM.from_pretrained( base_id, torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(base_id) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token

Load LoRA adapter model = PeftModel.from_pretrained( base_model, "maytemuma/llamax3-8b-lora-fewshot" ) model.eval()

Inference (few-shot prompt)

def classify_syllogism(syllogism: str, model, tokenizer) -> bool: """ Returns True for 'valid', False for 'invalid'. """ prompt = f"""Task: Determine if logical arguments are valid or invalid.

Example 1: Syllogism: All dogs are animals. All animals are living things. Therefore, all dogs are living things. Answer: valid

Example 2: Syllogism: No cats are dogs. Some animals are dogs. Therefore, some animals are cats. Answer: invalid

Example 3: Syllogism: {syllogism} Answer:"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=5, temperature=0.7, do_sample=True, top_p=0.9, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode( outputs[inputs["input_ids"].shape:],[1] skip_special_tokens=True ).strip().lower()

if response.startswith("valid") and "invalid" not in response[:10]: return True if "invalid" in response[:15]: return False return False

Training Details

Few-shot Training Prompt

Task: Determine if logical arguments are valid or invalid.

Example 1: Syllogism: All dogs are animals. All animals are living things. Therefore, all dogs are living things. Answer: valid

Example 2: Syllogism: No cats are dogs. Some animals are dogs. Therefore, some animals are cats. Answer: invalid

Example 3: Syllogism: {syllogism} Answer: {label}

Hyperparameters

{ "learning_rate": 2e-4, "num_epochs": 10, "batch_size": 4, "gradient_accumulation_steps": 4, "warmup_ratio": 0.1, "weight_decay": 0.01, "max_seq_length": 768, "optimizer": "adamw_torch", "lr_scheduler": "cosine", "bf16": true, "early_stopping_patience": 3 }

Note: max_seq_length was set to 768 to accommodate longer few-shot prompts safely.

Limitations

  • Requires the few-shot prompt format at inference for best results.
  • English-only training data; performance may drop on other languages.
  • Binary validity classification; does not provide explanations.

Intended Use

  • Research on logical reasoning and content effects.
  • SemEval 2026 Task 11 benchmarks and submissions.
  • Educational demonstrations of LoRA for reasoning tasks.

Citation

@misc{llamax3-syllogistic-fewshot-2025, author = {Maria Teresa Muñoz Martin}, title = {LLaMAX3-8B LoRA for Syllogistic Reasoning (Few-shot Training)}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/maytemuma/llamax3-8b-lora-fewshot}}, note = {SemEval 2026 Task 11: Logical Reasoning with Content Effect} }

License

Apache-2.0. This adapter inherits the license terms compatible with its base model.

Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for maytemuma/llamax3-8b-lora-fewshot

Base model

LLaMAX/LLaMAX3-8B
Adapter
(2)
this model