DeAR-8B-Reranker-CE-LoRA-v1

Model Description

DeAR-8B-Reranker-CE-LoRA-v1 is a LoRA (Low-Rank Adaptation) adapter for neural reranking trained with Binary Cross-Entropy loss. This lightweight adapter requires only ~100MB of storage and can be applied to LLaMA-3.1-8B to achieve near full-model performance with minimal overhead.

Model Details

Model Type: LoRA Adapter for Pointwise Reranking
Base Model: meta-llama/Llama-3.1-8B
Adapter Size: ~100MB
Training Method: LoRA with Binary Cross-Entropy + Knowledge Distillation
LoRA Rank: 16
LoRA Alpha: 32
Trainable Parameters: 67M (0.8% of total)

Key Features

✅ Ultra Lightweight: Only 100MB storage
✅ Efficient: 3x faster training than full fine-tuning
✅ High Performance: 98% of full model accuracy
✅ Easy Integration: Simple adapter loading
✅ Classification-based: Binary relevance prediction

Usage

Load and Use

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel, PeftConfig

# Load LoRA adapter
adapter_path = "abdoelsayed/dear-8b-reranker-ce-lora-v1"
config = PeftConfig.from_pretrained(adapter_path)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Load base model
base_model = AutoModelForSequenceClassification.from_pretrained(
    config.base_model_name_or_path,
    num_labels=1,
    torch_dtype=torch.bfloat16
)

# Load and merge LoRA
model = PeftModel.from_pretrained(base_model, adapter_path)
model = model.merge_and_unload()
model.eval().cuda()

# Score query-document pair
query = "What is machine learning?"
document = "Machine learning is a subset of artificial intelligence..."

inputs = tokenizer(
    f"query: {query}",
    f"document: {document}",
    return_tensors="pt",
    truncation=True,
    max_length=228,
    padding="max_length"
)
inputs = {k: v.cuda() for k, v in inputs.items()}

with torch.no_grad():
    score = model(**inputs).logits.squeeze().item()
print(f"Relevance score: {score}")

Batch Reranking

@torch.inference_mode()
def rerank(tokenizer, model, query: str, documents, batch_size=64):
    scores = []
    device = next(model.parameters()).device
    
    for i in range(0, len(documents), batch_size):
        batch = documents[i:i + batch_size]
        queries = [f"query: {query}"] * len(batch)
        docs = [f"document: {title} {text}" for title, text in batch]
        
        inputs = tokenizer(queries, docs, return_tensors="pt", 
                         truncation=True, max_length=228, padding=True)
        inputs = {k: v.to(device) for k, v in inputs.items()}
        
        logits = model(**inputs).logits.squeeze(-1)
        scores.extend(logits.cpu().tolist())
    
    return sorted(enumerate(scores), key=lambda x: x[1], reverse=True)

Training Details

LoRA Configuration

{
    "r": 16,
    "lora_alpha": 32,
    "target_modules": ["q_proj", "v_proj", "k_proj", "o_proj", 
                       "gate_proj", "up_proj", "down_proj"],
    "lora_dropout": 0.05,
    "bias": "none",
    "task_type": "SEQ_CLS"
}

Training Hyperparameters

Learning Rate: 1e-4
Batch Size: 4
Gradient Accumulation: 2
Epochs: 2
Hardware: 4x A100 (40GB)
Training Time: ~12 hours
Memory: ~28GB per GPU

Advantages

Feature	LoRA	Full Model
Storage	100MB	16GB
Training Time	12h	34h
Performance	98%	100%
Memory	28GB	38GB

Related Models

DeAR-8B-CE - Full model
DeAR-8B-RankNet-LoRA - RankNet variant
Teacher Model

Citation

@article{abdallah2025dear,
  title={DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation},
  author={Abdallah, Abdelrahman and Mozafari, Jamshid and Piryani, Bhawna and Jatowt, Adam},
  journal={arXiv preprint arXiv:2508.16998},
  year={2025}
}

License

MIT License

More Information

GitHub: DataScienceUIBK/DeAR-Reranking
Paper: arXiv:2508.16998
Collection: DeAR Models

Downloads last month: 1

Model tree for abdoelsayed/dear-8b-reranker-ce-lora-v1

Base model

meta-llama/Llama-3.1-8B

Adapter

(520)

this model

Datasets used to train abdoelsayed/dear-8b-reranker-ce-lora-v1

Collection including abdoelsayed/dear-8b-reranker-ce-lora-v1

DeAR-Reranking

Collection

DeAR (Deep Agent Rank): Dual-Stage Document Reranking with Reasoning Agents Accepted at EMNLP Findings 2025 • 12 items • Updated Oct 21 • 1