Qwen3-14B-PubMedQA-LoRA-Adapters

Model Overview

This repository contains LoRA adapters for fine-tuning the Qwen3-14B model specifically for biomedical question answering. The model has been trained on the PubMedQA dataset to provide accurate yes/no/maybe answers followed by detailed explanations for biomedical questions.

  • Developed by: huseyincavus
  • Base Model: unsloth/qwen3-14b-unsloth-bnb-4bit
  • License: Apache 2.0
  • Model Type: Parameter-Efficient Fine-Tuning (PEFT) using LoRA adapters
  • Language: English
  • Domain: Biomedical Question Answering
  • Project Repository: Github

Fine-Tuning Details

This model was efficiently fine-tuned using the Unsloth library, achieving 2x faster training speeds with reduced memory usage. The fine-tuning process was optimized to run on a single free-tier Google Colab T4 GPU.

Training Configuration

  • Dataset: PubMedQA (pqa_artificial split)
  • Technique: Low-Rank Adaptation (LoRA)
  • Quantization: 4-bit quantization for memory efficiency
  • Training Steps: 300 steps (demonstration)
  • Training Time: Approximately 2 hours on T4 GPU
  • Optimization: Unsloth acceleration framework

Key Features

  • Specialized Domain: Biomedical question answering
  • Output Format: Direct yes/no/maybe answer with detailed explanation
  • Memory Efficient: 4-bit quantization enables training on limited hardware
  • Fast Training: 2x speed improvement with Unsloth optimization
  • Lightweight: LoRA adapters require minimal storage and can be easily shared

Model Capabilities

The model is designed to:

  • Answer biomedical questions with yes/no/maybe responses
  • Provide detailed explanations based on scientific context
  • Process complex biomedical literature and research questions
  • Maintain accuracy while being resource-efficient

Usage

Loading the Model

from unsloth import FastLanguageModel
from peft import PeftModel
import torch

# Load base model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/qwen3-14b-unsloth-bnb-4bit",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

# Load LoRA adapters
model = PeftModel.from_pretrained(model, "huseyincavus/Qwen3-14B-PubMedQA-lora-adapters")

Inference Example

from transformers import TextStreamer

# Define the conversation using the chat template structure
system_prompt = "You are a helpful biomedical assistant. Your task is to answer the given question based on the provided context. First, provide a simple 'yes', 'no', or 'maybe' answer, followed by a detailed explanation."

user_question = "Is there a definitive link between coffee consumption and a reduced risk of Parkinson's disease?"

user_context = "Several epidemiological studies have suggested an inverse association between coffee consumption and the risk of Parkinson's disease (PD). A large meta-analysis of 26 studies found that the risk of PD was, on average, 30% lower in coffee drinkers compared to non-drinkers. The association appears to be dose-dependent. However, the mechanism is not fully understood, though caffeine's role as an adenosine A2A receptor antagonist is a leading hypothesis. It's important to note that these are observational studies, which show correlation but cannot prove causation."

# Combine the question and context into the user's message
user_prompt = f"Question: {user_question}\nContext: {user_context}"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt},
]

# Apply the chat template - add_generation_prompt=True is crucial
prompt = tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=True,
    enable_thinking=False  # Prevents thinking blocks for direct answers
)

# Tokenize and move to GPU
model_inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

# Set up streaming for live output
streamer = TextStreamer(tokenizer, skip_prompt=True)

# Generate response with streaming
print("\n" + "="*50)
print(" BIOMEDICAL QA MODEL RESPONSE")
print("="*50 + "\n")

outputs = model.generate(
    **model_inputs,
    streamer=streamer,
    max_new_tokens=256,
    temperature=0.6,
    top_p=0.9,
    do_sample=True
)

Training Process

The complete training pipeline is available in the GitHub repository with the Jupyter notebook Qwen3_(14B)_PubMed_QA.ipynb, which includes:

  1. Environment Setup: Installation of required libraries (unsloth, transformers, peft, trl)
  2. Model Loading: Loading Qwen3-14B with 4-bit quantization
  3. Data Preprocessing: Formatting PubMedQA dataset for chat-based training
  4. Fine-tuning: Using SFTTrainer for parameter-efficient training
  5. Evaluation: Inference testing on biomedical questions
  6. Model Saving: Pushing LoRA adapters to Hugging Face Hub

Performance

The model demonstrates strong performance on biomedical question answering tasks, providing:

  • Accurate yes/no/maybe classifications
  • Detailed, context-aware explanations
  • Consistent response format suitable for downstream applications

Limitations

  • Trained specifically on biomedical domain; may not generalize to other domains
  • Limited to English language
  • Requires base Qwen3-14B model for full functionality
  • Performance may vary on questions outside the PubMedQA distribution

Getting Started

To reproduce the training or use the model:

  1. Open the notebook in Google Colab
  2. Enable T4 GPU: Runtime > Change runtime type > T4 GPU
  3. Run all cells to execute the complete pipeline
  4. Optional: Set up HF_TOKEN in Colab Secrets to save your own adapters

About This Project

This is a personal project exploring biomedical AI applications. Feel free to use, modify, or build upon this work! If you find it helpful, a mention or star would be appreciated but not required.

Acknowledgments

  • Unsloth Team for the acceleration framework
  • Qwen Team for the base model
  • PubMedQA Dataset creators for the training data
  • Hugging Face for the model hosting and tools

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train huseyincavus/Qwen3-14B-PubMedQA-lora-adapters