Custom LLM with SFT + LoRA + RAG

Model Description

This model is a Qwen2.5/7B large language model fine-tuned using Parameter-Efficient Fine-Tuning (LoRA) with a custom SFT dataset. It is designed to provide enhanced responses within a specific context defined by the user.

Training Procedure

  1. Synthetic SFT pairs generated with ChatGPT.
  2. Expansion of the SFT dataset to cover broader contexts.
  3. LoRA adapters trained on Qwen2.5/7B for efficient fine-tuning.
  4. RAG integration with FAISS vector database for document retrieval.

Intended Use

  • Conversational AI in specific domains
  • Enhanced question-answering using RAG
  • Applications requiring lightweight fine-tuning without full model training

Limitations

  • Requires GPU for training
  • RAG performance depends on quality and coverage of the document corpus
  • Behavior outside the trained context may be unpredictable

Example Usage

Please use the complete instructions on github: repo

from backend.main import HealthRAG

llm = HealthRAG()
response = llm.ask_enhanced_llm("Explain preventive healthcare tips")
print(response)

How to Cite

If you use this model in your research or projects, please cite it as:

Custom LLM with SFT + LoRA + RAG, Gabriel Pacheco, 2025
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support