Custom LLM with SFT + LoRA + RAG

Model Description

This model is a Qwen2.5/7B large language model fine-tuned using Parameter-Efficient Fine-Tuning (LoRA) with a custom SFT dataset. It is designed to provide enhanced responses within a specific context defined by the user.

Training Procedure

Synthetic SFT pairs generated with ChatGPT.
Expansion of the SFT dataset to cover broader contexts.
LoRA adapters trained on Qwen2.5/7B for efficient fine-tuning.
RAG integration with FAISS vector database for document retrieval.

Intended Use

Conversational AI in specific domains
Enhanced question-answering using RAG
Applications requiring lightweight fine-tuning without full model training

Limitations

Requires GPU for training
RAG performance depends on quality and coverage of the document corpus
Behavior outside the trained context may be unpredictable

Example Usage

Please use the complete instructions on github: repo

from backend.main import HealthRAG

llm = HealthRAG()
response = llm.ask_enhanced_llm("Explain preventive healthcare tips")
print(response)

How to Cite

If you use this model in your research or projects, please cite it as:

Custom LLM with SFT + LoRA + RAG, Gabriel Pacheco, 2025

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support