Gaokerena-R

This is gaokerena-R, a model trained with a limited-data approach to enhance the Persian medical reasoning capabilities of the aya-expanse-8b model. Despite using less data, gaokerena-R outperforms our previous model, gaokerena-V, which was trained on a much larger dataset. This demonstrates the effectiveness of our reasoning-focused training strategy under data-constrained conditions.

Model Sources

Risks and Limitations

While Gaokerena aims to provide accurate information, it is not a substitute for professional medical advice. The model may have limitations in:

  • Handling medical emergencies.
  • Addressing highly specialized or rare medical conditions.
  • Offering region-specific guidance, as the training data does not include localized Persian medical practices.

How to Get Started with the Model

Since the model has been built upon Aya, you can use this model in a single or multi-modal configuration.

Single modal inference

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft.peft_model import PeftModel

device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16

model = AutoModelForCausalLM.from_pretrained(
    "CohereForAI/aya-expanse-8b",
    torch_dtype=dtype,
    device_map=device
)
tokenizer = AutoTokenizer.from_pretrained("CohereForAI/aya-expanse-8b")

model = PeftModel.from_pretrained(model = model,model_id = "gaokerena/gaokerena-r1.0")
model = model.merge_and_unload()

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
pipe_output = pipe([{"role": "user", "content": "چگونه استرس می‌تواند باعث ایجاد آفت دهان شود؟"}],
                       max_new_tokens=1024,
                       eos_token_id=[tokenizer.eos_token_id],
                       do_sample=False,
)

output = pipe_output[0]["generated_text"][-1]["content"]
print(output)

How to Get Started with the Model

Since the model has been built upon Aya, you can use this model in a single or multi-modal configuration.

Single modal inference

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft.peft_model import PeftModel

device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16

model = AutoModelForCausalLM.from_pretrained(
    "CohereForAI/aya-expanse-8b",
    torch_dtype=dtype,
    device_map=device
)
tokenizer = AutoTokenizer.from_pretrained("CohereForAI/aya-expanse-8b")

model = PeftModel.from_pretrained(model = model,model_id = "gaokerena/gaokerena-r1.0")
model = model.merge_and_unload()

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
pipe_output = pipe([{"role": "user", "content": "چگونه استرس می‌تواند باعث ایجاد آفت دهان شود؟"}],
                       max_new_tokens=1024,
                       eos_token_id=[tokenizer.eos_token_id],
                       do_sample=False,
)

output = pipe_output[0]["generated_text"][-1]["content"]
print(output)

Training Details

Bibtex

if you found our model useful feel free to give us a cite!

@misc{Gaokerena-R1.0,
  title={Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training},
  author={Ghassabi, Mehrdad and Hakim, Sadra and Baradaran Kashani, Hamidreza and Rostami, Pedram},
  year={2025}
  eprint={2510.20059},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gaokerena/gaokerena-r1.0

Adapter
(17)
this model