The model was trained on paired preferences from the MathDial and MRBench datasets.

To find more information and to cite, see:

@article{macina2025mathtutorbench,
      title={MathTutorBench: A Benchmark for Measuring Open-ended\\ Pedagogical Capabilities of LLM Tutors}, 
      author={Jakub Macina, Nico Daheim, Ido Hakimi, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan},
      year={2025},
      eprint={2502.18940},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.18940},
}

Downloads last month: 37

Safetensors

Model size

2B params

Tensor type

F32

Model tree for eth-nlped/Qwen2.5-1.5B-pedagogical-rewardmodel

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Finetuned

(1375)

this model

eth-nlped
/

Qwen2.5-1.5B-pedagogical-rewardmodel

Model tree for eth-nlped/Qwen2.5-1.5B-pedagogical-rewardmodel

Datasets used to train eth-nlped/Qwen2.5-1.5B-pedagogical-rewardmodel