Anatoly Potapov
commited on
Commit
·
a7f8ed0
1
Parent(s):
0b3dabc
Cite Vikhrmodels
Browse files
README.md
CHANGED
|
@@ -73,7 +73,7 @@ This benchmark was carefully translated into Russian and measured with [LLM Judg
|
|
| 73 |
|
| 74 |
### 🏟️ [Arena](https://github.com/lm-sys/arena-hard-auto)
|
| 75 |
|
| 76 |
-
We used Russian version of Arena benchmark and [Arena Hard Auto](https://github.com/lm-sys/arena-hard-auto) codebase
|
| 77 |
for evaluation. As baseline model we chose gpt3.5-turbo-0125 and the judge was gpt-4-1106-preview.
|
| 78 |
|
| 79 |
<style>
|
|
|
|
| 73 |
|
| 74 |
### 🏟️ [Arena](https://github.com/lm-sys/arena-hard-auto)
|
| 75 |
|
| 76 |
+
We used Russian version of Arena benchmark from [Vikhrmodels](https://huggingface.co/datasets/Vikhrmodels/ru-arena-general) and [Arena Hard Auto](https://github.com/lm-sys/arena-hard-auto) codebase
|
| 77 |
for evaluation. As baseline model we chose gpt3.5-turbo-0125 and the judge was gpt-4-1106-preview.
|
| 78 |
|
| 79 |
<style>
|