15 6 24

Emin Temiz PRO

etemiz

https://pickabrain.ai

AI & ML interests

Alignment

Recent Activity

liked a model 4 days ago

mradermacher/Ostrich-32B-Qwen3-251003-i1-GGUF

new activity 5 days ago

mradermacher/Ostrich-32B-Qwen3-251003-GGUF:i1?

posted an update 5 days ago

Benchmarked 2 more models today. Any other model you want me to do? Working on a broader version of the AHA leaderboard. Follow for more quackery :)

View all activity

Organizations

None yet

liked a model 4 days ago

mradermacher/Ostrich-32B-Qwen3-251003-i1-GGUF

33B • Updated 4 days ago • 4.42k • 1

New activity in mradermacher/Ostrich-32B-Qwen3-251003-GGUF 5 days ago

i1?

#1 opened 5 days ago by

etemiz

posted an update 5 days ago

Post

251

Benchmarked 2 more models today. Any other model you want me to do?

Working on a broader version of the AHA leaderboard. Follow for more quackery :)

replied to their post 6 days ago

This fine tuning would score 56 and be placed 1st in the leaderboard but I didn't add it, I only include full trainings in the leaderboard or (further tunings by the same company):

https://huggingface.co/CWClabs/CWC-Mistral-Nemo-12B-V2-q4_k_m

posted an update 6 days ago

Post

230

O oh! things looking uglier by day

LLM builders in general are not doing a great job of making human aligned models.

I don't want to say this is a proxy for p(doom)... But it could be if we are not careful.

Most probable cause is reckless training LLMs using outputs of other LLMs, and don't caring about curation of datasets and not asking 'what is beneficial for humans?'...

1 reply

updated a model 7 days ago

etemiz/Mistral-Nemo-12B-CWC-Enoch-251014-GGUF

12B • Updated 7 days ago • 680

liked 2 models 11 days ago

Arki05/Grok-1-GGUF

316B • Updated Apr 11, 2024 • 13.1k • 68

bartowski/Yi-1.5-34B-Chat-GGUF

Text Generation • 34B • Updated May 12, 2024 • 1.17k • 24

published a model 15 days ago

etemiz/Mistral-Nemo-12B-CWC-Enoch-251014-GGUF

12B • Updated 7 days ago • 680

replied to their post 19 days ago

https://huggingface.co/huihui-ai/Huihui-GLM-4.5-Air-abliterated-GGUF/tree/main/Q3_K_M-GGUF

posted an update 20 days ago

Post

1042

Another abliteration by huihui which had big positive impact!

huihui-ai/Huihui-GLM-4.5-Air-abliterated-GGUF

@huihui-ai

3 replies

updated 2 models 21 days ago

etemiz/Ostrich-32B-Qwen3-251003

33B • Updated 21 days ago • 34 • 1

etemiz/Ostrich-32B-AHA-Qwen3-250830

33B • Updated 21 days ago • 14 • 1

replied to their post 21 days ago

Our leaderboard can be used for human alignment in an RL setting. Ask the same question to top models and worst models and the answer from top models can get +1 score, bad models can get -1. Ask many times with higher temperature to generate more answers. What do you think?

posted an update 21 days ago

Post

195

Grok 2 is worse than 1 but better than 3. This was already measured using API but now we measured the LLM and the results are similar.

GLM is ranking higher and higher compared to previous versions. Nice trend!