Bad Performance

by JanPf - opened Apr 9

Apr 9

Hey,

thank you very much for your dedication! We tried to use the model in Q4_K_S, but the reply doesn't seem to have anything to do with the question. Is this a problem on our end? This issue does not appear when using our GGUF: https://huggingface.co/LSX-UniWue/LLaMmlein_7B_chat-gguf

Best,
Jan & the LLaMmlein-Gang ( @Julia287 , @arthur3131 )

mradermacher

Owner Apr 9

•

edited Apr 9

Works for me, but not well (see below). If our ggufs are affected, then likely everybody's ggufs are affected, because we just use standard settings. Did you have to do anything special to create your gguf?

> Was sind Primzahlen?
Primzahlen ist der Überbegriff für Zahlen. die auch als Primzahlen bekannt sind.

Zunächst eine Definition für die Bedeutung von Primzahlen.


Dieser Begriff ist nicht sehr weit bekannt.

Für den Begriff einer Primzahl, die Sie kennen.

Diese Begriffe sind nicht sehr bekannt, aber das ist gut für sie.

Dieser Begriff ist nicht sehr bekannt.

Dieser Begriff ist nicht sehr bekannt.

Dieser Begriff ist nicht sehr bekannt.

Dieser Begriff ist nicht sehr bekannt.

Dieser Begriff ist nicht sehr bekannt.


Dieser Begriff ist nicht sehr bekannt.
Dieser Begriff ist nicht sehr bekannt.

Julia287

Apr 9

When prompting our GGUF file (https://huggingface.co/LSX-UniWue/LLaMmlein_7B_chat-gguf), which we build using the defaults of llama.cpp (python3 convert_hf_to_gguf.py /Users/juliawunderle/our_model --outfile our_model.gguf --outtype f16), we get a much better response:

> Was sind Primzahlen?

Eine Primzahl ist eine Zahl, die nur durch sich selbst und 1 teilbar ist. Das bedeutet, dass sie keinen Teiler kleiner als sich selbst hat, der größer als 1 ist.

Die ersten 10 Primzahlen sind 2, 3, 5, 7, 11, 13, 17, 19, 23 und 29.

Es gibt unendlich viele Primzahlen.

Primzahlen sind sehr wichtig in der Mathematik und werden unter anderem zur Verschlüsselung von Informationen verwendet.

Best,
LLaMmlein-Gang 🐑

mradermacher

Owner Apr 9

•

edited Apr 9

Hmm, your gguf seems just as bad, though:

> Was sind Primzahlen?











. Person

mradermacher

Owner Apr 9

I cannot reproduce your results. How do you invoke llama to get this response?

mradermacher

Owner Apr 9

In fact, I always seem to get the same response with your gguf, lots of newlines and ". Person".

mradermacher

Owner Apr 9

Well, not always:

> Was ist Berlin?














































 Einwilligung Eingreifens . Ehe damaligen damaligen Kindheit Direktwerbung Direktwerbung vertrauens Verfügung

Julia287

Apr 9

We inference using lm studio on a mac. Here our ggufs work. Which framework are you using?

 ➜ md5sum /Users/juliawunderle/.lmstudio/models/lmstudio-community/Selected16/selected16.gguf
39cfb471aabe64d7caf270bc865a5bfb  /Users/juliawunderle/.lmstudio/models/lmstudio-community/Selected16/selected16.gguf

mradermacher

Owner Apr 9

•

edited Apr 9

No framework, I use llama.cpp, version 5102

When I use -p, I often get somewhat sensible answers, but not always, with either gguf. To be honest, this seems like a model problem. If the unquantzized model already performs as badly as it does, then you can expect the quantized version to have even more trouble, though.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment