Instructions to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="bartowski/Meta-Llama-3.1-8B-Instruct-GGUF",
	filename="Meta-Llama-3.1-8B-Instruct-IQ2_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

Use Docker

docker model run hf.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

LM Studio
Jan

vLLM

How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

Ollama
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Ollama:
```
ollama run hf.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
```

Unsloth Studio

How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for bartowski/Meta-Llama-3.1-8B-Instruct-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for bartowski/Meta-Llama-3.1-8B-Instruct-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for bartowski/Meta-Llama-3.1-8B-Instruct-GGUF to start chatting

How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Docker Model Runner:
```
docker model run hf.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
```

Lemonade

How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.Meta-Llama-3.1-8B-Instruct-GGUF-Q4_K_M

List all available models

lemonade list

🚩 Report: Not working

#14

by frank0071 - opened Sep 28, 2024

Discussion

frank0071

Sep 28, 2024

wtf is this model , wasting my time on GGG?
any you say , respons is GGG.

bartowski

Owner Sep 28, 2024

You're the first one in a long time to see this issue, usually it indicated an out of date install..

frank0071

Sep 28, 2024

i've actually tried llama.cpp sir, but thanks for response. i'll install the latest version of both apps ( gpt4all, llama.cpp)

SFBAI

Sep 28, 2024

•

edited Sep 28, 2024

frank0071, also check SHA256 hash of gguf file and compare with original string 9da71c45c90a821809821244d4971e5e5dfad7eb091f0b8ff0546392393b6283 it must be same
if SHA256 hash is different your gguf file is corrupt, download gguf file again

frank0071

Sep 29, 2024

•

edited Sep 29, 2024

thanks for answer softyfluffyboy, appreciated.
i updated my GPT4ALL tool to v3.3.1 , which is the latest one, i've got another llama3.2 ( in the image is 3.1) and still get "GGGGG" in the response

i've also updated my llama.cpp , there i get GGG as well.

i compare the checksum with your provided string , its different , but i've got no idea why , my inernet connection is not the problem im sure of that, i've just changed that gguf file name couple times, and anyway if it was because of that it shouldnt worked and printing GGG i guess?
am i right?, does changing the file name actually change the check sum? idont think so

bartowski

Owner Sep 29, 2024

No the file name shouldn't change the check sum, can you try downloading again? Corruption can happen over even the best networks, it's extremely rare but it happens

frank0071

Sep 29, 2024

sure mr bartowski i will download it to night and i'll let you posted , thanks for support though really appreciated it.

there is an idea in my mind that maybe that causing the problem , i have like 10 different llm all working , but i downloaded them using firefox.

but for your gguf llm i downloaded using "wget" in linux , so maybe that casuing the problem, ill test using firefox again, and let you know.
best reagard

frank0071

Sep 30, 2024

hi again guys, i downlaoded using firefox bulit-in dl manager ,

i kinda changed the llm though , but i downloaded three different llm from bartwoski's huggingface repo, two of them using wget with the -c flag, ( stands for continue) which i think that screwd up my llms files ( not sure how , this shouln't have happened) , so i downloade this time using firefox now its ok, i just wanted to share infact if anybody had similar behavior use firefox
thanks bartowski.

frank0071 changed discussion status to closed Sep 30, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment