Instructions to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="bartowski/Meta-Llama-3.1-8B-Instruct-GGUF", filename="Meta-Llama-3.1-8B-Instruct-IQ2_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
Use Docker
docker model run hf.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
- Ollama
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Ollama:
ollama run hf.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
- Unsloth Studio
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for bartowski/Meta-Llama-3.1-8B-Instruct-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for bartowski/Meta-Llama-3.1-8B-Instruct-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for bartowski/Meta-Llama-3.1-8B-Instruct-GGUF to start chatting
- Pi
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Docker Model Runner:
docker model run hf.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
- Lemonade
How to use bartowski/Meta-Llama-3.1-8B-Instruct-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull bartowski/Meta-Llama-3.1-8B-Instruct-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Meta-Llama-3.1-8B-Instruct-GGUF-Q4_K_M
List all available models
lemonade list
🚩 Report: Not working
You're the first one in a long time to see this issue, usually it indicated an out of date install..
i've actually tried llama.cpp sir, but thanks for response. i'll install the latest version of both apps ( gpt4all, llama.cpp)
frank0071, also check SHA256 hash of gguf file and compare with original string 9da71c45c90a821809821244d4971e5e5dfad7eb091f0b8ff0546392393b6283 it must be same
if SHA256 hash is different your gguf file is corrupt, download gguf file again
thanks for answer softyfluffyboy, appreciated.
i updated my GPT4ALL tool to v3.3.1 , which is the latest one, i've got another llama3.2 ( in the image is 3.1) and still get "GGGGG" in the response
i've also updated my llama.cpp , there i get GGG as well.
i compare the checksum with your provided string , its different , but i've got no idea why , my inernet connection is not the problem im sure of that, i've just changed that gguf file name couple times, and anyway if it was because of that it shouldnt worked and printing GGG i guess?
am i right?, does changing the file name actually change the check sum? idont think so
No the file name shouldn't change the check sum, can you try downloading again? Corruption can happen over even the best networks, it's extremely rare but it happens
sure mr bartowski i will download it to night and i'll let you posted , thanks for support though really appreciated it.
there is an idea in my mind that maybe that causing the problem , i have like 10 different llm all working , but i downloaded them using firefox.
but for your gguf llm i downloaded using "wget" in linux , so maybe that casuing the problem, ill test using firefox again, and let you know.
best reagard
hi again guys, i downlaoded using firefox bulit-in dl manager ,
i kinda changed the llm though , but i downloaded three different llm from bartwoski's huggingface repo, two of them using wget with the -c flag, ( stands for continue) which i think that screwd up my llms files ( not sure how , this shouln't have happened) , so i downloade this time using firefox now its ok, i just wanted to share infact if anybody had similar behavior use firefox
thanks bartowski.


