# Llama3-ThinkQ8 A fine-tuned version of Llama 3 that shows explicit thinking using `` and `` tags. This model is quantized to 8-bit (Q8) for efficient inference. ## Model Details - **Base Model**: Llama 3 - **Quantization**: 8-bit (Q8) - **Special Feature**: Explicit thinking process with tags ## How to Use with Ollama ### 1. Install Ollama If you haven't already installed Ollama, follow the instructions at [ollama.ai](https://ollama.ai). ### 2. Download the model file Download the GGUF file from this repository. ### 3. Create the Ollama model Create a file named `Modelfile` with this content: ``` FROM llama3-thinkQ8.gguf # Model parameters PARAMETER temperature 0.8 PARAMETER top_p 0.9 # System prompt SYSTEM """You are a helpful assistant. You will check the user request and you will think and generate brainstorming and self-thoughts in your mind and respond only in the following format: {your thoughts here} {your final answer here} . Use the tags once and place all your output inside them ONLY""" ``` Then run: ```bash ollama create llama3-think -f Modelfile ``` ### 4. Run the model ```bash ollama run llama3-think ``` ## Example Prompts Try these examples: ``` Using each number in this tensor ONLY once (5, 8, 3) and any arithmetic operation like add, subtract, multiply, divide, create an equation that equals 19. ``` ``` Explain the concept of quantum entanglement to a high school student. ```