Create README.md
Browse files# Magma-8B-4bit
## Model Description
**Magma-8B-4bit** is a 4-bit quantized version of the [Magma-8B](https://huggingface.co/microsoft/Magma-8B) model, optimized for efficient inference with reduced memory usage.
## Features
- **Quantization**: 4-bit precision using `BitsAndBytesConfig`.
- **Reduced VRAM Usage**: Runs efficiently on consumer GPUs.
- **Performance**: Maintains strong text generation capabilities while being lightweight.
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("PritamcodesAGI/Magma-8B-4bit")
tokenizer = AutoTokenizer.from_pretrained("PritamcodesAGI/Magma-8B-4bit")
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
```
## License
This model follows the same licensing terms as the original [Magma-8B](https://huggingface.co/microsoft/Magma-8B).