Create README.md

# Magma-8B-4bit

## Model Description
**Magma-8B-4bit** is a 4-bit quantized version of the [Magma-8B](https://huggingface.co/microsoft/Magma-8B) model, optimized for efficient inference with reduced memory usage.

## Features
- **Quantization**: 4-bit precision using `BitsAndBytesConfig`.
- **Reduced VRAM Usage**: Runs efficiently on consumer GPUs.
- **Performance**: Maintains strong text generation capabilities while being lightweight.

## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("PritamcodesAGI/Magma-8B-4bit")
tokenizer = AutoTokenizer.from_pretrained("PritamcodesAGI/Magma-8B-4bit")

inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
```

## License
This model follows the same licensing terms as the original [Magma-8B](https://huggingface.co/microsoft/Magma-8B).

Files changed (1) hide show

README.md +7 -0

README.md ADDED Viewed

	@@ -0,0 +1,7 @@

+---
+license: mit
+language:
+- en
+base_model:
+- microsoft/Magma-8B
+---