AhmedBou
/

Llama-3-EngText-ArabicSummary

text-generation-inference

Model card Files Files and versions

AhmedBou commited on Jun 3, 2024

Commit

e1e014a

·

verified ·

1 Parent(s): 101472a

Update README.md

Files changed (1) hide show

README.md +45 -1

README.md CHANGED Viewed

@@ -1,6 +1,7 @@
 ---
 language:
 - en
 license: apache-2.0
 tags:
 - text-generation-inference
@@ -9,8 +10,51 @@ tags:
 - llama
 - trl
 base_model: unsloth/llama-3-8b-bnb-4bit
 ---
 # Uploaded  model
 - **Developed by:** AhmedBou
@@ -19,4 +63,4 @@ base_model: unsloth/llama-3-8b-bnb-4bit
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
 language:
 - en
+- ar
 license: apache-2.0
 tags:
 - text-generation-inference
 - llama
 - trl
 base_model: unsloth/llama-3-8b-bnb-4bit
+datasets:
+- AhmedBou/EngText-ArabicSummary
 ---
+## Inference code:
+Use this python code for inference
+````python
+# Installs Unsloth, Xformers (Flash Attention) and all other packages!
+!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
+!pip install --no-deps xformers trl peft accelerate bitsandbytes
+from unsloth import FastLanguageModel
+max_seq_length = 2048
+dtype = None
+load_in_4bit = True
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name = "AhmedBou/Llama-3-EngText-ArabicSummary",
+    max_seq_length = max_seq_length,
+    dtype = dtype,
+    load_in_4bit = load_in_4bit,
+)
+FastLanguageModel.for_inference(model)
+input = """
+past a news article here
+"""
+FastLanguageModel.for_inference(model) # Enable native 2x faster inference
+inputs = tokenizer(
+[
+    alpaca_prompt.format(
+        input, # input
+        "", # output - leave this blank for generation!
+    )
+], return_tensors = "pt").to("cuda")
+outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
+tokenizer.batch_decode(outputs)
+````
 # Uploaded  model
 - **Developed by:** AhmedBou
 This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)