AhmedBou commited on
Commit
e1e014a
·
verified ·
1 Parent(s): 101472a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -1
README.md CHANGED
@@ -1,6 +1,7 @@
1
  ---
2
  language:
3
  - en
 
4
  license: apache-2.0
5
  tags:
6
  - text-generation-inference
@@ -9,8 +10,51 @@ tags:
9
  - llama
10
  - trl
11
  base_model: unsloth/llama-3-8b-bnb-4bit
 
 
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  # Uploaded model
15
 
16
  - **Developed by:** AhmedBou
@@ -19,4 +63,4 @@ base_model: unsloth/llama-3-8b-bnb-4bit
19
 
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
  language:
3
  - en
4
+ - ar
5
  license: apache-2.0
6
  tags:
7
  - text-generation-inference
 
10
  - llama
11
  - trl
12
  base_model: unsloth/llama-3-8b-bnb-4bit
13
+ datasets:
14
+ - AhmedBou/EngText-ArabicSummary
15
  ---
16
 
17
+ ## Inference code:
18
+ Use this python code for inference
19
+
20
+ ````python
21
+ # Installs Unsloth, Xformers (Flash Attention) and all other packages!
22
+ !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
23
+ !pip install --no-deps xformers trl peft accelerate bitsandbytes
24
+
25
+ from unsloth import FastLanguageModel
26
+
27
+ max_seq_length = 2048
28
+ dtype = None
29
+ load_in_4bit = True
30
+
31
+ model, tokenizer = FastLanguageModel.from_pretrained(
32
+ model_name = "AhmedBou/Llama-3-EngText-ArabicSummary",
33
+ max_seq_length = max_seq_length,
34
+ dtype = dtype,
35
+ load_in_4bit = load_in_4bit,
36
+ )
37
+ FastLanguageModel.for_inference(model)
38
+
39
+ input = """
40
+ past a news article here
41
+ """
42
+
43
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
44
+ inputs = tokenizer(
45
+ [
46
+ alpaca_prompt.format(
47
+ input, # input
48
+ "", # output - leave this blank for generation!
49
+ )
50
+ ], return_tensors = "pt").to("cuda")
51
+
52
+ outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
53
+ tokenizer.batch_decode(outputs)
54
+
55
+
56
+ ````
57
+
58
  # Uploaded model
59
 
60
  - **Developed by:** AhmedBou
 
63
 
64
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
65
 
66
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)