evenfarther commited on
Commit
246153d
·
verified ·
1 Parent(s): 4adb946

Upload adapter_lora_quickstarts_qwen3_14b.ipynb

Browse files
adapter_lora_quickstarts_qwen3_14b.ipynb ADDED
@@ -0,0 +1,251 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "# Qwen3-14B LoRA Adapter — Colab Quickstart\n",
8
+ "\n",
9
+ "This notebook shows how to load the `unsloth/Qwen3-14B-unsloth-bnb-4bit` base model, attach the fine-tuned LoRA adapter (P/U classifier), and run classification on a Google Colab T4 runtime.\n",
10
+ "\n",
11
+ "**Workflow overview**\n",
12
+ "1. Install a compatible software stack (NumPy 1.x + SciPy 1.11) together with Unsloth, Transformers, BitsAndBytes, and PEFT.\n",
13
+ "2. Restart the runtime once (Colab quirk), then re-run the install cell.\n",
14
+ "3. Configure the Hugging Face adapter repo name.\n",
15
+ "4. Load base model + attach adapter.\n",
16
+ "5. Run classification with the same prompt used in evaluation, including a lightweight thinking prefill.\n"
17
+ ]
18
+ },
19
+ {
20
+ "cell_type": "code",
21
+ "execution_count": null,
22
+ "metadata": {
23
+ "id": "install"
24
+ },
25
+ "outputs": [],
26
+ "source": [
27
+ "# Step 1 – Install compatible dependencies.\n",
28
+ "!pip install -q --upgrade --force-reinstall --no-cache-dir \\\n",
29
+ " \"numpy==1.26.4\" \"scipy<1.12\" unsloth unsloth_zoo transformers accelerate bitsandbytes peft huggingface_hub\n",
30
+ "print(\"Installation complete. Restart the runtime (Runtime > Restart runtime) and run this cell ONCE MORE before moving on.\")\n"
31
+ ]
32
+ },
33
+ {
34
+ "cell_type": "markdown",
35
+ "metadata": {},
36
+ "source": [
37
+ "## Step 2 – Configure paths\n",
38
+ "Set `ADAPTER_REPO` to the Hugging Face repo where you uploaded the LoRA adapter files (it must contain `adapter_config.json`, `adapter_model.safetensors`, `chat_template.jinja`, tokenizer files, etc.).\n"
39
+ ]
40
+ },
41
+ {
42
+ "cell_type": "code",
43
+ "execution_count": null,
44
+ "metadata": {
45
+ "id": "config"
46
+ },
47
+ "outputs": [],
48
+ "source": [
49
+ "BASE_MODEL = 'unsloth/Qwen3-14B-unsloth-bnb-4bit'\n",
50
+ "ADAPTER_REPO = 'evenfarther/Qwen3-14b-chemical-synthesis-adapter'\n",
51
+ "MAX_SEQ_LEN = 180 # matches evaluation context length\n",
52
+ "SYSTEM_PROMPT = 'You are a helpful assistant for P/U classification of synthesizability.'\n",
53
+ "PROMPT_TEMPLATE = (\n",
54
+ " 'As chief synthesis scientist, judge {compound} critically. ' +\n",
55
+ " 'Avoid bias. P=synthesizable, U=not:'\n",
56
+ ")\n",
57
+ "DEVICE_MAP = {'': 0} # keep everything on GPU 0\n",
58
+ "MAX_MEMORY = {0: '14GiB'} # Colab T4 budget\n"
59
+ ]
60
+ },
61
+ {
62
+ "cell_type": "markdown",
63
+ "metadata": {},
64
+ "source": [
65
+ "## Step 3 – Load base model + adapter\n",
66
+ "This places the MXFP4 base model and the LoRA adapter on the GPU. If you see a device-map error, make sure the runtime is a single T4 GPU session and that no other large models are loaded.\n"
67
+ ]
68
+ },
69
+ {
70
+ "cell_type": "code",
71
+ "execution_count": null,
72
+ "metadata": {
73
+ "id": "load"
74
+ },
75
+ "outputs": [],
76
+ "source": [
77
+ "import torch\n",
78
+ "from unsloth import FastLanguageModel\n",
79
+ "from peft import PeftModel\n",
80
+ "\n",
81
+ "torch.cuda.empty_cache()\n",
82
+ "model, tokenizer = FastLanguageModel.from_pretrained(\n",
83
+ " model_name=BASE_MODEL,\n",
84
+ " max_seq_length=MAX_SEQ_LEN,\n",
85
+ " dtype=None,\n",
86
+ " load_in_4bit=True,\n",
87
+ " device_map=DEVICE_MAP,\n",
88
+ " max_memory=MAX_MEMORY,\n",
89
+ " attn_implementation='eager',\n",
90
+ ")\n",
91
+ "model = PeftModel.from_pretrained(\n",
92
+ " model,\n",
93
+ " ADAPTER_REPO,\n",
94
+ " device_map=DEVICE_MAP,\n",
95
+ ")\n",
96
+ "FastLanguageModel.for_inference(model)\n",
97
+ "\n",
98
+ "# Tokenizer safety\n",
99
+ "if tokenizer.pad_token is None:\n",
100
+ " tokenizer.pad_token = tokenizer.eos_token\n",
101
+ " tokenizer.pad_token_id = tokenizer.eos_token_id\n",
102
+ "\n",
103
+ "vram_gb = torch.cuda.memory_allocated() / 1e9\n",
104
+ "print(f'VRAM usage after load: {vram_gb:.2f} GB')\n"
105
+ ]
106
+ },
107
+ {
108
+ "cell_type": "markdown",
109
+ "metadata": {},
110
+ "source": [
111
+ "## Step 4 – (Optional) Use the adapter's chat template\n",
112
+ "The evaluation uses a checkpoint-provided chat template and pre-fills a lightweight thinking block after the assistant start. If your adapter repo contains a `chat_template.jinja`, load it into the tokenizer for consistent formatting.\n"
113
+ ]
114
+ },
115
+ {
116
+ "cell_type": "code",
117
+ "execution_count": null,
118
+ "metadata": {
119
+ "id": "template"
120
+ },
121
+ "outputs": [],
122
+ "source": [
123
+ "from huggingface_hub import hf_hub_download\n",
124
+ "\n",
125
+ "try:\n",
126
+ " tmpl_path = hf_hub_download(repo_id=ADAPTER_REPO, filename='chat_template.jinja')\n",
127
+ " with open(tmpl_path, 'r', encoding='utf-8') as f:\n",
128
+ " tokenizer.chat_template = f.read()\n",
129
+ " print('Loaded chat_template.jinja from adapter repo.')\n",
130
+ "except Exception as e:\n",
131
+ " print('No chat_template.jinja found in adapter repo (this is OK). Using tokenizer default.')\n"
132
+ ]
133
+ },
134
+ {
135
+ "cell_type": "markdown",
136
+ "metadata": {},
137
+ "source": [
138
+ "## Step 5 – Classification helper\n",
139
+ "The helper below mirrors the training/evaluation prompt. It also inserts a minimal thinking prefill so the model conditions identically to evaluation.\n"
140
+ ]
141
+ },
142
+ {
143
+ "cell_type": "code",
144
+ "execution_count": null,
145
+ "metadata": {
146
+ "id": "helper"
147
+ },
148
+ "outputs": [],
149
+ "source": [
150
+ "THINK_STUB = '<think>\\n\\n</think>\\n\\n'\n",
151
+ "\n",
152
+ "def classify(composition: str) -> str:\n",
153
+ " \"\"\"Return 'P' or 'U' for the given composition string.\"\"\"\n",
154
+ " user_prompt = PROMPT_TEMPLATE.format(compound=composition)\n",
155
+ " messages = [\n",
156
+ " {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n",
157
+ " {\"role\": \"user\", \"content\": user_prompt},\n",
158
+ " ]\n",
159
+ "\n",
160
+ " # Preferred: pass enable_thinking=False so the template injects the stub.\n",
161
+ " try:\n",
162
+ " inputs = tokenizer.apply_chat_template(\n",
163
+ " messages,\n",
164
+ " return_tensors='pt',\n",
165
+ " add_generation_prompt=True,\n",
166
+ " enable_thinking=False,\n",
167
+ " ).to(model.device)\n",
168
+ " used_stub = True\n",
169
+ " except TypeError:\n",
170
+ " # Fallback: manual stub append if template kwargs aren't supported.\n",
171
+ " prompt_text = tokenizer.apply_chat_template(\n",
172
+ " messages,\n",
173
+ " return_tensors=None,\n",
174
+ " add_generation_prompt=True,\n",
175
+ " )\n",
176
+ " if not prompt_text.endswith(THINK_STUB):\n",
177
+ " prompt_text = prompt_text + THINK_STUB\n",
178
+ " inputs = tokenizer(prompt_text, return_tensors='pt').to(model.device)[\"input_ids\"]\n",
179
+ " used_stub = True\n",
180
+ "\n",
181
+ " with torch.no_grad():\n",
182
+ " outputs = model.generate(\n",
183
+ " inputs,\n",
184
+ " max_new_tokens=1,\n",
185
+ " temperature=0.0,\n",
186
+ " do_sample=False,\n",
187
+ " pad_token_id=tokenizer.pad_token_id,\n",
188
+ " eos_token_id=tokenizer.eos_token_id,\n",
189
+ " )\n",
190
+ " gen_ids = outputs[0][inputs.shape[1]:] # newly generated\n",
191
+ " gen_text = tokenizer.decode(gen_ids, skip_special_tokens=True)\n",
192
+ " # Extract the first non-whitespace char and map to P/U if present.\n",
193
+ " gen_text = gen_text.strip()\n",
194
+ " return gen_text[:1] if gen_text[:1] in ('P', 'U') else gen_text\n"
195
+ ]
196
+ },
197
+ {
198
+ "cell_type": "markdown",
199
+ "metadata": {},
200
+ "source": [
201
+ "## Step 6 – Run a few examples\n"
202
+ ]
203
+ },
204
+ {
205
+ "cell_type": "code",
206
+ "execution_count": null,
207
+ "metadata": {
208
+ "id": "demo"
209
+ },
210
+ "outputs": [],
211
+ "source": [
212
+ "examples = [\n",
213
+ " 'Ta1Tl1Cr1Pt1',\n",
214
+ " 'Rh1Cl3',\n",
215
+ " 'Co2Bi1Ru1',\n",
216
+ " 'Co15Mo55Fe10Ni10Cu10',\n",
217
+ "]\n",
218
+ "for comp in examples:\n",
219
+ " pred = classify(comp)\n",
220
+ " output = pred if pred else '<empty>'\n",
221
+ " print(f'{comp}: {output}')\n"
222
+ ]
223
+ },
224
+ {
225
+ "cell_type": "markdown",
226
+ "metadata": {},
227
+ "source": [
228
+ "### Notes\n",
229
+ "- VRAM: Colab T4 (≈15 GB) is typically sufficient for 4-bit base + adapter.\n",
230
+ "- If you see memory errors, reduce `MAX_SEQ_LEN` (e.g., 128) or clear other sessions.\n",
231
+ "- The adapter targets chemical synthesizability classification; generalization outside this domain is not guaranteed.\n"
232
+ ]
233
+ }
234
+ ],
235
+ "metadata": {
236
+ "accelerator": "GPU",
237
+ "colab": {
238
+ "name": "Qwen3-14B LoRA Adapter — Colab Quickstart",
239
+ "provenance": []
240
+ },
241
+ "kernelspec": {
242
+ "display_name": "Python 3",
243
+ "name": "python3"
244
+ },
245
+ "language_info": {
246
+ "name": "python"
247
+ }
248
+ },
249
+ "nbformat": 4,
250
+ "nbformat_minor": 5
251
+ }