evenfarther
/

Qwen3-14b-chemical-synthesis-adapter

+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# Qwen3-14B LoRA Adapter — Colab Quickstart\n",
+        "\n",
+        "This notebook shows how to load the `unsloth/Qwen3-14B-unsloth-bnb-4bit` base model, attach the fine-tuned LoRA adapter (P/U classifier), and run classification on a Google Colab T4 runtime.\n",
+        "\n",
+        "**Workflow overview**\n",
+        "1. Install a compatible software stack (NumPy 1.x + SciPy 1.11) together with Unsloth, Transformers, BitsAndBytes, and PEFT.\n",
+        "2. Restart the runtime once (Colab quirk), then re-run the install cell.\n",
+        "3. Configure the Hugging Face adapter repo name.\n",
+        "4. Load base model + attach adapter.\n",
+        "5. Run classification with the same prompt used in evaluation, including a lightweight thinking prefill.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "install"
+      },
+      "outputs": [],
+      "source": [
+        "# Step 1 – Install compatible dependencies.\n",
+        "!pip install -q --upgrade --force-reinstall --no-cache-dir \\\n",
+        "    \"numpy==1.26.4\" \"scipy<1.12\" unsloth unsloth_zoo transformers accelerate bitsandbytes peft huggingface_hub\n",
+        "print(\"Installation complete. Restart the runtime (Runtime > Restart runtime) and run this cell ONCE MORE before moving on.\")\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 2 – Configure paths\n",
+        "Set `ADAPTER_REPO` to the Hugging Face repo where you uploaded the LoRA adapter files (it must contain `adapter_config.json`, `adapter_model.safetensors`, `chat_template.jinja`, tokenizer files, etc.).\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "config"
+      },
+      "outputs": [],
+      "source": [
+        "BASE_MODEL = 'unsloth/Qwen3-14B-unsloth-bnb-4bit'\n",
+        "ADAPTER_REPO = 'evenfarther/Qwen3-14b-chemical-synthesis-adapter'\n",
+        "MAX_SEQ_LEN = 180  # matches evaluation context length\n",
+        "SYSTEM_PROMPT = 'You are a helpful assistant for P/U classification of synthesizability.'\n",
+        "PROMPT_TEMPLATE = (\n",
+        "    'As chief synthesis scientist, judge {compound} critically. ' +\n",
+        "    'Avoid bias. P=synthesizable, U=not:'\n",
+        ")\n",
+        "DEVICE_MAP = {'': 0}  # keep everything on GPU 0\n",
+        "MAX_MEMORY = {0: '14GiB'}  # Colab T4 budget\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 3 – Load base model + adapter\n",
+        "This places the MXFP4 base model and the LoRA adapter on the GPU. If you see a device-map error, make sure the runtime is a single T4 GPU session and that no other large models are loaded.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "load"
+      },
+      "outputs": [],
+      "source": [
+        "import torch\n",
+        "from unsloth import FastLanguageModel\n",
+        "from peft import PeftModel\n",
+        "\n",
+        "torch.cuda.empty_cache()\n",
+        "model, tokenizer = FastLanguageModel.from_pretrained(\n",
+        "    model_name=BASE_MODEL,\n",
+        "    max_seq_length=MAX_SEQ_LEN,\n",
+        "    dtype=None,\n",
+        "    load_in_4bit=True,\n",
+        "    device_map=DEVICE_MAP,\n",
+        "    max_memory=MAX_MEMORY,\n",
+        "    attn_implementation='eager',\n",
+        ")\n",
+        "model = PeftModel.from_pretrained(\n",
+        "    model,\n",
+        "    ADAPTER_REPO,\n",
+        "    device_map=DEVICE_MAP,\n",
+        ")\n",
+        "FastLanguageModel.for_inference(model)\n",
+        "\n",
+        "# Tokenizer safety\n",
+        "if tokenizer.pad_token is None:\n",
+        "    tokenizer.pad_token = tokenizer.eos_token\n",
+        "    tokenizer.pad_token_id = tokenizer.eos_token_id\n",
+        "\n",
+        "vram_gb = torch.cuda.memory_allocated() / 1e9\n",
+        "print(f'VRAM usage after load: {vram_gb:.2f} GB')\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 4 – (Optional) Use the adapter's chat template\n",
+        "The evaluation uses a checkpoint-provided chat template and pre-fills a lightweight thinking block after the assistant start. If your adapter repo contains a `chat_template.jinja`, load it into the tokenizer for consistent formatting.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "template"
+      },
+      "outputs": [],
+      "source": [
+        "from huggingface_hub import hf_hub_download\n",
+        "\n",
+        "try:\n",
+        "    tmpl_path = hf_hub_download(repo_id=ADAPTER_REPO, filename='chat_template.jinja')\n",
+        "    with open(tmpl_path, 'r', encoding='utf-8') as f:\n",
+        "        tokenizer.chat_template = f.read()\n",
+        "    print('Loaded chat_template.jinja from adapter repo.')\n",
+        "except Exception as e:\n",
+        "    print('No chat_template.jinja found in adapter repo (this is OK). Using tokenizer default.')\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 5 – Classification helper\n",
+        "The helper below mirrors the training/evaluation prompt. It also inserts a minimal thinking prefill so the model conditions identically to evaluation.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "helper"
+      },
+      "outputs": [],
+      "source": [
+        "THINK_STUB = '<think>\\n\\n</think>\\n\\n'\n",
+        "\n",
+        "def classify(composition: str) -> str:\n",
+        "    \"\"\"Return 'P' or 'U' for the given composition string.\"\"\"\n",
+        "    user_prompt = PROMPT_TEMPLATE.format(compound=composition)\n",
+        "    messages = [\n",
+        "        {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n",
+        "        {\"role\": \"user\", \"content\": user_prompt},\n",
+        "    ]\n",
+        "\n",
+        "    # Preferred: pass enable_thinking=False so the template injects the stub.\n",
+        "    try:\n",
+        "        inputs = tokenizer.apply_chat_template(\n",
+        "            messages,\n",
+        "            return_tensors='pt',\n",
+        "            add_generation_prompt=True,\n",
+        "            enable_thinking=False,\n",
+        "        ).to(model.device)\n",
+        "        used_stub = True\n",
+        "    except TypeError:\n",
+        "        # Fallback: manual stub append if template kwargs aren't supported.\n",
+        "        prompt_text = tokenizer.apply_chat_template(\n",
+        "            messages,\n",
+        "            return_tensors=None,\n",
+        "            add_generation_prompt=True,\n",
+        "        )\n",
+        "        if not prompt_text.endswith(THINK_STUB):\n",
+        "            prompt_text = prompt_text + THINK_STUB\n",
+        "        inputs = tokenizer(prompt_text, return_tensors='pt').to(model.device)[\"input_ids\"]\n",
+        "        used_stub = True\n",
+        "\n",
+        "    with torch.no_grad():\n",
+        "        outputs = model.generate(\n",
+        "            inputs,\n",
+        "            max_new_tokens=1,\n",
+        "            temperature=0.0,\n",
+        "            do_sample=False,\n",
+        "            pad_token_id=tokenizer.pad_token_id,\n",
+        "            eos_token_id=tokenizer.eos_token_id,\n",
+        "        )\n",
+        "    gen_ids = outputs[0][inputs.shape[1]:]  # newly generated\n",
+        "    gen_text = tokenizer.decode(gen_ids, skip_special_tokens=True)\n",
+        "    # Extract the first non-whitespace char and map to P/U if present.\n",
+        "    gen_text = gen_text.strip()\n",
+        "    return gen_text[:1] if gen_text[:1] in ('P', 'U') else gen_text\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Step 6 – Run a few examples\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "demo"
+      },
+      "outputs": [],
+      "source": [
+        "examples = [\n",
+        "    'Ta1Tl1Cr1Pt1',\n",
+        "    'Rh1Cl3',\n",
+        "    'Co2Bi1Ru1',\n",
+        "    'Co15Mo55Fe10Ni10Cu10',\n",
+        "]\n",
+        "for comp in examples:\n",
+        "    pred = classify(comp)\n",
+        "    output = pred if pred else '<empty>'\n",
+        "    print(f'{comp}: {output}')\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Notes\n",
+        "- VRAM: Colab T4 (≈15 GB) is typically sufficient for 4-bit base + adapter.\n",
+        "- If you see memory errors, reduce `MAX_SEQ_LEN` (e.g., 128) or clear other sessions.\n",
+        "- The adapter targets chemical synthesizability classification; generalization outside this domain is not guaranteed.\n"
+      ]
+    }
+  ],
+  "metadata": {
+    "accelerator": "GPU",
+    "colab": {
+      "name": "Qwen3-14B LoRA Adapter — Colab Quickstart",
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 5
+}