Qwen3-14B Chemical Synthesis Classifier - LoRA Adapter

Model Overview

This repository contains the LoRA adapter for a Qwen3-14B model fine-tuned to classify chemical synthesizability (P = synthesizable, U = unsynthesizable). Training uses a P/U-only focal loss. Prompts follow this template:

As chief synthesis scientist, judge {compound} critically. Avoid bias. P=synthesizable, U=not:

The base checkpoint is Unsloth’s 4-bit MXFP4 build (unsloth/Qwen3-14B-unsloth-bnb-4bit). Attaching this adapter reproduces the best validation performance among evaluated epochs (Epoch 2).

Task: Binary classification (P = synthesizable, U = unsynthesizable)
Training Objective: QLoRA with focal loss (gamma = 2.0, alpha_P = 8.12, alpha_U = 1.0)
Max Sequence Length (train): 2048 tokens; Evaluation: 180 tokens
Dataset: 316,442 train (train_pu_hem.jsonl) / 79,114 validation (validate_pu_hem.jsonl) samples (~11% P / 89% U)
Adapter Size: ~981 MB (adapter_model.safetensors)

Prompt & Thinking Prefill

Evaluation constructs prompts via the included chat template and pre-fills a lightweight thinking block to match SFT conditions. Effective structure:

System: You are a helpful assistant for P/U classification of synthesizability.
User: As chief synthesis scientist, judge {compound} critically. Avoid bias. P=synthesizable, U=not:
Assistant (prefill): <think>\n\n</think>\n\n

Two equivalent setups in the evaluation script:

Template-driven: --use_checkpoint_chat_template --enable_thinking false --think_stub false (recommended; the template inserts the stub).
Manual stub: --think_stub true (forces appending <think>…</think> after the assistant start).

Validation Metrics (Epoch 2 - Best)

Metric	Value
TPR (P Recall)	0.9562
TNR (U Specificity)	0.9001

Dataset Sources

The training and validation splits combine multiple public sources and internal curation:

P/U labelled data from J. Am. Chem. Soc. 2024, 146, 29, 19654-19659 (doi:10.1021/jacs.4c05840).
High-entropy materials data from Data in Brief 2018, 21, 2664-2678 (doi:10.1016/j.dib.2018.11.111).
Additional candidates via literature queries and manual screening of high-entropy materials.

After de-duplication across all sources, approximately 2,560 unique compositions were appended to the base corpus. The combined dataset contains 316,442 training samples and 79,114 validation samples with an imbalanced label ratio (~11% P / 89% U).

VRAM & System Requirements

GPU VRAM: >=16 GB recommended for loading the 4-bit base with this adapter.
RAM: >=16 GB recommended for tokenization and batching.
Libraries: unsloth, transformers, peft, bitsandbytes.
CPU-only inference is not supported with MXFP4 4-bit weights.

Limitations & Notes

This adapter targets chemical synthesizability judgments; generalization outside this domain is not guaranteed.
For consistent results, use a chat template aligned with training (a chat_template.jinja is included in this checkpoint).