SciReason-LFM2-2.6B
Model Overview
SciReason-LFM2-2.6B is a fine-tuned version of LiquidAI/LFM2-2.6B, trained with Unsloth on the OpenScienceReasoning-2 dataset.
The fine-tuning enhances the base model’s ability to handle multi-step scientific reasoning and produce coherent chain-of-thought explanations.
Training Configuration
- Framework: Unsloth
- Dataset: nvidia/OpenScienceReasoning-2
- Examples: ~11,000
- Epochs: 1
- Total Steps: 1,375
- Batch size per device: 2
- Gradient Accumulation Steps: 4
- Effective Batch Size: 8
- Trainable Parameters: ~20M (LoRA / PEFT with Unsloth smart offloading)
- Optimizer: AdamW
- Learning Rate: 2e-4
- Weight Decay: 0.01
- LR Scheduler: cosine with warmup
- Hardware: Single GPU (Unsloth offloading enabled)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model_id = "yasserrmd/SciReason-LFM2-2.6B"
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype="bfloat16",
# attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# Generate answer
prompt = """
Solve the following problem. Make sure to put the answer (and only answer) inside \boxed{}.
Based on analysis of multinational aeromedical databases (e.g., EASA's EMPR, FAA's CAMI database, and military longitudinal studies), which statement accurately characterizes a fundamental limitation in definitively establishing cause-and-effect relationships for cardiovascular morbidity trends among commercial aircrew?
A: Stratified sampling protocols universally eliminate survivorship bias
B: Retroactive harmonization of biochemical markers across jurisdictions enables precise meta-analysis
C: Inability to fully adjust for dominant confounding variables (e.g., socioeconomic status, undisclosed supplement use)
D: Cohort studies consistently show declining age-adjusted myocardial infarction rates compared to the general population
E: Mandatory polysomnography data provides complete correction for sleep disorder comorbidities
F: Radiation dose metrics exhibit a linear correlation with arrhythmia incidence in jet aircraft pilots
G: Genome-wide association studies have identified fully penetrant monogenic risk variants specific to aviators
H: Continuous blood pressure monitoring during all flight phases yields statistically significant longitudinal datasets
I: Pharmacokinetic interactions between hypoxia and statins are conclusively established in CRF models
J: Regulatory divergence causes morbidity rates to universally decline across all regions after 2018"""
input_ids = tokenizer.apply_chat_template(
[{
"role":"system",
"content":"""
You are a reasoning assistant.
When solving problems:
- Always place your reasoning inside think tags.
- Think in structured steps, but keep it concise (3–4 short steps maximum).
- Avoid repeating yourself or giving unnecessary background.
- Use bullet points or brief numbered steps for clarity inside think tag.
- After think end tag, provide only the final answer clearly and directly.
- Do not include reasoning outside of the think tags.
"""
},
{"role": "user", "content": prompt}],
add_generation_prompt=True,
return_tensors="pt",
tokenize=True,
).to(model.device)
output = model.generate(
input_ids,
do_sample=True,
temperature=0.3,
min_p=0.15,
repetition_penalty=1.05,
max_new_tokens=1024,
)
print(tokenizer.decode(output[0], skip_special_tokens=False))
# <|startoftext|><|im_start|>user
# What is C. elegans?<|im_end|>
# <|im_start|>assistant
# C. elegans, also known as Caenorhabditis elegans, is a small, free-living
# nematode worm (roundworm) that belongs to the phylum Nematoda.
Intended Use
This model is designed for:
- Scientific reasoning tasks
- Educational Q&A
- Step-by-step logical problem solving
⚠️ Disclaimer: Not intended for clinical or legal decision-making.
License
Apache-2.0 License. See LICENSE.
Acknowledgements
- LiquidAI for LFM2-2.6B
- NVIDIA for OpenScienceReasoning-2
- Unsloth for efficient fine-tuning with gradient offloading
- Downloads last month
- 2