SmolVLA Fine-tuned (VLM Frozen)

Fine-tuned SmolVLA model with frozen Vision-Language Model on BIMAN_PICK_AND_PLACE2 dataset.

Model Configuration

  • freeze_vision_encoder: True
  • train_expert_only: True
  • train_state_proj: True

Training Details

  • Base model: lerobot/smolvla_base
  • Dataset: Bcryan/BIMAN_PICK_AND_PLACE2
  • Training steps: 70,000
  • Batch size: 16
  • VLM: Frozen (only action expert trained)

Usage

from lerobot.common.policies.smolvla import SmolVLAPolicy
policy = SmolVLAPolicy.from_pretrained("Autobrik/smolvla-finetunned-vlm-off")

Training Command

python lerobot/scripts/train.py \
  --policy.path=lerobot/smolvla_base \
  --dataset.repo_id=Bcryan/BIMAN_PICK_AND_PLACE2 \
  --batch_size=16 \
  --steps=70000
Downloads last month
1
Video Preview
loading

Dataset used to train Autobrik/smolvla-finetunned-vlm-off