DeepSWE-Preview-FP8

This is an FP8 quantized version of the agentica-org/DeepSWE-Preview model. All credit for the original model goes to the Agentica team.

Description

DeepSWE-Preview-FP8 is an FP8 quantized version of the DeepSWE-Preview model, an open-source coding agent trained exclusively with reinforcement learning (RL) to excel at software engineering tasks. Built on top of Qwen3-32B with thinking mode enabled, the model demonstrates strong reasoning capabilities in navigating complex codebases and handling multiple files. This quantized version maintains the capabilities of the original while offering reduced memory requirements and faster inference.

The original model achieves 59.0% on SWE-Bench-Verified, making it #1 in the open-weights category.

Architecture

Base Model: Qwen/Qwen3-32B
Quantization Format: FP8 (Float-Quantized)
Original Training Method: Pure Reinforcement Learning (no Supervised Fine-Tuning)
Parameters: 32.8 Billion (original), quantized to FP8 precision
Tensor Type: FP8 (quantized from F32)
Context Length: Supports up to 65,536 tokens

Key Technical Details

Trained with only 200 steps of RL, showing significant performance gains
Pass@1 performance of 42.2% (averaged over 16 runs) in original model
Enhanced GRPO algorithm incorporating innovations from DAPO, Dr. GRPO, LOOP/RLOO
Uses R2E-Gym environment for training and evaluation

Quantization Benefits

Reduced Memory Footprint: Lower VRAM requirements compared to the original model
Faster Inference: Improved inference speed due to FP8 precision
Maintained Performance: Preserves most of the original model's capabilities

Intended Use Cases

Software Engineering Tasks: Designed primarily for coding-related activities
Codebase Navigation: Excels at understanding and modifying complex codebases
Multi-file Editing: Capable of viewing and editing multiple files in a project
Automated Testing: Can execute bash commands and run test suites
Research Foundation: Serves as a base model for developing future coding agents

Training Data

4.5K problems from a subset of R2E-Gym
Filtered to avoid data contamination (e.g., removed problems from sympy repository)
Each problem maps to individual Docker images

Deployment Recommendations

Temperature: 1
Max Tokens: 32-64K
Serving Options: vLLM (recommended), Hugging Face TGI, SGLang, TensorRT-LLM
Special Tools: Works with R2EGym's system prompt and tools (file_editor.py, execution_bash.py, search.py, finish.py)

Performance

While the FP8 quantization reduces memory requirements, the model should maintain comparable performance to the original on most software engineering tasks.

The model is released under the MIT License, emphasizing open and accessible AI development.

Relationship to Original Model

This model is a quantized version of agentica-org/DeepSWE-Preview. It maintains the core capabilities of the original while offering reduced memory requirements and faster inference through FP8 quantization.

Downloads last month: 4

Safetensors

Model size

33B params

Tensor type

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for groxaxo/DeepSWE-Preview-FP8

Base model

Qwen/Qwen3-32B

Finetuned

agentica-org/DeepSWE-Preview

Quantized

(15)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard