Danau5tin
/

Orca-Agent-v0.1

Model card Files Files and versions

Orca-Agent-v0.1 / README.md

Danau5tin's picture

Update README.md

eafa937 verified 21 days ago

|

history blame contribute delete

2.92 kB

	---
	base_model:
	- willcb/Qwen3-14B
	license: apache-2.0
	datasets:
	- Danau5tin/terminal-tasks
	tags:
	- agent
	- code
	- multi-agent
	---

	# Orca-Agent-v0.1

	![Orca-Agent-v0.1 banner image](./orca-agent-v01-banner.png)

	In depth details behind the training, including the training code, is all open sourced [here](https://github.com/Danau5tin/Orca-Agent-RL)

	## Description
	Orca-Agent-v0.1 is an orchestration agent that acts as the brain of the operation - it receives the user's task but never touches code directly. Instead, it:

	- Analyses the task and breaks it into focused subtasks
	- Dispatches explorer agents to understand the system
	- Delegates implementation work to coder agents with precise instructions
	- Verifies all changes through additional explorer agents
	- Maintains the context store with all discovered knowledge

	## Performance
	\| Orchestrator \| Subagent \| Terminal Bench \|
	\|--------------\|----------\|--------\|
	\| Qwen3-Coder-480B \| Qwen3-Coder-480B \| 19.7% \|
	\| Orca-Agent-v0.1-14B \| Qwen3-Coder-30B \| 18.25% \|
	\| Qwen3-14B \| Qwen3-Coder-30B \| 7.0% \|

	This places Orca-Agent-v0.1 (14B) + Qwen3-Coder-Flash (30B MoE), within striking distance of [Qwen3-Coder-480B](https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct) using the same architecture which placed #26 on TerminalBench when it was published recently in [my other project](https://github.com/Danau5tin/multi-agent-coding-system).

	## Example model output
	```xml
	<think>
	Okay, now with task_003 identifying a backup, the next step would be to attempt recovery using this identified file.
	</think>

	<task_create>
	agent_type: 'coder'
	title: 'Attempt recovery using the identified backup file'
	description: \|
	Execute the following steps to recover using the backup identified in task_003:
	1. Move the identified backup file (/tmp/terraform_work/.terraform.tfstate.tmp) to the target location (/infrastructure/recovered_state.json).
	2. Verify the operation by checking:
	- The existence of the new file at the target location
	- The size matches the original backup file
	- The file permissions are set appropriately (rw-r--r--)
	max_turns: 10
	context_bootstrap:
	- path: '/tmp/terraform_work/.terraform.tfstate.tmp'
	reason: 'The backup file identified in task_003'
	context_refs:
	- 'task_003'
	</task_create>
	```

	## Model training Overview
	- Full fine-tune of Qwen3-14b
	- 32x H100s
	- 16x for training
	- 8x inference for Orca-Agent
	- 8x inference for subagent (Qwen3-Coder-30B-A3B)
	- Trained with GRPO + curriculum learning
	- Batch size 256, 64 rollouts per task
	- More details [here](https://github.com/Danau5tin/Orca-Agent-RL)

	## Serving model

	vLLM
	```bash
	vllm serve Danau5tin/Orca-Agent-v0.1
	```

	SGLang
	```bash
	python -m sglang.launch_server \
	--model-path Danau5tin/Orca-Agent-v0.1
	```

	The agent's orchestration code can be found [here](https://github.com/Danau5tin/multi-agent-coding-system).