---
library_name: transformers
tags:
- legal
- summarization
license: apache-2.0
language:
- en
metrics:
- bleu
- rouge
base_model:
- VISAI-AI/Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2
datasets:
- VISAI-AI/JUSTNLP2025-L-Summ-formatted
pipeline_tag: summarization
---

# JUST-NLP 2025 Shared Tasks: L-SUMM RL-r2 (rank=2) Model
One of the model submitted to [JUST-NLP 2025 Shared Task on L-SUMM task](https://exploration-lab.github.io/JUST-NLP/task/) by 4corners team.
The code for training the model is publicly available [here](https://github.com/tann9949/justnlp-2025-legal-summ).

## Finetuning Parameters
This model was finetuned using Unsloth's GRPO pipeline with LoRA Adapter following this hyperparameters:
- LoRA Rank: 2
- LoRA Alpha: 4
- LoRA modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Learning Rate: 8e-5 constant
- Num epochs: 1 (model collapsed at around 550 steps)
- Global Batch Size, Num generations/rollouts: 16
- Optimizer: adamw_8bit
- Temperature: 1.
- Max Training Length: 12000
- Max Gradient norm: 0.2
- Enable GSPO (aggregation at sequence level)
- Loss type: DAPO
- Epsilon high: 0.28
- Reward functions: ROUGE-L, ROUGE-2, BLEU (BLEU was scaled 3x due to lower bleu score on the base model)

We use the official training data provided by JUST-NLP Shared Task for L-SUMM with some data filtering. The dataset as well as the details is given [here](https://huggingface.co/datasets/VISAI-AI/JUSTNLP2025-L-Summ-formatted).


## Results

Validation Leaderboard results:

| model                                                        |     Avg |   Rouge-2 |   Rouge-L |     BLEU |
|:-------------------------------------------------------------|--------:|----------:|----------:|---------:|
| Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage1 | 25.47 |   31.25 |   31.42 | 13.74  |
| Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2 | 25.57 |   31.51 |   31.77 |  13.43 |


Test Leaderboard Results

| model                                                        |     Avg |   Rouge-2 |   Rouge-L |     BLEU |
|:-------------------------------------------------------------|--------:|----------:|----------:|---------:|
| Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2             | 23.94 |   30.35 |   30.19 | 11.27  |
| Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-1step              | 21.62 |   28.46 |   28.42 |  7.97 |
| Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r1-ckpt150      | 27.21 |   33.36 |   32.25 | 16.01  |
| Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r2-ckpt500      | 29.91 |   34.91 |   33.34 | 21.49  |

## Hardware Usage
We use 1x A100 80GB to finetune this model.

## Authors
Chompakorn Chaksangchaichot & Pawitsapak Akarajaradwong<br>
`{chompakornc_pro,pawitsapaka_visai}@vistec.ac.th`