--- library_name: transformers tags: - legal - summarization license: apache-2.0 language: - en metrics: - bleu - rouge base_model: - VISAI-AI/Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2 datasets: - VISAI-AI/JUSTNLP2025-L-Summ-formatted pipeline_tag: summarization --- # JUST-NLP 2025 Shared Tasks: L-SUMM RL-r2 (rank=2) Model One of the model submitted to [JUST-NLP 2025 Shared Task on L-SUMM task](https://exploration-lab.github.io/JUST-NLP/task/) by 4corners team. The code for training the model is publicly available [here](https://github.com/tann9949/justnlp-2025-legal-summ). ## Finetuning Parameters This model was finetuned using Unsloth's GRPO pipeline with LoRA Adapter following this hyperparameters: - LoRA Rank: 2 - LoRA Alpha: 4 - LoRA modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj - Learning Rate: 8e-5 constant - Num epochs: 1 (model collapsed at around 550 steps) - Global Batch Size, Num generations/rollouts: 16 - Optimizer: adamw_8bit - Temperature: 1. - Max Training Length: 12000 - Max Gradient norm: 0.2 - Enable GSPO (aggregation at sequence level) - Loss type: DAPO - Epsilon high: 0.28 - Reward functions: ROUGE-L, ROUGE-2, BLEU (BLEU was scaled 3x due to lower bleu score on the base model) We use the official training data provided by JUST-NLP Shared Task for L-SUMM with some data filtering. The dataset as well as the details is given [here](https://huggingface.co/datasets/VISAI-AI/JUSTNLP2025-L-Summ-formatted). ## Results Validation Leaderboard results: | model | Avg | Rouge-2 | Rouge-L | BLEU | |:-------------------------------------------------------------|--------:|----------:|----------:|---------:| | Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage1 | 25.47 | 31.25 | 31.42 | 13.74 | | Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2 | 25.57 | 31.51 | 31.77 | 13.43 | Test Leaderboard Results | model | Avg | Rouge-2 | Rouge-L | BLEU | |:-------------------------------------------------------------|--------:|----------:|----------:|---------:| | Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-stage2 | 23.94 | 30.35 | 30.19 | 11.27 | | Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-1step | 21.62 | 28.46 | 28.42 | 7.97 | | Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r1-ckpt150 | 27.21 | 33.36 | 32.25 | 16.01 | | Qwen3-4B-Instruct-2507-L-SUMM-fourcorners-rl-r2-ckpt500 | 29.91 | 34.91 | 33.34 | 21.49 | ## Hardware Usage We use 1x A100 80GB to finetune this model. ## Authors Chompakorn Chaksangchaichot & Pawitsapak Akarajaradwong
`{chompakornc_pro,pawitsapaka_visai}@vistec.ac.th`