FinRL-DAPO-GRPO-Sentiment-Risk

This repository contains the code and trained model for our FinRL Contest 2025 Task 1 submission:
"A New DAPO Algorithm for Stock Trading".

We propose a novel trading agent that integrates a Group Relative Policy Optimization (GRPO) approach with:

Insights from the DAPO algorithm (used in LLM preference tuning),
Sentiment and risk signals derived from LLM-extracted financial news,
An exponentiated sentiment-risk reward function for more robust decision-making.

Model Highlights

Framework: Custom RL agent using GRPO with decoupled clipping.
Reward: Weighted sentiment-risk adjusted return.
Dataset: FNSPID, based on Nasdaq-100 from 1999–2023.
Performance:
- Cumulative return: 230.49% (2020–2023)
- Info Ratio: 0.37
- Max Drawdown: -49.11%
- Outperforms CPPO-DeepSeek 10% baseline

This model is designed for research use in financial RL.
You can load the model via PyTorch:

import torch

model = torch.load("model_rl.pth")
model.eval()

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support