FinRL-DAPO-GRPO-Sentiment-Risk

This repository contains the code and trained model for our FinRL Contest 2025 Task 1 submission:
"A New DAPO Algorithm for Stock Trading".

We propose a novel trading agent that integrates a Group Relative Policy Optimization (GRPO) approach with:

  • Insights from the DAPO algorithm (used in LLM preference tuning),
  • Sentiment and risk signals derived from LLM-extracted financial news,
  • An exponentiated sentiment-risk reward function for more robust decision-making.

Model Highlights

  • Framework: Custom RL agent using GRPO with decoupled clipping.
  • Reward: Weighted sentiment-risk adjusted return.
  • Dataset: FNSPID, based on Nasdaq-100 from 1999โ€“2023.
  • Performance:
    • Cumulative return: 230.49% (2020โ€“2023)
    • Info Ratio: 0.37
    • Max Drawdown: -49.11%
    • Outperforms CPPO-DeepSeek 10% baseline

Files

  • model_rl.pth: Trained PyTorch model checkpoint.
  • README.md: This file.
  • (Optionally add: training logs, config files, etc.)

Usage

This model is designed for research use in financial RL.
You can load the model via PyTorch:

import torch

model = torch.load("model_rl.pth")
model.eval()
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support