π Q-Learning Agent - Taxi-v3
Trained Q-learning agent on Taxi-v3 using Gymnasium.
π§© Training setup
- Episodes: 25000
- Learning rate: 0.7
- Gamma: 0.95
- Epsilon decay: 0.005
π Evaluation
- Mean reward: 8.30 Β± 2.45
- Success rate: 100.00%
- Episodes evaluated: 20
- Max steps per episode: 99
π¬ Demo
https://huggingface.co/jAbreu24/q-Taxi-v3-optimized/resolve/main/replay.mp4
Model trained and uploaded automatically on 2025-11-05 23:58:52.