Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
McGill-NLP
's Collections
The Markovian Thinker
SSA-COMET
INJONGO
Unequal unlearning
AgentRewardBench
Malicious-IR
SafeArena
CHASE
LLM2Vec
WebLINX
AURORA
WebLINX Models
Statcan Dialogue Dataset & Models
FaithDial
MLQuestions
The Markovian Thinker
updated
Oct 9
Reformulating the RL of reasoning LLMs through Markovian Thinking paradigm.
Upvote
11
+1
McGill-NLP/delethink-24k-1.5b
2B
•
Updated
Oct 9
•
190
•
5
McGill-NLP/longcot-24k-1.5b
2B
•
Updated
Oct 9
•
151
•
1
McGill-NLP/longcot-8k-1.5b
2B
•
Updated
Oct 9
•
7
McGill-NLP/delethink-96k-base-1.5b
2B
•
Updated
Oct 3
•
1
•
1
McGill-NLP/openmath-filtered
Viewer
•
Updated
Oct 3
•
200k
•
82
McGill-NLP/delethink-96k-1.5b
2B
•
Updated
Oct 9
•
9
•
3
The Markovian Thinker
Paper
•
2510.06557
•
Published
Oct 8
•
30
Upvote
11
+7
Share collection
View history
Collection guide
Browse collections