Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
McGill-NLP 's Collections
The Markovian Thinker
SSA-COMET
INJONGO
Unequal unlearning
AgentRewardBench
Malicious-IR
SafeArena
CHASE
LLM2Vec
WebLINX
AURORA
WebLINX Models
Statcan Dialogue Dataset & Models
FaithDial
MLQuestions

The Markovian Thinker

updated 18 days ago

Reformulating the RL of reasoning LLMs through Markovian Thinking paradigm.

Upvote
10

  • McGill-NLP/delethink-24k-1.5b

    2B • Updated 18 days ago • 497 • 5

  • McGill-NLP/longcot-24k-1.5b

    2B • Updated 18 days ago • 27 • 1

  • McGill-NLP/longcot-8k-1.5b

    2B • Updated 18 days ago • 19

  • McGill-NLP/delethink-96k-base-1.5b

    2B • Updated 25 days ago • 13 • 1

  • McGill-NLP/openmath-filtered

    Viewer • Updated 25 days ago • 200k • 95

  • McGill-NLP/delethink-96k-1.5b

    2B • Updated 18 days ago • 36 • 3

  • The Markovian Thinker

    Paper • 2510.06557 • Published 20 days ago • 29
Upvote
10
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs