22 1

Piotr Wilkin

ilintar

https://buymeacoffee.com/ilintar

AI & ML interests

Doing some model adaptation stuff for llama.cpp / random projects here and there

Recent Activity

updated a model about 16 hours ago

ilintar/MiniMax-M2-GGUF

updated a model about 16 hours ago

ilintar/Qwen3-Nemotron-32B-160k-GGUF

published a model about 17 hours ago

ilintar/Qwen3-Nemotron-32B-160k-GGUF

View all activity

Organizations

updated 2 models about 16 hours ago

ilintar/MiniMax-M2-GGUF

229B • Updated about 16 hours ago • 7 • 1

ilintar/Qwen3-Nemotron-32B-160k-GGUF

33B • Updated about 16 hours ago • 412 • 1

published a model about 17 hours ago

ilintar/Qwen3-Nemotron-32B-160k-GGUF

33B • Updated about 16 hours ago • 412 • 1

published a model 2 days ago

ilintar/MiniMax-M2-GGUF

229B • Updated about 16 hours ago • 7 • 1

reacted to onekq's post with 👍 4 days ago

Post

3587

Context rot is such a catchy phrase, but the problem has been identified 2+ years ago, called attention decay.
Lost in the Middle: How Language Models Use Long Contexts (2307.03172)

I spotted the same problem in coding tasks, and documented in my book (https://www.amazon.com/dp/9999331130).

Why did this problem become hot again? This is because many of us thought the problem has been solved by long context models, which is not true.

Here we were misled by benchmarks. Most long-context benchmarks build around the QA scenario, i.e. "finding needle in haystack". But in agentic scenarios, the model needs to find EVERYTHING in the haystack, and just can't afford enough attention for this challenge.