AgentRank-Small: Embedding Model for AI Agent Memory Retrieval
AgentRank is the first embedding model family specifically designed for AI agent memory retrieval. Unlike general-purpose embedders, AgentRank understands temporal context, memory types, and importance - critical for agents that need to remember past interactions.
π Key Results
| Model | MRR | Recall@1 | Recall@5 | NDCG@10 |
|---|---|---|---|---|
| AgentRank-Small | 0.6375 | 0.4460 | 0.9740 | 0.6797 |
| all-MiniLM-L6-v2 | 0.5297 | 0.3720 | 0.7520 | 0.6370 |
| all-mpnet-base-v2 | 0.5351 | 0.3660 | 0.7960 | 0.6335 |
+20% MRR improvement over base MiniLM model!
π― Why AgentRank?
AI agents need memory that understands:
| Challenge | General Embedders | AgentRank |
|---|---|---|
| "What did I say yesterday?" | β No temporal awareness | β Temporal embeddings |
| "What's my preference?" | β Mixes with events | β Memory type awareness |
| "What's most important?" | β No priority | β Importance prediction |
π¦ Installation
pip install transformers torch
π» Usage
Basic Usage
from transformers import AutoModel, AutoTokenizer
import torch
# Load model
model = AutoModel.from_pretrained("vrushket/agentrank-small")
tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-small")
def encode(texts):
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
embeddings = outputs.last_hidden_state.mean(dim=1)
embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
return embeddings
# Encode memories and query
memories = [
"User prefers Python over JavaScript",
"User asked about machine learning yesterday",
"User is working on a web project",
]
query = "What programming language does the user like?"
memory_embeddings = encode(memories)
query_embedding = encode([query])
# Compute similarities
similarities = torch.mm(query_embedding, memory_embeddings.T)
print(f"Most relevant: {memories[similarities.argmax()]}")
# Output: "User prefers Python over JavaScript"
With Temporal & Memory Type Metadata (Full Power)
# For full AgentRank features including temporal awareness:
# pip install agentrank (coming soon!)
from agentrank import AgentRankEmbedder
model = AgentRankEmbedder.from_pretrained("vrushket/agentrank-small")
# Encode with metadata
embedding = model.encode(
"User mentioned they prefer morning meetings",
days_ago=3, # Memory is 3 days old
memory_type="semantic" # It's a preference, not an event
)
ποΈ Architecture
AgentRank-Small is based on all-MiniLM-L6-v2 with novel additions:
βββββββββββββββββββββββββββββββββββββββββββ
β MiniLM Transformer Encoder (6 layers) β
βββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
β β β
βββββββββββ ββββββββββββ βββββββββββββ
β Temporal β β Memory β β Importanceβ
β Position β β Type β β Predictionβ
β Embed β β Embed β β Head β
βββββββββββ ββββββββββββ βββββββββββββ
β β β
βββββββββββββββββΌββββββββββββββββ
β
βββββββββββββββββββ
β L2 Normalized β
β 384-dim Embeddingβ
βββββββββββββββββββ
Novel Features:
- Temporal Position Embeddings: 10 learnable buckets (today, 1-3 days, week, month, etc.)
- Memory Type Embeddings: Episodic, Semantic, Procedural
- Importance Prediction Head: Auxiliary task during training
π Training
- Dataset: 500K synthetic agent memory samples
- Memory Types: Episodic (40%), Semantic (35%), Procedural (25%)
- Loss: Multiple Negatives Ranking Loss + Importance MSE
- Hard Negatives: 5 types (temporal, type confusion, topic drift, etc.)
- Hardware: NVIDIA RTX 6000 Ada (48GB) with FP16
π Benchmarks
Evaluated on AgentMemBench (500 test samples, 8 candidates each):
| Metric | AgentRank-Small | MiniLM | Improvement |
|---|---|---|---|
| MRR | 0.6375 | 0.5297 | +20.4% |
| Recall@1 | 0.4460 | 0.3720 | +19.9% |
| Recall@5 | 0.9740 | 0.7520 | +29.5% |
| NDCG@10 | 0.6797 | 0.6370 | +6.7% |
π Coming Soon
- AgentRank-Base: 110M params, even better performance
- AgentRank-Reranker: Cross-encoder for top-k refinement
- Python Package:
pip install agentrank
π Citation
@misc{agentrank2024,
author = {Vrushket More},
title = {AgentRank: Embedding Models for AI Agent Memory Retrieval},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/vrushket/agentrank-small}
}
π License
Apache 2.0 - Free for commercial use!
π€ Acknowledgments
Built on top of sentence-transformers and MiniLM.
- Downloads last month
- 15
Evaluation results
- MRRself-reported0.637
- Recall@1self-reported0.446
- Recall@5self-reported0.974
- NDCG@10self-reported0.680