Dennis's picture

Dennis

denniscraandijk

·

DennisCraandijk

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago

utter-project/EuroLLM-9B

liked a Space 6 days ago

lightonai/LightOnOCR-1B-Demo

liked a Space 9 days ago

UWV/wimbert-space

View all activity

Organizations

upvoted a collection 2 months ago

Apertus LLM

Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated Oct 1 • 292

upvoted a paper 7 months ago

SuperBPE: Space Travel for Language Models

Paper • 2503.13423 • Published Mar 17 • 13

upvoted 4 papers 8 months ago

Gemini Embedding: Generalizable Embeddings from Gemini

Paper • 2503.07891 • Published Mar 10 • 44

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published Mar 7 • 79

Training Sparse Mixture Of Experts Text Embedding Models

Paper • 2502.07972 • Published Feb 11 • 8

It's All in The [MASK]: Simple Instruction-Tuning Enables BERT-like Masked Language Models As Generative Classifiers

Paper • 2502.03793 • Published Feb 6 • 4

upvoted a collection 8 months ago

Babel

Open Multilingual Large Language Models Serving Over 90% of Global Speakers • 5 items • Updated Apr 15 • 18

upvoted a paper 8 months ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published Feb 25 • 28

upvoted 2 collections 9 months ago

Nomic Embed v2

Multilingual Embedding Models • 5 items • Updated Apr 30 • 20

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated Apr 30 • 103

upvoted a paper 9 months ago

SPLADE-v3: New baselines for SPLADE

Paper • 2403.06789 • Published Mar 11, 2024 • 5

upvoted 2 collections 10 months ago

Lychee-KaLM-embedding

16 items • Updated 5 days ago • 25

Granite 3.1 Language Models

A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 9 items • Updated 2 days ago • 67

upvoted 2 collections 12 months ago

Common Corpus

Largest multilingual pretraining data. • 1 item • Updated Nov 13, 2024 • 12

POTION

These are the flagship POTION models. Load them and use them with model2vec (https://github.com/MinishLab/model2vec) or sentence-transformers • 6 items • Updated May 23 • 13

upvoted 2 papers about 1 year ago

Contextual Document Embeddings

Paper • 2410.02525 • Published Oct 3, 2024 • 24

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 83

upvoted 3 papers over 1 year ago

Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 82

GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence

Paper • 2310.05388 • Published Oct 9, 2023 • 4

Weaver: Foundation Models for Creative Writing

Paper • 2401.17268 • Published Jan 30, 2024 • 45