Iman998 (Iman Barati)

upvoted a collection 2 months ago

Qwen3-Omni

Collection

6 items • Updated Oct 9 • 165

upvoted 2 papers 2 months ago

GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction

Paper • 2505.10939 • Published May 16 • 3

SearchInstruct: Enhancing Domain Adaptation via Retrieval-Based Instruction Dataset Creation

Paper • 2509.10708 • Published Sep 12 • 17

upvoted a paper 5 months ago

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

Paper • 2506.18945 • Published Jun 23 • 40

upvoted a collection 8 months ago

🧠 Reasoning datasets

Collection

Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 174

upvoted an article 9 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12

•

471

upvoted 2 articles 10 months ago

Article

Training and Finetuning Embedding Models with Sentence Transformers v3

May 28, 2024

•

259

Article

🪆 Introduction to Matryoshka Embedding Models

Feb 23, 2024

•

180

upvoted a paper 11 months ago

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Paper • 2412.14922 • Published Dec 19, 2024 • 88

upvoted 6 papers about 1 year ago

Law of the Weakest Link: Cross Capabilities of Large Language Models

Paper • 2409.19951 • Published Sep 30, 2024 • 54

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Paper • 2409.17115 • Published Sep 25, 2024 • 63

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published Sep 18, 2024 • 39

upvoted an article about 1 year ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

•

272

upvoted an article over 1 year ago

Article

Welcome Falcon Mamba: The first strong attention-free 7B model

Aug 12, 2024

•

113

upvoted 3 papers over 1 year ago

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 104

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 167

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 95

Iman Barati

AI & ML interests

Organizations

Qwen3-Omni

GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction

SearchInstruct: Enhancing Domain Adaptation via Retrieval-Based Instruction Dataset Creation

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

🧠 Reasoning datasets

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Training and Finetuning Embedding Models with Sentence Transformers v3

🪆 Introduction to Matryoshka Embedding Models

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Law of the Weakest Link: Cross Capabilities of Large Language Models

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Instruction Following without Instruction Tuning

Emu3: Next-Token Prediction is All You Need

LLMs + Persona-Plug = Personalized LLMs

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Welcome Falcon Mamba: The first strong attention-free 7B model

Scaling Synthetic Data Creation with 1,000,000,000 Personas

Qwen2 Technical Report

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Iman Barati

AI & ML interests

Organizations

Iman998's activity

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Training and Finetuning Embedding Models with Sentence Transformers v3

🪆 Introduction to Matryoshka Embedding Models

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Welcome Falcon Mamba: The first strong attention-free 7B model