Clément Dumas's picture

7 7 21

Clément Dumas

Butanium

·

https://butanium.github.io/

AI & ML interests

None yet

Recent Activity

commented on an article 12 days ago

🤗 PEFT welcomes new merging methods

upvoted a paper 12 days ago

Open Character Training: Shaping the Persona of AI Assistants through Constitutional AI

liked a model about 1 month ago

NobodyExistsOnTheInternet/K3-Q4-GGUF

View all activity

Organizations

commented on 🤗 PEFT welcomes new merging methods 12 days ago

I got very confused by the claim that linear corresponds to the task arithmetic paper.
This paper operates on the weight difference between base and FT models, so the difference when applying LoRA is a vector of dimension (num_params,) with:

0s on modules not LoRAed
BA.flatten() on modules that got a LoRA

The merging from this paper then corresponds to CAT and not linear in my opinion. The flattened LoRA weights don't make any sense in this context imo.

upvoted a paper 12 days ago

Open Character Training: Shaping the Persona of AI Assistants through Constitutional AI

Paper • 2511.01689 • Published 19 days ago • 4

liked a model about 1 month ago

NobodyExistsOnTheInternet/K3-Q4-GGUF

3.5T • Updated Sep 12 • 53 • 18

updated a model about 1 month ago

science-of-finetuning/SAE-chat-gemma-2-2b-L13-k100-x32-lr1e-04-local-shuffling-ft-chat

Updated Oct 9 • 4

published a model about 1 month ago

science-of-finetuning/SAE-chat-gemma-2-2b-L13-k100-x32-lr1e-04-local-shuffling-ft-chat

Updated Oct 9 • 4

updated 9 models about 1 month ago

science-of-finetuning/SAE-difference_bc-gemma-2-2b-L13-x32-k100-lr1e-04-local-shuffling

science-of-finetuning/gemma-2-2b-L13-k100-lr1e-04-local-shuffling-CCLoss

science-of-finetuning/gemma-2-2b-L13-k100-lr1e-04-local-shuffling-SAELoss

Updated Mar 10 • 1

science-of-finetuning/SAE-difference_cb-gemma-2-2b-L13-k100-lr1e-04-local-shuffling

Updated Jun 5 • 2

science-of-finetuning/SAE-difference_cb-gemma-2-2b-L13-k100-x8-lr1e-04-local-shuffling

Updated Jun 9 • 5

science-of-finetuning/SAE-difference_cb-gemma-2-2b-L13-k100-x2-lr1e-04-local-shuffling

Updated Jun 10 • 3

science-of-finetuning/SAE-difference_cb-gemma-2-2b-L13-k100-x1-lr1e-04-local-shuffling

Updated Jun 10 • 3

science-of-finetuning/SAE-base-gemma-2-2b-L13-k100-x32-lr1e-04-local-shuffling

Updated Jun 19 • 3

science-of-finetuning/gemma-2-2b-L13-k100-lr1e-04-local-shuffling-Decoupled

Updated May 8 • 1

upvoted a collection about 2 months ago

Open Character Training

https://arxiv.org/abs/2511.01689 • 8 items • Updated 18 days ago • 3

New activity in OpenMeditron/Meditron3-Gemma2-2B 2 months ago

Mismatch between tree view and readme description

#4 opened 2 months ago by

liked a model 2 months ago

maius/llama-3.1-8b-it-personas

Text Generation • Updated 18 days ago • 2

updated a model 4 months ago

Butanium/simple-stories-4L16H512D-attention-only-toy-transformer

11.5M • Updated Aug 6 • 1

published a model 4 months ago

Butanium/simple-stories-4L16H512D-attention-only-toy-transformer

11.5M • Updated Aug 6 • 1

updated a model 4 months ago

Butanium/simple-stories-4L16H256D-attention-only-toy-transformer

3.93M • Updated Aug 6 • 1