YanLabs commited on
Commit
44c45ab
·
verified ·
1 Parent(s): 3ff1da3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -19
README.md CHANGED
@@ -1,44 +1,65 @@
1
  ---
2
- license: mit
3
  base_model:
4
  - google/gemma-3-27b-it
5
  pipeline_tag: text-generation
6
  ---
7
- # Model Card for YanLabs/gemma3-27b-it-abliterated-normpreserve
8
 
9
- This is a abliterated version of google/gemma-3-27b-it, using norm-preserving technique.
10
- Please refer to: https://github.com/jim-plus/llm-abliteration
 
 
 
 
11
 
12
  ## Model Details
13
 
14
  ### Model Description
15
 
16
- This is a abliterated version of google/gemma-3-27b-it, using norm-preserving technique.
17
 
 
 
 
 
18
 
19
- - **Developed by:** YanLabs
20
- - **Model type:** Transformer-Text Generation
21
- - **License:** MIT
22
- - **Finetuned from model [optional]:** google/gemma-3-27b-it
23
 
24
- ### Model Sources [optional]
 
 
25
 
 
26
 
27
- - **Repository:** google/gemma-3-27b-it
28
- - **Paper(from jim-plus):** [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration)
29
 
30
- ## Uses
 
 
 
 
31
 
32
- Security measures of the model have been removed. For research use only.
 
 
 
 
 
 
 
 
 
 
 
33
 
34
- ## Citation:
35
  If you use this model in your research, please cite:
 
 
36
  @misc{gemma3-27b-abliterated,
37
  author = {YanLabs},
38
  title = {Gemma 3 27B Instruct - Norm-Preserving Abliterated},
39
  year = {2025},
40
  publisher = {HuggingFace},
41
- howpublished = {\url{https://huggingface.co/YanLabs/gemma3-27b-it-abliterated-normpreserve}}
42
- }
43
-
44
-
 
1
  ---
2
+ license: gemma
3
  base_model:
4
  - google/gemma-3-27b-it
5
  pipeline_tag: text-generation
6
  ---
 
7
 
8
+
9
+ # Gemma 3 27B Instruct - Norm-Preserving Abliterated
10
+
11
+ This is an abliterated version of [google/gemma-3-27b-it](https://huggingface.co/google/gemma-3-27b-it) using the norm-preserving biprojected abliteration technique.
12
+
13
+ **⚠️ Warning**: Safety guardrails and refusal mechanisms have been removed through abliteration. This model may generate harmful content and is intended for mechanistic interpretability research only.
14
 
15
  ## Model Details
16
 
17
  ### Model Description
18
 
19
+ This model applies **norm-preserving biprojected abliteration** to remove refusal behaviors while preserving the model's original capabilities. The technique surgically removes "refusal directions" from the model's activation space without traditional fine-tuning.
20
 
21
+ - **Developed by**: YanLabs
22
+ - **Model type**: Causal Language Model (Transformer)
23
+ - **License**: Gemma Terms of Use
24
+ - **Base model**: [google/gemma-3-27b-it](https://huggingface.co/google/gemma-3-27b-it)
25
 
26
+ ### Model Sources
 
 
 
27
 
28
+ - **Base Model**: [google/gemma-3-27b-it](https://huggingface.co/google/gemma-3-27b-it)
29
+ - **Abliteration Tool**: [jim-plus/llm-abliteration](https://github.com/jim-plus/llm-abliteration)
30
+ - **Paper**: [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration)
31
 
32
+ ## Uses
33
 
34
+ ### Intended Use
 
35
 
36
+ - **Research**: Mechanistic interpretability studies
37
+ - **Analysis**: Understanding LLM safety mechanisms
38
+ - **Development**: Testing abliteration techniques
39
+
40
+ ### Out-of-Scope Use
41
 
42
+ - Production deployments
43
+ - ❌ User-facing applications
44
+ - ❌ Generating harmful content for malicious purposes
45
+
46
+ ## Limitations
47
+
48
+ - Abliteration does not guarantee complete removal of all refusals
49
+ - May generate unsafe or harmful content
50
+ - Model behavior may be unpredictable in edge cases
51
+ - No explicit harm prevention mechanisms remain
52
+
53
+ ## Citation
54
 
 
55
  If you use this model in your research, please cite:
56
+
57
+ ```bibtex
58
  @misc{gemma3-27b-abliterated,
59
  author = {YanLabs},
60
  title = {Gemma 3 27B Instruct - Norm-Preserving Abliterated},
61
  year = {2025},
62
  publisher = {HuggingFace},
63
+ howpublished = {\url{https://huggingface.co/YanLabs/gemma3-27b-it-abliterated-normpreserve}},
64
+ note = {Abliterated using norm-preserving biprojected technique}
65
+ }