warshanks
/

Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v1-AWQ

Text Generation

compressed-tensors

Model card Files Files and versions

Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v1-AWQ / README.md

warshanks's picture

Update README.md

ed6261d verified 3 months ago

|

history blame contribute delete

2.35 kB

	---
	base_model:
	- Goekdeniz-Guelmez/Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v1
	tags:
	- chat
	pipeline_tag: text-generation
	---
	# JOSIEFIED Model Family

	The JOSIEFIED model family represents a series of highly advanced language models built upon renowned architectures such as Alibaba’s Qwen2/2.5/3, Google’s Gemma3, and Meta’s LLaMA3/4. Covering sizes from 0.5B to 32B parameters, these models have been significantly modified (“gabliterated”) and further fine-tuned to maximize uncensored behavior without compromising tool usage or instruction-following abilities.

	Despite their rebellious spirit, the JOSIEFIED models often outperform their base counterparts on standard benchmarks — delivering both raw power and utility.
	These models are intended for advanced users who require unrestricted, high-performance language generation.

	## Model Card for Goekdeniz-Guelmez/Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v1

	### Model Description

	Introducing Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v1, a new addition to the JOSIEFIED family — fine-tuned and gabliterated with a focus on openness and instruction alignment.

	### Gabliteration

	With this model series, I introduce the first Gabliteration, a novel neural weight modification technique that advances beyond traditional abliteration methods through adaptive multi-directional projections with regularized layer selection.
	My new Gabliteration technique addresses the fundamental limitation of existing abliteration methods that compromise model quality while attempting to modify specific behavioral patterns.

	#### Technical Background

	Building upon the foundational work of Arditi et al. (2024) on single-direction abliteration, Gabliteration extends to a comprehensive multi-directional framework with theoretical guarantees. My method employs singular value decomposition on difference matrices between harmful and harmless prompt representations to extract multiple refusal directions.


	- Developed by: Goekdeniz-Guelmez
	- Funded by: Goekdeniz-Guelmez
	- Shared by: Goekdeniz-Guelmez
	- Model type: qwen3
	- Finetuned from model: Qwen/Qwen3-4B-Instruct-2507

	## Bias, Risks, and Limitations

	This model has reduced safety filtering and may generate sensitive or controversial outputs.
	Use responsibly and at your own risk.