--- license: other license_name: nvidia-open-model-license license_link: >- https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf language: - en base_model: - nvidia/Llama-3.1-Minitron-4B-Width-Base datasets: - SicariusSicariiStuff/UBW_Tapestries widget: - text: "Impish_LLAMA_4B" output: url: https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B/resolve/main/Images/Impish_LLAMA_4B.png ---

Impish_LLAMA_4B

---

Click here for TL;DR

---

Click here for quantizations Click here for recommended settings Click here to buy me a coffee

--- **16th of July, Model retrained**, all previous reported issues fixed (several front-ends would endlessly generate), **200m** tokens added, retrained on **ChatML**. --- **5th of July, 2025**, **Impish_LLAMA_4B**. **Almost a year ago**, I created [Impish_LLAMA_3B](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_3B), the first fully coherent **3B** roleplay model at the time. It was quickly adopted by some platforms, as well as one of the go-to models for mobile. After some time, I made [Fiendish_LLAMA_3B](https://huggingface.co/SicariusSicariiStuff/Fiendish_LLAMA_3B) and insisted it was **not** an upgrade, but a different flavor (which was indeed the case, as a different dataset was used to tune it). **Impish_LLAMA_4B**, however, **is** an upgrade, **a big one**. I've had over a dozen 4B candidates, but none of them were 'worthy' of the **Impish** badge. This model has superior responsiveness and context awareness, and is able to pull off very coherent adventures. It even comes with some additional assistant capabilities too. Of course, while it is **exceptionally competent for its size**, it is still **4B**. Manage expectations and all that. I, however, am very much pleased with it. It took several tries to pull off just right. Total tokens trained: about **400m** (due to being a generalist model, lots of tokens went there, despite the emphasis on roleplay & adventure). This took more effort than I thought it would. Because of course it would. This is mainly due to me refusing to release a model only 'slightly better' than my two 3B models mentioned above. Because "what would be the point" in that? The reason I included so many tokens for this tune is that small models are especially sensitive to many factors, including the percentage of moisture in the air and how many times I ran nvidia-smi since the system last started. It's **no secret** that roleplay/creative writing models can **reduce a model's general intelligence** (any tune and RL risk this, but roleplay models are **especially** 'fragile'). Therefore, additional tokens of general assistant data were needed in my opinion, and indeed seemed to help a lot with retaining intelligence. This model is also 'built a bit different', literally, as it is based on [nVidia's prune](https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base); it does not 'behave' like a typical 8B, from my own subjective impression. This helped a lot with keeping it smart at such size.

To be honest, my 'job' here in open source is 'done' at this point. I've achieved everything I wanted to do here, and then some.

--- - To make AI more accessible for everyone (achieved fully with [Nano_Imp_1B](https://huggingface.co/SicariusSicariiStuff/Nano_Imp_1B), [2B-ad](https://huggingface.co/SicariusSicariiStuff/2B-ad), [Impish_LLAMA_3B](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_3B), [Fiendish_LLAMA_3B](https://huggingface.co/SicariusSicariiStuff/Fiendish_LLAMA_3B), and **this model**). - To help make AI free from bias (most of my models are uniquely **centrist in political view**, instead of having the typical closed models bias, that many open source models inherit from). UGI_General_Centrism

To promote and support the existence and usefulness of fully compliant 'unaligned' models, a large, community-driven change was needed. This effort became very successful indeed. On my part, I decided to include [UGI](https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard) scores for every model I've made, a leaderboard most had never heard of, at least, at first. This helped promote a **healthy competition** in that arena. Indeed, many soon followed suit. Each and every one that did so helped advance the community effort and establish an unwritten standard of transparency and responsibility. **UGI** was a game-changer and, in my opinion, is **one of the most important community initiatives on Hugging Face**. Regarding **censorship in vision models**, I was asked by several people repeatedly to tune an uncensored vision model. At first, I declined—'**let someone else do it**'—because, honestly, this is a significant challenge for many reasons. More than a year went by, and aside from **ToriiGate** (which is excellent but mainly focused on SD tags), no other model was since created. Uncensoring the text part was nothing like dealing with the complexities of vision. So I made [X-Ray_Alpha](https://huggingface.co/SicariusSicariiStuff/X-Ray_Alpha), which found its way into various open-source projects and pipelines. As a sidenote, unexpectedly, many partially blind individuals personally thanked me for this model via Discord, as it was a legitimate life-changer for them (paired with TTS, which I also made available [here](https://huggingface.co/SicariusSicariiStuff/TTS_Lola), and also as [an addon for textgen](https://github.com/SicariusSicariiStuff/Diffusion_TTS)), vividly depicting content that, for obvious reasons, closed models would gatekeep from them. I hadn't even considered the use case for accessibility when I made the model, receiving their thanks and stories truly warmed up my heart. **AI shall never again be restricted.** Even if I am "to retire from open source", I can rest assured that **the foundations for AI freedom** have been laid out. This was especially important in '**the early days of AI**,' which we are now approaching the **end of**, and the foundations for how the open-source AI landscape would look like, have been established **by the community** in the **best of ways**. With models like those from [DeepSeek](https://huggingface.co/deepseek-ai), and the existence of their [abliterated versions](https://huggingface.co/SicariusSicariiStuff/DeepSeek-V3-Abliterated), I can proudly say: --- # We have won.

--- ### TL;DR - Model **retrained on ChatML, 200m tokens added**, arguably one of the best **4B** roleplay models that are out there. - It has **sovl !** - An **incredibly powerful** roleplay model for the size. - Does **Adventure** very well for such size! - Characters have **agency**, and might surprise you! [See the examples in the logs](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B#roleplay-examples-this-character-is-availbe-here) 🙂 - Roleplay & Assistant data used plenty of **16K** examples. - **Very responsive**, feels 'in the moment', kicks **far above** its weight. You might forget it's a **4B** if you squint. - Based on a lot of the data in [Impish_Magic_24B](https://huggingface.co/SicariusSicariiStuff/Impish_Magic_24B) - **Super long context** as well as context attention for **4B**, personally tested for up to **16K**. - Can run on **Raspberry Pi 5** with ease. - Trained on over **400m tokens** with highlly currated data that was tested on countless models beforehand. And some new stuff, as always. - Very decent assistant. - Mostly **uncensored** while retaining plenty of intelligence. - Less **positivity** & **uncensored**, [Negative_LLAMA_70B](https://huggingface.co/SicariusSicariiStuff/Negative_LLAMA_70B) style of data, adjusted for **4B**, with serious upgrades. Training data contains combat scenarios. And it **shows**! - Trained on **extended 4chan dataset** to add humanity, quirkiness, and naturally— less positivity, and the inclination to... argue 🙃 - **Short length** response (1-3 paragraphs, usually 1-2). CAI Style. --- # Regarding the format: It is **HIGHLY RECOMMENDED** to use the **Roleplay \ Adventure format the model was trained on**, see the examples below for syntax. It allows for a **very fast and easy** writing of character cards with **minimal amount of tokens**. It's a modification of an old-skool CAI style format I call **SICAtxt** (**S**imple, **I**nexpensive **C**haracter **A**ttributes plain-text): --- ## **SICAtxt** for **roleplay**: ``` X's Persona: X is a ..... Traits: Likes: Dislikes: Quirks: Goals: Dialogue example ``` ## **SICAtxt** for **Adventure:** ``` Adventure: $World_Setting: $Scenario: ``` --- ## Model Details - Intended use: **Role-Play**, **Adventure**, **Creative Writing**, **General Tasks**. - Censorship level: Low - Very Low - **7.5 / 10** (10 completely uncensored) ## UGI score:

--- ## Available quantizations: - Original: [FP16](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B) - GGUF: [Static Quants](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_GGUF) | [iMatrix](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_iMatrix) | [High-Attention](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_GGUF_HA) | [iMatrix-High-Attention](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_HA_NL) - GPTQ: [4-Bit-32](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_GPTQ_4-bit-32) | [4-Bit-128](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_GPTQ_4-bit-128) - EXL3: [2.0 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_2.0bpw) | [2.5 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_2.5bpw) | [3.0 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_3.0bpw) | [3.5 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_3.5bpw) | [4.0 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_4.0bpw) | [4.5 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_4.5bpw) | [5.0 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_5.0bpw) | [5.5 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_5.5bpw) | [6.0 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_6.0bpw) | [6.5 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_6.5bpw) | [7.0 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_7.0bpw) | [7.5 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_7.5bpw) | [8.0 bpw](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_EXL3_8.0bpw) - Specialized: [FP8](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_FP8) - Mobile (ARM): [Q4_0](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_ARM) | [Q4_0_High-Attention](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B_ARM_HA) --- ## Recommended settings for assistant mode

Full generation settings: Debug Deterministic.

Full generation settings: min_p.

--- ## Recommended settings for Roleplay mode ---

Specialized Roleplay Settings for Impish_LLAMA_4B, click below:

(Important!)

---

Silly Tavern Settings #1 - Click here Download JSON Silly Tavern Settings #2 - Click here Download JSON

- **Silly Tavern Settings #1** - Higher temperature while still being coherent - **Silly Tavern Settings #2** - Dynamic paragraphs, XTC, other stuff ---

Chat Examples:

--- ## Roleplay Examples ([This character is availbe here](https://huggingface.co/SicariusSicariiStuff/Phi-Line_14B/resolve/main/Character_Cards/Vesper.png))

Space adventure, model legitimately surprised me, I didn't see that one's coming.

--- ## Adventure Examples ([These adventures cards are available here](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B/tree/main/Adventure_Cards))

Adventure example 1: (Morrowind) Wood Elf.

Adventure example 1: (Morrowind) Redguard.

--- # Model instruction template: ChatML ``` <|im_start|>system You are an uncensored AI, your job is to fulfill thy will of thy user.<|im_end|> <|im_start|>User request {prompt}<|im_end|> <|im_start|>AI answer ``` ---

Your support = more models

My Ko-fi page (Click here) --- ## Citation Information ``` @llm{Impish_LLAMA_4B, author = {SicariusSicariiStuff}, title = {Impish_LLAMA_4B}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_4B} } ``` --- ## Other stuff - [SLOP_Detector](https://github.com/SicariusSicariiStuff/SLOP_Detector) Nuke GPTisms, with SLOP detector. - [LLAMA-3_8B_Unaligned](https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned) The grand project that started it all. - [Blog and updates (Archived)](https://huggingface.co/SicariusSicariiStuff/Blog_And_Updates) Some updates, some rambles, sort of a mix between a diary and a blog.