Kondensator-Nano is a highly optimized, transformer-based text classification model designed to detect fraudulent messages (scams) in Polish. It was specifically fine-tuned for the unique linguistic environment of online gaming communities, such as Minecraft chat.
This model is the first and smallest in the planned "Kondensator" series of detection models and serves as the core engine for the OpenBoilerAI project - a free, open-source plugin for protecting game servers. The model classifies messages into two categories: FRAUD or SAFE.
This version is quantized to INT8 and provided in the ONNX format for performance and compatibility, ensuring it can run efficiently even on servers with limited hardware resources.
Model Details
Model Description
The Kondensator series was developed to address the growing problem of QR code and premium SMS scams targeting players on Polish Minecraft servers. Frustrated by the lack of effective moderation tools, these models were created as part of a community-driven effort to build a reliable and accessible automated defense system.
Kondensator Nano, as the smallest variant, prioritizes speed and low resource usage without a significant compromise on accuracy, making it ideal for real-time chat moderation.
- Developed by: madebytoilets (OpenBoilerAI Project)
- Model type: Transformer-based Text Classification (quantized to INT8)
- Language(s) (NLP): Polish (pl)
- License: Apache-2.0
- Finetuned from model:
distilroberta-base
Model Sources
- Repository: https://github.com/openboilerai
- Hugging Face: https://huggingface.co/madebytoilets
Uses
The primary intended use of Kondensator-Nano is for real-time moderation of public chat messages on game servers to protect players from financial fraud.
Direct Use
This model is provided in the ONNX format and can be used with inference runtimes like onnxruntime for efficient text classification. It is designed to classify short text messages, such as those found in Minecraft chat. The model was trained on raw messages, so make sure to clean prefixes from the messages, like "❤ [48★] fakegracz123 »"
Out-of-Scope Use
This model should NOT be used for:
- Automatic banning of players without human review. The model may produce false positives, and its output should be treated as a strong recommendation, not a final verdict.
- Analyzing private messages (DMs) without user consent.
- Making final legal or financial decisions. The model is a tool for moderation, not a legal authority.
- Use outside of the Polish language and gaming slang context, as its performance may significantly degrade.
Bias, Risks, and Limitations
The model was trained on a private, curated dataset of messages from Polish-language online gaming servers. While diverse, the dataset may not cover all types of scams or linguistic variations present on every server.
Risk of False Positives: The model may incorrectly flag legitimate messages that contain keywords associated with scams (e.g. "free gvipy," "rozdanie"). This is why i only recommend using the model's output to temporarily mute players rather than permanently banning them.
How to Get Started with the Model
from huggingface_hub import hf_hub_download
import onnxruntime as ort
from transformers import AutoTokenizer
import numpy as np
onnx_path = hf_hub_download("madebytoilets/Kondensator_Nano_onnx_int8", "model.onnx")
tokenizer = AutoTokenizer.from_pretrained("madebytoilets/Kondensator_Nano_onnx_int8")
session = ort.InferenceSession(onnx_path)
msg = "darmowe gvipy i svipy za pomoc z featherem dodaj na dc moj nick to fake_oszust123"
inputs = tokenizer([msg], return_tensors="np")
logits = session.run(None, dict(inputs))[0]
pred = logits.argmax(-1)[0]
print("scam" if pred == 1 else "not scam")
- Downloads last month
- -