Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
104.8
TFLOPS
1
1
Raphael Mota da Costa da Paz
RaphaPAZ
Follow
0 followers
·
12 following
XaszD
AI & ML interests
None yet
Recent Activity
reacted
to
samerzaher80
's
post
with 👍
2 days ago
AetherMind_SRL: How I beat 7B models on MMLU with 184M params and a $300 GPU I’m Sameer, a solo researcher from Iraq working on a single RTX 3050 8GB laptop.Today I’m releasing AetherMind_SRL – a 184M-parameter NLI model that was trained only on tasks (SNLI, MNLI, ANLI, and a small clinical Alzheimer’s dataset). It was never fine-tuned or even shown a single MMLU question during training.Yet here are the zero-shot MMLU (57 subjects) results:Model MMLU Zero-Shot Training Data AetherMind_SRL (me) 184M 36.05 % Only NLI (SNLI/MNLI/ANLI + ADNI) DeBERTa-v3-base 278M ~30.8 % General pre-training BERT-large 340M 27–30 % General pre-training LLaMA-1 7B 7B 34–35 % Massive text corpus LLaMA-2 7B 7B ~45 % Bigger + better data Yes – my 184M model beats every classic 300–400M model and the original 7-billion-parameter LLaMA-1, all while running at 300+ samples/sec on a $300 laptop GPU.How did this happen?I built a standardized self-improvement loop called AetherMind Self-Reflective Learning (SRL) v1.0:Train normally on NLI Let the model predict on hard adversarial data (ANLI) Log every mistake + low-confidence case Build a balanced “SMART” buffer (60% errors + 40% correct anchors) Fine-tune with tiny LR and error-weighted loss Repeat until stable That’s it. No external knowledge, no MMLU data, no cluster. Just pure reasoning transfer from entailment/contradiction patterns → real-world knowledge.Try it yourself python from transformers import pipeline import torch nli_pipeline = pipeline( "text-classification", model="samerzaher80/AetherMind_SRL", device=0 if torch.cuda.is_available() else -1 ) # DEFINE YOUR TEST HERE premise = "Patient shows progressive memory decline." hypothesis = "Patient shows progressive memory decline." input_text = f"{premise} [SEP] {hypothesis}" result = nli_pipeline(input_text)[0] print(f"Prediction: {result['label']}") print(f"Confidence: {result['score']: Model: https://huggingface.co/samerzaher80/AetherMind_SRL
reacted
to
piercus
's
post
with 👍
29 days ago
Starts erasing! 🎉 🎉 🎉 This is made with a one-step SD1.5 LBM [1] eraser ! Data is open. Data pipeline is open. Training code is open. On our LBM fork : https://github.com/finegrain-ai/LBM [1] https://huggingface.co/papers/2503.07535
new
activity
about 2 months ago
Qwen/Qwen3-VL-8B-Thinking:
Is there any way of using this model along with qwen image on comfyui?
View all activity
Organizations
RaphaPAZ
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
upvoted
a
collection
about 2 months ago
Qwen3-VL
Collection
37 items
•
Updated
28 days ago
•
456