Model Name with Parameters

SalesA AI 125M — Multimodal Mixture-of-Experts Transformer

  • Full Name Example:
    SalesA AI 125M (MoE, Multimodal, N.E.N & SalesA Team)

  • Breakdown:

    • SalesA AI: Your model’s brand/architecture.
    • 125M: Number of parameters (from your logs: Total parameters: 125,106,858 ≈ 125M).
    • MoE: Mixture-of-Experts.
    • Multimodal: Handles text, vision, audio, code, action.
    • Author: N.E.N (Nthuku Elijah Nzeli) and SalesA Team.

    Created by N.E.N (Nthuku Elijah Nzeli) and SalesA Team

    • Model architecture: SalesAModel
    • Library: Custom (see included code)
    • Parameters: 125M
    • Modalities: Text, Vision, Audio, Code, Action/Robotics
    • MoE: Yes

    This repository contains the SalesA AI model, a modular, extensible, and efficient multimodal transformer with Mixture-of-Experts (MoE) layers. ```

Why This Naming Convention?

  • Clarity: Users immediately know the model’s size and capabilities.
  • Discoverability: Easy to search/filter on Hugging Face and other platforms.
  • Professionalism: Follows conventions used by top models (e.g., “Qwen2.5-7B-Minivoc-32k” source, “distilbert-base-uncased” source).

Summary Table

Field Value
Model Name SalesA AI 125M
Architecture Multimodal Mixture-of-Experts Transformer
Parameters 125M
Author N.E.N (Nthuku Elijah Nzeli) and SalesA Team

SalesA AI 125M — Multimodal Mixture-of-Experts Transformer
Created by N.E.N (Nthuku Elijah Nzeli) and SalesA Team

This repository contains the SalesA AI model, a modular, extensible, and efficient multimodal transformer with Mixture-of-Experts (MoE) layers. It supports text, vision, audio, code, and action/robotics tasks.

  • Model architecture: SalesAModel
  • Library: Custom (see included code)
  • Author: N.E.N (Nthuku Elijah Nzeli) and SalesA Team

Model Description

SalesA AI is a lightweight, CPU-optimized, multimodal model with a Mixture-of-Experts (MoE) architecture. It supports:

  • Text generation & classification (including financial sentiment)
  • Vision (image-to-text, classification)
  • Audio (audio-to-text, classification)
  • Code generation
  • Action prediction for robotics/locomotion
  • Ethical and bias-aware outputs

The model is designed for extensibility, ethical deployment, and real-world applications in finance, sales, stock/market analysis, and robotics.

Intended Uses & Limitations

Intended Uses

  • Financial news and sentiment analysis
  • Sales and market trend analysis
  • General text, vision, and audio tasks
  • Code generation and completion
  • Robotics: action/command prediction from multimodal input
  • Research and educational use

Limitations

  • Not suitable for high-stakes financial decisions without human oversight
  • May not generalize to all languages or domains
  • Biases may exist in training data; see bias analysis plots
  • Not for commercial use without review (see license)

Datasets Used

Training Details

  • Architecture: Mixture-of-Experts (MoE), multimodal encoders (text, vision, audio)
  • Parameters: ~125M
  • Hardware: CPU-optimized, trainable on commodity hardware
  • Losses: Cross-entropy for classification/generation, multitask loss
  • Optimizer: AdamW
  • Batch size: 4 (default)
  • Epochs: 10 (default)

Evaluation Results

  • Financial sentiment (accuracy): 0.85
  • Financial sentiment (F1): 0.83
  • General text/vision/audio: See per-task metrics in training logs
  • Bias/diagnostic plots: See confusion_matrix.png, class_distribution.png, per_class_metrics.png in the model directory

Ethical Considerations & Bias Analysis

  • Model includes bias and diagnostic visualizations for transparency
  • Not for use in applications requiring guaranteed fairness or absence of bias
  • See Hugging Face Model Card Guide for best practices

Files Included

  • model.safetensors: Model weights
  • config.json: Model configuration
  • tokenizer.json, vocab.json, tokenizer.model: Tokenizer files
  • merge.txt: Tokenizer merges (placeholder)
  • generation_config.json: Generation parameters
  • model.safetensors.index.json: Index for sharded weights (placeholder)
  • chat_template.jinja: Chat UI template (placeholder)
  • training_history.pkl: Training history
  • confusion_matrix.png, class_distribution.png, per_class_metrics.png: Diagnostic plots

How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Or use your custom loading code for SalesA AI

Citation

If you use this model, please cite:

@article{Malo2014GoodDO,
  title={Good debt or bad debt: Detecting semantic orientations in economic texts},
  author={P. Malo and A. Sinha and P. Korhonen and J. Wallenius and P. Takala},
  journal={Journal of the Association for Information Science and Technology},
  year={2014},
  volume={65}
}

License

apache-2.0

Contact

For questions or commercial licensing, contact the SalesA Team.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train Qybera/SalesA-AI-125M

Evaluation results