1

MetaCLIP-2-Cifar10

MetaCLIP-2-Cifar10 is an image classification vision–language encoder model fine-tuned from facebook/metaclip-2-worldwide-s16 for a single-label classification task. It is designed to identify and categorize images into the ten CIFAR-10 object classes using the MetaClip2ForImageClassification architecture.

MetaCLIP 2: A Worldwide Scaling Recipe : https://huggingface.co/papers/2507.22062

Classification report:

              precision    recall  f1-score   support

    airplane     0.9813    0.9685    0.9748      2000
  automobile     0.9777    0.9850    0.9813      2000
        bird     0.9560    0.9560    0.9560      2000
         cat     0.9104    0.9395    0.9247      2000
        deer     0.9566    0.9580    0.9573      2000
         dog     0.9476    0.9215    0.9343      2000
        frog     0.9774    0.9735    0.9755      2000
       horse     0.9704    0.9670    0.9687      2000
        ship     0.9782    0.9890    0.9836      2000
       truck     0.9774    0.9735    0.9755      2000

    accuracy                         0.9631     20000
   macro avg     0.9633    0.9632    0.9632     20000
weighted avg     0.9633    0.9631    0.9632     20000

download


The model classifies images into the following categories:

  • Class 0: airplane
  • Class 1: automobile
  • Class 2: bird
  • Class 3: cat
  • Class 4: deer
  • Class 5: dog
  • Class 6: frog
  • Class 7: horse
  • Class 8: ship
  • Class 9: truck

Run with Transformers

!pip install -q transformers torch pillow gradio
import gradio as gr
from transformers import AutoImageProcessor
from transformers import AutoModelForImageClassification
from transformers.image_utils import load_image
from PIL import Image
import torch

# Load model and processor
model_name = "prithivMLmods/MetaCLIP-2-Cifar10"
model = AutoModelForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)

def cifar10_classification(image):
    """Predicts the CIFAR-10 class represented in an image."""
    image = Image.fromarray(image).convert("RGB")
    inputs = processor(images=image, return_tensors="pt")

    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

    labels = {
        "0": "airplane",
        "1": "automobile",
        "2": "bird",
        "3": "cat",
        "4": "deer",
        "5": "dog",
        "6": "frog",
        "7": "horse",
        "8": "ship",
        "9": "truck"
    }
    predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}

    return predictions

# Create Gradio interface
iface = gr.Interface(
    fn=cifar10_classification,
    inputs=gr.Image(type="numpy"),
    outputs=gr.Label(label="Prediction Scores"),
    title="CIFAR-10 Classification",
    description="Upload an image to classify it into one of the CIFAR-10 categories."
)

# Launch the app
if __name__ == "__main__":
    iface.launch()

Sample Inference:

Screenshot 2025-11-15 at 08-21-23 CIFAR-10 Classification Screenshot 2025-11-15 at 08-26-25 CIFAR-10 Classification Screenshot 2025-11-15 at 08-22-10 CIFAR-10 Classification Screenshot 2025-11-15 at 08-22-41 CIFAR-10 Classification Screenshot 2025-11-15 at 08-23-53 CIFAR-10 Classification Screenshot 2025-11-15 at 08-24-30 CIFAR-10 Classification Screenshot 2025-11-15 at 08-25-04 CIFAR-10 Classification

Intended Use:

The MetaCLIP-2-Cifar10 model is designed for object classification across the ten CIFAR-10 categories. Potential use cases include:

  • Educational & Research Applications: Benchmarking experiments, model comparison, and deep learning studies.
  • Lightweight Vision Systems: Useful for systems requiring simple object recognition.
  • Dataset Exploration: Assisting in data inspection, annotation, and visualization.
  • Prototype Systems: Ideal for rapid prototyping in classification pipelines.
Downloads last month
24
Safetensors
Model size
21.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/MetaCLIP-2-Cifar10

Finetuned
(4)
this model

Dataset used to train prithivMLmods/MetaCLIP-2-Cifar10

Collection including prithivMLmods/MetaCLIP-2-Cifar10