Cancer Image Classification using Vision Transformer (ViT)

Model Description

This model is a fine-tuned Vision Transformer (ViT) for binary classification of medical images to detect cancer. The model classifies images into two categories:

  • Normal (Label 0): Healthy tissue
  • Malignant (Label 1): Cancerous tissue

Model Details

  • Base Model: google/vit-base-patch16-224-in21k
  • Architecture: Vision Transformer (ViT-Base)
  • Input Size: 224x224 pixels
  • Number of Classes: 2 (Binary Classification)
  • Framework: PyTorch + Transformers

Performance

The model achieves the following performance on the test dataset:

Metric Value
Accuracy 83.38%
Precision 71.50%
Recall (Sensitivity) 72.25%
Specificity 88.02%
F1 Score 0.7187
MCC 0.6009

Clinical Interpretation

  • Correctly classifies 83.4% of test images
  • ⚠️ Catches 72.3% of actual cancer cases (27.75% false negative rate)
  • 71.5% of cancer predictions are correct
  • ⚠️ False Negative Rate: 27.75% - May miss some cancer cases
  • ⚠️ False Positive Rate: 11.98% - May incorrectly flag healthy tissue

Usage

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch

# Load model and processor
model = AutoModelForImageClassification.from_pretrained("Shivamnegi92/cancer-classification-vit")
processor = AutoImageProcessor.from_pretrained("Shivamnegi92/cancer-classification-vit")

# Load and preprocess image
image = Image.open("path/to/your/image.jpg")
inputs = processor(images=image, return_tensors="pt")

# Make prediction
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1)

# Interpret results
class_names = ["Normal", "Malignant"]
confidence = predictions[0][predicted_class].item()
result = class_names[predicted_class.item()]

print(f"Prediction: {result}")
print(f"Confidence: {confidence:.4f}")

Training Details

Training Data

  • Dataset: Private medical image dataset
  • Training Samples: Custom split
  • Validation Samples: Custom split
  • Test Samples: 650 images (459 Normal + 191 Malignant)

Training Configuration

  • Epochs: 5
  • Batch Size: 16
  • Learning Rate: 5e-5
  • Optimizer: AdamW
  • Loss Function: Cross Entropy
  • Metric for Best Model: Accuracy

Data Processing

  • Image Preprocessing: ViT image processor normalization
  • Image Size: 224x224 pixels
  • Data Augmentation: Random resized crop
  • Label Mapping: Applied adjust_labels function (Normal=0, Malignant=1)

Limitations and Ethical Considerations

⚠️ Medical Disclaimer

This model is for research and educational purposes only. It should NOT be used for actual medical diagnosis.

Limitations

  • High False Negative Rate: 27.75% of cancer cases may be missed
  • Dataset Bias: Trained on specific dataset - may not generalize to all populations
  • Image Quality Dependency: Performance varies with image quality and acquisition method
  • Limited Scope: Only trained for specific type of medical imaging

Ethical Considerations

  • Always consult qualified healthcare professionals for medical diagnosis
  • Model outputs should be interpreted by medical experts
  • Consider potential bias in training data
  • Ensure proper validation in clinical settings before any medical application

Technical Specifications

Hardware Requirements

  • Minimum RAM: 2GB
  • Recommended RAM: 4GB
  • GPU: Optional (CPU inference supported)
  • Storage: ~400MB for model weights

Software Requirements

  • Python: >=3.8
  • PyTorch: >=2.0.0
  • Transformers: >=4.46.0
  • Pillow: >=9.0.0

Citation

If you use this model in your research, please cite:

@misc{cancer-classification-vit-2024,
  title={Cancer Image Classification using Vision Transformer},
  author={Shivam Negi},
  year={2024},
  howpublished={\url{https://huggingface.co/Shivamnegi92/cancer-classification-vit}}
}

Contact

For questions or issues regarding this model, please open an issue in the model repository.


Keywords: cancer detection, medical imaging, vision transformer, image classification, healthcare AI, diagnostic assistance

Downloads last month
14
Safetensors
Model size
85.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Shivamnegi92/cancer-classification-vit

Finetuned
(2439)
this model

Evaluation results