Whisper Tiny - CTranslate2 (INT8 Quantized)

This is a CTranslate2 converted version of OpenAI's Whisper Tiny model, quantized to INT8 for faster CPU inference.

Model Details

  • Base Model: openai/whisper-tiny
  • Quantization: INT8
  • Framework: CTranslate2
  • Optimized for: CPU inference

Usage

Install dependencies

pip install faster-whisper

Python code

from faster_whisper import WhisperModel

# Load model
model = WhisperModel("Kofi24/asr-whisper-tiny-ct2-int8", device="cpu", compute_type="int8")

# Transcribe
segments, info = model.transcribe("audio.mp3", language="en")

for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

Performance

  • Faster inference than the original model
  • Optimized for CPU with INT8 quantization
  • Suitable for real-time transcription on CPU

Citation

@misc{whisper-ctranslate2,
  author = {Your Name},
  title = {Whisper Tiny CTranslate2 INT8},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Kofi24/asr-whisper-tiny-ct2-int8}}
}
Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support