Hubert Base ONNX Model for Voice Conversion

This is the ONNX-exported version of the Hubert Base model, fine-tuned for voice conversion and compatible with modern inference pipelines. This model allows fast and efficient audio processing in ONNX runtime environments.

It builds upon the following models:


Features

  • Converts audio features into high-quality embeddings for voice conversion tasks.
  • Fully ONNX-compatible for optimized inference on CPUs and GPUs.
  • Lightweight and easy to integrate in custom voice processing pipelines.
  • No extra requirements needed, just numpy and onnxruntime

ONNX Model Report

Model: hubert_base.onnx
Producer: pytorch 2.0.0
IR Version: 8
Opsets: ai.onnx:18
Parameters: 94,370,816


🟦 Inputs

  • source | type: float32 | shape: [batch_size, sequence_length]
    • Waveform PCM 32 - SR 16,000 - Mono
  • padding_mask | type: bool | shape: [batch_size, sequence_length]
    • It is usually a completely false array, with the same shape as the waveform. padding_mask = np.zeros(waveform.shape, dtype=np.bool_)

🟩 Outputs

  • features | type: float32 | shape: [batch_size, sequence_length, 768 ]

Usage

import numpy as np
import onnxruntime as ort

class OnnxHubert:
    """
    Class to load and run the ONNX model exported by Hubert.
    
    Attributes:
        session (ort.InferenceSession): The ONNX Runtime session.
        input_name (str): The name of the input node.
        output_name (str): The name of the output node.
    
    Methods:
        extract_features_batch (source, padding_mask): Run the ONNX model and extract features from the batch.
        extract_features (source, padding_mask): Run the ONNX model and extract features from a single input.
    """
    def __init__(self, model_path: str, thread_num: int = None):
        """
        Initialize the OnnxHubert object.

        Parameters:
            model_path (str): The path to the ONNX model file.
            thread_num (int, optional): The number of threads to use for inference. Defaults to None.

        Attributes:
            session (ort.InferenceSession): The ONNX Runtime session.
            input_name (str): The name of the input node.
            output_name (str): The name of the output node.
        """
        self.session = ort.InferenceSession(model_path)
        
        self.input_name = self.session.get_inputs()[0].name
        self.output_name = self.session.get_outputs()[0].name
    def extract_features(
        self, 
        source: np.ndarray, 
        padding_mask: np.ndarray
    ) -> np.ndarray:
        """
        Extract features from the batch using the ONNX model.

        Inputs:
            source: ndarray of shape (batch_size, sequence_length) float32
            padding_mask: ndarray of shape (batch_size, sequence_length) bool

        Returns:
            ndarray of shape (D, 768) with the extracted features
        """
        result = self.session.run(None, {
            "source": source,
            "padding_mask": padding_mask
        })
        return result[0]

Installation

You can install the required libraries with:

pip install onnxruntime numpy
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for MidFord327/Hubert-Base-ONNX

Quantized
(2)
this model