Hubert Base ONNX Model for Voice Conversion
This is the ONNX-exported version of the Hubert Base model, fine-tuned for voice conversion and compatible with modern inference pipelines. This model allows fast and efficient audio processing in ONNX runtime environments.
It builds upon the following models:
Features
- Converts audio features into high-quality embeddings for voice conversion tasks.
- Fully ONNX-compatible for optimized inference on CPUs and GPUs.
- Lightweight and easy to integrate in custom voice processing pipelines.
- No extra requirements needed, just numpy and onnxruntime
ONNX Model Report
Model: hubert_base.onnx
Producer: pytorch 2.0.0
IR Version: 8
Opsets: ai.onnx:18
Parameters: 94,370,816
π¦ Inputs
- source | type:
float32| shape: [batch_size, sequence_length]- Waveform PCM 32 - SR 16,000 - Mono
- padding_mask | type:
bool| shape: [batch_size, sequence_length]- It is usually a completely false array, with the same shape as the waveform.
padding_mask = np.zeros(waveform.shape, dtype=np.bool_)
- It is usually a completely false array, with the same shape as the waveform.
π© Outputs
- features | type:
float32| shape: [batch_size, sequence_length, 768 ]
Usage
import numpy as np
import onnxruntime as ort
class OnnxHubert:
"""
Class to load and run the ONNX model exported by Hubert.
Attributes:
session (ort.InferenceSession): The ONNX Runtime session.
input_name (str): The name of the input node.
output_name (str): The name of the output node.
Methods:
extract_features_batch (source, padding_mask): Run the ONNX model and extract features from the batch.
extract_features (source, padding_mask): Run the ONNX model and extract features from a single input.
"""
def __init__(self, model_path: str, thread_num: int = None):
"""
Initialize the OnnxHubert object.
Parameters:
model_path (str): The path to the ONNX model file.
thread_num (int, optional): The number of threads to use for inference. Defaults to None.
Attributes:
session (ort.InferenceSession): The ONNX Runtime session.
input_name (str): The name of the input node.
output_name (str): The name of the output node.
"""
self.session = ort.InferenceSession(model_path)
self.input_name = self.session.get_inputs()[0].name
self.output_name = self.session.get_outputs()[0].name
def extract_features(
self,
source: np.ndarray,
padding_mask: np.ndarray
) -> np.ndarray:
"""
Extract features from the batch using the ONNX model.
Inputs:
source: ndarray of shape (batch_size, sequence_length) float32
padding_mask: ndarray of shape (batch_size, sequence_length) bool
Returns:
ndarray of shape (D, 768) with the extracted features
"""
result = self.session.run(None, {
"source": source,
"padding_mask": padding_mask
})
return result[0]
Installation
You can install the required libraries with:
pip install onnxruntime numpy
Model tree for MidFord327/Hubert-Base-ONNX
Base model
facebook/hubert-base-ls960