FireRedChat-turn-detector

Descriptions

Compact end-of-turn detection used in FireRedChat. livekit plugin available here

  • chinese_best_model_q8.onnx: FireRedChat turn-detector model (Chinese only)
  • multilingual_best_model_q8.onnx: FireRedChat turn-detector model (Chinese and English)

Roadmap

  • 2025/09
    • Release the onnx checkpoints and livekit plugin.

Usage

import numpy as np
import onnxruntime as ort
from transformers import AutoTokenizer

def softmax(x):
    exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return exp_x / np.sum(exp_x, axis=1, keepdims=True)

session = ort.InferenceSession(
    "chinese_best_model_q8.onnx", providers=["CPUExecutionProvider"]
)

tokenizer = AutoTokenizer.from_pretrained(
    "./tokenizer",
    local_files_only=True,
    truncation_side="left"
)

text = "这是一句没有标点的文本"
inputs = tokenizer(
            text,
            truncation=True,
            padding='max_length',
            add_special_tokens=False,
            return_tensors="np",
            max_length=128,
        )
# Run inference
outputs = session.run(None, 
                      {
                          "input_ids": inputs["input_ids"].astype("int64"), 
                          "attention_mask": inputs["attention_mask"].astype("int64")
                      })
eou_probability = softmax(outputs[0]).flatten()[-1]
print(eou_probability, eou_probability>0.5)

Acknowledgment

  • Base model: google-bert/bert-base-multilingual-cased (license: "apache-2.0")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for FireRedTeam/FireRedChat-turn-detector

Quantized
(5)
this model