FireRedChat-turn-detector
Descriptions
Compact end-of-turn detection used in FireRedChat. livekit plugin available here
- chinese_best_model_q8.onnx: FireRedChat turn-detector model (Chinese only)
- multilingual_best_model_q8.onnx: FireRedChat turn-detector model (Chinese and English)
Roadmap
- 2025/09
- Release the onnx checkpoints and livekit plugin.
Usage
import numpy as np
import onnxruntime as ort
from transformers import AutoTokenizer
def softmax(x):
exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
return exp_x / np.sum(exp_x, axis=1, keepdims=True)
session = ort.InferenceSession(
"chinese_best_model_q8.onnx", providers=["CPUExecutionProvider"]
)
tokenizer = AutoTokenizer.from_pretrained(
"./tokenizer",
local_files_only=True,
truncation_side="left"
)
text = "这是一句没有标点的文本"
inputs = tokenizer(
text,
truncation=True,
padding='max_length',
add_special_tokens=False,
return_tensors="np",
max_length=128,
)
# Run inference
outputs = session.run(None,
{
"input_ids": inputs["input_ids"].astype("int64"),
"attention_mask": inputs["attention_mask"].astype("int64")
})
eou_probability = softmax(outputs[0]).flatten()[-1]
print(eou_probability, eou_probability>0.5)
Acknowledgment
- Base model: google-bert/bert-base-multilingual-cased (license: "apache-2.0")
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for FireRedTeam/FireRedChat-turn-detector
Base model
google-bert/bert-base-multilingual-cased