What could be the usescases of this?
#1
by
krigeta
- opened
This model is not intended to be used on its own, but I think it has potential to serve as a foundational audio model similar to HuBERT or wavLM.
In fact, in my preliminary experiments, fine-tuning this model on an environmental sound classification task yielded a certain level of performance (though not particularly high).
It may also be possible to use it as the audio encoder component of other speech LLMs.
amazing work for audio.
Can you show the raw codes of translating Qwen3-omni to Qwen3-omni-Audio?
From a principle perspective, this is a Whisper pro
Could you use Whisper as an analogy, which part or module of the Whisper model is this model comparable to?