espnet
/

geolid_vl107only_independent_trainable

language-identification

Model card Files Files and versions

Metrics Training metrics Community

qingzhengwang commited on Aug 27

Commit

9f40eb4

·

1 Parent(s): fc03db7

Add arXiv link.

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -120,6 +120,8 @@ license: cc-by-4.0
 ### `espnet/geolid_vl107only_independent_trainable`
 This geolocation-aware language identification (LID) model is developed using the [ESPnet](https://github.com/espnet/espnet/) toolkit. It integrates the powerful pretrained [MMS-1B](https://huggingface.co/facebook/mms-1b) as the encoder and employs [ECAPA-TDNN](https://arxiv.org/pdf/2005.07143) as the embedding extractor to achieve robust spoken language identification.
 The main innovations of this model are:
@@ -127,7 +129,7 @@ The main innovations of this model are:
 2. Conditioning the intermediate representations of the self-supervised learning (SSL) encoder on intermediate-layer information.
 This geolocation-aware strategy greatly improves robustness, especially for dialects and accented variations.
-For further details on the geolocation-aware LID methodology, please refer to our paper: *Geolocation-Aware Robust Spoken Language Identification* (arXiv link to be added).
 ### Usage Guide: How to use in ESPnet2

 ### `espnet/geolid_vl107only_independent_trainable`
+[Paper](https://arxiv.org/pdf/2508.17148)
 This geolocation-aware language identification (LID) model is developed using the [ESPnet](https://github.com/espnet/espnet/) toolkit. It integrates the powerful pretrained [MMS-1B](https://huggingface.co/facebook/mms-1b) as the encoder and employs [ECAPA-TDNN](https://arxiv.org/pdf/2005.07143) as the embedding extractor to achieve robust spoken language identification.
 The main innovations of this model are:
 2. Conditioning the intermediate representations of the self-supervised learning (SSL) encoder on intermediate-layer information.
 This geolocation-aware strategy greatly improves robustness, especially for dialects and accented variations.
+For further details on the geolocation-aware LID methodology, please refer to our paper: *Geolocation-Aware Robust Spoken Language Identification* ([arXiv](https://arxiv.org/pdf/2508.17148)).
 ### Usage Guide: How to use in ESPnet2