ZurichNLP
/

swissbert

@@ -5,6 +5,7 @@ language:
   - fr
   - it
   - rm
   - multilingual
 inference: false
 ---
@@ -19,6 +20,9 @@ In addition, we used a Switzerland-specific subword vocabulary.
 The pre-training code and usage examples are available [here](https://github.com/ZurichNLP/swissbert). We also release a version that was fine-tuned on named entity recognition (NER): https://huggingface.co/ZurichNLP/swissbert-ner
 ## Languages
 SwissBERT contains the following language adapters:
@@ -29,6 +33,7 @@ SwissBERT contains the following language adapters:
 | 1                       | `fr_CH`       | French                |
 | 2                       | `it_CH`       | Italian               |
 | 3                       | `rm_CH`       | Romansh Grischun      |
 ## License
 Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
@@ -87,6 +92,10 @@ SwissBERT is not designed for generating text.
 - Training data: German, French, Italian and Romansh documents in the [Swissdox@LiRI](https://t.uzh.ch/1hI) database, until 2022.
 - Training procedure: Masked language modeling
 ## Environmental Impact
 - Hardware type: RTX 2080 Ti.
 - Hours used: 10 epochs × 18 hours × 8 devices = 1440 hours
@@ -95,7 +104,7 @@ SwissBERT is not designed for generating text.
 - Carbon efficiency: 0.0016 kg CO2e/kWh ([source](https://t.uzh.ch/1rU))
 - Carbon emitted: 0.6 kg CO2e ([source](https://mlco2.github.io/impact#compute))
-## Citation
 ```bibtex
 @article{vamvas-etal-2023-swissbert,
       title={Swiss{BERT}: The Multilingual Language Model for Switzerland},
@@ -106,4 +115,15 @@ SwissBERT is not designed for generating text.
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2303.13310}
 }
-```

   - fr
   - it
   - rm
+  - gsw
   - multilingual
 inference: false
 ---
 The pre-training code and usage examples are available [here](https://github.com/ZurichNLP/swissbert). We also release a version that was fine-tuned on named entity recognition (NER): https://huggingface.co/ZurichNLP/swissbert-ner
+## Update 2024-01: Support for Swiss German
+We added a Swiss German adapter to the model.
 ## Languages
 SwissBERT contains the following language adapters:
 | 1                       | `fr_CH`       | French                |
 | 2                       | `it_CH`       | Italian               |
 | 3                       | `rm_CH`       | Romansh Grischun      |
+| 4                       | `gsw`         | Swiss German          |
 ## License
 Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
 - Training data: German, French, Italian and Romansh documents in the [Swissdox@LiRI](https://t.uzh.ch/1hI) database, until 2022.
 - Training procedure: Masked language modeling
+The Swiss German adapter was trained on the following two datasets of written Swiss German:
+1. [SwissCrawl](https://icosys.ch/swisscrawl)&nbsp;([Linder et al., LREC 2020](https://aclanthology.org/2020.lrec-1.329)), a collection of Swiss German web text (forum discussions, social media).
+2. A custom dataset of Swiss German tweets
 ## Environmental Impact
 - Hardware type: RTX 2080 Ti.
 - Hours used: 10 epochs × 18 hours × 8 devices = 1440 hours
 - Carbon efficiency: 0.0016 kg CO2e/kWh ([source](https://t.uzh.ch/1rU))
 - Carbon emitted: 0.6 kg CO2e ([source](https://mlco2.github.io/impact#compute))
+## Citations
 ```bibtex
 @article{vamvas-etal-2023-swissbert,
       title={Swiss{BERT}: The Multilingual Language Model for Switzerland},
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2303.13310}
 }
+```
+Swiss German adapter:
+```bibtex
+@inproceedings{vamvas-etal-2024-modular,,
+      title={Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect},
+      author={Jannis Vamvas and No{\"e}mi Aepli and Rico Sennrich},
+      booktitle={First Workshop on Modular and Open Multilingual NLP},
+      year={2024},
+}
+```

config.json CHANGED Viewed

@@ -18,7 +18,8 @@
     "de_CH",
     "fr_CH",
     "it_CH",
-    "rm_CH"
   ],
   "layer_norm_eps": 1e-05,
   "ln_before_adapter": true,
@@ -30,7 +31,7 @@
   "position_embedding_type": "absolute",
   "pre_norm": false,
   "torch_dtype": "float32",
-  "transformers_version": "4.27.1",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50262

     "de_CH",
     "fr_CH",
     "it_CH",
+    "rm_CH",
+    "gsw"
   ],
   "layer_norm_eps": 1e-05,
   "ln_before_adapter": true,
   "position_embedding_type": "absolute",
   "pre_norm": false,
   "torch_dtype": "float32",
+  "transformers_version": "4.33.2",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50262

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2f7329a691419166342728cda32d7e8a5fd72b83f6b290168d0b3057dd9c51eb
-size 612385785

 version https://git-lfs.github.com/spec/v1
+oid sha256:3621abd43ac00e35367a180626eccb4091493178ed6f922fc78717e2a4c06fed
+size 640768013