diff --git a/model_cards/codegram/calbert-base-uncased/README.md b/model_cards/codegram/calbert-base-uncased/README.md new file mode 100644 index 000000000..77cb5254a --- /dev/null +++ b/model_cards/codegram/calbert-base-uncased/README.md @@ -0,0 +1,25 @@ +--- +language: catalan +--- + +# CALBERT: a Catalan Language Model + +## Introduction + +CALBERT is an open-source language model forĀ Catalan based on theĀ ALBERT architecture. + +It is now available on Hugging Face in its `base-uncased` version, and was pretrained on the [OSCAR dataset](https://traces1.inria.fr/oscar/). + +For further information or requests, please go to the [GitHub repository](https://github.com/codegram/calbert) + +## Pre-trained models + +| Model | Arch. | Training data | +|-------------------------------------|------------------|-----------------------------------| +| `codegram` / `calbert-base-uncased` | Base (uncased) | OSCAR (4.3 GB of text) | + + +## Authors + +CALBERT was trained and evaluated by [Txus Bach](https://twitter.com/txustice), as part of [Codegram](https://www.codegram.com)'s applied research. +