Büble-LM
Büble-LM is our new state-of-the-art 2 billion parameter language model (LM) for German!
Büble-LM
BübleLM is a state-of-the-art German language model based on Gemma-2B, adapted using trans-tokenization with a custom German SentencePiece tokenizer.
Büble significantly outperforms other German LMs like Sauerkraut-2B and LLäMmlein-1B on most benchmarks we tried. It was trained with a novel trans-tokenization approach by Pieter Delobelle when he was a guest researcher at our chair!
More details on this model coming soon!
Getting Started
- Try out the model
- Check the evaluation results