Research

My research focuses on efficient and practical machine learning for NLP. This includes research into resource-efficient training of large language models (LLMs) and state-of-the-art methods for information extraction (IE) from text. In particular, my group likes to develop tangible output in the form of open source libraries, publicly available datasets and online platforms. See highlights below!


Flair NLP

We develop Flair, a very popular library for state-of-the-art NLP. It is used in thousands of industrial, academic and open source projects.

Read more

Büble-LM

Büble-LM is our new state-of-the-art 2 billion parameter language model (LM) for German!

Read more

Zitatsuchmaschine

Our German-language search engine for quotes!

Read more

Fundus

Need to crawl online news? With Fundus, you can crawl millions of pages of online news with just a few lines of code!

Read more

TransformerRanker

TransformerRanker automatically finds the best-suited LM for your NLP task!

Read more

OpinionGPT

OpinionGPT is a ChatGPT-style model trained specifically to be biased and opinionated!

Read more

CleanCoNLL

CleanCoNLL is a nearly noise-free dataset for named entity recognition (NER). Use it to train and evaluate your NER models!

Read more

LM Pub Quiz

Measure and compare the factual knowledge of language models!

Read more

NoiseBench

With NoiseBench, you can measure the robustness of your ML approach to real-world label noise!

Read more