Research

My research focuses on efficient and practical machine learning for NLP. This includes research into resource-efficient training of large language models (LLMs) and state-of-the-art methods for information extraction (IE) from text. In particular, my group likes to develop tangible output in the form of open source libraries, publicly available datasets and online platforms. See highlights below!

Featured
Libraries
Datasets
Applications

Flair NLP

We develop Flair, a very popular library for state-of-the-art NLP. It is used in thousands of industrial, academic and open source projects.

Büble-LM

Büble-LM is our new state-of-the-art 2 billion parameter language model (LM) for German!

Zitatsuchmaschine

Our German-language search engine for quotes!

Fundus

Need to crawl online news? With Fundus, you can crawl millions of pages of online news with just a few lines of code!

TransformerRanker

TransformerRanker automatically finds the best-suited LM for your NLP task!

OpinionGPT

OpinionGPT is a ChatGPT-style model trained specifically to be biased and opinionated!

CleanCoNLL

CleanCoNLL is a nearly noise-free dataset for named entity recognition (NER). Use it to train and evaluate your NER models!

LM Pub Quiz

Measure and compare the factual knowledge of language models!

NoiseBench

With NoiseBench, you can measure the robustness of your ML approach to real-world label noise!