Prof. Dr.Alan Akbik

Hi, I'm a professor at the Humboldt University of Berlin, leading the chair of machine learning. I focus on natural language processing (NLP) research and the development of popular open source libraries such as Flair NLP.

Check out my research, my publicationsand my Chair!

Pinned Message

Boldt-1B Released!

State-of-the-art performance for German NLP: We are releasing Boldt-1B, a new open-source foundation LLM that outperforms existing major LLMs in its parameter class. Check it out!

Latest News

05/05/2026

New Release

Our German foundation LLM Boldt-1B is now publicly available, a compact German model that outperforms existing state-of-the-art models in its parameter class! Try it out!

05/05/2025

New PaperarXiv 2026

Our paper "Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling", explaining the background to our Boldt family of German models and benchmarks, now on arXiv!

04/15/2026

New Research Grant

The Investitionsbank Berlin (IBB) approved a new research grant for a Forschungstransfer project to industry partner Wordliner GmbH! We're hiring again!

04/07/2026

Paper acceptedACL 2026

Our paper "Beyond Marginal Distributions: A Framework to Evaluate the Representativeness of Demographic-Aligned LLMs" accepted to ACL 2026!

02/04/2026

New Research Grant

The Investitionsbank Berlin (IBB) approved a new research grant for a Forschungstransfer project to industry partner Slomofone GmbH! We are hiring again!

01/28/2026

Paper acceptedEACL 2026

Our paper "Using Subword-Embeddings for Bilingual Lexicon Induction in Bantu Languages" accepted to EACL 2026 (AfricaNLP Workshop)!

01/04/2026

Paper acceptedEACL 2026

Our paper "FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition" accepted to EACL 2026!

10/18/2025

New Research Grant

The Investitionsbank Berlin (IBB) approved a new ProFIT research grant for a 3-year research project together with industry partner OMQ GmbH! We're hiring again!

10/01/2025

New Lab Member

We welcome our new PhD student Filipe Laitenberger to the team!

10/01/2025

New Lab Member

We welcome our new research engineer Max Dallabetta to the team!

09/25/2025

New Research Grant

The Investitionsbank Berlin (IBB) approved a new research grant for a Forschungstransfer project to industry partner Wordliner GmbH!

09/15/2025

Paper acceptedBabyLM 2025

Our paper "Babies Learn to Look Ahead: Multi-Token Prediction in Small LMs" accepted to BabyLM 2025 (EMNLP Workshop)!

09/15/2025

Paper acceptedBabyLM 2025

Our paper "Sample-Efficient Language Modeling with Linear Attention and Lightweight Enhancements" accepted to BabyLM 2025 (EMNLP Workshop)!

08/20/2025

Paper acceptedEMNLP 2025

Our paper "Improving Online Job Advertisement Analysis via Compositional Entity Extraction" accepted to EMNLP 2025 (main conference)!

08/20/2025

Paper acceptedEMNLP 2025

Our paper "Token-Level Metrics for Detecting Incorrect Gold Annotations in Named Entity Recognition" accepted to EMNLP 2025 (findings)!

08/20/2025

Paper acceptedEMNLP 2025

Our paper "Lemma Dilemma: On Lemma Generation Without Domain- or Language-Specific Training Data" accepted to EMNLP 2025 (findings)!

06/22/2025

Paper acceptedACL 2025

Our paper "Question Decomposition for Retrieval-Augmented Generation" accepted to ACL 2025 (SRW workshop)!

06/11/2025

Paper acceptedACL 2025

Our paper "From Data to Knowledge: Evaluating How Efficiently Language Models Learn Facts" accepted to ACL 2025 (L2M2 workshop)!

06/11/2025

Paper acceptedACL 2025

Our paper "Towards a Principled Evaluation of Knowledge Editors" accepted to ACL 2025 (L2M2 workshop)!

05/22/2025

Paper acceptedACL 2025

Our paper "Measuring Label Ambiguity in Subjective Tasks using Predictive Uncertainty Estimation" accepted to ACL 2025 (LAW workshop)!

05/16/2025

Paper acceptedACL 2025

Our paper "Evaluating Design Decisions for Dual Encoder-based Entity Disambiguation" accepted to ACL 2025 (main conference)!

05/16/2025

Paper acceptedACL 2025

Our paper "Pre-Training Curriculum for Multi-Token Prediction in Language Models" accepted to ACL 2025 (main conference)!

05/04/2025

Paper acceptedCVPR 2025

Our paper "Don't Mesh with Me: Generating Constructive Solid Geometry Instead of Meshes by Fine-Tuning a Code-Generation LLM" accepted to CVPR 2025 (AI for Content Creation Workshop)!

04/19/2025

New PaperarXiv 2024

Our paper "Empirical Evaluation of Knowledge Distillation from Transformers to Subquadratic Language Models" now on arXiv!

03/13/2025

Paper acceptedICLR 2025

Our paper "MastermindEval: A Simple But Scalable Reasoning Benchmark" accepted to ICLR 2025 (LLM Reasoning Workshop)!

03/05/2025

New Lab Member

We welcome our new PhD student Piet Wagner to the team!

03/01/2025

Paper acceptedNAACL 2025

Our paper "TransformerRanker: A Tool for Efficiently Finding the Best-Suited Language Models for Downstream Classification Tasks" accepted to NAACL 2025 (system demonstrations)!

03/01/2025

Paper acceptedNAACL 2025

Our paper "LM-Pub-Quiz: A Comprehensive Framework for Zero-Shot Evaluation of Relational Knowledge in Language Models" accepted to NAACL 2025 (system demonstrations)!

01/25/2025

Paper acceptedNAACL 2025

Our paper "Familarity: Better Evaluation of Zero-Shot Named Entity Recognition by Quantifying Label Shifts in Synthetic Training Data" accepted to NAACL 2025 (main conference)!