Welcome!

I am professor of machine learning at Humboldt-Universität zu Berlin. My research focuses on natural language processing (NLP), i.e. methods that enable machines to understand human language. This spans research topics such as transfer learning, few-shot learning and semantic parsing, as well as application areas in large-scale text analytics. My research is operationalized in form of the open source NLP framework Flair that allows anyone to use state-of-the-art NLP methods in their research or applications. Together with my group and the open source community, we maintain and develop the Flair framework.

If you'd like to know more, check out my publications, the Flair NLP project, the Universal Proposition Banks, or contact me.


Open positions!

Our group is growing quickly so we are always looking for PhD candidates and student assistants to join us!

Currently we have three open positions for PhD candidates (full time, fully-paid):
  • one position to begin at the end of 2021
  • two positions to begin in October 2022 (application deadline is already November 2021!)
For all three positions, the deadline is soon (October/November). Consider applying! Contact me in case of interest!

Latest News

  • 06.09.2021 - Research grant: DFG-funded project "Modeling Neurogenesis for Continuous Learning" in the Science of Intelligence cluster of excellence approved! Positions for PhD candidates available from October 2022!
  • 30.08.2021 - New release: Flair (v0.9) released, adding speed and many new NLP tasks to Flair!
  • 23.08.2021 - Research grant: DFG-funded project "Efficient Model Learning from Data with Partially Incorrect Labels" in the Science of Intelligence cluster of excellence approved! Positions for PhD candidates available from October 2022!
  • 01.08.2021 - Research grant: The BMWi-funded project "ML-SEBIRA" with industry partner neofonie starts today! Positions available!
  • 01.07.2021 - Paper accepted: Full paper on early predator detection accepted to ACL 2021!
  • 01.05.2021 - Major research grant: The DFG-funded project "Eidetische Repräsentationen natürlicher Sprache" starts today. This is a massively funded 6-year project looking to advance the state of the art in neural language modeling (BERT, GPT-3). Many positions available in our group!
  • 01.05.2021 - Research grant: The DFG-funded project "Neural Representations for Lifelong Learning" in the Science of Intelligence cluster of excellence starts today!
  • 01.04.2021 - Research grant: The BMWi-funded project "ML-ENA" with industry partner deepset starts today! Positions available!
  • 05.03.2021 - New release: Flair (v0.8) released, adding our powerful FLERT approach and HuggingFace model hub support to Flair!
  • 01.03.2021 - Research grant: The IBB-funded project "AIM" with industry partners Ubermetrics and Webtrekk starts today!

Latest Publications

Early Detection of Sexual Predators in Chats. Matthias Vogt, Ulf Leser and Alan Akbik. Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021.[pdf]

HunFlair: An easy-to-use tool for state-of-the-art biomedical named entity recognition. Leon Weber, Mario Sänger, Jannes Münchmeyer, Maryam Habibi, Ulf Leser and Alan Akbik. Bioinformatics. 2021 [pdf]

FLERT: Document-Level Features for Named Entity Recognition. Stefan Schweter and Alan Akbik. arxiv. 2020 [pdf]

more publications


Main Research

Flair NLP. My main current line of research focuses on new neural approaches to core NLP tasks. In particular, we present an approach that leverages character-level neural language modeling to learn latent representations that encode "general linguistic and world knowledge". These representations are then used as word embeddings to set new state-of-the-art scores for classic NLP tasks such as multilingual named entity recognition and part-of-speech tagging. Check out the project overview page for more details

Universal Proposition Banks. In this line of research, I am investigating methods for semantically parsing text data in a wide range of languages, such as Arabic, Chinese, German, Hindi, Russian and many others. In order to train such parsers, we are automatically generating Proposition Bank-style resources from parallel corpora. We are making all resources publicly available, so check out the project overview page for more details and the generated Proposition Banks.

A picture of me should be here

Alan Akbik

Professor of Machine Learning
Humbold-Universität zu Berlin
alan [dot] akbik [ät] hu-berlin [dot] de