I am professor of machine learning at Humboldt-Universität zu Berlin. My research focuses on natural language processing (NLP), i.e. methods that enable machines to understand human language. This spans research topics such as transfer learning, few-shot learning and semantic parsing, as well as application areas in large-scale text analytics. My research is operationalized in form of the open source NLP framework Flair that allows anyone to use state-of-the-art NLP methods in their research or applications. Together with my group and the open source community, we maintain and develop the Flair framework.
If you'd like to know more, check out my publications, the Flair NLP project, the Universal Proposition Banks, or contact me.
Our group is growing quickly, so we are always looking for PhD candidates and student assistants to join us. Contact me in case of interest!
Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs. Jonas Golde, Patrick Haller, Felix Hamborg, Julian Risch and Alan Akbik. ArXiv, 2023. [pdf]
OpinionGPT: Modelling Explicit Biases in Instruction-Tuned LLMs. Patrick Haller, Ansar Aynetdinov and Alan Akbik. ArXiv, 2023. [pdf]
ZELDA: A Comprehensive Benchmark for Supervised Entity Disambiguation. Marcel Milich and Alan Akbik. 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023. [pdf]
Medical Coding with Biomedical Transformer Ensembles and Zero/Few-shot Learning. Angelo Ziletti, Alan Akbik, Christoph Berns, Thomas Herold, Marion Legler, Martina Viell. 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, NAACL 2022. [pdf]
Flair NLP. My main current line of research focuses on new neural approaches to core NLP tasks. In particular, we present an approach that leverages character-level neural language modeling to learn latent representations that encode "general linguistic and world knowledge". These representations are then used as word embeddings to set new state-of-the-art scores for classic NLP tasks such as multilingual named entity recognition and part-of-speech tagging. Check out the project overview page for more details
Universal Proposition Banks. In this line of research, I am investigating methods for semantically parsing text data in a wide range of languages, such as Arabic, Chinese, German, Hindi, Russian and many others. In order to train such parsers, we are automatically generating Proposition Bank-style resources from parallel corpora. We are making all resources publicly available, so check out the project overview page for more details and the generated Proposition Banks.
Professor of Machine Learning
Humbold-Universität zu Berlin
alan [dot] akbik [ät] hu-berlin [dot] de