ZAP: An Open-Source Multilingual Annotation Projection Framework. Alan Akbik and Roland Vollgraf. 11th Language Resources and Evaluation Conference, LREC 2018. (forthcoming)
FEIDEGGER: A Multi-modal Corpus of Fashion Images and Descriptions in German. Leonidas Lefakis, Alan Akbik and Roland Vollgraf. 11th Language Resources and Evaluation Conference, LREC 2018. (forthcoming)
CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles. Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, Anbang Xu. 2017 Conference on Empirical Methods on Natural Language Processing, EMNLP 2017. [pdf]
Multilingual Information Extraction with PolyglotIE. Alan Akbik, Laura Chiticariu, Marina Danilevsky, Yonas Kbrom, Yunyao Li and Huaiyu Zhu. 26th International Conference on Computational Linguistics, COLING 2016. [pdf][video]
K-SRL: Instance-based Learning for Semantic Role Labeling. Alan Akbik and Yunyao Li. 26th International Conference on Computational Linguistics, COLING 2016. [pdf]
Multilingual Aliasing for Auto-Generating Proposition Banks. Alan Akbik, Xinyu Guan and Yunyao Li. 26th International Conference on Computational Linguistics, COLING 2016. [pdf]
Improving Data Quality by Leveraging Statistical Relational Learning. Larysa Visengeriyeva, Alan Akbik and Manohar Kaul. 21st International Conference on Information Quality, ICIQ 2016.
Towards Semi-Automatic Generation of Proposition Banks for Low-Resource Languages. Alan Akbik, Vishwajeet Kumar and Yunyao Li. 2016 Conference on Empirical Methods on Natural Language Processing, EMNLP 2016.[pdf]
Polyglot: Multilingual Semantic Role Labeling with Unified Labels. Alan Akbik and Yunyao Li. 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016. [pdf]
Exploratory Relation Extraction in Large Multilingual Data. Alan Akbik. PhD Thesis.
Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling. Alan Akbik, Laura Chiticariu, Marina Danilevsky, Yunyao Li, Shivakumar Vaithyanathan and Huaiyu Zhu. 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015. [pdf]
SCHNÄPPER: A Web Toolkit for Exploratory Relation Extraction. Thilo Michael and Alan Akbik. 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015. [pdf]
Proceedings of the First AHA!-Workshop on Information Discovery in Text. Alan Akbik and Larysa Visengeriyeva. 25th International Conference on Computational Linguistics, COLING 2014. [pdf]
Extracting a Repository of Events and Event References from News Clusters. Silvia Julinda, Christoph Boden and Alan Akbik. AHA! Workshop on Information Discovery in Text, COLING 2014. [pdf]
Nerdle: Topic-Specific Question Answering Using Wikia Seeds. Umar Maqsud, Sebastian Arnold, Michael Hülfenhaus and Alan Akbik. 25th International Conference on Computational Linguistics, COLING 2014. [pdf]
Exploratory Relation Extraction from Large Text Corpora. Alan Akbik, Thilo Michael and Christoph Boden. 25th International Conference on Computational Linguistics, COLING 2014. [pdf]
The Weltmodell: A Data-Driven Commonsense Knowledge Base.
Alan Akbik and Thilo Michael.
9th Edition of the Language Resources and Evaluation Conference, LREC 2014. [pdf]
Freepal: A Large Collection of Deep Lexico-Syntactic Patterns for Relation Extraction. Johannes Kirschnick, Alan Akbik, Holmer Hemsen. 9th Edition of the Language Resources and Evaluation Conference, LREC 2014. [pdf]
Effective Selectional Restrictions for Unsupervised Relation Extraction. Alan Akbik, Larysa Visengeriyeva, Johannes Kirschnick, Alexander Löser. 6th International Joint Conference on Natural Language Processing, IJCNLP 2013. [pdf]
Propminer: A Workflow for Interactive Information Extraction and Exploration using Dependency Trees.
Alan Akbik, Oresti Konomi and Michail Melnikov
The 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013. [pdf]
Automatic Preservation Watch using Information Extraction on the Web. Luis Faria, Alan Akbik, Barbara Sierman, Marcel Ras, Miguel Ferreira and Jose Carlos Ramalho. 10th International Conference on Preservation of Digital Objects, iPres 2013. [pdf]
QuoteMine: A Repository of Newsworthy Quotes. Alan Akbik, Martin Schenck. International Conference of the German Society for Computational Linguistics and Language Technology, GSCL 2013. [pdf]
Unsupervised Discovery of Relations and Discriminative Extraction Patterns.
Alan Akbik, Larysa Visengeriyeva, Priska Herger, Holmer Hemsen, Alexander Löser
24th International Conference on Computational Linguistics, COLING 2012. [pdf]
KrakeN: N-ary Facts in Open Information Extraction.
Alan Akbik, Alexander Löser
The Knowledge Extraction Workshop at NAACL-HLT, 2012. [pdf]
Master Thesis: Extracting a Repository of Events and Event References from News Clusters
Silvia Julinda created an approach for mining events and their textual representations from news clusters on the Web. Her thesis work led to a publication at the AHA! Workshop on Information Discovery at COLING 2014.
Priska Herger used clustering methods on large corpora of text to determine broad relationship types that hold between nouns, such as hypernymy, meronymy, co-hypernymy and others.
Diplomarbeit: Automatisierte Extraktion textueller Änderungen aus dem Bearbeitungsverlauf von Online-Nachrichtenartikeln
("Extracting Microedits from Online News Articles")
Christian Niedrich mined online news for 'microedits', i.e. small changes that are made to online news articles after they are online. He is developing a method that automatically constructs a corpus of such edits.
Bachelor Thesis: A Workflow for Defining Information Extraction Patterns Oresti Konomi designed a workflow and implemented a tool for defining Information Extraction patterns addressed at users without a background in NLP. Together with the work from Michail Melnikov's bachelor thesis, this work has been published as an ACL 2013 demo.
Bachelor Thesis: Design und Umsetzung eines Systems zur verteilten Ausfuhrung von patternbasierter Informationsextraktion
("Design and Implementation of a System for Distributed Pattern-Based Information Extraction")
Michail Melnikov built a system that executes complex Information Extraction patterns in a distributed environment. Together with the work from Oresti Konomi's bachelor thesis, this work has been published as an ACL 2013 demo.
Diplomarbeit: Automatisierte Extraktion von Zitaten und zugehörigen Themen aus Webdokumenten
("Automatic Extraction of Quotes and Speakers from Web Documents")
Philipp Keese built and evaluated an Information Extraction system that finds quotes and their speakers in Web documents.
Bachelor Thesis: Mining von Events in Twitter mit Fokus auf deutscher Politik
("Mining Twitter Events with Focus on German Politics")
Kfir Admoni built a system that continuously mines twitter for news related to German politics for 'hot topics'.
Bachelor Thesis: Gezieltes Retrieval von faktenstarken Sätzen im Web auf Basis von Wikipedia
("Targeted Retrieval of Sentences with High Information Content from the Web")
Do Tuan Ahn built a system that finds and retrieves sentences that contain factual data from the Web.
Stephan Pieper investigated machine learning approaches to automatically learn patterns for N-ary open information extraction.
Bachelor Thesis: Information Extraction von Zitaten in türkischsprachigen Quellen
("Quote-Extraction from Turkish Newswire Text")
Ahmet Karakas built a pipeline for extracting quotes and speakers from Turkish-language text. He will also conduct a survey of existing NLP resources for the Turkish language. The results of his work will be integrated into the QuoteMine project.
Bachelor Thesis: Extraktion von Relationen und Konzepten von komplexen Nominalphrasen
("Extraction of Relations and Concepts from Complex Noun Phrases")
Umar Maqsut extracted information from complex noun phrases and investigated when such phrases can be used as concepts in a knowledge base.
Bachelor Thesis: Implementierung und Evaluierung eines Verfahrens zur Erhöhung der Qualität der flachen Extraktion komplexer Nominalphrasen
("Using World Knowledge to find Complex Noun Phrases in Shallow Parsing")
Stefan Schramm investigated a number of 'world knowledge' features in a CRF classifier for finding complex noun phrases, something that can normally only be achieved using a deep syntactic parser. He trained a classifierusing different featuresets and evaluates the results.
This is my Master thesis, which I submitted in May 2009. I examine Relation Extraction as a method for
automatically generating a semantically annotated wiki from the English Wikipedia. This work has given
rise to many ideas I have since been exploring.
Text and Data Mining
alan [dot] akbik [ät] zalando [dot] de