Publications

On this page you find an overview of my scientific publications ordered by year. Further down the page, you also find other scientific output. Finally, there is a list of the bachelor and master theses I advised while I was a research associate at the Berlin Institute of Technology.

2017

The Projector: An Interactive Annotation Projection Visualization Tool. Alan Akbik and Roland Vollgraf. 2017 Conference on Empirical Methods on Natural Language Processing, EMNLP 2017.

CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles. Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, Anbang Xu. 2017 Conference on Empirical Methods on Natural Language Processing, EMNLP 2017.


2016

Multilingual Information Extraction with PolyglotIE. Alan Akbik, Laura Chiticariu, Marina Danilevsky, Yonas Kbrom, Yunyao Li and Huaiyu Zhu. 26th International Conference on Computational Linguistics, COLING 2016. [pdf][video]

K-SRL: Instance-based Learning for Semantic Role Labeling. Alan Akbik and Yunyao Li. 26th International Conference on Computational Linguistics, COLING 2016. [pdf]

Multilingual Aliasing for Auto-Generating Proposition Banks. Alan Akbik, Xinyu Guan and Yunyao Li. 26th International Conference on Computational Linguistics, COLING 2016. [pdf]

Improving Data Quality by Leveraging Statistical Relational Learning. Larysa Visengeriyeva, Alan Akbik and Manohar Kaul. 21st International Conference on Information Quality, ICIQ 2016.

Towards Semi-Automatic Generation of Proposition Banks for Low-Resource Languages. Alan Akbik, Vishwajeet Kumar and Yunyao Li. 2016 Conference on Empirical Methods on Natural Language Processing, EMNLP 2016.[pdf]

Polyglot: Multilingual Semantic Role Labeling with Unified Labels. Alan Akbik and Yunyao Li. 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016. [pdf]

Exploratory Relation Extraction in Large Multilingual Data. Alan Akbik. PhD Thesis.


2015

Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling. Alan Akbik, Laura Chiticariu, Marina Danilevsky, Yunyao Li, Shivakumar Vaithyanathan and Huaiyu Zhu. 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015. [pdf]

SCHNÄPPER: A Web Toolkit for Exploratory Relation Extraction. Thilo Michael and Alan Akbik. 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015. [pdf]


2014

Proceedings of the First AHA!-Workshop on Information Discovery in Text. Alan Akbik and Larysa Visengeriyeva. 25th International Conference on Computational Linguistics, COLING 2014. [pdf]

Extracting a Repository of Events and Event References from News Clusters. Silvia Julinda, Christoph Boden and Alan Akbik. AHA! Workshop on Information Discovery in Text, COLING 2014. [pdf]

Nerdle: Topic-Specific Question Answering Using Wikia Seeds. Umar Maqsud, Sebastian Arnold, Michael Hülfenhaus and Alan Akbik. 25th International Conference on Computational Linguistics, COLING 2014. [pdf]

Exploratory Relation Extraction from Large Text Corpora. Alan Akbik, Thilo Michael and Christoph Boden. 25th International Conference on Computational Linguistics, COLING 2014. [pdf]

The Weltmodell: A Data-Driven Commonsense Knowledge Base. Alan Akbik and Thilo Michael.
9th Edition of the Language Resources and Evaluation Conference, LREC 2014. [pdf]

Freepal: A Large Collection of Deep Lexico-Syntactic Patterns for Relation Extraction. Johannes Kirschnick, Alan Akbik, Holmer Hemsen. 9th Edition of the Language Resources and Evaluation Conference, LREC 2014. [pdf]


2013

Effective Selectional Restrictions for Unsupervised Relation Extraction. Alan Akbik, Larysa Visengeriyeva, Johannes Kirschnick, Alexander Löser. 6th International Joint Conference on Natural Language Processing, IJCNLP 2013. [pdf]

Propminer: A Workflow for Interactive Information Extraction and Exploration using Dependency Trees. Alan Akbik, Oresti Konomi and Michail Melnikov
The 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013. [pdf]

Automatic Preservation Watch using Information Extraction on the Web. Luis Faria, Alan Akbik, Barbara Sierman, Marcel Ras, Miguel Ferreira and Jose Carlos Ramalho. 10th International Conference on Preservation of Digital Objects, iPres 2013. [pdf]

QuoteMine: A Repository of Newsworthy Quotes. Alan Akbik, Martin Schenck. International Conference of the German Society for Computational Linguistics and Language Technology, GSCL 2013. [pdf]


2012 and earlier

Unsupervised Discovery of Relations and Discriminative Extraction Patterns. Alan Akbik, Larysa Visengeriyeva, Priska Herger, Holmer Hemsen, Alexander Löser
24th International Conference on Computational Linguistics, COLING 2012. [pdf]

KrakeN: N-ary Facts in Open Information Extraction. Alan Akbik, Alexander Löser
The Knowledge Extraction Workshop at NAACL-HLT, 2012. [pdf]

Wanderlust: Extracting Semantic Relations from Natural Language Text Using Dependency Grammar Patterns. Alan Akbik, Jürgen Broß. Workshop on Semantic Search, WWW 2009. [paper][video]


Other Scientific Output

Workshop Organizer

Together with Larysa Visengeriyeva, I organized the First AHA!-Workshop on Information Discovery in Text, held at the 25th International Conference on Computational Linguistics, COLING 2014. You can download the workshop proceedings here: [pdf]

Programme Committee

  • 2017 European Chapter of the Association for Computational Linguistics, EACL 2017
  • 26th International Conference on Computational Linguistics, COLING 2016
  • 2016 Conference on Information and Knowledge Management, CIKM 2016
  • 5th Workshop on Automated Knowledge Base Construction, AKBC 2016
  • 25th International Conference on Computational Linguistics, COLING 2014
  • 4th Workshop on Automated Knowledge Base Construction, AKBC 2014

Thesis Advisor

While at the Berlin Institute of Technology, I advised a number of Bachelor and Master theses. I am happy to say that many of these works either contributed to or directly resulted in academic publications. Some of the theses were written in German, so I am giving English translations of the titles in parentheses.

Master Thesis: Extracting a Repository of Events and Event References from News Clusters
Silvia Julinda created an approach for mining events and their textual representations from news clusters on the Web. Her thesis work led to a publication at the AHA! Workshop on Information Discovery at COLING 2014.

Master Thesis: Learning Semantic Relations with Distributional Similarity

Priska Herger used clustering methods on large corpora of text to determine broad relationship types that hold between nouns, such as hypernymy, meronymy, co-hypernymy and others.

Diplomarbeit: Automatisierte Extraktion textueller Änderungen aus dem Bearbeitungsverlauf von Online-Nachrichtenartikeln ("Extracting Microedits from Online News Articles")
Christian Niedrich mined online news for 'microedits', i.e. small changes that are made to online news articles after they are online. He is developing a method that automatically constructs a corpus of such edits.

Bachelor Thesis: A Workflow for Defining Information Extraction Patterns Oresti Konomi designed a workflow and implemented a tool for defining Information Extraction patterns addressed at users without a background in NLP. Together with the work from Michail Melnikov's bachelor thesis, this work has been published as an ACL 2013 demo.

Bachelor Thesis: Design und Umsetzung eines Systems zur verteilten Ausfuhrung von patternbasierter Informationsextraktion ("Design and Implementation of a System for Distributed Pattern-Based Information Extraction")
Michail Melnikov built a system that executes complex Information Extraction patterns in a distributed environment. Together with the work from Oresti Konomi's bachelor thesis, this work has been published as an ACL 2013 demo.

Diplomarbeit: Automatisierte Extraktion von Zitaten und zugehörigen Themen aus Webdokumenten ("Automatic Extraction of Quotes and Speakers from Web Documents")
Philipp Keese built and evaluated an Information Extraction system that finds quotes and their speakers in Web documents.

Bachelor Thesis: Mining von Events in Twitter mit Fokus auf deutscher Politik ("Mining Twitter Events with Focus on German Politics")
Kfir Admoni built a system that continuously mines twitter for news related to German politics for 'hot topics'.

Bachelor Thesis: Gezieltes Retrieval von faktenstarken Sätzen im Web auf Basis von Wikipedia ("Targeted Retrieval of Sentences with High Information Content from the Web")
Do Tuan Ahn built a system that finds and retrieves sentences that contain factual data from the Web.

Bachelor Thesis: Generation and Evaluation of N-ary Extraction Patterns for Open Information Extraction

Stephan Pieper investigated machine learning approaches to automatically learn patterns for N-ary open information extraction.

Bachelor Thesis: Information Extraction von Zitaten in türkischsprachigen Quellen ("Quote-Extraction from Turkish Newswire Text")
Ahmet Karakas built a pipeline for extracting quotes and speakers from Turkish-language text. He will also conduct a survey of existing NLP resources for the Turkish language. The results of his work will be integrated into the QuoteMine project.

Bachelor Thesis: Extraktion von Relationen und Konzepten von komplexen Nominalphrasen ("Extraction of Relations and Concepts from Complex Noun Phrases")
Umar Maqsut extracted information from complex noun phrases and investigated when such phrases can be used as concepts in a knowledge base.

Bachelor Thesis: Implementierung und Evaluierung eines Verfahrens zur Erhöhung der Qualität der flachen Extraktion komplexer Nominalphrasen ("Using World Knowledge to find Complex Noun Phrases in Shallow Parsing")
Stefan Schramm investigated a number of 'world knowledge' features in a CRF classifier for finding complex noun phrases, something that can normally only be achieved using a deep syntactic parser. He trained a classifierusing different featuresets and evaluates the results.


My Master Thesis

Wanderlust: Extracting Semantic Relations from Natural Language Text Using Dependency Grammar Patterns

This is my Master thesis, which I submitted in May 2009. I examine Relation Extraction as a method for automatically generating a semantically annotated wiki from the English Wikipedia. This work has given rise to many ideas I have since been exploring.

A picture of me should be here

Alan Akbik

Text and Data Mining
IBM Research
akbika [ät] us [dot] ibm [dot] com