About

I am a postdoctoral researcher at IBM Research Almaden in San Jose, California, working at the intersection of natural language processing (NLP) and large-scale data mining technologies. My current research focus is Shallow Semantic Parsing and Information Extraction in multilingual data. To enable this, I am researching an approach to auto-generate semantic role labelers for arbitrary languages that parse different languages into a shared semantic abstraction. I pursue this research in order to enable the development of crosslingual Information Extraction and Question Answering applications, as well as to facilitate studies of crosslingual semantics.

If you'd like to know more, check out my publications and my projects, or contact me.


Latest News

  • News (09.12.2016): Version 1.0 of Universal Proposition Banks released! It consists of treebanks in several languages with "universal" semantic role labeling annotation.
  • News (11.10.2016): Will give a guest talk at UC Berkeley on multilingual SRL on November 17th!
  • News (30.09.2016): Demo paper on Multilingual IE accepted at COLING 2016!
  • News (20.09.2016): Two full papers accepted at COLING 2016!
  • News (30.08.2016): Check out a screencast of PolyglotIE, our multilingual Information Extraction system!
  • News (29.07.2016): Paper on Semantic Role Labeling of underresourced languages accepted at EMNLP 2016!
  • News (15.06.2016): Check out a screencast of our Multilingual Semantic Role Labeler!
  • News (23.05.2016): Demonstration paper on Multilingual Semantic Role Labeling with Unified labels accepted to ACL 2016!
  • News (13.04.2016): Successfully defended my PhD thesis!


Latest Publications

Multilingual Information Extraction with PolyglotIE. Alan Akbik, Laura Chiticariu, Marina Danilevsky, Yonas Kbrom, Yunyao Li and Huaiyu Zhu. 26th International Conference on Computational Linguistics, COLING 2016. [pdf][video]

K-SRL: Instance-based Learning for Semantic Role Labeling. Alan Akbik and Yunyao Li. 26th International Conference on Computational Linguistics, COLING 2016. [pdf]

Multilingual Aliasing for Auto-Generating Proposition Banks. Alan Akbik, Xinyu Guan and Yunyao Li. 26th International Conference on Computational Linguistics, COLING 2016. [pdf]

Towards Semi-Automatic Generation of Proposition Banks for Low-Resource Languages. Alan Akbik, Vishwajeet Kumar and Yunyao Li. 2016 Conference on Empirical Methods on Natural Language Processing, EMNLP 2016.[pdf]

Polyglot: Multilingual Semantic Role Labeling with Unified Labels. Alan Akbik and Yunyao Li. 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016. [pdf]

Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling. Alan Akbik, Laura Chiticariu, Marina Danilevsky, Yunyao Li, Shivakumar Vaithyanathan and Huaiyu Zhu. 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015. [pdf]

more publications


Main Research

Multilingual Semantic Role Labeling. In this line of research, I am investigating methods for semantically parsing text data in a wide range of languages, such as Arabic, Chinese, German, Hindi, Russian and many others. In order to train such parsers, we are automatically generating Proposition Bank-style resources from parallel corpora. We are making all resources publicly available, so check out the project overview page for more details and the generated Proposition Banks.

Multilingual Information Extraction. Text data is readily available in a multitude of human languages; on the Web and elsewere, trends point to a relative decline of English and a rise in use of non-English languages. Effectively mining such data for structured information of interest is a huge challenge, since traditionally, separate NLP pipelines and extractors need to be build for each language. In this research, I am investigating more cost-effective ways of creating high quality extractors for multilingual data.

multilingual text analytics

A picture of me should be here

Alan Akbik

Text and Data Mining
IBM Research
akbika [├Ąt] us [dot] ibm [dot] com