Benchmark for Slot Filling Relation Classification


The code provided at this page can be used to create the Slot Filling relation classification benchmark as presented in [1] (pdf).

The resulting dataset consists of labeled sentences for relation classification. A statistic of the dataset is provided here.

We further provide scripts to reproduce our dev-eval splits (both yearwise and genrewise).



Code: codeSFbenchmark

For running the code, you need to have access to the TAC source corpus (2010, 2013), the query files (2012, 2013, 2014) and the assessments (2012, 2013, 2014). This data is distributed by LDC (TACdata).

The code calls the Stanford CoreNLP toolkit [2].

Additional Resources of Paper

For the paper, we trained and evaluated different features / architectures for each model type. Based on their macro F1 scores, the best model of each type was selected and its relation specific results are presented in the paper. The relation specific results of the other models can be found here.


If you use the provided code in your work, please cite the following paper:

  title={Comparing Convolutional Neural Networks to Traditional Models for Slot Filling},
  author={Heike Adel and Benjamin Roth and Hinrich Sch\"{u}tze},
  booktitle={{NAACL} {HLT} 2016, The 2016 Conference of the North American Chapter
               of the Association for Computational Linguistics: Human Language Technologies,
               San Diego, California, USA, June 12 - June 17, 2016}


[1] Heike Adel, Benjamin Roth and Hinrich Sch├╝tze: "Comparing Convolutional Neural Networks to Traditional Models for Slot Filling", NAACL 2016.

[2] Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky: "The Stanford CoreNLP Natural Language Processing Toolkit", ACL System Demonstrations 2014.

Contact: Heike Adel (website)