Embedding Learning Through Multilingual Concept Induction


This page provides additional information and data for this paper.

Supplementary material (including a readme with detailed description) can be downloaded below.

We provide the concept based wordspaces for the methods SAMPLE, CLIQUE and N(t) (both for WORD and CHAR). The embedding spaces are based on 1664 bible editions and span 1259 languages. Embedding dimension is 200.


wordspaces.zip [27GB]

supplementary.zip [<1MB]


If you use the provided data in your work, please cite the following paper.

title={Embedding Learning Through Multilingual Concept Induction},
author={Dufter, Philipp and Zhao, Mengjie and Schmitt, Martin and Fraser, Alexander and Sch{\"u}tze, Hinrich},
booktitle={Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},

Contact: Philipp Dufter