Extend language model to handle sources available in multiple languages
Many Datasets, e.g. on KLiVO, are indeed available in multiple languages. Duplicating the Dataset, one in English, one in German, seems wasteful.
Easiest solution would be to change the language field in the dataset structure into a vector.