Consider hyphenation during tokenization

We currently do not find "Hochwasserereignisse" when searching for "Hochwasser". To alleviate this, we could consider hyphenation to produce additional tokens from longer words with some limits to avoid useless stops, e.g. a minimum length for the source and target tokens. The implementation could be based on either hypher or hyphenation.