Explicitly handle duplicate dataset identifiers per source

Created by: adamreichold

It appears that some sources allocate duplicate dataset identifiers which we currently implicitly handle via last-write-wins. This should be replaced a explicit handling that decides which version should be used.