Drop identical datasets to avoid racing overwrites
If a task overwrites a dataset while the previous writer is still active, corruption of the dataset can result. The simplest way to avoid this seems to drop the second dataset instead of overwriting which is also more efficient.
This did not hit us yet as all harvester do not spawn additional tasks and so we are currently limited to a single task per source and identical datasets are a problem only within a given source. But we should fix this nevertheless as there is no mechanism that prevents a harvester from using multiple tasks to increase throughput.
This also changes the nomenclature form "duplicate" to "identical" to differentiate this case of duplicate identifiers and the duplicate detection based on fingerprints.