Über Open CoDE Software Wiki Diskussionen Gitlab

Skip to content

Select base dataset for merging datasets with same Fingerprint.

Jakob Deller requested to merge merge-by-modified into main

Datasets with same Fingerprint have to be merged into one.

Selecting the "base" dataset should be based on the guidelines given by the DCAT-AP.de manual, e.g. the dataset with the most up-to-date modification date has to be be selected. This is in reality complicated as many datasets do not carry a modification date or all sources provide the same, in which case it cannot be used for selection.

After analysis of the data, this MR proposes a ranking of dataset sources to be considered in the selection process.

Also, this MR implements some Rust-Fu to optimize hardware access when de-duplicating datasets.

Edited by Adam Reichold

Merge request reports