Über Open CoDE Software Wiki Diskussionen Gitlab

Skip to content

Port ELWIS harvester to Rust to simplify future schema changes.

Adam Reichold requested to merge OC000014987132/metadaten:port-elwis into main

This is a verbatim port keeping all the TODO markers "intact", i.e. the metadata quality is the same as before. I did this because I would prefer to get rid of the Python infrastructure and iterate on the Rust code base. In addition, indexing PDF seems to be broken in ELWIS itself for now and hence I could not have tested this even if I did implement it.

Note that this can currently only be run with the CHECK_SOURCE_URL part in xtask/src/main.rs commented out because the Python code produced what are technically invalid URL (missing percent encoding of characeters like +) and I did not want to reproduce that behaviour.

When this is done, the locales setup code here and here can be removed, c.f. umwelt-info/infrastruktur/entwicklung!58 (merged) and umwelt-info/infrastruktur/testbetrieb!74 (merged).

Edited by Adam Reichold

Merge request reports