Technical rework of HND Bayern harvester
Some remarks in no particular order:
- Functions
fetch_links
andfetch_hnd
just called once with their result forwarded, i.e. they do not structure anything and can be removed. - Filter
all_links
before deduplication to avoid sorting/deduplicating irrelevant entries. - Use an enum type
Parameter
to avoid unchecked matching on stringly-typed data, i.e. make invalid states irrepresentable. -
to_string
is almost always "wrong": It is used to convert types into a human-readable representation, e.g. numbers into text. To turn a string slice into an owned text, meaning text into another kind of text,to_owned
should be preferred. -
Sites
is a structure describe a single site, hence it is renamed toSite
. But the variable containing a list of sites is stillsites
of typeVec<Site>
. - In regular expression, use capture groups to capture exactly what is required,
nothing more. This avoids any post-processing using e.g.
replace
. - If the regular expression contains a capture group which is always present
if the regular expressed matched at all, there is no need for error handling, i.e.
captures[index]
orcaptures.get(index).unwrap()
can be used. - Reduce the scope of bindings as much as possible and avoid making them mutable, e.g.
prefer
let region = if let Some(muni) = muni { Region::Other(muni) } else { Region::BY}
tolet mut region = Region::BY; if let Some(muni) = muni { region = Region::Other(muni); }
as the first form is clearer as it indicates that there can be no more changes later on. - Avoid producing owned types like
String
if they will just beformat!
ted into a larger text elsewhere to avoid needless allocations via calls toto_owned
. - When turning
Option
(e.g. the captures of a regular expression which might have matched or not) intoOption
(e.g. a site ID which might be missing), this can often be expressed more succinctly using theOption::map
combinator. It is not just about less text though, asmap
immediately implies a standard mode of operation whereas pattern matching can express anything and the reader has to figure it out each time it is read.