Über Open CoDE Software Wiki Diskussionen GitLab

Skip to content

Draft: Profit from scraper lfu-rlp and integrate further federal institutions of Rheinland-Pfalz

Stefan Krämer requested to merge other_rlp into main

websites of RLP similar structured as lfu.rlp.de:

  • snu.rlp.de
  • wegezumholz.rlp.de
  • fawf.wald.rlp.de
  • umdenken.rlp.de
  • naturschutzstationen.rlp.de
  • effnet.rlp.de
  • badeseen.rlp.de
  • umgebungslaerm.rlp.de
  • luft.rlp.de
  • mkuem.rlp.de
  • lua.rlp.de

websites will be considered separately:

  • wald.rlp.de
  • Wald.rlp sitemap selector does not work
  • klimaneutrales.rlp.de
  • Klimaneutrales.rlp sitemap selector does not work

The website bildung.rlp.de was not included, as environment and nature conservaton was not really represented in the sitemap.

Still in Draft mode as different minor things have to be done first:

  • Some title contain to much information ("Alle Badeseen", "Aktuelle Projekte SNU")
  • Some title on Detailpage are more informative than the title in the sitemap (e.g. Fischotter [SNU])
  • FAWF FAQs have unspecific titles
  • Descriptions in Wegezumholz contain "Menü" at the end
  • More pages than available are scraped for category "Aktuelles"
  • SNU has empty pages in "Links und Downloads"

Another task, that has to be done separately:

  • Solve PDF error
Edited by Stefan Krämer

Merge request reports

Loading