Über Open CoDE Software Wiki Diskussionen GitLab

Skip to content

Increase compression level and simplify PDF before caching to reduce storage requirements

Adam Reichold requested to merge smaller-pdf into main

The Zstd compression level mainly affects compression but not decompression speed and since we are storing multiple gigabytes of cached responses by now and the cost of compression only occurs together with the even slower network requests, it seems prudent to significantly increase the compression level from its default value of 3 to 9 which should reduce our storage requirements as isolated responses are updated.

Furthermore, storing large PDF documents takes up a lot of space even though we are interested only in their text content metadata. Hence, a new function pdf::simplify is introduced which uses the QPDF library to remove all images and improve the compressibility of these files.

Merge request reports

Loading