Start adding scripts to make ad-hoc analyses on the metadata catalogue.
Created by: adamreichold
Such analyses can for example be used to make decisions on software optimizations like in the second commit included here.
The analysis itself finishes in about 10 seconds when running against a local release build of the server and a catalogue of 65k datasets:
> time python3 resources.py
1: 36.0 %
2: 59.4 %
3: 69.0 %
4: 78.1 %
0: 87.0 %
5: 94.5 %
13: 96.1 %
6: 97.1 %
8: 97.6 %
9: 98.1 %
...
44: 100.0 %
57: 100.0 %
84: 100.0 %
61: 100.0 %
65: 100.0 %
real 0m8,661s
user 0m1,306s
sys 0m0,043s
Fetching the same amount of data over an SSH tunnel from our server takes a bit more than minute and produces between 20% and 30% CPU utilization on that server.