Hi Jonas!
Jonas Waeber kirjoitti 19.04.2018 klo 09:30:
> Yes, it is an interessting endeavor. Unfortunately it will most likely
> stay theoretical as I am only staying on the Project for two more months
> and our financial resources are limited.
Oh, what a pity. WHat is the future of the Bartoc Skosmos installation
then? Are you (as an institution) planning to continue maintaining it?
> Then this is not an option as the global search is the core feature of
> Bartoc Skosmos. Even though it is way too slow currently to be very usefull.
Yes, global search is a big challenge. We are not happy with how it
works in Finto.fi either, and it's much smaller than your installation.
> Propably the biggest problem. Everything is currently running on a 16 GB
> machine. (e.g. Skosmos, Fuseki, Varnish, Upload Routines). It would
> probably make more sense to run Skosmos & Varnish on a Server and Fuseki
> on another. Fuseki reserves 8 GB, which leaves 8 GB for the rest. Fuseki
> runs out of RAM from time to time.
16 GB isn't very much for what you are doing. I would recommend
increasing this, at least double the amount would be good.
> Does this happen when reloading vocabularies with PUT request to the
> Fuseki? Or would I have to actually delete the index files and then load
> all the vocabularies from scratch.
Unfortunately TDB tends to grow each time you update the data, whether
using PUT or POST or SPARQL updates. Even if the number of triples stays
the same, the size on disk tends to grow. With TDB1 the only way to fix
this is to start over with an empty database. With the new TDB2 there is
also a "compact" operation which will essentially rebuild the database,
but you will have to take down Fuseki for the operation. Even then you
will end up with two copies of the database - old and new - and you will
have to maually delete the old database. So it doesn't help much.
> The issue here is, that dowloading, processing and uploading takes now
> more than a day for all vocabularies. Which uses all the RAM for large
> vocabularies.
Maybe the TDB could be built on another machine? It's just a bunch of
files in a directory, so easy to transfer or to use some kind of shared
filesystem.
> This is definitelly a bottleneck then. The current server has only 2
> cores... Will have to see if this can be changed.
Sounds like adding more cores (and RAM) would be the easiest ways of
improving the situation!
For the record, Finto.fi currently has 4 (virtual) CPU cores, 16GB RAM
and a fast SAN disk (at least partly SSD backed). We are currently
setting up new servers with 4 cores and 32GB RAM.
-Osma