Traffic analysis on crystal NoSkE

8 views
Skip to first unread message

Nikola Ljubešić

unread,
Apr 11, 2024, 8:51:51 AM4/11/24
to no...@sketchengine.co.uk
Dear all,

We would like to perform traffic analysis of our CLARIN.SI noSketch concordancers, i.e. https://www.clarin.si/ske/ and its variant with user logins https://www.clarin.si/skelog/. What we would try to find out is how much and in what way specific corpora are being used so that we can better determine which corpora and features are worth developing further.

We see two main options how to do this (might be missing some):
- cache analysis - previous searches could give some (limited) insights, but the cache was not really meant for this, i.e. the format seems difficult to analyse
- web traffic analysis by analysing apache web logs - there might already be tools / solutions out there

Did anybody else try something similar in the context of noSketch Engine, so we wouldn't start this from scratch?

Thanks,

Nikola and Tomaž

Omar Siam

unread,
Apr 11, 2024, 2:48:33 PM4/11/24
to NoSketch Engine, nlju...@gmail.com
We tried something like this for a NoSkE instance of ours. The idea was to log the request with the web server that runs bonito/run.cgi and pass it to Matomo.
We ended up not doing this because it was to much for what we needed but I still think it should work.
https://github.com/acdh-oeaw/noske-ubi9/blob/main/lighttpd.conf#L36
https://github.com/acdh-oeaw/noske-ubi9/blob/main/import_logs.py

Nikola Ljubešić

unread,
Apr 15, 2024, 9:50:01 AM4/15/24
to Omar Siam, NoSketch Engine
Thank you, Omar, for this! We will have a detailed look.

Open to input from others, of course!

Best,

Nikola
Reply all
Reply to author
Forward
0 new messages